# Exploring Good Stories with Language Models

In this exercise we will be using a dataset of artificially crafted (boring) stories. The goal is to see if we can improve them using prompts based on Bunn's guide to being a good writer. We are going to divide our narrative dimensions into two categories: Sequential and Global. Sequential address the issue of narrative progression, while global address issues that the entire story taken together communicates.

Your job is to build a notebook that takes a story as input and then create a series of prompts to improve the story by getting the LLM to focus on the dimensions below. You must address all dimensions in your notebook. Test multiple stories and for each dimension assess the quality of the LLM's ability to model that concept. Discuss strengths and weaknesses of the model and use examples to support your point. You won't be able to discuss all dimensions in your write-up so be focused. Submit your notebook along with your report this week.

## Sequential Dimensions

1. Set the scene
2. Inciting incident
3. Rising Action
4. Climax (choose one)
   - epiphany
   - moral choice
   - decisive action
   - emotional release

## Global Dimensions

1. Does it respect the three unities: single time, place, coherence of actions
2. Do you feel like you are there?
3. How much tension is there?
4. Is there an element of strangeness?
5. Is there a significant change or turning point?
6. Is insight achieved?
7. Is there a unified emotion governing the story?
8. Are there meaningful shifts in valence (positive/negative)?

Good luck!

In [None]:
%pip install ollama tqdm black pandas datetime

Note: you may need to restart the kernel to use updated packages.


In [1]:
import pandas as pd

def read_story_from_csv(row_number, file_path):
    """
    Reads a story from a specified row in a CSV file.

    Parameters:
    - row_number: The row number from which to read the story.
    - file_path: The path to the CSV file.

    Returns:
    - The story as a string.
    """
    try:
        # Load the CSV file
        df = pd.read_csv(file_path)

        # Check if the row number is valid
        if row_number < 0 or row_number >= len(df):
            raise ValueError("Invalid row number")

        # Define the story columns
        story_columns = ['sentence1', 'sentence2', 'sentence3', 'sentence4', 'sentence5']

        # Concatenate the sentences to form the story
        story = ' '.join(df.loc[row_number, story_columns].dropna().astype(str))

        return story

    except FileNotFoundError:
        print(f"Error: The file {file_path} does not exist.")
        return None

    except Exception as e:
        print(f"An error occurred: {e}")
        return None

# Example usage
story_files_path = "ROCStories_winter2017 - ROCStories_winter2017.csv"
row_number = 0  # Replace with the desired row number

source_story = read_story_from_csv(row_number, story_files_path)
if source_story:
    print(source_story)


David noticed he had put on a lot of weight recently. He examined his habits to try and figure out the reason. He realized he'd been eating too much fast food lately. He stopped going to burger places and started a vegetarian diet. After a few weeks, he started to feel much better.


In [None]:
import ollama
import pandas as pd
from tqdm import tqdm
from datetime import datetime

# Define text sources
prompts = ["Rewrite the following story to make it more engaging"]
prompt_number = 0

# Set the model name
model_name = "llama3:8b"

# Initialize an empty list to store outputs
outputs = []

print(f"Processing source story: {source_story[:20]}")

# Loop through each prompt
for prompt_index, prompt in enumerate(prompts):
    # Construct the prompt
    rewrite_prompt = f"Given the following source story: {source_story} please rewrite it according to the following instruction: {prompt}. Avoid any introduction, and directly output the generated text."
    
    # Generate text using the LLaMA model
    response = ollama.chat(model=model_name, messages=[{"role": "user", "content": rewrite_prompt}])
    generated_text = response["message"]["content"]
    
    # Store the prompt and generated text
    outputs.append({
        "Prompt": prompt,
        "Generated Text": generated_text
    })
    
    # Display progress
    print(f"Processed prompt {prompt_index+1} of {len(prompts)}")

# Convert outputs to a DataFrame and save to CSV
df = pd.DataFrame(outputs)

# Extract first two words from the source story
first_two_words = source_story.split()[:2]
if not first_two_words:  # Handle empty source story
    first_two_words = ["default", "story"]

filename = f"{'_'.join(first_two_words)}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv"

df.to_csv(filename, index=False)

print(f"Saved output to file: {filename}")


Processing source story: David noticed he had
Processed prompt 1 of 1


NameError: name 'datetime' is not defined