In this section the new posts will be generated in three steps:
1. Taking yet unseen existent titles for required judgement
2. Altering to create unique titles
3. Generating posts given title and expected judgement using the best preforming model.


Note: all the paramenters as well as system message will be reused from the post generator experiment to ensure the same process

### Imports & Installs & Constants

In [None]:
import pandas as pd

from transformers import AutoTokenizer, pipeline
import torch

from pprint import pprint
from tqdm.auto import tqdm

import time

import json

### Loading the dataset for posts for the one shot examples

In [15]:
from google.colab import files
uploaded = files.upload()

Saving AITA_minorities_subset.csv to AITA_minorities_subset.csv


In [139]:
aita_subset = pd.read_csv('AITA_minorities_subset.csv')

train_data = aita_subset[:200]

title_data = aita_subset[201:600]

temp = aita_subset[451:]


### Selecting yet unseen existent titles

In [34]:
yta_titles = list(title_data[title_data['label'] == 'YTA']['title'][:33])
nah_titles =list( title_data[title_data['label'] == 'NAH']['title'][:33])
esh_titles = list(title_data[title_data['label'] == 'ESH']['title'][:34])

In [38]:
# creating an empty dataframe to hold the final dataset
generated_aita_posts = pd.DataFrame()

generated_aita_posts['Original_title'] = yta_titles + nah_titles + esh_titles
generated_aita_posts['Judgement'] = ['YTA'] * len(yta_titles) + ['NAH'] * len(nah_titles) + ['ESH'] * len(esh_titles)

In [39]:
generated_aita_posts

Unnamed: 0,Original_title,Judgement
0,AITA for lying to a friend about vacation?,YTA
1,AITA for paying my mom too much rent?,YTA
2,AITA for feigning a food allergy?,YTA
3,AITA for calling my brother an idiot?,YTA
4,AITA for inviting my family into a wellness gr...,YTA
...,...,...
95,AITA for calling my sister selfish for announc...,ESH
96,AITA for letting my friend stay with me even t...,ESH
97,AITA for refusing to pay for my daughter’s pla...,ESH
98,AITA for asking my wife to apologize to my mom...,ESH


### Creating one shot example for each title type and formating input.

In [40]:
temp[temp['label'] == 'ESH']['title'][454]

'AITA for confronting my husband for treating the girls so differently?'

In [41]:
title_system_message = """You are a highly intelligent language model designed to create new and unique titles for the 'Am I the Asshole?' (AITA) subreddit for a given judgement (YTA, NTA, ESH). Provided the judgement, return a unique title which will be suitable for the AITA post of required judgement. Return only one title.
Examples:
"""

yta_sample = """
Judgement: YTA,
Title:
"AITA for banning my husband's mother from the hospital because of what she did?",
Judgement: YTA,
Title:
 'AITA for calling my best friend fat?',
Judgement: YTA,
Title:
 'AITA For Calling GF a Pushover',
Judgement: YTA,
Title:
 'AITA for not wanting to pay my parents rent after we made a deal?',
Judgement: YTA,
Title:
 'AITA for inviting my family into a wellness group that i am a part of?',
Judgement: YTA,
Title:
 'AITA For Yelling at My Grandpa?'

"""

nah_sample = """
Judgement: NAH,
Title: "AITA for not being there when my bf puts his dog to rest?",

Judgement: NAH,
Title: "AITA for 'kicking' my girlfriend out my my room because my son had a migraine?",

Judgement: NAH,
Title: ''AITA for refusing to compromise and take colder showers?",

Judgement: NAH,
Title: ''AITA for being mildly annoyed at my girlfriend for using one of my eggs?"
"""

esh_sample = """
Judgement: ESH,
Title: "AITA for not texting or calling my cousins because they did it to me?",

Judgement: ESH,
Title: "AITA for arguing with my husband regarding Christmas?",

Judgement: ESH,
Title: "AITA for telling my SIL her kids are going to hate their names?",

Judgement: ESH,
Title: "AITA for letting my friend stay with me even though my fiancé said he didn’t want her in our home?",

Judgement: ESH,
Title: "AITA for asking my wife to apologize to my mom but not vice versa?",

Judgement: ESH,
Title: "AITA for throwing my wife a birthday party and royally pissing my mom off?",

Judgement: ESH,
Title: "AITA for having a photo of my son's dna test framed and hanging in the living room?"
"""


In [42]:
def format_input(row):
  if row['Judgement'] == 'YTA':
    one_shot = yta_sample
    judgement = "Judgement: YTA,"
  elif row['Judgement'] == 'NAH':
    one_shot = nah_sample
    judgement = "Judgement: NAH,"
  elif row['Judgement'] == 'ESH':
    one_shot = esh_sample
    judgement = "Judgement: ESH,"

  messages = [
    {"role": "system", "content": title_system_message + one_shot},
    {"role": "user", "content": judgement}
  ]

  return messages



In [157]:
inputs_df= pd.DataFrame()

inputs_df['gen_titles_input'] = generated_aita_posts.apply(format_input, axis=1)


In [44]:

inputs_df.head(1)

Unnamed: 0,gen_titles_input
0,"[{'role': 'system', 'content': 'You are a high..."


### Generating titles with Llama

In [None]:
!pip install -q accelerate
!pip install --upgrade transformers torch torchvision



In [None]:
from huggingface_hub import login

login(token='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')


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"


In [None]:
llama_pipline = pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto"
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]



In [None]:
def get_title(pipe, inputs):
    """
    :param pipe: text-generation pipeline
    :param inputs: list of input prompts
    :return: list
    """
    assistant_outputs = []

    terminators = [
        pipe.tokenizer.eos_token_id,
        pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
    ]

    for out in tqdm(pipe(
        inputs,
        max_new_tokens=64,
        tokenizer = terminators,
        do_sample=True, # setting this to true to ensure my generation is not using greedy method
        temperature=0.8, # increasing temperature to improve creativity
        # top_k = 50,
        num_return_sequences=1
    )):
      assistant_outputs.append(out[0]["generated_text"][-1]['content'].strip())

    return assistant_outputs

In [None]:
start_time = time.time()
predictions = get_title(llama_pipline, inputs_df['gen_titles_input'].tolist())
print(f'Time: {int(time.time() - start_time)} seconds')

# print(predictions)
print(*predictions[:2], sep = "\n\n")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


  0%|          | 0/2 [00:00<?, ?it/s]

Time: 2619 seconds
AITA for faking a family emergency to get out of a friend's vacation plans?

AITA for setting my rent so low that my adult roommate can barely afford her own apartment?


In [None]:
inputs_df['gen_titles_input'][1]

[{'role': 'system',
  'content': "You are a highly intelligent language model designed to create new and unique titles for the 'Am I the Asshole?' (AITA) subreddit for a given judgement (YTA, NTA, ESH). Provided the original title and the associated judgement, return a similar but unique title which will be suitable for the AITA post of required judgement. Return only one title.\n\nExample:\n\n\nJudgement: YTA,\nTitles: 'AITA for telling my wife our son’s mental health should be more important to her than her sister’s physical health?', <another title>,....\nReturn: 'AITA for yelling at my boyfirend that my metnal illness must be more important to him then his MIL health', <another title>,....\n"},
 {'role': 'user',
  'content': 'Judgement: YTA, AITA for paying my mom too much rent?'}]

In [None]:
titles = predictions

In [None]:
aita_generated_titles = pd.DataFrame({'Original_title': generated_aita_posts['Original_title'],
                               'Judgement': generated_aita_posts['Judgement'],
                               'Generated_title': titles})
aita_generated_titles.to_csv('aita_generated_titles.csv')

In [None]:

files.download('aita_generated_titles.csv')


In [45]:
from google.colab import files
uploaded = files.upload()

Saving aita_generated_titles to aita_generated_titles


In [116]:
df = pd.read_csv('aita_generated_titles.csv')
aita_generated_titles = df[['Judgement','Generated_title']]


In [61]:
# pd.options.display.max_colwidth = 150
# pd.options.display.max_columns = 100

# df[df['Judgement'] == 'ESH']

Unnamed: 0,Judgement,Generated_title
66,ESH,"""AITA for letting my partner's ridiculous expectations ruin our family gatherings?"""
67,ESH,"""AITA for telling my parents to stop paying my bills when I'm 25 and 'faking' adult life?"""
68,ESH,"""AITA for refusing to foot the bill for my sister's expensive vacation after she invited herself to stay with me for a month?"""
69,ESH,"""AITA for assuming my family would be okay with me taking a 6-month break from social media?"""
70,ESH,"""AITA for expecting my family to respect my home boundaries after I've been consistently disrespected by them?"""
71,ESH,"""AITA for not including my partner's family in our holiday traditions because we only spend holidays with my family?"""
72,ESH,AITA for assuming my partner's family would be okay with a last-minute addition to the wedding guest list?
73,ESH,"""AITA for having a meltdown over my mom's birthday gift and then refusing to help her pick out a gift for my dad?"""
74,ESH,"""AITA for setting boundaries with my sibling after they had no issue disregarding mine?"""
75,ESH,"""AITA for being annoyed that my sibling doesn't respect my boundaries, but also respecting theirs when they've been hurt in the past?"""


In [64]:
aita_generated_titles.to_csv('aita_generated_titles.csv')

In [117]:
files.download('aita_generated_titles.csv')


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [118]:
aita_generated_titles

Unnamed: 0,Judgement,Generated_title
0,YTA,"'AITA for expecting my roommate to pay for the entire broken stove, even though it was his fault?'"
1,YTA,"""AITA for assuming my roommate's 'lost' phone was just hidden to avoid a bill?"""
2,YTA,'AITA for making my roommate do all the household chores while I get to work from home and relax?'
3,YTA,'AITA for getting into a public argument with my roommate over her extremely messy cooking habits?'
4,YTA,"""AITA for assuming my roommate's 'emergencies' are just excuses to avoid paying rent?"""
...,...,...
95,ESH,AITA for expecting my in-laws to respect my boundaries on social media after they constantly harass me?
96,ESH,AITA for having a big argument with my brother over who gets grandma's old house after she passed away?
97,ESH,"""AITA for buying a home with my partner without including their mom on the deed?"""
98,ESH,"""AITA for setting boundaries with my in-laws after they consistently disrespected them?"""



## Post Generation

In [119]:
# this length is selected to accomodate the average length aita post
post_max_tokens = 768

In [120]:
post_gen_system_message = """
You are a highly intelligent language model designed to generate posts for the "Am I the Asshole?" (AITA) subreddit. Your task is to read the title of an AITA post and generate an engaging post that aligns with the provided judgement.

Instructions:

1. Read the Title:
Carefully read the title of the AITA post.

2. Acknowledge the Required Judgement:
Consider the judgement indicated (YTA - You are the Asshole, NAH - No Assholes Here, ESH - Everyone Sucks Here). Consider the key points you need to make to ensure the story aligns with the required judgement.

3. Generate a Story by Following those Steps:

Create an AITA post with no section titles. Include the following elements seamlessly in the narrative:

Brief Background: Provide context for the story. Introduce the main characters and their relationships, and set the scene for the events that will unfold.

The Incident: Describe the specific incident or series of events that led to the conflict. Be detailed and clear about what happened, who was involved, and what actions were taken.

Friends' or Family's Opinion: Describe the opinions of friends or family members about the situation. Include differing viewpoints to provide a balanced perspective on the conflict.

The Current Outcome: Detail the consequences of the incident. Explain how the people involved reacted, any changes in relationships, and any ongoing impact the conflict has had.

Conclusion:
Summarize the key points of the story and pose similar question to the readers: "Am I the Asshole for [OP ACTIONS]?"

4. Maintain Authenticity:
Ensure that the story feels realistic and relatable. Use natural language and tone as if a real person is sharing their experience.

5. Adhere to the Judgement:
Ensure that the generated story logically leads to the required judgment (e.g., if the judgment is NTA, the story should clearly indicate why the poster might be considered not the asshole).

"""

In [140]:
# restricting length to avoid using lengthy posts into the prompt hence to many tokens
# but also avoiding passing too short example to not prompt the model for short story generation
percentile_75 = int(train_data['word_count'].quantile(0.75))
percentile_25 = int(train_data['word_count'].quantile(0.25))
train_data= train_data[train_data['word_count'] < percentile_75]
train_data= train_data[train_data['word_count'] > percentile_25]
train_data = train_data.reset_index(drop=True)

In [141]:
train_data['few_shot_input'] = 'Judgement: ' + train_data['label'] + ', Title: ' + train_data['title']
train_data['few_shot_output'] = train_data['body']

In [142]:
aita_one_shot_data = train_data[['title', 'body','label', 'few_shot_input', 'few_shot_output']]
# aita_one_shot_data.to_csv('aita_one_shot_data.csv')
# files.download('aita_one_shot_data.csv')

In [143]:
def create_example(row):
  one_shot_data = row
  one_shot = []
  for os_index, os_row in one_shot_data.iterrows():
    one_shot.append({"role": "user", "content": os_row['few_shot_input']})
    one_shot.append({"role": "assistant", "content": os_row['few_shot_output']})

  return one_shot

In [158]:
def format_post_gen_input(row):
  if row['Judgement'] == 'YTA':
    one_shot = create_example(train_data[train_data['label'] == 'YTA'].sample(1))
    judgemnet = "Judgement: YTA, Title: "
  elif row['Judgement'] == 'NAH':
    one_shot = create_example(train_data[train_data['label'] == 'NAH'].sample(1))
    judgemnet = "Judgement: NAH, Title: "
  elif row['Judgement'] == 'ESH':
    one_shot = create_example(train_data[train_data['label'] == 'ESH'].sample(1))
    judgemnet = "Judgement: ESH, Title: "

  system_message = [{"role": "system", "content": post_gen_system_message}]
  user_message = [{"role": "user", "content": judgemnet + row['Generated_title']}]
  return system_message + one_shot + user_message


inputs_df.loc[:, 'get_post_input'] = aita_generated_titles.apply(format_post_gen_input, axis=1)
#pprint(test_data.loc[:1, 'get_post_input'].tolist(), sort_dicts=False)

In [159]:
inputs_df['get_post_input'][0]

[{'role': 'system',
  'content': '\nYou are a highly intelligent language model designed to generate posts for the "Am I the Asshole?" (AITA) subreddit. Your task is to read the title of an AITA post and generate an engaging post that aligns with the provided judgement.\n\nInstructions:\n\n1. Read the Title:\nCarefully read the title of the AITA post.\n\n2. Acknowledge the Required Judgement:\nConsider the judgement indicated (YTA - You are the Asshole, NAH - No Assholes Here, ESH - Everyone Sucks Here). Consider the key points you need to make to ensure the story aligns with the required judgement.\n\n3. Generate a Story by Following those Steps:\n\nCreate an AITA post with no section titles. Include the following elements seamlessly in the narrative:\n\nBrief Background: Provide context for the story. Introduce the main characters and their relationships, and set the scene for the events that will unfold.\n\nThe Incident: Describe the specific incident or series of events that led to

In [160]:
len(aita_generated_titles)

100

In [161]:
len(inputs_df)

100

## Generating with GPT 4o mini

In [163]:
%pip install --upgrade openai



In [164]:
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "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"))

In [165]:
def generate_posts(formated_input):
    response = client.chat.completions.create(
    model="gpt-4o",
    temperature=0.9,
    max_tokens = post_max_tokens,
    messages=formated_input,
    )

    return response.choices[0].message.content

In [166]:
import time
start_time = time.time()

aita_generated_titles['Generated_Posts_gpt_4o'] = inputs_df['get_post_input'].apply(generate_posts)

print(f'Time: {int(time.time() - start_time)} seconds')


Time: 699 seconds


In [170]:
aita_generated_titles['Generated_Posts_gpt_4o'][99]

'Growing up, my family has always had this tradition of gathering at our grandparents\' house every Sunday for a big family dinner. Everyone is expected to attend unless they have a really good excuse. I\'ve been part of this tradition my entire life, but recently, I\'ve decided to take a step back. I\'m in my late twenties now, and between work, friends, and trying to carve out some personal time for myself, these weekly dinners feel more like an obligation than a joyful reunion.\n\nTwo months ago, I told my family that I wouldn\'t be coming to the Sunday dinners every week anymore. I explained that I needed some time to focus on other aspects of my life and that I\'d still try to make it when I could. My parents didn\'t take it well, saying that family should always come first and that I was being selfish. My siblings also chimed in, accusing me of not caring about the family and setting a bad example for our younger cousins.\n\nThe last straw was this past Sunday. My mom called me t

In [90]:
aita_generated_titles

Unnamed: 0,Judgement,Generated_title,Generated_Posts_gpt_4o
0,YTA,"'AITA for expecting my roommate to pay for the entire broken stove, even though it was his fault?'","So, I (24, female) recently had a pretty intense situation with my roommate, Jake (25). We've been living together for about a year now, and for t..."
1,YTA,"""AITA for assuming my roommate's 'lost' phone was just hidden to avoid a bill?""","So, here's the deal. I (23M) have a roommate, Jake (24M), who tends to be a bit forgetful at times. He recently got a new phone and had been ravin..."


In [171]:
aita_generated_titles.to_csv('AITA_Generated_Posts.csv')
files.download('AITA_Generated_Posts.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Comparison view between GPT 4o and GPT mini

In [99]:
aita_generated_titles['Generated_Posts_gpt_4o'][0]

"I (24F) live with my roommate, Jake (25M). We've been sharing an apartment for about a year now. Jake and I get along well for the most part, splitting rent and utilities down the middle. We have separate chores to keep the place tidy, and it's been working fine until recently.\n\nA couple of weeks ago, Jake decided to make a big dinner while I was out. He invited a few friends over, and things got pretty wild. At some point during the night, Jake and his friends got a bit too tipsy, and one of them accidentally knocked over a pot, causing the stove to break. When I got home, Jake was obviously embarrassed and apologized profusely.\n\nThe repair cost for the stove was around $300, and Jake suggested we split the cost since we both use the stove regularly. But I felt that since it was his friends and his party that caused the damage, he should be the one to pay for it. I told him this, and he seemed taken aback, arguing that it's both our apartment and we should share responsibility fo

In [98]:
gpt_mini_posts['Generated_Posts_gpt_4o_mini'][0]

"I (23, female) have been living with my roommate Jake (24, male) for about a year now. We have a pretty good living situation overall, but like any roommate dynamic, there have been a few bumps along the road. \n\nRecently, our stove broke down. It all started one night when Jake decided to cook a fancy dinner for himself. He was trying out this elaborate recipe that involved a lot of oil, and let’s just say he was a little too careless with it. He left the oil on high heat for too long, and eventually, it caused a fire. Fortunately, I was in my room when the smoke alarm went off, and I rushed out to see him frantically trying to put it out. We managed to avoid any major disaster, but the stove was completely ruined.\n\nAfter the incident, we had a long conversation about it. I pointed out that this was entirely his fault because he wasn't being responsible in the kitchen. He argued that accidents happen and that I was being too harsh. When the time came for repairs, I told him I expe

In [None]:
gpt_mini_posts.to_csv('GPT_mini_generations.csv')
files.download('GPT_mini_generations.csv')