# Add Instruction Column to Enriched Labels

This notebook adds an instruction column to the ao3_tifu_enriched_labels dataset.
The instruction includes:
- Story type (one_liner or short_story)
- Emotion
- Tone (if available)
- Genre

In [9]:
import pandas as pd
import sys
from pathlib import Path

sys.path.append('..')
from config import PROCESSED_DATA_DIR

pd.set_option('display.max_colwidth', None)

## Load Data

In [10]:
# Load enriched labels
input_path = PROCESSED_DATA_DIR / "ao3_tifu_enriched_labels.parquet"
df = pd.read_parquet(input_path)

print(f"Loaded {len(df)} rows")
print(f"\nColumns: {df.columns.tolist()}")
print(f"\nData types:")
print(df.dtypes)

df.head()

Loaded 51337 rows

Columns: ['id', 'text', 'emotion', 'emotion_score', 'type', 'tone', 'tone_score', 'genre', 'genre_score']

Data types:
id                object
text              object
emotion           object
emotion_score    float64
type              object
tone              object
tone_score       float64
genre             object
genre_score      float64
dtype: object


Unnamed: 0,id,text,emotion,emotion_score,type,tone,tone_score,genre,genre_score
0,ao3_41199483_1822,"His eyes were deep and fathomless, gaze more intense than it had ever been when levelled against Eddie. Eddie swallowed. He wet his mouth and watched Steve’s eyes dip, tracking the movement.",fear,0.976,short_story,neutral,0.936,Horror,0.396911
1,ao3_39492981_286,"Steve was looking at him now, his eyes wide in Eddie’s peripheral. He couldn’t stomach seeing his face, no matter what it wore.",disgust,0.829,short_story,neutral,0.962,Thriller,0.23039
2,ao3_41199483_1637,"The scars on Steve’s wrist were the same colour as the ones on his cheeks, Eddie reminded himself. Thin, spidery, silver patches that were unnatural. Eddie’s breath caught in his throat.",disgust,0.911,short_story,neutral,0.835,Horror,0.477902
3,ao3_39570081_324,"Steve moved to Eddie’s chest, caressing him as he kissed down his body. When he got to Eddie’s legs, which were bent at the knee, he lifted one to kiss the inner thigh.",disgust,0.866,short_story,neutral,0.912,Romance,0.505607
4,ao3_39346095_128,"Eddie stares down at him, his eyes flickering to different points on his body. Steve’s fully clothed, but he has never felt more exposed. He runs his hand through his messed up hair.",disgust,0.803,short_story,neutral,0.956,Mystery,0.280344


## Check Data Distribution

In [11]:
print("Story Type Distribution:")
print(df['type'].value_counts())

print("\nEmotion Distribution:")
print(df['emotion'].value_counts())

print("\nTone Distribution (rows with tone >= 0.8):")
print(df['tone'].value_counts())
print(f"\nRows with tone: {df['tone'].notna().sum()}")
print(f"Rows without tone: {df['tone'].isna().sum()}")

print("\nGenre Distribution:")
print(df['genre'].value_counts())

Story Type Distribution:
type
short_story    49246
one_liner       2091
Name: count, dtype: int64

Emotion Distribution:
emotion
disgust     19102
fear        13288
anger        8086
surprise     4169
sadness      3469
joy          2774
neutral       449
Name: count, dtype: int64

Tone Distribution (rows with tone >= 0.8):
tone
neutral           43565
sadness            1291
admiration         1213
gratitude           864
love                772
surprise            716
joy                 681
fear                639
confusion           428
amusement           311
realization         300
optimism             91
disgust              84
caring               80
embarrassment        52
anger                49
remorse              48
curiosity            47
approval             43
disappointment       40
excitement            9
disapproval           7
desire                7
Name: count, dtype: int64

Rows with tone: 51337
Rows without tone: 0

Genre Distribution:
genre
Fantasy      10562
Ho

## Create Instruction Function

In [12]:
def create_instruction(row):
    """
    Create an instruction based on story attributes.
    
    Format: "Write a {type} with {emotion} emotion{tone_part}{genre_part}."
    """
    # Story type
    story_type = row['type'].replace('_', ' ')  # "one_liner" -> "one liner"
    
    # Emotion
    emotion = row['emotion']
    
    # Tone (only if available)
    tone_part = ""
    if pd.notna(row.get('tone')):
        tone_part = f", {row['tone']} tone"
    
    # Genre
    genre = row['genre']
    genre_part = f", in the {genre} genre"
    
    instruction = f"Write a {story_type} with {emotion} emotion{tone_part}{genre_part}."
    
    return instruction

# Test on a few rows
print("Sample instructions:")
for i in range(5):
    row = df.iloc[i]
    instruction = create_instruction(row)
    print(f"\n{i+1}. {instruction}")
    print(f"   Text preview: {row['text'][:100]}...")

Sample instructions:

1. Write a short story with fear emotion, neutral tone, in the Horror genre.
   Text preview: His eyes were deep and fathomless, gaze more intense than it had ever been when levelled against Edd...

2. Write a short story with disgust emotion, neutral tone, in the Thriller genre.
   Text preview: Steve was looking at him now, his eyes wide in Eddie’s peripheral. He couldn’t stomach seeing his fa...

3. Write a short story with disgust emotion, neutral tone, in the Horror genre.
   Text preview: The scars on Steve’s wrist were the same colour as the ones on his cheeks, Eddie reminded himself. T...

4. Write a short story with disgust emotion, neutral tone, in the Romance genre.
   Text preview: Steve moved to Eddie’s chest, caressing him as he kissed down his body. When he got to Eddie’s legs,...

5. Write a short story with disgust emotion, neutral tone, in the Mystery genre.
   Text preview: Eddie stares down at him, his eyes flickering to different points on his

## Apply Instructions to All Rows

In [13]:
# Add instruction column
print("Creating instruction column...")
df['instruction'] = df.apply(create_instruction, axis=1)

print(f"✓ Added instruction column to {len(df)} rows")

# Show sample
df[['instruction', 'text', 'type', 'emotion', 'tone', 'genre']].head(10)

Creating instruction column...
✓ Added instruction column to 51337 rows


Unnamed: 0,instruction,text,type,emotion,tone,genre
0,"Write a short story with fear emotion, neutral tone, in the Horror genre.","His eyes were deep and fathomless, gaze more intense than it had ever been when levelled against Eddie. Eddie swallowed. He wet his mouth and watched Steve’s eyes dip, tracking the movement.",short_story,fear,neutral,Horror
1,"Write a short story with disgust emotion, neutral tone, in the Thriller genre.","Steve was looking at him now, his eyes wide in Eddie’s peripheral. He couldn’t stomach seeing his face, no matter what it wore.",short_story,disgust,neutral,Thriller
2,"Write a short story with disgust emotion, neutral tone, in the Horror genre.","The scars on Steve’s wrist were the same colour as the ones on his cheeks, Eddie reminded himself. Thin, spidery, silver patches that were unnatural. Eddie’s breath caught in his throat.",short_story,disgust,neutral,Horror
3,"Write a short story with disgust emotion, neutral tone, in the Romance genre.","Steve moved to Eddie’s chest, caressing him as he kissed down his body. When he got to Eddie’s legs, which were bent at the knee, he lifted one to kiss the inner thigh.",short_story,disgust,neutral,Romance
4,"Write a short story with disgust emotion, neutral tone, in the Mystery genre.","Eddie stares down at him, his eyes flickering to different points on his body. Steve’s fully clothed, but he has never felt more exposed. He runs his hand through his messed up hair.",short_story,disgust,neutral,Mystery
5,"Write a short story with disgust emotion, neutral tone, in the Horror genre.","The air behind Hannibal shifted. Franklyn’s corpse met Will’s eyes: neck twisted at an awkward angle, blue contacts covering clouded brown irises.",short_story,disgust,neutral,Horror
6,"Write a short story with disgust emotion, neutral tone, in the Romance genre.","Steve’s hand left his waist to pull Eddie’s hair to the side, exposing his neck. He began to kiss Eddie’s neck, starting at his jawline and traveling lower until he was kissing the nape.",short_story,disgust,neutral,Romance
7,"Write a short story with disgust emotion, neutral tone, in the Mystery genre.","The presence is heavy and warm, seeping through Eddie’s thin shirt. Something Eddie has not been able to articulate in his mind is just how big Steve’s hands are.",short_story,disgust,neutral,Mystery
8,"Write a short story with fear emotion, neutral tone, in the Romance genre.",But the look in Cas's eyes.... Dean was old enough to know that there are other forms of intimacy besides sex. There were other ways to feel vulnerable; other ways to feel exposed.,short_story,fear,neutral,Romance
9,"Write a short story with disgust emotion, neutral tone, in the Romance genre.",Eddie flipped his palm upright. He was unable to feel flustered over the tender way Steve traced careful fingers against his skin. That sucked a little; he'd have liked to feel normal for a bit.,short_story,disgust,neutral,Romance


## Check Instruction Distribution

In [14]:
# Check unique instructions
print(f"Total rows: {len(df)}")
print(f"Unique instructions: {df['instruction'].nunique()}")

# Show most common instructions
print("\nTop 20 most common instructions:")
print(df['instruction'].value_counts().head(20))

Total rows: 51337
Unique instructions: 952

Top 20 most common instructions:
instruction
Write a short story with disgust emotion, neutral tone, in the Horror genre.       4158
Write a short story with disgust emotion, neutral tone, in the Fantasy genre.      3291
Write a short story with disgust emotion, neutral tone, in the Romance genre.      2967
Write a short story with fear emotion, neutral tone, in the Fantasy genre.         2877
Write a short story with disgust emotion, neutral tone, in the Family genre.       2040
Write a short story with fear emotion, neutral tone, in the Horror genre.          1861
Write a short story with disgust emotion, neutral tone, in the Thriller genre.     1793
Write a short story with anger emotion, neutral tone, in the Fantasy genre.        1721
Write a short story with disgust emotion, neutral tone, in the Mystery genre.      1532
Write a short story with fear emotion, neutral tone, in the Romance genre.         1425
Write a short story with fear e

## Sample Review

In [15]:
# Random sample review
print("Random sample review:")
print("=" * 80)

for idx in df.sample(10).index:
    row = df.loc[idx]
    print(f"\nInstruction: {row['instruction']}")
    print(f"Text: {row['text']}")
    print("-" * 80)

Random sample review:

Instruction: Write a short story with disgust emotion, neutral tone, in the Mystery genre.
Text: Harry left his hand there, his thumb slowly caressing the smooth, recently shaved skin. Malfoy closed his eyes and let himself lean a little against his touch. He kissed the palm of his hand, and Harry withdrew it.
--------------------------------------------------------------------------------

Instruction: Write a short story with disgust emotion, neutral tone, in the Family genre.
Text: He held the bottle to Louis’ lips, Louis tipping his head back and swallowing a mouthful. “Umph, it’s even cold,” Louis said, wincing and clenching his eyes, Harry drinking a gulp straight from the bottle.
--------------------------------------------------------------------------------

Instruction: Write a short story with anger emotion, neutral tone, in the Family genre.
Text: So why can’t he just get his arm to move? “You didn’t wear the vest,” Eddie said with a pout, ripping him

## Save Updated Dataset

In [16]:
# Save with instructions
output_path = PROCESSED_DATA_DIR / "ao3_tifu_enriched_labels_with_instruction.parquet"
df.to_parquet(output_path, index=False)

print(f"✓ Saved {len(df)} rows to {output_path}")
print(f"\nColumns in saved file: {df.columns.tolist()}")

✓ Saved 51337 rows to /Users/averylee/Desktop/FriendFic/notebooks/../data/processed/ao3_tifu_enriched_labels_with_instruction.parquet

Columns in saved file: ['id', 'text', 'emotion', 'emotion_score', 'type', 'tone', 'tone_score', 'genre', 'genre_score', 'instruction']
