# Phase 4: Evaluation Preparation (Gold Standard Creation)
**Goal:** Create a **150-sentence** "Gold Standard" test set for validating our Sentiment Analysis models.

## Requirements
1.  **Total:** 150 sentences.
2.  **Stratification:**
    *   50 from **Speech**
    *   50 from **Press Conf**
    *   50 from **Minutes**
3.  **Constraints:** Word count between 10 and 60.

## Output
Save to `../data/gold_standard/gold_standard_to_label_150.csv`.

In [1]:
import pandas as pd
import os

DATA_PATH = '../data/master/fed_master_corpus.csv'
OUTPUT_PATH = '../data/gold_standard/gold_standard_to_label_150.csv'

# 1. Load Data
if os.path.exists(DATA_PATH):
    df = pd.read_csv(DATA_PATH)
    print(f"Loaded {len(df)} records.")
else:
    raise FileNotFoundError("Master corpus not found.")

# 2. Filter for meaningful sentences
df_filtered = df[(df['word_count'] >= 10) & (df['word_count'] <= 60)].copy()
print(f"Filtered to {len(df_filtered)} sentences with length 10-60.")

# 3. Stratified Sampling (50 from each)
SAMPLES_PER_CLASS = 50

try:
    sample_speech = df_filtered[df_filtered['source'] == 'Speech'].sample(n=SAMPLES_PER_CLASS, random_state=42)
    sample_press = df_filtered[df_filtered['source'] == 'Press Conf'].sample(n=SAMPLES_PER_CLASS, random_state=42)
    sample_minutes = df_filtered[df_filtered['source'] == 'Minutes'].sample(n=SAMPLES_PER_CLASS, random_state=42)

    # Combine
    df_gold = pd.concat([sample_speech, sample_press, sample_minutes]).sample(frac=1, random_state=42).reset_index(drop=True)

    # 4. Prepare for Export
    export_cols = ['date', 'source', 'text']
    df_gold = df_gold[export_cols]
    df_gold['manual_label'] = '' 

    df_gold.to_csv(OUTPUT_PATH, index=False)

    print(f"SUCCESS: Generated {len(df_gold)} sentences.\nSaved to {OUTPUT_PATH}")
    print("Breakdown:")
    print(df_gold['source'].value_counts())
    
except ValueError as e:
    print(f"Error during sampling: {e}")
    print("Check if there are enough samples in each category.")

Loaded 7442 records.
Filtered to 5706 sentences with length 10-60.
SUCCESS: Generated 150 sentences.
Saved to ../data/gold_standard/gold_standard_to_label_150.csv
Breakdown:
source
Press Conf    50
Speech        50
Minutes       50
Name: count, dtype: int64


## Next Steps
Use the content of `gold_standard_to_label_150.csv` for LLM annotation.