# FinEmo-LoRA: Initial Project Work
## CSCI 4/6366 Intro to Deep Learning - Deliverable II

**Project:** Fine-Grained Financial Emotion Classification  
**Date:** November 21, 2025  
**Course:** CSCI 6366 Neural Networks and Deep Learning  
**Instructor:** Professor Joel Klein

---

## Team Information

**Team Member:**
- **Vaishnavi Kamdi** - GitHub: [@vaish725](https://github.com/vaish725/FinEmo-LoRA.git)

---

## Project Summary

This project develops a deep learning system for fine-grained emotion classification in financial text. Unlike traditional sentiment analysis (positive/negative/neutral), we classify text into **6 economic emotions**: anxiety, excitement, optimism, fear, uncertainty, and hope.

**Approach:**
1. **Current:** Lightweight feature-based classifier (DistilBERT + MLP)
2. **Future:** LoRA fine-tuning of Llama 3.1 8B with two-stage training

**Key Innovation:** Using GPT-4 for high-quality pseudo-labeling of financial text with confidence scoring and automated quality control.

---

## Dataset Sources

1. **FinGPT Sentiment Dataset**
   - Source: https://github.com/FinancialDiets/FINGPT
   - License: Apache 2.0
   - Description: Financial news headlines and social media posts

2. **GoEmotions Dataset**
   - Source: https://github.com/google-research/google-research/tree/master/goemotions
   - Paper: https://arxiv.org/abs/2005.00547
   - License: Apache 2.0
   - Description: 58k Reddit comments with 27 emotion labels (for transfer learning)

3. **SEntFiN Dataset**
   - Source: https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news
   - License: CC BY-SA 4.0
   - Description: Entity-aware financial sentiment analysis

---

## 1. Setup and Imports

In [None]:
# Import required libraries
import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Add project root to path
project_root = Path('/Users/vaishnavikamdi/Documents/GWU/Classes/Fall 2025/NNDL/FinEmo-LoRA')
sys.path.insert(0, str(project_root))

# Set plotting style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("✓ Imports successful")
print(f"✓ Project root: {project_root}")
print(f"✓ Python version: {sys.version.split()[0]}")

## 2. Dataset Exploration

Load and explore the annotated financial emotion dataset.

In [None]:
# Load the cleaned annotated dataset
data_path = project_root / 'data' / 'annotated' / 'fingpt_annotated_v2.csv'

df = pd.read_csv(data_path)

print(f"Dataset shape: {df.shape}")
print(f"\nColumns: {list(df.columns)}")
print(f"\nFirst few rows:")
df.head()

In [None]:
# Emotion class distribution
print("="*80)
print("EMOTION DISTRIBUTION")
print("="*80)

emotion_counts = df['emotion'].value_counts().sort_index()
print("\nClass counts:")
for emotion, count in emotion_counts.items():
    percentage = (count / len(df)) * 100
    print(f"  {emotion:<15} {count:>4} ({percentage:>5.1f}%)")

print(f"\nTotal samples: {len(df)}")
print(f"Class imbalance ratio: {emotion_counts.max() / emotion_counts.min():.1f}x")

## 3. Data Visualization

Visualize emotion distribution and sample text characteristics.

In [None]:
# Visualize emotion distribution
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Bar plot
emotion_counts.plot(kind='bar', ax=axes[0], color='steelblue', alpha=0.8)
axes[0].set_title('Emotion Class Distribution', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Emotion', fontsize=12)
axes[0].set_ylabel('Count', fontsize=12)
axes[0].tick_params(axis='x', rotation=45)
axes[0].grid(axis='y', alpha=0.3)

# Pie chart
axes[1].pie(emotion_counts, labels=emotion_counts.index, autopct='%1.1f%%',
            startangle=90, colors=sns.color_palette('pastel'))
axes[1].set_title('Emotion Distribution (%)', fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()

print("✓ Visualization complete")

In [None]:
# Analyze text length distribution
df['text_length'] = df['text'].str.len()
df['word_count'] = df['text'].str.split().str.len()

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Text length distribution
axes[0].hist(df['text_length'], bins=30, color='coral', alpha=0.7, edgecolor='black')
axes[0].axvline(df['text_length'].mean(), color='red', linestyle='--', 
                linewidth=2, label=f'Mean: {df["text_length"].mean():.0f}')
axes[0].set_title('Distribution of Text Length (Characters)', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Character Count', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].legend()
axes[0].grid(alpha=0.3)

# Word count distribution
axes[1].hist(df['word_count'], bins=30, color='mediumseagreen', alpha=0.7, edgecolor='black')
axes[1].axvline(df['word_count'].mean(), color='red', linestyle='--',
                linewidth=2, label=f'Mean: {df["word_count"].mean():.0f}')
axes[1].set_title('Distribution of Word Count', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Word Count', fontsize=12)
axes[1].set_ylabel('Frequency', fontsize=12)
axes[1].legend()
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

print(f"\nText Statistics:")
print(f"  Avg characters: {df['text_length'].mean():.0f} (std: {df['text_length'].std():.0f})")
print(f"  Avg words: {df['word_count'].mean():.0f} (std: {df['word_count'].std():.0f})")

## 4. Sample Annotations

Examine sample annotations from GPT-4 with confidence scores and reasoning.

In [None]:
# Display sample annotations for each emotion
print("="*80)
print("SAMPLE ANNOTATIONS (One per emotion)")
print("="*80)

for emotion in sorted(df['emotion'].unique()):
    sample = df[df['emotion'] == emotion].iloc[0]
    print(f"\n[{emotion.upper()}]")
    print(f"Text: {sample['text'][:150]}...")
    if 'confidence' in df.columns:
        print(f"Confidence: {sample['confidence']}")
    if 'reasoning' in df.columns:
        print(f"Reasoning: {sample['reasoning'][:120]}...")
    print("-"*80)

## 5. Baseline Model: Feature-Based Classifier

Our current baseline uses a **lightweight feature-based approach**:
1. **Feature Extraction:** DistilBERT (frozen, pre-trained) generates 768-dimensional embeddings
2. **Classifier:** 3-layer MLP (768 → 512 → 256 → 128 → 6 classes)
3. **Training:** Cross-entropy loss with class weights to handle imbalance

This approach is:
- ✅ **Fast**: Training takes ~2 minutes on CPU
- ✅ **CPU-friendly**: No GPU required
- ✅ **Interpretable**: Can analyze feature importance

### Install Required Packages

First, let's ensure all required packages are installed in the notebook environment.

In [None]:
# Install required packages if not already installed
import subprocess
import sys

packages = ['xgboost', 'scikit-learn', 'torch']

for package in packages:
    try:
        __import__(package)
        print(f"✓ {package} already installed")
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])
        print(f"✓ {package} installed")

print("\n✓ All required packages are available")

In [None]:
# Load pre-trained classifier
import pickle
import torch
import torch.nn as nn

# Add scripts to path for unpickling
import sys
scripts_path = str(project_root / 'scripts')
if scripts_path not in sys.path:
    sys.path.insert(0, scripts_path)

# Define MLP architecture (must match training)
class MLPClassifierPyTorch(nn.Module):
    def __init__(self, input_dim=768, hidden_layers=[512, 256, 128], num_classes=6, dropout=0.3):
        super(MLPClassifierPyTorch, self).__init__()
        layers = []
        prev_dim = input_dim
        for hidden_dim in hidden_layers:
            layers.extend([
                nn.Linear(prev_dim, hidden_dim),
                nn.ReLU(),
                nn.Dropout(dropout)
            ])
            prev_dim = hidden_dim
        layers.append(nn.Linear(prev_dim, num_classes))
        self.model = nn.Sequential(*layers)
    
    def forward(self, x):
        return self.model(x)

# Load the most recent trained model
model_dir = project_root / 'models' / 'classifiers'
model_files = sorted(model_dir.glob('mlp_*.pkl'))

if model_files:
    latest_model = model_files[-1]
    print(f"Loading model: {latest_model.name}")
    
    # Import the classifier module to help with unpickling
    try:
        from classifier import train_classifier
    except ImportError:
        pass  # OK if module not found, we've defined the class above
    
    with open(latest_model, 'rb') as f:
        checkpoint = pickle.load(f)
    
    classifier = checkpoint['model']
    label_encoder = checkpoint['label_encoder']
    
    print(f"✓ Model loaded successfully")
    print(f"  Feature dimension: {checkpoint['feature_dim']}")
    print(f"  Classes: {list(label_encoder.classes_)}")
    print(f"  Model type: {checkpoint['classifier_type']}")
    
    # Display model architecture
    if isinstance(classifier, MLPClassifierPyTorch):
        print(f"\nModel architecture:")
        print(classifier)
else:
    print("⚠️  No trained model found. Please run training first.")

## 6. Model Evaluation Results

Load and visualize the latest evaluation results.

### Alternative: View Results from JSON

If the model pickle loading has issues, we can view the evaluation results directly from saved metrics.

In [None]:
# Load evaluation results from JSON (if available)
import json

results_dir = project_root / 'results'

# Check for metrics files directly in results directory
metrics_files = sorted(results_dir.glob('evaluation_metrics_*.json'))

if metrics_files:
    latest_metrics = metrics_files[-1]
    print(f"Loading results from: {latest_metrics.name}\n")
    
    with open(latest_metrics, 'r') as f:
        metrics = json.load(f)
    
    print("="*80)
    print("EVALUATION RESULTS")
    print("="*80)
    
    print(f"\nOverall Metrics:")
    print(f"  Accuracy: {metrics['overall_accuracy']:.4f}")
    print(f"  Macro Precision: {metrics['macro_precision']:.4f}")
    print(f"  Macro Recall: {metrics['macro_recall']:.4f}")
    print(f"  Macro F1: {metrics['macro_f1']:.4f}")
    
    print(f"\nPer-Class Performance:")
    print(f"{'Emotion':<15} {'Precision':<12} {'Recall':<12} {'F1-Score':<12} {'Support':<10}")
    print("-" * 65)
    
    for emotion, class_metrics in metrics['per_class_metrics'].items():
        print(f"{emotion:<15} "
              f"{class_metrics['precision']:<12.4f} "
              f"{class_metrics['recall']:<12.4f} "
              f"{class_metrics['f1_score']:<12.4f} "
              f"{class_metrics['support']:<10}")
    
    print("\n✓ Results loaded successfully")
    print(f"\nTimestamp: {metrics['timestamp']}")
else:
    print("⚠️  No evaluation results found. Run evaluation first:")
    print("    python run_pipeline.py --stage evaluate")

In [None]:
# Display confusion matrix from latest evaluation
from PIL import Image

# Check both locations for confusion matrices
confusion_matrix_paths = [
    project_root / 'results',
    project_root / 'models' / 'classifiers'
]

cm_files = []
for path in confusion_matrix_paths:
    cm_files.extend(path.glob('confusion_matrix_*.png'))

cm_files = sorted(cm_files)

if cm_files:
    latest_cm = cm_files[-1]
    print(f"Latest confusion matrix: {latest_cm.name}")
    print(f"Location: {latest_cm.parent.name}/\n")
    
    img = Image.open(latest_cm)
    plt.figure(figsize=(10, 8))
    plt.imshow(img)
    plt.axis('off')
    plt.title('Confusion Matrix - Latest Model', fontsize=14, fontweight='bold', pad=20)
    plt.tight_layout()
    plt.show()
else:
    print("⚠️  No confusion matrix found. Run evaluation first:")
    print("    python run_pipeline.py --stage evaluate")

## 7. Initial Results & Observations

### Current Performance (Baseline Model)

**Original Dataset (200 samples):**
- Accuracy: 63.33%
- Macro F1: 0.33
- **Challenge:** Model never predicted excitement, optimism, or fear (0% F1 for these classes)

**Key Finding:** Error analysis revealed **data quality issues**:
- 60-70% of "optimism" labels were neutral factual statements
- Example: *"Company acquires €420M in assets"* (factual, not emotional)
- GPT-4 confused business-positive context with emotional optimism

### Data Cleaning Effort

Implemented automated annotation review with sentiment heuristics:
- Cleaned 83 "optimism" samples → 54 true optimism + 18 relabeled to uncertainty
- Reduced dataset to 189 samples with higher quality labels

**Cleaned Dataset (189 samples):**
- Accuracy: 44.74% (initial retrain)
- Macro F1: 0.39
- **Observation:** Performance dropped due to smaller dataset size

### Next Steps

1. **Short-term:**
   - Consider 3-class taxonomy (negative/positive/uncertainty) for small dataset
   - Scale up to 500+ annotated samples
   - Experiment with data augmentation

2. **Long-term:**
   - Implement LoRA fine-tuning pipeline
   - Two-stage training: GoEmotions (58k samples) → Financial domain
   - Target: 75-85% accuracy with balanced per-class performance

### Challenges Identified

1. **Small Dataset:** 189 samples insufficient for 6-class classification
2. **Class Imbalance:** 12.7x ratio (uncertainty: 76, hope: 6)
3. **Annotation Quality:** Automated annotation requires careful quality control
4. **Rare Classes:** Hope, fear, and excitement have very few samples (<20 each)

---

## Summary for Deliverable II

### Completed Work

✅ **Data Pipeline**
- Downloaded and preprocessed FinGPT dataset (200+ samples)
- Implemented data loaders and preprocessing utilities

✅ **Annotation System**
- GPT-4-based emotion annotation with confidence scoring
- Automated quality control with sentiment heuristics
- Generated 189 high-quality annotated samples

✅ **Baseline Model**
- Feature-based classifier (DistilBERT + MLP)
- CPU-friendly implementation
- Achieved 63.33% accuracy on initial data

✅ **Error Analysis**
- Identified annotation quality issues
- Implemented data cleaning pipeline
- Documented challenges and next steps

### Repository Contents

- `README.md` - Complete project documentation
- `scripts/` - Python modules for data, training, and evaluation
- `notebooks/` - This notebook demonstrating initial work
- `config.yaml` - Central configuration file
- `data/` - Datasets and annotations
- `models/` - Trained classifiers

### GitHub Repository

**URL:** https://github.com/vaish725/FinEmo-LoRA

**Team:** Vaishnavi Kamdi ([@vaish725](https://github.com/vaish725))