# Financial Sentiment Analysis with OPT-1.3B: Step-by-Step Guide

This notebook demonstrates **instruction fine-tuning** of the **OPT-1.3B** model for **financial sentiment analysis** using **Supervised Fine-Tuning (SFT)** with **LoRA**.

## 🎯 What You'll Learn

- How to fine-tune LLMs for domain-specific tasks (finance)
- How to use Deep Lake for dataset management
- How to combine LoRA with SFT for efficient training
- How to work with financial text data
- How to merge LoRA adapters with base models
- How to evaluate sentiment analysis models

## 💼 Use Case: Financial Sentiment Analysis

**Why this matters:**
- Financial markets are driven by sentiment
- Automated sentiment analysis saves time
- Domain-specific models outperform generic ones
- Real-world application in trading, risk assessment

## 🔧 Requirements

- **GPU**: 16GB+ VRAM recommended
- **Time**: 2-4 hours (depending on epochs)
- **Dataset**: FinGPT sentiment (20K training samples)
- **Model**: Facebook OPT-1.3B (1.3 billion parameters)

## 📊 Key Stats

| Metric | Value |
|--------|-------|
| Base Model | OPT-1.3B (1.3B params) |
| Trainable with LoRA | ~0.24% (3.1M params) |
| Training Examples | 20,000 |
| Validation Examples | 2,000 |
| Training Time | 2-4 hours |

## 📖 Table of Contents

1. [Setup Environment](#1-setup-environment)
2. [Load Deep Lake Dataset](#2-load-deep-lake-dataset)
3. [Initialize Model and Trainer](#3-initialize-model-and-trainer)
4. [Fine-Tune with SFT](#4-fine-tune-with-sft)
5. [Merge LoRA and OPT](#5-merge-lora-and-opt)
6. [Inference and Testing](#6-inference-and-testing)

---

**Credits**: Based on the tutorial by [Youssef Hosni](https://youssef-hosni.medium.com/)

## 1. Setup Environment

### Install Required Packages

We'll install:
- **transformers**: Hugging Face Transformers
- **deeplake**: Dataset management
- **trl**: Supervised Fine-Tuning Trainer
- **peft**: LoRA implementation
- **wandb**: Experiment tracking

In [None]:
# Install required packages
!pip install -q transformers==4.32.0 deeplake==3.6.19 trl==0.6.0 peft==0.5.0 wandb==0.15.8

print("✅ All packages installed successfully!")

In [None]:
# Import necessary libraries
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    TrainingArguments
)
from peft import LoraConfig, PeftModel
from trl import SFTTrainer
from trl.trainer import ConstantLengthDataset
import deeplake

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## 2. Load Deep Lake Dataset

### About the FinGPT Sentiment Dataset

The **FinGPT sentiment dataset** contains:
- **Financial tweets** from various sources
- **Sentiment labels**: positive, negative, neutral
- **Instructions** for each example
- **20,000** training samples
- **2,000** validation samples

**Format**: Each example has three fields:
- `instruction`: "What is the sentiment of this news?"
- `input`: The financial text
- `output`: The sentiment label

### Why Deep Lake?

- Efficient dataset management
- Easy streaming for large datasets
- Version control for data
- Works seamlessly with PyTorch

In [None]:
# Load the FinGPT sentiment dataset from Deep Lake
print("Loading FinGPT sentiment dataset...")
print("This may take a moment on first load...")

ds = deeplake.load('hub://genai360/FingGPT-sentiment-train-set')
ds_valid = deeplake.load('hub://genai360/FingGPT-sentiment-valid-set')

print("\n✅ Datasets loaded successfully!")
print(f"\nTraining dataset:")
print(ds)
print(f"\nTraining samples: {len(ds):,}")
print(f"Validation samples: {len(ds_valid):,}")

In [None]:
# Explore the dataset structure
print("=== Dataset Structure ===")
print(f"Tensors: {ds.tensors}")
print()

# Look at a sample
print("=== Sample Example ===")
sample_idx = 0
print(f"Instruction: {ds[sample_idx]['instruction'].text()}")
print(f"\nInput: {ds[sample_idx]['input'].text()}")
print(f"\nOutput: {ds[sample_idx]['output'].text()}")

In [None]:
# Show a few more examples
print("=== More Examples from Dataset ===\n")

for i in range(3):
    print(f"--- Example {i+1} ---")
    print(f"Instruction: {ds[i]['instruction'].text()}")
    print(f"Input: {ds[i]['input'].text()}")
    print(f"Output: {ds[i]['output'].text()}")
    print()

### Prepare Dataset for Training

We need to format each example into a structured text format that the model can learn from.

**Format:**
```
{instruction}

Content: {input}

Sentiment: {output}
```

This creates a consistent pattern for the model to learn.

In [None]:
# Define function to prepare sample text
def prepare_sample_text(example):
    """
    Prepare the text from a sample of the dataset.
    
    Combines instruction, input, and output into a formatted string.
    
    Args:
        example: A sample from the dataset
        
    Returns:
        Formatted text string
    """
    text = f"""{example['instruction'].text()}

Content: {example['input'].text()}

Sentiment: {example['output'].text()}"""
    return text

# Test the function
print("=== Formatted Example ===")
print(prepare_sample_text(ds[0]))

In [None]:
# Load the tokenizer for OPT-1.3B
model_name = "facebook/opt-1.3b"

print(f"Loading tokenizer for {model_name}...")
tokenizer = AutoTokenizer.from_pretrained(model_name)

print("\n✅ Tokenizer loaded successfully!")
print(f"\nVocabulary size: {len(tokenizer):,}")
print(f"Model max length: {tokenizer.model_max_length}")

In [None]:
# Test tokenization
test_text = prepare_sample_text(ds[0])
tokens = tokenizer(test_text, return_tensors="pt")

print(f"Original text length: {len(test_text)} characters")
print(f"Number of tokens: {len(tokens['input_ids'][0])}")
print(f"\nFirst 10 token IDs: {tokens['input_ids'][0][:10].tolist()}")
print(f"\nDecoded back: {tokenizer.decode(tokens['input_ids'][0][:100])}...")

### Create ConstantLengthDataset

The `ConstantLengthDataset` from TRL:
- Ensures all sequences have the same length (1024 tokens)
- Handles padding and truncation automatically
- Enables efficient batch processing
- Supports infinite sampling for long training

In [None]:
# Create training dataset with constant length
seq_length = 1024

print(f"Creating training dataset with sequence length: {seq_length}")

train_dataset = ConstantLengthDataset(
    tokenizer,
    ds,
    formatting_func=prepare_sample_text,
    infinite=True,  # Allow infinite sampling
    seq_length=seq_length
)

print("\n✅ Training dataset created!")

In [None]:
# Inspect a sample from the prepared dataset
iterator = iter(train_dataset)
sample = next(iterator)

print("=== Prepared Sample ===")
print(f"Keys: {sample.keys()}")
print(f"\nInput IDs shape: {sample['input_ids'].shape}")
print(f"Labels shape: {sample['labels'].shape}")
print(f"\nFirst 20 input IDs: {sample['input_ids'][:20].tolist()}")
print(f"\nDecoded text (first 200 chars):")
print(tokenizer.decode(sample['input_ids'][:100]))

In [None]:
# Create validation dataset
print("Creating validation dataset...")

eval_dataset = ConstantLengthDataset(
    tokenizer,
    ds_valid,
    formatting_func=prepare_sample_text,
    seq_length=seq_length
)

print("✅ Validation dataset created!")
print(f"\nDatasets ready for training!")

## 3. Initialize Model and Trainer

### Configure LoRA for Efficient Fine-Tuning

**LoRA (Low-Rank Adaptation)** allows us to:
- Train only ~0.24% of model parameters
- Significantly reduce memory requirements
- Maintain high performance
- Enable faster training

### LoRA Parameters

- **r=16**: Rank of low-rank matrices (higher = more capacity)
- **lora_alpha=32**: Scaling factor for LoRA updates
- **lora_dropout=0.05**: Dropout for regularization
- **task_type**: CAUSAL_LM for language modeling

In [None]:
# Configure LoRA
lora_config = LoraConfig(
    r=16,  # Rank of the low-rank matrices
    lora_alpha=32,  # Scaling factor
    lora_dropout=0.05,  # Dropout for regularization
    bias="none",  # Don't train bias terms
    task_type="CAUSAL_LM",  # Causal language modeling
)

print("✅ LoRA configuration created!")
print(f"\nLoRA settings:")
print(f"  Rank (r): {lora_config.r}")
print(f"  Alpha: {lora_config.lora_alpha}")
print(f"  Dropout: {lora_config.lora_dropout}")
print(f"  Task type: {lora_config.task_type}")

### Configure Training Arguments

Key settings for financial sentiment fine-tuning:

**Training Duration:**
- 10 epochs (can adjust based on convergence)
- Save and evaluate after each epoch

**Optimization:**
- Learning rate: 1e-4 (conservative for fine-tuning)
- Cosine learning rate schedule
- 100 warmup steps

**Efficiency:**
- Batch size: 12 per device
- BFloat16 mixed precision
- Gradient accumulation: 1 step

**Monitoring:**
- Weights & Biases integration
- Log every 5 steps

In [None]:
# Define training arguments
training_args = TrainingArguments(
    output_dir="./OPT-fine_tuned-FinGPT",
    
    # Training duration
    num_train_epochs=10,
    
    # Batch sizes
    per_device_train_batch_size=12,
    per_device_eval_batch_size=12,
    
    # Optimization
    learning_rate=1e-4,
    lr_scheduler_type="cosine",
    warmup_steps=100,
    weight_decay=0.05,
    
    # Gradient settings
    gradient_accumulation_steps=1,
    gradient_checkpointing=False,
    
    # Mixed precision
    fp16=False,
    bf16=True,  # BFloat16 for better stability
    
    # Evaluation and saving
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_steps=5,
    
    # Data handling
    dataloader_drop_last=True,
    
    # Distributed training
    ddp_find_unused_parameters=False,
    
    # Experiment tracking
    run_name="OPT-fine_tuned-FinGPT",
    report_to="wandb",  # Set to "none" if not using W&B
)

print("✅ Training arguments configured!")
print(f"\nKey settings:")
print(f"  Output dir: {training_args.output_dir}")
print(f"  Epochs: {training_args.num_train_epochs}")
print(f"  Batch size: {training_args.per_device_train_batch_size}")
print(f"  Learning rate: {training_args.learning_rate}")
print(f"  Mixed precision: BF16={training_args.bf16}")

### Load OPT-1.3B Model

We'll load the base OPT-1.3B model and prepare it for LoRA fine-tuning.

**Steps:**
1. Load model in BFloat16 precision
2. Freeze base model parameters
3. Keep small parameters (layer norms) in FP32 for stability
4. Enable gradient checkpointing
5. Prepare for LoRA adapter injection

In [None]:
# Load the OPT-1.3B model
print(f"Loading {model_name}...")
print("This may take a few minutes...")

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16
)

print("\n✅ Model loaded successfully!")
print(f"Model: {model_name}")

In [None]:
# Prepare model for LoRA training
from torch import nn

print("Preparing model for LoRA training...")

# Freeze base model parameters
for param in model.parameters():
    param.requires_grad = False  # Freeze the model - train adapters later
    if param.ndim == 1:
        # Cast small parameters (e.g., layernorm) to FP32 for stability
        param.data = param.data.to(torch.float32)

# Enable gradient checkpointing to reduce memory usage
model.gradient_checkpointing_enable()

# Enable input gradients
model.enable_input_require_grads()

# Define custom class to cast output to FP32
class CastOutputToFloat(nn.Sequential):
    def forward(self, x):
        return super().forward(x).to(torch.float32)

# Apply to language model head
model.lm_head = CastOutputToFloat(model.lm_head)

print("\n✅ Model prepared for training!")
print("  - Base parameters frozen")
print("  - Layer norms in FP32")
print("  - Gradient checkpointing enabled")
print("  - Output cast to FP32")

## 4. Fine-Tune with SFT

### Initialize SFTTrainer

The **SFTTrainer** (Supervised Fine-Tuning Trainer) from TRL:
- Handles the training loop automatically
- Integrates LoRA seamlessly
- Supports packing for efficient training
- Manages evaluation and checkpointing

**Packing**: Combines multiple examples into single sequences to minimize padding

In [None]:
# Initialize SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    peft_config=lora_config,
    packing=True,  # Pack multiple examples per sequence
)

print("✅ SFTTrainer initialized!")

In [None]:
# Function to print trainable parameters
def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    
    trainable_pct = 100 * trainable_params / all_param
    
    print(f"trainable params: {trainable_params:,}")
    print(f"all params: {all_param:,}")
    print(f"trainable%: {trainable_pct:.4f}%")
    
    return trainable_params, all_param, trainable_pct

print("=== Model Parameters ===")
trainable, total, pct = print_trainable_parameters(trainer.model)

print(f"\n💡 With LoRA, we're only training {pct:.2f}% of parameters!")
print(f"   That's {trainable:,} out of {total:,} parameters.")

In [None]:
# Start training!
print("=" * 70)
print("STARTING FINE-TUNING")
print("=" * 70)
print()
print(f"Training on {len(train_dataset)} samples")
print(f"Validating on {len(eval_dataset)} samples")
print(f"Running for {training_args.num_train_epochs} epochs")
print()
print("This will take approximately 2-4 hours...")
print("Monitor progress via logging output or W&B dashboard")
print()
print("=" * 70)
print()

# Train the model
# Uncomment the line below to start training
# trainer.train()

print("⚠️ Training is commented out by default.")
print("Uncomment 'trainer.train()' above to start actual training.")
print()
print("For testing, you can:")
print("  - Reduce num_train_epochs to 1-2")
print("  - Use a smaller subset of data")
print("  - Adjust batch size based on your GPU")

In [None]:
# After training completes, save the model
# Uncomment when training is done

# print("Saving fine-tuned model...")
# trainer.save_model("./OPT-fine_tuned-FinGPT/final")
# print("✅ Model saved!")

## 5. Merge LoRA and OPT

After training, we need to:
1. Load the base OPT-1.3B model
2. Load the trained LoRA adapters
3. Merge them together
4. Save the merged model for inference

**Why merge?**
- Simpler deployment (single model file)
- Faster inference (no adapter overhead)
- Standard model format (compatible with all tools)

In [None]:
# Load the base model
print("Loading base OPT-1.3B model...")

model_base = AutoModelForCausalLM.from_pretrained(
    "facebook/opt-1.3b",
    return_dict=True,
    torch_dtype=torch.bfloat16
)

print("✅ Base model loaded!")

In [None]:
# Load the LoRA adapters and merge
# Replace <desired_checkpoint> with your actual checkpoint folder
# Example: "checkpoint-500" or "final"

checkpoint_path = "./OPT-fine_tuned-FinGPT/final"  # Update this path

print(f"Loading LoRA adapters from: {checkpoint_path}")

# Note: This will only work after training is complete
# Uncomment the lines below when you have a trained checkpoint

# model_merged = PeftModel.from_pretrained(model_base, checkpoint_path)
# model_merged.eval()
# model_merged = model_merged.merge_and_unload()

# print("\n✅ LoRA adapters merged with base model!")

# # Save the merged model
# merged_path = "./OPT-fine_tuned-FinGPT/merged"
# model_merged.save_pretrained(merged_path)
# print(f"\n✅ Merged model saved to: {merged_path}")

print("⚠️ Merging is commented out - run after training completes")

## 6. Inference and Testing

Now let's test both the **vanilla (base)** model and our **fine-tuned** model to compare their performance on financial sentiment analysis.

### Test Setup

We'll use a sample financial news headline and ask both models to:
1. Identify the sentiment
2. Provide reasoning (optional)

**Format:**
```
What is the sentiment of this news? Please choose an answer from {negative/neutral/positive}

Content: [financial news text]

Sentiment:
```

In [None]:
# Prepare test input
test_prompt = """What is the sentiment of this news? Please choose an answer from {strong negative/moderately negative/mildly negative/neutral/mildly positive/moderately positive/strong positive}, then provide some short reasons.

Content: UPDATE 1-AstraZeneca sells rare cancer drug to Sanofi for up to $300 mln.

Sentiment: """

print("=== Test Prompt ===")
print(test_prompt)
print("=" * 70)

In [None]:
# Tokenize the input
inputs = tokenizer(test_prompt, return_tensors="pt")

# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs = {k: v.to(device) for k, v in inputs.items()}

print(f"Input tokenized: {inputs['input_ids'].shape[1]} tokens")
print(f"Device: {device}")

### Load Models for Comparison

We'll load:
1. **Vanilla Model**: Base OPT-1.3B (no fine-tuning)
2. **Fine-Tuned Model**: Our trained model with merged LoRA adapters

**Note**: In production, you'd only load the fine-tuned model.

In [None]:
# Load vanilla (base) model for comparison
print("Loading vanilla OPT-1.3B model...")

model_vanilla = AutoModelForCausalLM.from_pretrained(
    "facebook/opt-1.3b",
    torch_dtype=torch.bfloat16
)
model_vanilla.to(device)
model_vanilla.eval()

print("✅ Vanilla model loaded and ready!")

In [None]:
# Load fine-tuned model
# This requires that training has completed and model has been merged

merged_model_path = "./OPT-fine_tuned-FinGPT/merged"

print(f"Loading fine-tuned model from: {merged_model_path}")

# Uncomment when you have a trained model
# model_finetuned = AutoModelForCausalLM.from_pretrained(
#     merged_model_path,
#     torch_dtype=torch.bfloat16
# )
# model_finetuned.to(device)
# model_finetuned.eval()
# print("✅ Fine-tuned model loaded and ready!")

print("⚠️ Loading fine-tuned model is commented out")
print("   Uncomment after training and merging are complete")

In [None]:
# Generate with vanilla model
print("=" * 70)
print("VANILLA MODEL OUTPUT")
print("=" * 70)
print()

generation_output_vanilla = model_vanilla.generate(
    **inputs,
    return_dict_in_generate=True,
    output_scores=True,
    max_length=256,
    num_beams=1,
    do_sample=True,
    repetition_penalty=1.5,
    length_penalty=2.0
)

output_text_vanilla = tokenizer.decode(
    generation_output_vanilla['sequences'][0],
    skip_special_tokens=True
)

print(output_text_vanilla)
print()
print("=" * 70)

In [None]:
# Generate with fine-tuned model
# Uncomment when you have a trained model

# print("=" * 70)
# print("FINE-TUNED MODEL OUTPUT")
# print("=" * 70)
# print()

# generation_output_finetuned = model_finetuned.generate(
#     **inputs,
#     return_dict_in_generate=True,
#     output_scores=True,
#     max_length=256,
#     num_beams=1,
#     do_sample=False,  # Greedy for more consistent output
#     repetition_penalty=1.5,
#     length_penalty=2.0
# )

# output_text_finetuned = tokenizer.decode(
#     generation_output_finetuned['sequences'][0],
#     skip_special_tokens=True
# )

# print(output_text_finetuned)
# print()
# print("=" * 70)

print("⚠️ Fine-tuned model inference is commented out")
print("   Uncomment after training is complete")

### Expected Results Analysis

**Vanilla Model:**
- Often generates irrelevant or repetitive text
- May not understand the sentiment task
- Struggles with financial domain terminology
- Output format may not match instructions

**Fine-Tuned Model:**
- Correctly identifies sentiment (positive in this case)
- Follows instruction format precisely
- Understands financial context
- Provides concise, accurate response

**Example Expected Output:**
```
Fine-tuned: "positive"
```

### Why Fine-Tuning Works

1. **Domain Adaptation**: Model learns financial terminology
2. **Task Understanding**: Learns to classify sentiment
3. **Format Consistency**: Follows instruction patterns
4. **Efficient Training**: LoRA trains only 0.24% of parameters

### Test with Multiple Examples

Let's test with various financial scenarios to see how the model performs across different sentiments.

In [None]:
# Define multiple test cases
test_cases = [
    {
        "text": "Apple reports record-breaking quarterly revenue, exceeding all analyst expectations",
        "expected": "strong positive"
    },
    {
        "text": "Company announces massive layoffs affecting 20% of workforce",
        "expected": "strong negative"
    },
    {
        "text": "Stock prices remain unchanged after quarterly earnings report",
        "expected": "neutral"
    },
    {
        "text": "Tech giant faces potential antitrust investigation",
        "expected": "negative"
    },
    {
        "text": "Pharmaceutical company receives FDA approval for new drug",
        "expected": "positive"
    }
]

print("=== Test Cases Prepared ===")
for i, case in enumerate(test_cases, 1):
    print(f"{i}. Text: {case['text']}")
    print(f"   Expected: {case['expected']}")
    print()

In [None]:
# Function to test multiple examples
def test_sentiment(model, tokenizer, text, device):
    """
    Test sentiment analysis on a given text.
    
    Args:
        model: The model to use
        tokenizer: The tokenizer
        text: The financial text to analyze
        device: Device to run on
        
    Returns:
        Generated sentiment response
    """
    prompt = f"""What is the sentiment of this news? Please choose an answer from {{negative/neutral/positive}}

Content: {text}

Sentiment: """
    
    inputs = tokenizer(prompt, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    outputs = model.generate(
        **inputs,
        max_length=150,
        num_beams=1,
        do_sample=False,
        repetition_penalty=1.5
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Extract just the sentiment part
    sentiment = response.split("Sentiment:")[-1].strip().split()[0] if "Sentiment:" in response else response
    
    return sentiment

print("✅ Testing function defined!")
print("\nYou can use this to test your fine-tuned model on multiple examples.")

In [None]:
# Test all cases with vanilla model
print("=" * 70)
print("VANILLA MODEL - BATCH TEST RESULTS")
print("=" * 70)
print()

for i, case in enumerate(test_cases, 1):
    print(f"Test {i}:")
    print(f"  Text: {case['text'][:50]}...")
    print(f"  Expected: {case['expected']}")
    
    # Uncomment to run actual inference
    # result = test_sentiment(model_vanilla, tokenizer, case['text'], device)
    # print(f"  Predicted: {result}")
    
    print(f"  Predicted: [Run inference to see]")
    print()

print("💡 Uncomment inference code to see actual predictions")
print("=" * 70)

## 🎉 Congratulations!

You've successfully:
- ✅ Loaded financial sentiment dataset from Deep Lake
- ✅ Configured LoRA for efficient fine-tuning
- ✅ Set up SFTTrainer with optimal hyperparameters
- ✅ Prepared OPT-1.3B for domain-specific training
- ✅ Understood the complete fine-tuning pipeline
- ✅ Learned how to merge LoRA adapters
- ✅ Tested model performance on financial text

## 🎯 Key Takeaways

### Efficiency of LoRA
- **Only 0.24%** of parameters trained (3.1M out of 1.3B)
- **Significantly faster** than full fine-tuning
- **Lower memory** requirements (fits on single GPU)
- **Comparable performance** to full fine-tuning

### Domain Adaptation
- **Base model** struggles with financial sentiment
- **Fine-tuned model** understands financial context
- **Instruction following** improves dramatically
- **Output format** becomes consistent

### Practical Benefits
- **Production-ready** after just 2-4 hours training
- **Easy deployment** after merging adapters
- **Scalable approach** for other financial tasks
- **Cost-effective** compared to full fine-tuning

## 🚀 Next Steps

### Immediate
1. **Train the model**: Uncomment training code and run
2. **Evaluate**: Test on validation set
3. **Tune hyperparameters**: Adjust learning rate, epochs, etc.

### Advanced
1. **More data**: Train on larger financial datasets
2. **Multi-class**: Expand to 7 sentiment categories
3. **Other tasks**: Adapt for financial NER, summarization
4. **Ensemble**: Combine multiple fine-tuned models

### Production
1. **Optimize inference**: Quantization, ONNX conversion
2. **API deployment**: Serve via FastAPI or similar
3. **Monitoring**: Track prediction quality over time
4. **A/B testing**: Compare with baseline models

## 📚 Resources

### Papers
- [LoRA Paper](https://arxiv.org/abs/2106.09685)
- [OPT Paper](https://arxiv.org/abs/2205.01068)
- [Instruction Tuning Survey](https://arxiv.org/abs/2308.10792)

### Tools
- [PEFT Documentation](https://huggingface.co/docs/peft)
- [TRL Documentation](https://huggingface.co/docs/trl)
- [Deep Lake](https://docs.activeloop.ai/)
- [OPT Model Card](https://huggingface.co/facebook/opt-1.3b)

### Datasets
- [FinGPT Datasets](https://huggingface.co/datasets?search=fingpt)
- [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)
- [FiQA Sentiment](https://sites.google.com/view/fiqa/)

## 💡 Tips for Better Results

### Training
1. **More epochs**: 10-20 epochs often improve results
2. **Learning rate**: Try 5e-5 to 2e-4
3. **Batch size**: Increase if you have more GPU memory
4. **Sequence length**: Adjust based on your text lengths

### Data
1. **More examples**: 50K+ samples for best results
2. **Balanced classes**: Equal positive/negative/neutral
3. **Quality over quantity**: Clean, accurate labels
4. **Domain coverage**: Diverse financial topics

### Evaluation
1. **Hold-out test set**: Never seen during training
2. **Multiple metrics**: Accuracy, F1, precision, recall
3. **Error analysis**: Study misclassifications
4. **Human evaluation**: Sample random predictions

## 🔧 Troubleshooting

### Out of Memory
- Reduce `per_device_train_batch_size`
- Enable `gradient_checkpointing=True`
- Reduce `seq_length` to 512 or 768

### Poor Performance
- Train for more epochs
- Increase LoRA rank (r=32 or r=64)
- Check data quality and balance
- Adjust learning rate

### Slow Training
- Increase batch size if possible
- Use `bf16=True` for mixed precision
- Enable `gradient_checkpointing=False`
- Use multiple GPUs if available

---

**Happy Fine-Tuning!** 🎓✨

For more tutorials, check out:
- [Full Fine-Tuning (GPT-2)](../01-Full-Fine-Tuning/)
- [PEFT (Falcon-7B LoRA)](../02-PEFT/)
- [Summarization (FLAN-T5)](./Summarization-FLAN-T5.ipynb)
- [Reasoning Tuning](../04-Reasoning-Tuning/)