# Nemotron Next 8B LoRA Fine-tuning with FiQA Dataset

This notebook demonstrates LoRA fine-tuning of [NVIDIA Nemotron Next 8B](https://huggingface.co/nvidia/Nemotron-Next-8B) on the [FiQA dataset](https://huggingface.co/datasets/explodinggradients/fiqa) for financial question answering.

## Table of Contents

1. [Setup & Environment](#setup)
2. [Model Loading](#model-loading)
3. [Dataset Loading & Preprocessing](#dataset)
4. [Baseline Evaluation](#baseline)
5. [LoRA Configuration](#lora-config)
6. [LoRA Training](#training)
7. [Fine-tuned Evaluation](#evaluation)
8. [Visualization & Analysis](#visualization)

---

## GPU Requirements

‚ö†Ô∏è **This notebook requires a GPU with 24GB+ VRAM** (A100, H100, or RTX 4090 recommended)

| Phase | GPU Required | Time Estimate |
|-------|--------------|---------------|
| Model Loading | ‚úÖ Yes | 2-5 min |
| Dataset Prep | ‚úÖ Yes | 10-25 min |
| Baseline Eval | ‚úÖ Yes | 30-60 min |
| LoRA Training | ‚úÖ Yes | 3-6 hours |
| Final Eval | ‚úÖ Yes | 30-60 min |


<a name="setup"></a>
## 1. Setup & Environment

First, let's verify our environment and import required libraries.


In [None]:
# GPU REQUIRED - Verify CUDA availability
import torch

print("=" * 50)
print("Environment Check")
print("=" * 50)
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    raise RuntimeError("‚ùå CUDA not available! This notebook requires a GPU.")

print("=" * 50)
print("‚úÖ GPU environment verified!")


In [None]:
# Core imports
import os
import json
import time
from pathlib import Path
from typing import Dict, List, Optional, Tuple

# Data processing
import pandas as pd
from datasets import load_dataset, Dataset, DatasetDict
from tqdm.auto import tqdm

# NeMo AutoModel imports
try:
    import nemo_automodel
    from nemo_automodel._transformers import NeMoAutoModelForCausalLM
    from nemo_automodel.components._peft.lora import PeftConfig, apply_lora_to_linear_modules
    print(f"‚úÖ NeMo AutoModel imported successfully")
except ImportError as e:
    print(f"‚ö†Ô∏è NeMo AutoModel not found: {e}")
    print("Please install: cd Automodel && uv pip install -e .")
    raise

# Transformers for tokenizer
from transformers import AutoTokenizer, AutoProcessor

# Evaluation
import evaluate

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Set plotting style
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette("husl")

print("‚úÖ All imports successful!")


In [None]:
# Configuration
CONFIG = {
    # Model
    "model_name": "nvidia/Nemotron-Next-8B",
    "torch_dtype": torch.bfloat16,
    
    # Dataset
    "dataset_name": "explodinggradients/fiqa",
    "train_split_ratio": 0.8,  # 80% train, 20% validation from original train
    "max_length": 512,
    
    # LoRA
    "lora_rank": 8,
    "lora_alpha": 32,
    "lora_dropout": 0.05,
    
    # Training
    "learning_rate": 2e-4,
    "batch_size": 4,
    "num_epochs": 3,
    "gradient_accumulation_steps": 4,
    "warmup_ratio": 0.1,
    
    # Paths
    "output_dir": "./outputs",
    "checkpoint_dir": "./checkpoints",
}

# Create output directories
Path(CONFIG["output_dir"]).mkdir(parents=True, exist_ok=True)
Path(CONFIG["checkpoint_dir"]).mkdir(parents=True, exist_ok=True)

print("‚úÖ Configuration loaded")
print(f"   Model: {CONFIG['model_name']}")
print(f"   Dataset: {CONFIG['dataset_name']}")
print(f"   LoRA rank: {CONFIG['lora_rank']}, alpha: {CONFIG['lora_alpha']}")


---

<a name="model-loading"></a>
## 2. Model Loading

**‚è±Ô∏è Time Estimate: 2-5 minutes** | **GPU REQUIRED**

Load Nemotron Next 8B using NeMo AutoModel APIs.

> üöß **TODO: Phase 2** - Implement model loading


---

<a name="dataset"></a>
## 3. Dataset Loading & Preprocessing

**‚è±Ô∏è Time Estimate: 10-25 minutes** | **GPU REQUIRED**

Load FiQA dataset, create train/val/test splits, and format for instruction fine-tuning.

> üöß **TODO: Phase 3** - Implement dataset loading and preprocessing


---

<a name="baseline"></a>
## 4. Baseline Evaluation

**‚è±Ô∏è Time Estimate: 30-60 minutes** | **GPU REQUIRED**

Evaluate the base model on FiQA test set before fine-tuning.

> üöß **TODO: Phase 4** - Implement baseline evaluation


---

<a name="lora-config"></a>
## 5. LoRA Configuration

**‚è±Ô∏è Time Estimate: ~5 minutes** | **GPU REQUIRED**

Configure and apply LoRA adapter to the model.

> üöß **TODO: Phase 5** - Implement LoRA configuration


---

<a name="training"></a>
## 6. LoRA Training

**‚è±Ô∏è Time Estimate: 3-6 hours** | **GPU REQUIRED**

Train the LoRA adapter on FiQA training data.

> üöß **TODO: Phase 6** - Implement LoRA training


---

<a name="evaluation"></a>
## 7. Fine-tuned Evaluation

**‚è±Ô∏è Time Estimate: 30-60 minutes** | **GPU REQUIRED**

Evaluate the fine-tuned model and compare with baseline.

> üöß **TODO: Phase 7** - Implement fine-tuned evaluation


---

<a name="visualization"></a>
## 8. Visualization & Analysis

Create visualizations comparing baseline vs fine-tuned performance.

> üöß **TODO: Phase 8** - Implement visualization and analysis


---

## Summary

This notebook demonstrated LoRA fine-tuning of Nemotron Next 8B on the FiQA financial QA dataset.

### Results

| Metric | Baseline | Fine-tuned | Improvement |
|--------|----------|------------|-------------|
| Exact Match | TBD | TBD | TBD |
| F1 Score | TBD | TBD | TBD |
| BLEU | TBD | TBD | TBD |

### Next Steps

- Experiment with different LoRA ranks
- Try longer training
- Evaluate on additional financial QA datasets
