# FinAI Contest Task 2 – LoRA Fine-Tuning Walk-through

This notebook walks through **data preparation, fine-tuning, and inference** for FinAI Task 2 using the [FinLoRA](https://github.com/Open-Finance-Lab/FinLoRA) framework. It is simplified. For more detailed instructions, please check the tutorials under the FinLoRA docs here: https://finlora-docs.readthedocs.io/en/latest/index.html. The full process for a **simplified** Buffett Agent model we created can be found here: https://finlora-docs.readthedocs.io/en/latest/tutorials/buffett_agent.html.

**Target Tasks:**
- CFA exams
- BloombergGPT public benchmarks  
- XBRL tasks


## 1. Environment Setup

**Prerequisites:**
- NVIDIA GPU with ≥ 24 GB VRAM (8-bit) or ≥ 16 GB VRAM (4-bit)
- CUDA ≥ 11.8
- Alternatively, use runpod.io (see FinLoRA docs for instructions on using it: https://finlora-docs.readthedocs.io/en/latest/tutorials/setup.html)

In [None]:
# Clone FinLoRA and install dependencies
!git clone https://github.com/Open-Finance-Lab/FinLoRA.git
%cd FinLoRA

# Option A - bash script
!chmod +x setup.sh && ./setup.sh

# Option B - conda (alternative)
# !conda env create -f environment.yml && conda activate finenv


In [None]:
# Authenticate for gated Llama models
!huggingface-cli login


## 2. Data Preparation

**Data Sources to Collect:**
- CFA mock-exam PDFs or CSVs
- BloombergGPT benchmark datasets (FPB, FiQA SA, Headline, NER, ConvFinQA)
- XBRL corpora for tag/value/formula tasks

**Required Format:** JSONL with `{"context": "<question>", "target": "<answer>"}`


In [None]:
import json
import random
from pathlib import Path

# Assume you have collected raw Q&A pairs in 'finai_raw.jsonl'
# Each line should be: {"context": "question", "target": "answer"}
# If you want to test your adapter on a test set, you can split the data into trai and test sets as follows.

# Read raw data
raw_file = Path('data/finai_raw.jsonl')  # Update path as needed
if raw_file.exists():
    with open(raw_file, 'r', encoding='utf-8') as f:
        lines = f.read().splitlines()
    
    # Shuffle for random split
    random.seed(42)
    random.shuffle(lines)
    
    # 80/20 split
    n = len(lines)
    n_train = int(0.8 * n)
    
    train_lines = lines[:n_train]
    test_lines = lines[n_train:]
    
    # Create directories
    Path('data/train').mkdir(parents=True, exist_ok=True)
    Path('data/test').mkdir(parents=True, exist_ok=True)
    
    # Save splits
    with open('data/train/finai_train.jsonl', 'w', encoding='utf-8') as f:
        f.write('\n'.join(train_lines) + '\n')
    
    
    with open('data/test/finai_test.jsonl', 'w', encoding='utf-8') as f:
        f.write('\n'.join(test_lines) + '\n')
    
    print(f"Split {n} examples into {len(train_lines)} examples for fine-tuning and {len(test_lines)} examples for testing")
else:
    print(f"Please create {raw_file} with your collected Q&A pairs first")


## 3. Configure Fine-Tuning

Add configuration to `finetune_configs.json` for your FinAI model.


In [None]:
import json

# Read existing config
config_file = Path('lora/finetune_configs.json')
with open(config_file, 'r') as f:
    configs = json.load(f)

# Add competition fine-tuned model configuration
configs["finai_llama_3_1_8b_8bits_r8_lora"] = {
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "dataset_path": "../data/train/competition_train.jsonl",
    "lora_r": 8,
    "quant_bits": 8,
    "learning_rate": 1e-4,
    "num_epochs": 4,
    "batch_size": 2,
    "gradient_accumulation_steps": 2
}

# Save updated config
with open(config_file, 'w') as f:
    json.dump(configs, f, indent=2)


## 4. Run Fine-Tuning

This will take some time depending on your dataset size and GPU setup.


In [None]:
# Fetch DeepSpeed configs and run fine-tuning
%cd lora
!axolotl fetch deepspeed_configs

# Run the fine-tuning
!python finetune.py competition_llama_3_1_8b_8bits_r8_lora


## 5. Load Adapter & Run Inference

Once fine-tuning is complete, you can run inferences as follows.


In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
# Path to your adapter
adapter_path = "axolotl-output/competitionllama_3_1_8b_8bits_r8_lora"

tokenizer = AutoTokenizer.from_pretrained(base_model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Load and apply the LoRA adapter
model = PeftModel.from_pretrained(base_model, adapter_path)

# Test with sample questions
test_questions = [
    "What is the primary purpose of a cash flow hedge under IFRS?",
    "Explain the concept of economic value added (EVA).",
    "How do you calculate the price-to-earnings ratio?"
]

for question in test_questions:
    print(f"\nQuestion: {question}")
    inputs = tokenizer(question, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            pad_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"Answer: {response[len(question):].strip()}")
    print("-" * 80)
