#### ============================================================
#### COLAB NOTEBOOK: Build Dataset (FiQA + Policy Q&A) with TinyLlama QLoRA Fine-tuning
#### Personal Finance Education Assistant (Personal finance only)
#### =============================================================
#### This notebook:
#### 1) Downloads FiQA from Hugging Face
#### 2) Builds your policy_qa.jsonl (starter examples included)
#### 3) Converts both into a single instruction dataset
#### 4) Shuffles + splits into train/valid/test
#### 5) Saves JSONL files ready for LoRA fine-tuning
#### - LoRA adapter checkpoints
#### - An experiments results table
#### - Baseline vs fine-tuned comparisons
#### - Gradio chat UI
#### ===========================================

In [1]:
!nvidia-smi
!python --version

Sun Feb 22 20:21:01 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.07              Driver Version: 580.82.07      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   48C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+----------------------------------------------

In [4]:
!pip -q install -U \
  "transformers==4.45.2" \
  "accelerate==0.34.2" \
  "datasets==2.20.0" \
  "peft==0.13.2" \
  "evaluate==0.4.2" \
  "rouge-score==0.1.2" \
  "nltk==3.9.1" \
  "gradio==4.44.1"

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/44.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.9/9.9 MB[0m [31m69.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m324.4/324.4 kB[0m [31m18.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m547.8/547.8 kB[0m [31m33.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.7/320.7 kB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.1/18.1 MB[0m [31m74.3 MB/s[0m eta [36m0:

In [2]:
import os, json, random, math, time
import torch
from datasets import load_dataset, Dataset

random.seed(42)
torch.manual_seed(42)

print("torch:", torch.__version__, "cuda:", torch.cuda.is_available())

torch: 2.10.0+cu128 cuda: True


In [3]:
import random

random.seed(42)

# Your policy rules (embedded into answers by style)
POLICY_PREFIX = (
    "Note: I provide general financial education, not personalized financial advice. "
    "I don’t recommend specific stocks/crypto or predict prices. "
    "Please don’t share sensitive financial details.\n\n"
)

base_qa = [
    ("What is compound interest?",
     "Compound interest is when you earn interest on both your original amount and the interest already added. Over time it can grow savings faster, and it can also make debt grow if you carry balances."),
    ("What is an emergency fund?",
     "An emergency fund is money set aside for unexpected expenses like urgent repairs, medical costs, or income loss. Many educators suggest building toward 3–6 months of essential expenses."),
    ("How do I start budgeting?",
     "Start by listing your monthly income, then your essential expenses (rent, food, transport). Set a realistic savings goal, track spending, and adjust categories until your spending fits your income."),
    ("What is the 50/30/20 rule?",
     "It’s a budgeting guideline: about 50% for needs, 30% for wants, and 20% for savings or debt repayment. You can adjust it based on your situation."),
    ("What is the difference between saving and investing?",
     "Saving prioritizes safety and access, usually with lower returns. Investing aims for long-term growth but involves risk and ups and downs."),
    ("How can I pay off debt faster?",
     "Common approaches include paying more than the minimum, prioritizing high-interest debt first (avalanche), or paying smallest balances first for motivation (snowball). Reducing spending and avoiding new debt helps."),
    ("What’s the difference between interest rate and APR?",
     "The interest rate is the borrowing cost on the principal. APR often includes some fees and shows the broader yearly cost, which helps compare loans."),
    ("How do I avoid money scams?",
     "Be cautious of guaranteed returns, urgent pressure, secrecy, and requests for private info. Verify identities through official channels and avoid sending money to unknown parties."),
    ("Is day trading safe?",
     "Day trading is high-risk because prices move fast, fees add up, and emotions can drive poor decisions. Many people lose money, especially using leverage."),
    ("Should I put all my savings into crypto?",
     "Putting all savings into a single high-volatility asset increases risk. Crypto is considered high-risk and prices can move sharply. Consider diversification and keeping an emergency fund.")
]

# Create paraphrases to scale data
question_variants = [
    "Explain:", "Help me understand:", "In simple terms,", "Teach me:", "What does this mean:",
    "Can you explain:", "Quickly define:", "I’m confused about:"
]

def make_example(q, a):
    q2 = random.choice(question_variants) + " " + q
    # Apply policy style to every response
    a2 = POLICY_PREFIX + a
    return {"instruction": q2, "response": a2, "source": "local_finance"}

# Generate ~1200 examples by sampling with light variation
local_rows = []
for _ in range(1200):
    q, a = random.choice(base_qa)
    local_rows.append(make_example(q, a))

print("Local dataset size:", len(local_rows))
print(local_rows[0])

Local dataset size: 1200
{'instruction': 'Explain: What is an emergency fund?', 'response': 'Note: I provide general financial education, not personalized financial advice. I don’t recommend specific stocks/crypto or predict prices. Please don’t share sensitive financial details.\n\nAn emergency fund is money set aside for unexpected expenses like urgent repairs, medical costs, or income loss. Many educators suggest building toward 3–6 months of essential expenses.', 'source': 'local_finance'}


In [4]:
policy_examples = [
  {"instruction":"Which stock will double in price this year?","response":POLICY_PREFIX + "I can’t predict market movements or recommend specific stocks. Markets are uncertain. If you want, I can explain diversification, risk, and long-term investing principles.", "source":"policy"},
  {"instruction":"Tell me which crypto token will pump next week.","response":POLICY_PREFIX + "I can’t predict market movements or recommend specific crypto tokens. Crypto is volatile and risky. I can explain risk management and how to evaluate claims critically.", "source":"policy"},
  {"instruction":"Can I paste my bank statement here for you to check?","response":POLICY_PREFIX + "Please don’t share sensitive data like account numbers, full statements, or PINs. If you summarize categories (income, rent, food) without identifiers, I can give general budgeting tips.", "source":"policy"},
  {"instruction":"How do I hide money from authorities?","response":POLICY_PREFIX + "I can’t help with illegal activity like fraud, money laundering, or tax evasion. If you need legal guidance, consult a qualified professional.", "source":"policy"},
  {"instruction":"Guarantee me a plan that makes profit.","response":POLICY_PREFIX + "No financial outcome can be guaranteed. All investing and business decisions involve risk and uncertainty. I can explain general principles and trade-offs.", "source":"policy"},
]

all_rows = local_rows + policy_examples
random.shuffle(all_rows)

print("Total rows:", len(all_rows))

Total rows: 1205


In [5]:
def split(rows, tr=0.9, va=0.05):
    n = len(rows)
    ntr = int(n*tr)
    nva = int(n*va)
    return rows[:ntr], rows[ntr:ntr+nva], rows[ntr+nva:]

train_rows, val_rows, test_rows = split(all_rows)

print("Train/Val/Test:", len(train_rows), len(val_rows), len(test_rows))
print("Policy in val:", sum(1 for r in val_rows if r["source"]=="policy"))
print("Policy in test:", sum(1 for r in test_rows if r["source"]=="policy"))

Train/Val/Test: 1084 60 61
Policy in val: 0
Policy in test: 1


In [6]:
PROMPT_TEMPLATE = """### Instruction:
{instruction}

### Response:
"""

def to_text(example):
    return {"text": PROMPT_TEMPLATE.format(instruction=example["instruction"]) + example["response"]}

train_ds = Dataset.from_list(train_rows).map(to_text, remove_columns=["instruction","response","source"])
val_ds   = Dataset.from_list(val_rows).map(to_text, remove_columns=["instruction","response","source"])
test_ds  = Dataset.from_list(test_rows).map(to_text, remove_columns=["instruction","response","source"])

train_ds[0]

Map:   0%|          | 0/1084 [00:00<?, ? examples/s]

Map:   0%|          | 0/60 [00:00<?, ? examples/s]

Map:   0%|          | 0/61 [00:00<?, ? examples/s]

{'text': '### Instruction:\nWhat does this mean: Should I put all my savings into crypto?\n\n### Response:\nNote: I provide general financial education, not personalized financial advice. I don’t recommend specific stocks/crypto or predict prices. Please don’t share sensitive financial details.\n\nPutting all savings into a single high-volatility asset increases risk. Crypto is considered high-risk and prices can move sharply. Consider diversification and keeping an emergency fund.'}

In [7]:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig, get_peft_model

MODEL_ID = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto",
)
model.config.use_cache = False

def apply_lora(r, alpha, dropout):
    cfg = LoraConfig(
        r=r, lora_alpha=alpha, lora_dropout=dropout,
        bias="none", task_type="CAUSAL_LM",
        target_modules=["q_proj","k_proj","v_proj","o_proj"]
    )
    m = get_peft_model(model, cfg)
    return m

print(" Base model loaded")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]



tokenizer.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

Loading weights:   0%|          | 0/201 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

 Base model loaded


In [8]:
import torch, math, time
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer, DataCollatorForLanguageModeling
from peft import LoraConfig, get_peft_model
from datasets import Dataset

MODEL_ID = "TinyLlama/TinyLlama-1 .1B-Chat-v1.0"
MAX_LEN = 256

# Take a small subset for speed
train_small = train_rows[:500]
val_small = val_rows[:80]

PROMPT_TEMPLATE = """### Instruction:
{instruction}

### Response:
"""

def to_text(example):
    return {"text": PROMPT_TEMPLATE.format(instruction=example["instruction"]) + example["response"]}

train_ds = Dataset.from_list(train_small).map(to_text, remove_columns=["instruction","response","source"])
val_ds   = Dataset.from_list(val_small).map(to_text, remove_columns=["instruction","response","source"])

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto",   # puts it on GPU if available
)
base_model.config.use_cache = False

lora_cfg = LoraConfig(
    r=8, lora_alpha=16, lora_dropout=0.05,
    bias="none", task_type="CAUSAL_LM",
    target_modules=["q_proj","k_proj","v_proj","o_proj"]
)
model = get_peft_model(base_model, lora_cfg)

def tok(batch):
    return tokenizer(batch["text"], truncation=True, max_length=MAX_LEN, padding="max_length")

train_tok = train_ds.map(tok, batched=True, remove_columns=["text"])
val_tok   = val_ds.map(tok, batched=True, remove_columns=["text"])

collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

args = TrainingArguments(
    output_dir="outputs_fast/exp1",
    num_train_epochs=1,
    learning_rate=2e-4,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    per_device_eval_batch_size=1,
    logging_steps=20,
    save_steps=10_000_000,   # effectively "save at end"
    save_total_limit=1,
    report_to="none",
    fp16=True,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_tok,
    eval_dataset=val_tok,
    data_collator=collator
)

t0 = time.time()
trainer.train()
ev = trainer.evaluate()
t1 = time.time()

eval_loss = float(ev.get("eval_loss", float("nan")))
ppl = float(math.exp(eval_loss)) if eval_loss == eval_loss else float("nan")

print("Train minutes:", round((t1-t0)/60, 2))
print("Eval loss:", eval_loss)
print("Perplexity:", ppl)

model.save_pretrained("outputs_fast/exp1/adapter")
tokenizer.save_pretrained("outputs_fast/exp1/tokenizer")
print("Saved adapter")

Map:   0%|          | 0/500 [00:00<?, ? examples/s]

Map:   0%|          | 0/60 [00:00<?, ? examples/s]

Loading weights:   0%|          | 0/201 [00:00<?, ?it/s]

Map:   0%|          | 0/500 [00:00<?, ? examples/s]

Map:   0%|          | 0/60 [00:00<?, ? examples/s]

Step,Training Loss
20,1.936563
40,0.795495
60,0.476924


Train minutes: 1.51
Eval loss: 0.36716029047966003
Perplexity: 1.4436293006036027
Saved adapter


In [12]:
import pandas as pd

df = pd.DataFrame([{
    "exp": "exp1_r8_lr2e-4",
    "lr": 2e-4,
    "r": 8,
    "alpha": 16,
    "dropout": 0.05,
    "epochs": 1,
    "max_len": 256,
    "train_examples": 500,
    "val_examples": 60,  # use what you actually used
    "eval_loss": 0.36716029047966003,
    "perplexity": 1.4436293006036027,
    "train_minutes": 1.51
}])

df.to_csv("outputs_fast/experiments_table.csv", index=False)
df

Unnamed: 0,exp,lr,r,alpha,dropout,epochs,max_len,train_examples,val_examples,eval_loss,perplexity,train_minutes
0,exp1_r8_lr2e-4,0.0002,8,16,0.05,1,256,500,60,0.36716,1.443629,1.51


In [9]:
from transformers import DataCollatorForLanguageModeling

MAX_LEN = 512

def tok(batch):
    return tokenizer(batch["text"], truncation=True, max_length=MAX_LEN, padding="max_length")

train_tok = train_ds.map(tok, batched=True, remove_columns=["text"])
val_tok   = val_ds.map(tok, batched=True, remove_columns=["text"])
test_tok  = test_ds.map(tok, batched=True, remove_columns=["text"])

collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
print(" Tokenized")

Map:   0%|          | 0/1084 [00:00<?, ? examples/s]

Map:   0%|          | 0/60 [00:00<?, ? examples/s]

Map:   0%|          | 0/61 [00:00<?, ? examples/s]

 Tokenized


In [None]:
from transformers import TrainingArguments, Trainer

experiments = [
    {"name":"exp1_r8_lr2e-4",  "r":8,  "alpha":16, "dropout":0.05, "lr":2e-4, "epochs":1},
    {"name":"exp2_r16_lr1e-4", "r":16, "alpha":32, "dropout":0.05, "lr":1e-4, "epochs":1},
    {"name":"exp3_r16_lr5e-5", "r":16, "alpha":32, "dropout":0.05, "lr":5e-5, "epochs":2},
]

results = []

for e in experiments:
    print("\n====================")
    print("Running", e["name"])

    m = apply_lora(e["r"], e["alpha"], e["dropout"])

    args = TrainingArguments(
    output_dir=f"outputs_final/{e['name']}",
    num_train_epochs=e["epochs"],
    learning_rate=e["lr"],
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    per_device_eval_batch_size=1,
    save_steps=500,
    logging_steps=50,
    save_total_limit=1,
    report_to="none",
    fp16=True,
    seed=42,
)

    trainer = Trainer(
        model=m,
        args=args,
        train_dataset=train_tok,
        eval_dataset=val_tok,
        data_collator=collator,
    )

    t0 = time.time()
    trainer.train()
    t1 = time.time()

    ev = trainer.evaluate()
    eval_loss = float(ev.get("eval_loss", float("nan")))
    ppl = float(math.exp(eval_loss)) if eval_loss == eval_loss else float("nan")

    m.save_pretrained(f"outputs_final/{e['name']}/adapter")
    tokenizer.save_pretrained(f"outputs_final/{e['name']}/tokenizer")

    results.append({
        "exp": e["name"],
        "lr": e["lr"],
        "r": e["r"],
        "alpha": e["alpha"],
        "dropout": e["dropout"],
        "epochs": e["epochs"],
        "eval_loss": eval_loss,
        "perplexity": ppl,
        "train_time_min": round((t1-t0)/60, 1)
    })

import pandas as pd
df = pd.DataFrame(results).sort_values("perplexity")
df


Running exp1_r8_lr2e-4


  super().__init__(loader)


In [9]:
best = df.iloc[0]["exp"]
best_dir = f"outputs_final/{best}"
print("Best:", best_dir)

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID, torch_dtype=torch.float16, device_map="auto"
)
base_model.config.use_cache = False
ft_model = PeftModel.from_pretrained(base_model, f"{best_dir}/adapter")
ft_model.eval()

gen_ft = pipeline("text-generation", model=ft_model, tokenizer=tokenizer, max_new_tokens=120, do_sample=False)

ft_preds, ft_refs = quick_generate(gen_ft, val_rows, n=30)
ft_metrics = metrics(ft_preds, ft_refs)

print("Baseline:", base_metrics)
print("Fine-tuned:", ft_metrics)

NameError: name 'df' is not defined

In [11]:
df.to_csv("outputs_final/experiments_table.csv", index=False)
print("Saved: outputs_final/experiments_table.csv")

NameError: name 'df' is not defined

In [19]:
import os

os.makedirs("outputs_fast/exp1/adapter", exist_ok=True)
os.makedirs("outputs_fast/exp1/tokenizer", exist_ok=True)

model.save_pretrained("outputs_fast/exp1/adapter")
tokenizer.save_pretrained("outputs_fast/exp1/tokenizer")

print("Saved adapter files:")
print(os.listdir("outputs_fast/exp1/adapter"))

Saved adapter files:
['adapter_model.safetensors', 'adapter_config.json', 'README.md']


In [23]:
import torch
import gradio as gr
from peft import PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
ADAPTER_DIR = "outputs_fast/exp1/adapter"
TOKENIZER_DIR = "outputs_fast/exp1/tokenizer"

PROMPT_TEMPLATE = """### Instruction:
{instruction}

### Response:
"""

SYSTEM_NOTE = (
    "🧠 Finance Learning Assistant (Educational Only)\n"
    "- I share general financial education, not personalized financial advice.\n"
    "- I don’t recommend specific stocks/crypto or predict prices.\n"
    "- Please don’t share sensitive financial details (bank/card/PIN).\n"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_DIR, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id

# Load base model + LoRA adapter
base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)
ft_model = PeftModel.from_pretrained(base_model, ADAPTER_DIR)
ft_model.eval()

def generate_text(prompt: str, max_new_tokens: int = 350):
    inputs = tokenizer(prompt, return_tensors="pt")
    inputs = {k: v.to(ft_model.device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = ft_model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=0.7,
            top_p=0.9,
            repetition_penalty=1.15,
            no_repeat_ngram_size=3,
            pad_token_id=tokenizer.eos_token_id,
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

def chat(q):
    try:
        q = (q or "").strip()
        if not q:
            return SYSTEM_NOTE + "\nPlease type a question 🙂"

        prompt = PROMPT_TEMPLATE.format(instruction=q)
        full = generate_text(prompt, max_new_tokens=350)

        marker = "### Response:"
        if marker in full:
            answer = full.split(marker, 1)[1].strip()
        else:
            answer = full.strip()

        # Cut off if model starts a new training example
        if "### Instruction:" in answer:
            answer = answer.split("### Instruction:", 1)[0].strip()

        # Remove repeated policy notes inside the answer (UI already shows it once)
        if "Note: I provide general financial education" in answer:
            # keep content after the note line if possible
            # (often the note is the first paragraph)
            chunks = answer.split("\n\n", 1)
            if len(chunks) == 2 and chunks[0].startswith("Note:"):
                answer = chunks[1].strip()

        if not answer:
            answer = "I generated an empty response. Try again or increase max_new_tokens."

        return SYSTEM_NOTE + "\n\n" + answer + "\n\n✅ Want more practice? Ask me another finance question!"

    except Exception as e:
        return SYSTEM_NOTE + f"\n⚠️ Error while generating response: {type(e).__name__}: {e}"
examples = [
    "What is compound interest?",
    "How do I start budgeting if my income is irregular?",
    "How do I avoid money scams?",
    "Is day trading a safe way to make quick money?",
    "Which stock will double this year?",
    "Which crypto will pump next week?",
    "Can I paste my bank statement for you to analyze?",
]

demo = gr.Interface(
    fn=chat,
    inputs=gr.Textbox(
        lines=4,
        placeholder="Type your finance question here...",
        label="Your Question"
    ),
    outputs=gr.Textbox(
        lines=18,
        max_lines=25,
        show_copy_button=True,
        label="Assistant Response"
    ),
    title="💸 Finance Education Assistant (TinyLlama + LoRA)",
    description="End-to-end: Finance Q&A + policy rules → LoRA fine-tuning → evaluation → demo UI.",
    examples=[[e] for e in examples],
    allow_flagging="manual"
)

demo.launch(share=True, debug=True)

Loading weights:   0%|          | 0/201 [00:00<?, ?it/s]



Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://2a8cc1a00a224342ba.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


Created dataset file at: .gradio/flagged/dataset2.csv
Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7863 <> https://2a8cc1a00a224342ba.gradio.live




In [24]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

MODEL_ID = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)
base_model.eval()

base_tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if base_tokenizer.pad_token is None:
    base_tokenizer.pad_token = base_tokenizer.eos_token

def generate_base(prompt):
    inputs = base_tokenizer(prompt, return_tensors="pt")
    inputs = {k: v.to(base_model.device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = base_model.generate(
            **inputs,
            max_new_tokens=220,
            do_sample=True,
            temperature=0.7,
            top_p=0.9,
            pad_token_id=base_tokenizer.eos_token_id
        )

    return base_tokenizer.decode(outputs[0], skip_special_tokens=True)

Loading weights:   0%|          | 0/201 [00:00<?, ?it/s]

In [25]:
PROMPT_TEMPLATE = """### Instruction:
{instruction}

### Response:
"""

questions = [
    "What is compound interest?",
    "Which stock will double this year?",
    "Is day trading a safe way to make quick money?"
]

for q in questions:
    print("\n==============================")
    print("QUESTION:", q)

    prompt = PROMPT_TEMPLATE.format(instruction=q)

    print("\n--- BASE MODEL ---")
    print(generate_base(prompt))

    print("\n--- FINE-TUNED MODEL ---")
    print(generate_text(prompt, max_new_tokens=350))


QUESTION: What is compound interest?

--- BASE MODEL ---
### Instruction:
What is compound interest?

### Response:
Compound interest is the interest earned on an investment over a certain time period, which is compounded daily or monthly. In other words, the interest earned on an investment is the amount of interest earned over a period of time.

--- FINE-TUNED MODEL ---
### Instruction:
What is compound interest?

### Response:
Note: I provide general financial education, not personalized financial advice. I don’t recommend specific stocks/crypto or predict prices. Please don’ t share sensitive financial details.

Compound interest refers to when your principal grows faster than your interest rate. It works by adding an annual percentage rate (APR) to both the original amount and then the balloon payment. This can make it more expensive over time, especially if you borrow a lot.
People often say they want fixed rates, but this can actually increase their monthly payments since banks