In [1]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes

In [2]:
from unsloth import FastLanguageModel
import torch
from datasets import load_dataset
from trl import SFTTrainer
from transformers import TrainingArguments

ðŸ¦¥ Unsloth: Will patch your computer to enable 2x faster free finetuning.
ðŸ¦¥ Unsloth Zoo will now patch everything to make training faster!


# Configuration

In [3]:
max_seq_length = 2048
dtype = None
load_in_4bit = True

# Load Llama-3 Model

In [4]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

==((====))==  Unsloth 2025.12.10: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.5.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.33.post2. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/198 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

# LoRA adapters

In [5]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407,
    use_rslora = False,
    loftq_config = None,
)

Unsloth 2025.12.10 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


# Load & Format Dataset

In [6]:
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs       = examples["input"]
    outputs      = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }

dataset = load_dataset("gbharti/finance-alpaca", split = "train")
dataset = dataset.map(formatting_prompts_func, batched = True)

README.md:   0%|          | 0.00/831 [00:00<?, ?B/s]

Cleaned_date.json:   0%|          | 0.00/42.9M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/68912 [00:00<?, ? examples/s]

Map:   0%|          | 0/68912 [00:00<?, ? examples/s]

In [49]:
sample = dataset[100]

print(f"INSTRUCTION: {sample['instruction']}")
print(f"INPUT:       {sample['input']}")
print(f"OUTPUT:      {sample['output']}")
print(f"Formated Text:{sample['text']}")

INSTRUCTION: Can a credit card company raise my rates for making a large payment?
INPUT:       
OUTPUT:      No. Credit card companies will typically not care about your individual credit card account. Instead they look either at a "package" of card accounts opened at roughly the same time, or of "slices" of cardholder accounts by credit rating. If an entire package's or slice's balance drops significantly, they'll take a look, and will adjust rates accordingly (often they may actually decrease rates as an incentive to increase you use of the card). Because credit card debt is unstructured debt, the bank cannot impose an "early payment penalty" of any kind (there's no schedule for paying it off, so there's no way to prove that they're missing out on $X in interest because you paid early). Generally, banks don't like CC debt anyway; it's very risky debt, and they often end up writing large balances off for pennies on the dollar. So, when you pay down your balance by a significant amount

In [50]:
for i in range(1,11):
    sample = dataset[i]

    print(f"SAMPLE {i}")
    print(f"INSTRUCTION: {sample['instruction']}")
    print(f"INPUT:       {sample['input']}")
    print(f"OUTPUT:      {sample['output']}")
    print(f"Formatted Text (First 100 chars): {sample['text'][:100]}...")
    print("-" * 80)

SAMPLE 1
INSTRUCTION: Why does it matter if a Central Bank has a negative rather than 0% interest rate?
INPUT:       
OUTPUT:      That is kind of the point, one of the hopes is that it incentivizes banks to stop storing money and start injecting it into the economy themselves. Compared to the European Central Bank investing directly into the economy the way the US central bank has been doing. (The Federal Reserve buying mortgage backed securities) On a country level, individual European countries have tried this before in recent times with no noticeable effect.
Formatted Text (First 100 chars): Below is an instruction that describes a task, paired with an input that provides further context. W...
--------------------------------------------------------------------------------
SAMPLE 2
INSTRUCTION: Where should I be investing my money?
INPUT:       
OUTPUT:      Pay off your debt.  As you witnessed, no "investment" % is guaranteed.  But your debt payments are... so if you have cash, th

# Model Training

In [7]:
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False,
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 60,
        learning_rate = 2e-4,
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
    ),
)

trainer_stats = trainer.train()

Map (num_proc=2):   0%|          | 0/68912 [00:00<?, ? examples/s]

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 68,912 | Num Epochs = 1 | Total steps = 60
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040 of 8,072,204,288 (0.52% trained)
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice:

 3


wandb: You chose "Don't visualize my results"


wandb: Detected [openai] in use.
wandb: Use W&B Weave for improved LLM call tracing. Install Weave with `pip install weave` then add `import weave` to the top of your script.
wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
1,2.8967
2,2.7196
3,2.6699
4,2.5344
5,2.1299
6,2.3296
7,2.3357
8,2.0181
9,2.0867
10,1.9117


# Saving the trained model

In [41]:
model.save_pretrained("finetuned_model")
tokenizer.save_pretrained("finetuned_model")

('finetuned_model/tokenizer_config.json',
 'finetuned_model/special_tokens_map.json',
 'finetuned_model/tokenizer.json')

# Inference

In [None]:
from unsloth import FastLanguageModel

max_seq_length = 2048
dtype = None
load_in_4bit = True

model_name = "finetuned_model"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)

In [43]:
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

In [9]:
inputs = tokenizer(
[
    alpaca_prompt.format(
        "Analyze the sentiment of this headline.",
        "Operating profit rose to EUR 13.5 mn from EUR 10.5 mn in 2007.",
        "",
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)

In [16]:
resp = tokenizer.batch_decode(outputs)

In [18]:
resp

['<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nAnalyze the sentiment of this headline.\n\n### Input:\nOperating profit rose to EUR 13.5 mn from EUR 10.5 mn in 2007.\n\n### Response:\nPositive. The company is doing well.<|end_of_text|>']

In [25]:
resp = resp[0].split("Response:")[1].replace("<|end_of_text|>","")
print(resp)


Positive. The company is doing well.


In [28]:
def inference(INS, INP):
  inputs = tokenizer(
  [
    alpaca_prompt.format(
        INS,
        INP,
        "",
    )
  ], return_tensors = "pt").to("cuda")

  outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
  resp = tokenizer.batch_decode(outputs)
  resp = resp[0].split("Response:")[1].replace("<|end_of_text|>","").strip()
  return resp

In [29]:
inference(
    "Analyze the sentiment of this headline.",
    "The company's net profit dropped by 22% to $1.2 million due to rising operational costs."
)

'The headline\'s sentiment is negative, as it indicates that the company\'s financial performance has declined. The negative sentiment is reinforced by the use of words such as "dropped" and "due to rising operational costs."'

In [30]:
inference(
    "Categorize this financial news.",
    "TechGiant Inc. announces the acquisition of AI startup DataFlow for $450 million to boost its cloud division."
)

'TechGiant Inc. announces the acquisition of AI startup DataFlow for $450 million to boost its cloud division. This is a significant move for TechGiant, as it looks to expand its presence in the rapidly growing cloud computing market. DataFlow is a leading provider of AI-powered data analytics solutions, and its acquisition'

In [31]:
inference(
    "What is the outlook described in this text?",
    "The board is currently reviewing the fiscal policy for 2025, with no immediate changes expected to the dividend payout."
)

'The outlook described in this text is that the board is currently reviewing the fiscal policy for 2025, with no immediate changes expected to the dividend payout.'

In [32]:
inference(
    "Analyze the sentiment.",
    "Shares of BioPharma soared 15% in pre-market trading following FDA approval of their new drug."
)

'The sentiment is positive. The FDA approval of the new drug is a positive development for the company.'

In [46]:
inference(
    "Analyze the financial sentiment.",
    "Although Q3 revenue grew 5% year-over-year, the company missed analyst estimates by $0.10 per share."
)

'The financial sentiment is negative, as the company missed analyst estimates.'