# Install Necessary Dependencies

In [1]:
!pip install git+https://github.com/huggingface/transformers.git
!pip install -q transformers accelerate peft datasets
!pip install -U bitsandbytes trl

Collecting git+https://github.com/huggingface/transformers.git
  Cloning https://github.com/huggingface/transformers.git to /tmp/pip-req-build-r426tka8
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-req-build-r426tka8

  Resolved https://github.com/huggingface/transformers.git to commit ba29a439adbe6f371710d0514659127264ae24b3
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting filelock (from transformers==4.49.0.dev0)
  Downloading filelock-3.17.0-py3-none-any.whl.metadata (2.9 kB)
Collecting huggingface-hub<1.0,>=0.24.0 (from transformers==4.49.0.dev0)
  Downloading huggingface_hub-0.28.1-py3-none-any.whl.metadata (13 kB)
Collecting numpy>=1.17 (from transformers==4.49.0.dev0)
  Downloading numpy-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Collecting pyyaml

In [2]:
!pip install ipywidgets

Collecting ipywidgets
  Downloading ipywidgets-8.1.5-py3-none-any.whl.metadata (2.3 kB)
Collecting widgetsnbextension~=4.0.12 (from ipywidgets)
  Downloading widgetsnbextension-4.0.13-py3-none-any.whl.metadata (1.6 kB)
Collecting jupyterlab-widgets~=3.0.12 (from ipywidgets)
  Downloading jupyterlab_widgets-3.0.13-py3-none-any.whl.metadata (4.1 kB)
Downloading ipywidgets-8.1.5-py3-none-any.whl (139 kB)
Downloading jupyterlab_widgets-3.0.13-py3-none-any.whl (214 kB)
Downloading widgetsnbextension-4.0.13-py3-none-any.whl (2.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m136.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: widgetsnbextension, jupyterlab-widgets, ipywidgets
Successfully installed ipywidgets-8.1.5 jupyterlab-widgets-3.0.13 widgetsnbextension-4.0.13


# Import

In [3]:
import os
import torch
import warnings
import transformers
from trl import SFTTrainer
from datasets import load_dataset
from peft import LoraConfig, get_peft_model, TaskType, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from huggingface_hub import HfApi, HfFolder, Repository, notebook_login

### Login to Hugging Face using Access Token

In [4]:
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

# Load your Fine-Tuned Model from Hugging Face(models)

In [6]:
model_name = "sagarvk24/Meta-Llama-3.1-8B-Finance-FineTune-Sagar"

## Perform Quantization for efficient resource utilisation

In [7]:
# Define quantization configuration
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,  # Enable 4-bit quantization
    bnb_4bit_compute_dtype=torch.float16,  # Use FP16 for computation
    bnb_4bit_use_double_quant=True,  # Further reduces memory usage
)

In [8]:
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the model with quantization
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=quant_config,
    device_map="auto"  # Automatically assigns model to available GPU
)

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:  34%|###3      | 1.68G/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

# Let's Check Inferences

In [None]:
input_text = "What are the key financial trends in 2025?"
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).to("cuda")
output = model.generate(**inputs, max_new_tokens=50, temperature=0.7, top_p=0.9)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True).strip()

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


In [13]:
print(decoded_output)

What are the key financial trends in 2025? 2025 is expected to be a year of economic growth and recovery for the US and global economies. The Federal Reserve will likely continue to raise interest rates to manage inflation and maintain economic stability. The European Union is expected to face significant economic challenges due


In [14]:
# Test inference
input_text = "Tell me about SIP in easy manner."
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).to("cuda")
output = model.generate(**inputs, max_new_tokens=50, temperature=0.7, top_p=0.9)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True).strip()

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


In [15]:
print(decoded_output)

Tell me about SIP in easy manner. SIP is a way of buying a small amount of shares on a regular basis. This is a long-term investment strategy. SIP is a way of buying a small amount of shares on a regular basis. This is a long-term investment strategy. SIP stands


In [16]:
# Test inference
input_text = "What are mutual funds? Are they safe?"
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).to("cuda")
output = model.generate(**inputs, max_new_tokens=50, temperature=0.7, top_p=0.9)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True).strip()

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


In [17]:
print(decoded_output)

What are mutual funds? Are they safe? Can I make money with them?
Mutual funds are a type of investment that pools money from many investors to invest in a variety of assets such as stocks, bonds, or other securities. They are managed by professional fund managers who aim to generate returns
