<a href="https://www.kaggle.com/code/kajuyerim/lora-guide-on-llama3-1-8b-instruct?scriptVersionId=191364321" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# LoRA Guide About Sentimental Analysis on Financial Domain Using Llama3.1 8b-instruct
---

<font size="5">**What is LoRA?**</font> 
* #### Low Rank Adaptation: Freezing the pretrained model weights and injecting trainable rank decomposition matrices into each layer of the Transformer architecture.
* #### Main purpose is to lower the dimensions of the matrix by 
* #### Instead of updating all weights of a model, we only update the injected low rank matrices.

<font size="5">**Why Use LoRA?**</font> 
* #### Greatly reducing the number of trainable parameters (up to 10000 times).
* #### Reducing the GPU memory requirement (up to 3 times).

---
<font size="4">**! This guide is based on the LoRA paper (LoRA: Low-Rank Adaptation for Neural Networks)** (https://arxiv.org/abs/2106.09685#).</font>   
<font size="4">**! Any mistakes between this guide and the paper are due to my interpretation on the material.**</font>  
---


In [None]:
!pip install -U transformers datasets

We need to prepare the dataset and load it so that we can use it to train our model with LoRA.

In [None]:
# Re-import necessary libraries since the code execution state was reset.
from datasets import load_dataset

# Load and shuffle the dataset with a specified configuration
dataset = load_dataset("takala/financial_phrasebank", "sentences_allagree", split='train')
shuffled_dataset = dataset.shuffle(seed=50)

# Map numerical labels to string labels
label_mapping = {0: "negative", 1: "neutral", 2: "positive"}

# Format examples into "sentence" - "sentiment" format
formatted_examples = [
    f'"{example["sentence"]}" - "{label_mapping[example["label"]]}"' for example in shuffled_dataset
]

# Print the first 5 formatted examples
for example in formatted_examples[:10]:
    print(example)


In [None]:
from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")

login(token = hf_token)

In [None]:
from transformers import AutoTokenizer,AutoModelForCausalLM,pipeline
import torch

base_model = "/kaggle/input/llama-3.1/transformers/8b-instruct/1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
        base_model,
        return_dict=True,
        low_cpu_mem_usage=True,
        torch_dtype=torch.float16,
        device_map="auto",
        trust_remote_code=True,
)
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)