Finetuning the model on financial_phrasebank dataset, that consists of pairs of text-labels to classify financial-related sentences, if they are either <span style="color: red;">positive</span>, <span style="color: purple;">neutral</span> or <span style="color: green;">negative</span>.

# 1.Experimental Setup1

## 1.1 Setup

## 1.2 Training

### 1.2.1 Run code on CPU version

In [None]:
!pwd

In [1]:
!python ../../../peft_train.py \
--model_name ../../../pretrain_models/flan-t5-base \
--max_seq_len 2048 \
--group_by_length \
--max_steps 200 \
--dataset_name ../../../text-classification/financial_phrasebank \
--num_labels 3 \
--epochs 5 \
--learning_rate 1e-3 \
--per_device_train_batch_size 64 \
--per_device_eval_batch_size 64 \
--model_type SEQ_2_SEQ_LM \
--output_model_path ./result/flan-t5-base-financial-lora \
--bnb_4bit_compute_dtype float16 \
--enable_peft  True \
--load_in_8bit False \
--use_4b False


  warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
use AutoModelForSeq2SeqLM load  model.
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
trainable params: 7077888 || all params: 254655744 || trainable%: 2.779394601050114
tokenizer padding setting: </s>
Sentence: N-Viro operates processing facilities independently as well as in partnership with municipalities .
number of labels:3
Running tokenizer on dataset: 100%|█| 4075/4075 [00:00<00:00, 22218.71 examples/
Running tokenizer on da

### 1.2.2 Run code on GPU version
Load the model together with the adapter with few lines of code! Check the snippet below to load the adapter from the Hub and run the example evaluation.

In [None]:
!pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git
!pip install -q datasets bitsandbytes einops wandb evaluate
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/MyDrive/Colab Notebooks/llms-peft-cook-colab/experiments/flan-t5-base-lora/financial_phrasebank

In [None]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
!python ../../../peft_train.py \
--model_name google/flan-t5-base \
--max_seq_len 2048 \
--group_by_length \
--max_steps 200 \
--dataset_name ../../../text-classification/financial_phrasebank \
--num_labels 3 \
--epochs 5 \
--learning_rate 1e-3 \
--per_device_train_batch_size 64 \
--per_device_eval_batch_size 64 \
--model_type SEQ_2_SEQ_LM \
--output_model_path ./result/flan-t5-base-financial-lora \
--bnb_4bit_compute_dtype float32 \
--need_hyperparameters_search False \
--enable_peft  True\
--load_in_8bit \
--use_4b

# Load your adapter from the Hub

In [7]:
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

peft_model_id = "./result/flan-t5-cup-lora"
base_model_name_or_path = '.../../../pretrain_models/google-flan-t5-small'
config = PeftConfig.from_pretrained(peft_model_id)

# model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, torch_dtype="auto", device_map="auto")
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, torch_dtype="auto", device_map="cpu")
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

In [9]:
model.eval()
input_text = "In January-September 2009 , the Group 's net interest income increased to EUR 112.4 mn from EUR 74.3 mn in January-September 2008 ."
inputs = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(input_ids=inputs["input_ids"], max_new_tokens=10)

print("input sentence: ", input_text)
print(" output prediction: ", tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True))

input sentence:  In January-September 2009 , the Group 's net interest income increased to EUR 112.4 mn from EUR 74.3 mn in January-September 2008 .
 output prediction:  ['positive positive positive positive positive positive positive positive positive positive']


In [8]:
model.eval()
input_text = "In January-September 2009 , the Group 's net interest income increased to EUR 112.4 mn from EUR 74.3 mn in January-September 2008 ."
inputs = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(input_ids=inputs["input_ids"], max_new_tokens=10)

print("input sentence: ", input_text)
print(" output prediction: ", tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True))

input sentence:  In January-September 2009 , the Group 's net interest income increased to EUR 112.4 mn from EUR 74.3 mn in January-September 2008 .
 output prediction:  ['positive positive positive positive positive positive positive positive positive positive']


## 1.3 Experimental Result

In [None]:
class HTMLRender:
    def __init__(self,html_str):
        self.html_str =html_str
    def _repr_html_(self):
       return self.html_str


In [None]:
model_accurcy_html = '''
<table>
  <tr>
    <th>eval loss</th>
    <th>eval accuracy</th>
    <th>eval precision</th>
    <th>eval recall</th>
    <th>eval f1</th>
  </tr>
  <tr>
    <td style="background-color:#4C72B0;">0.102</td>
    <td style="background-color:#55A868;">0.896</td>
    <td style="background-color:#C44E52;">0.90</td>
     <td style="background-color:#8172B2;">0.896</td>
    <td style="background-color:#64B5CD;">0.894</td>
  </tr>

</table>

       '''
HTMLRender(model_accurcy_html)

In [None]:
accuracy_html = '''
<img src="./image/flan-t5-small-accuracy.png" alt="flan-t5-small-accuracy" width="70%">
'''
HTMLRender(accuracy_html)

In [None]:
## Load your adapter from the Hub
