# Fine-tuning Llama 2 for News Category (Evaluation)

Inspired by [Kshitiz Sahay's blog](https://medium.com/@kshitiz.sahay26/fine-tuning-llama-2-for-news-category-prediction-a-step-by-step-comprehensive-guide-to-fine-tuning-48c06dee28a9)

step-by-step tutorial for fine-tuning any LLM (Large Language Model). 

This guide will be divided into two parts:

**Part 3: Evaluate Model**
1. Load the model
2. Creating test text

In [1]:
from peft import get_peft_config, PeftModel, PeftConfig, get_peft_model, LoraConfig, TaskType
from transformers import AutoModelForCausalLM
from transformers import LlamaTokenizer, LlamaForCausalLM
import torch
from transformers.trainer_callback import TrainerCallback
import os
import pandas as pd
from transformers import BitsAndBytesConfig
from trl import SFTTrainer
import mlflow

# MAGIC %pip install transformers==4.31.0 datasets==2.13.0 peft==0.4.0 accelerate==0.21.0 bitsandbytes==0.40.2 trl==0.4.7

Setting ds_accelerator to cuda (auto detect)


In [2]:
"""
if torch.cuda.is_available():
    # Get the number of CUDA devices
    device_count = torch.cuda.device_count()
    print(f"CUDA is available with {device_count} CUDA device(s)!")
    
    # Get the name of each CUDA device
    for i in range(device_count):
        print(f"Device {i}: {torch.cuda.get_device_name(i)}")
else:
    print("CUDA is not available. Running on CPU.")
"""
import wandb

wandb.init(mode="disabled")

[34m[1mwandb[0m: Currently logged in as: [33mseastii[0m ([33mseastii-ByteDance[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Tracking run with wandb version 0.17.4
[34m[1mwandb[0m: Run data is saved locally in [35m[1m/root/.jupyter/connection/wandb/run-20240708_040455-5evkl7w8[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mlora[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/seastii-ByteDance/hd-test[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/seastii-ByteDance/hd-test/runs/5evkl7w8[0m


1


In [8]:
#encoded_input = tokenizer("Sample input sequence", return_tensors='pt', padding=True, truncation=True)
#print(encoded_input)
BASE_DIR = '/mlx_devbox/users/haidong.shao/playground/'

In [9]:
model_path = 'openlm-research/open_llama_3b_v2'

In [10]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_path)
#tokenizer = LlamaTokenizer.from_pretrained(model_path)
tokenizer.add_special_tokens({'pad_token': '[PAD]'})

1

In [11]:
model = LlamaForCausalLM.from_pretrained(
    model_path, #load_in_8bit=True, 
    device_map='auto',
)

In [16]:
peft_model_id = BASE_DIR + 'results/checkpoint-250'
peft_model = PeftModel.from_pretrained(model, peft_model_id)

## New Model
Copy some text from google news

In [26]:
test_strings = ["The result of Sunday’s parliamentary election runoff comes as a huge surprise, with France appearing to be on the verge of a major political shift – but not the one everyone was expecting.No pollster predicted before Sunday that a left-wing alliance would win and that the far right would come in third place. This is a shocking reversal of the outcome of the first round of voting, if tonight’s results match the projections. For now, France seems ungovernable. With no party projected to get close to clinching a majority, the parliament will be in a state of paralysis, split between three blocs.",
"Earlier this year, the South Korean tech giant, which is the largest phone maker in the world by shipment volume, announced the Galaxy Ring, its first finger-worn health tracker and a direct competitor to Oura's popular Oura Ring wellness device. With Samsung's Unpacked event coming up on July 10, we're expecting to learn a lot more about the company's big challenger to the Oura Ring, including its price, when it'll launch and additional details on its health-tracking capabilities. Given Samsung's massive presence in the consumer electronics space, you'd think Oura would be concerned about the Galaxy Ring's arrival. ",
"On the first day of production on the New Zealand location of “Avatar: The Way of Water,” actor Cliff Curtis asked if he could bring his family to the film’s home base to give a traditional blessing. Curtis showed up with 43 people and led an elaborate Maori blessing in front of the entire crew, then gave gifts to the crew members. Producer Jon Landau’s gift was a carved wooden oar — as Landau told TheWrap a couple of years later, “It was to help steer the ship as we were going into production. I still have it hanging on my wall"]

predictions = []
for test in test_strings:
  prompt = """Below is an instruction that describes a task. Write a response that appropriately completes the request..

  ### Instruction:
  Categorize the news article into one of the 5 categories:\n\ntech\nbusiness\nsport\nentertainment\npolitics
  
  Input:
  {}

  ### Response:""".format(test)
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to('cuda')

  generation_output = model.generate(
      input_ids=input_ids, max_new_tokens=156
  )
  predictions.append(tokenizer.decode(generation_output[0]))

In [28]:
def extract_response_text(input_string):
    start_marker = '### Response:'
    end_marker = '###'
    
    start_index = input_string.find(start_marker)
    if start_index == -1:
        return None
    
    start_index += len(start_marker)
    
    end_index = input_string.find(end_marker, start_index)
    if end_index == -1:
        return input_string[start_index:]
    
    return input_string[start_index:end_index].strip()
for i in range(3): 
  pred = predictions[i]
  #print(pred)
  text = test_strings[i]
  print(text+'\n')
  print(extract_response_text(pred))
  print('--------')

The result of Sunday’s parliamentary election runoff comes as a huge surprise, with France appearing to be on the verge of a major political shift – but not the one everyone was expecting.No pollster predicted before Sunday that a left-wing alliance would win and that the far right would come in third place. This is a shocking reversal of the outcome of the first round of voting, if tonight’s results match the projections. For now, France seems ungovernable. With no party projected to get close to clinching a majority, the parliament will be in a state of paralysis, split between three blocs.

politics
--------
Earlier this year, the South Korean tech giant, which is the largest phone maker in the world by shipment volume, announced the Galaxy Ring, its first finger-worn health tracker and a direct competitor to Oura's popular Oura Ring wellness device. With Samsung's Unpacked event coming up on July 10, we're expecting to learn a lot more about the company's big challenger to the Oura

## open_llama_3b_v2
I can not find a good prompt to generate reasonable outputs

In [33]:
model_path = 'openlm-research/open_llama_3b_v2'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(model_path, load_in_8bit=True, device_map='auto',)

test_strings = ["The result of Sunday’s parliamentary election runoff comes as a huge surprise, with France appearing to be on the verge of a major political shift – but not the one everyone was expecting.No pollster predicted before Sunday that a left-wing alliance would win and that the far right would come in third place. This is a shocking reversal of the outcome of the first round of voting, if tonight’s results match the projections. For now, France seems ungovernable. With no party projected to get close to clinching a majority, the parliament will be in a state of paralysis, split between three blocs.",
"Earlier this year, the South Korean tech giant, which is the largest phone maker in the world by shipment volume, announced the Galaxy Ring, its first finger-worn health tracker and a direct competitor to Oura's popular Oura Ring wellness device. With Samsung's Unpacked event coming up on July 10, we're expecting to learn a lot more about the company's big challenger to the Oura Ring, including its price, when it'll launch and additional details on its health-tracking capabilities. Given Samsung's massive presence in the consumer electronics space, you'd think Oura would be concerned about the Galaxy Ring's arrival. ",
"On the first day of production on the New Zealand location of “Avatar: The Way of Water,” actor Cliff Curtis asked if he could bring his family to the film’s home base to give a traditional blessing. Curtis showed up with 43 people and led an elaborate Maori blessing in front of the entire crew, then gave gifts to the crew members. Producer Jon Landau’s gift was a carved wooden oar — as Landau told TheWrap a couple of years later, “It was to help steer the ship as we were going into production. I still have it hanging on my wall"]


In [34]:
predictions = []
for test in test_strings:
    prompt = 'Q: Categorize the following text into one of the 5 categories:\n\ntech\nbusiness\nsport\nentertainment\npolitics\n\n ' + test + '\n The category of above text is:'
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to('cuda')
    generation_output = model.generate(
        input_ids=input_ids, max_new_tokens=156
    )
    predictions.append(tokenizer.decode(generation_output[0]))

for i in range(3): 
  pred = predictions[i]
  print(pred)
  print('--------')

<s>Q: Categorize the following text into one of the 5 categories:

tech
business
sport
entertainment
politics

 The result of Sunday’s parliamentary election runoff comes as a huge surprise, with France appearing to be on the verge of a major political shift – but not the one everyone was expecting.No pollster predicted before Sunday that a left-wing alliance would win and that the far right would come in third place. This is a shocking reversal of the outcome of the first round of voting, if tonight’s results match the projections. For now, France seems ungovernable. With no party projected to get close to clinching a majority, the parliament will be in a state of paralysis, split between three blocs.
 The category of above text is:


*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*


--------
<s>Q: Categorize the following text into one of the 5 categories:

tech
business
sport
ente