## Fine-tuning Phi-3-mini-QLoRA

Based on Phi-3Cookbook https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/Phi-3-finetune-qlora-python.ipynb

Install required packages

In [7]:
# from IPython.display import clear_output
# !pip install -qqq --upgrade bitsandbytes transformers peft accelerate datasets trl flash_attn
# !pip install huggingface_hub
# !pip install python-dotenv
# !pip install wandb -qqq
# !pip install absl-py nltk rouge_score
# !pip list | grep transformers
# clear_output()

In [1]:
from IPython.display import clear_output
!pip install peft==0.8.2
!pip install bitsandbytes==0.42.0
!pip install accelerate==0.26.1
!pip install datasets==2.16.1
!pip install GPUtil
!pip install transformers==4.38.0
!pip install huggingface-hub
!pip install trl
clear_output()

Import packages

In [2]:
import torch
from random import randrange
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Template Exploration

In [None]:
model_name = 'acorreal/phi3-mental-health'
adapter_name = 'acorreal/adapter-phi-3-mini-mental-health'
compute_dtype = torch.bfloat16

: 

In [None]:
%%time

# Load model
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype=compute_dtype)
model = PeftModel.from_pretrained(model, adapter_name)
model = model.merge_and_unload()

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(adapter_name)

# Print model name
print("Model:", model.name_or_path)

: 

In [None]:
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

: 

In [None]:
# Test the template
pipe.tokenizer.apply_chat_template([
    {
        "role": "user",
        "content": "Hello, I am stressed"}
    ],
    tokenize=False,
    add_generation_prompt=True
)

: 

In [None]:
def predict(prompt: str) -> str:
    prompt = pipe.tokenizer.apply_chat_template([{"role": "user", "content": prompt}], tokenize=False, add_generation_prompt=True)
    outputs = pipe(prompt, max_new_tokens=256, do_sample=True, num_beams=1, temperature=0.3, top_k=50, top_p=0.95, max_time= 180)
    print(outputs)
    return outputs[0]['generated_text'][len(prompt):].strip()

: 

In [None]:
%%time
predict("i am going through some things with my feelings and myself i barely sleep and i do nothing but think about how i am worthless and how i should not be here i have never tried or contemplated suicide i have always wanted to fix my issues but i never get around to it how can i change my feeling of being worthless to everyone")

: 

# Testing Generated Responses

## Preparing the dataset

In [3]:
import pandas as pd
from datasets import Dataset

In [5]:
df = pd.read_csv('dataset.csv')
df.columns = ['input', 'output']
df['instruction'] = "You are a mental health assistant. Your job is to provide emotional support, actively listen, and offer practical suggestions for well-being. Respond empathically and do not give specific medical advice or diagnoses. Always make sure the user feels heard and supported. If the user mentions suicidal thoughts, encourage them to seek professional help immediately. Here's the conversation so far:\n\n"
df.head()

Unnamed: 0,input,output,instruction
0,i am going through some things with my feeling...,if everyone thinks you are worthless then mayb...,You are a mental health assistant. Your job is...
1,i am going through some things with my feeling...,hello and thank you for your question and seek...,You are a mental health assistant. Your job is...
2,i am going through some things with my feeling...,first thing i would suggest is getting the sle...,You are a mental health assistant. Your job is...
3,i am going through some things with my feeling...,therapy is essential for those that are feelin...,You are a mental health assistant. Your job is...
4,i am going through some things with my feeling...,i first want to let you know that you are not ...,You are a mental health assistant. Your job is...


In [6]:
# Load the dataset
dataset = Dataset.from_pandas(df)
dataset

Dataset({
    features: ['input', 'output', 'instruction'],
    num_rows: 2747
})

In [7]:
def create_message_column(row):
    messages = []
    user = {
        "content": f"{row['instruction']}\n Input: {row['input']}",
        "role": "user"
    }
    messages.append(user)
    assistant = {
        "content": f"{row['output']}",
        "role": "assistant"
    }
    messages.append(assistant)
    return {"messages": messages}

def format_dataset_chatml(row):
    return {"text": tokenizer.apply_chat_template(row["messages"], add_generation_prompt=False, tokenize=False)}

In [8]:
model_id = "microsoft/Phi-3-mini-4k-instruct"

In [9]:
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side = 'right' # to prevent warnings



tokenizer_config.json:   0%|          | 0.00/3.17k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/568 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [10]:
dataset_chatml = dataset.map(create_message_column)
dataset_chatml = dataset_chatml.map(format_dataset_chatml)

Map:   0%|          | 0/2747 [00:00<?, ? examples/s]

Map:   0%|          | 0/2747 [00:00<?, ? examples/s]

In [11]:
dataset_chatml

Dataset({
    features: ['input', 'output', 'instruction', 'messages', 'text'],
    num_rows: 2747
})

In [12]:
dataset_chatml[0]

{'input': 'i am going through some things with my feelings and myself i barely sleep and i do nothing but think about how i am worthless and how i should not be here i have never tried or contemplated suicide i have always wanted to fix my issues but i never get around to it how can i change my feeling of being worthless to everyone',
 'output': 'if everyone thinks you are worthless then maybe you need to find new people to hang out withseriously the social context in which a person lives is a big influence in selfesteemotherwise you can go round and round trying to understand why you are not worthless then go back to the same crowd and be knocked down againthere are many inspirational messages you can find in social media maybe read some of the ones which state that no person is worthless and that everyone has a good purpose to their lifealso since our culture is so saturated with the belief that if someone does not feel good about themselves that this is somehow terriblebad feelings 

In [13]:
dataset_chatml = dataset_chatml.train_test_split(test_size=0.05, seed=1234)
dataset_chatml

DatasetDict({
    train: Dataset({
        features: ['input', 'output', 'instruction', 'messages', 'text'],
        num_rows: 2609
    })
    test: Dataset({
        features: ['input', 'output', 'instruction', 'messages', 'text'],
        num_rows: 138
    })
})

## Loading the models

In [18]:
original_model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)



config.json:   0%|          | 0.00/931 [00:00<?, ?B/s]

configuration_phi3.py:   0%|          | 0.00/10.4k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct:
- configuration_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_phi3.py:   0%|          | 0.00/73.8k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct:
- modeling_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors.index.json:   0%|          | 0.00/16.3k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.67G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/172 [00:00<?, ?B/s]

In [14]:
finetuned_model_id = 'acorreal/phi3-mental-health'
finetuned_model = AutoModelForCausalLM.from_pretrained(finetuned_model_id, trust_remote_code=True)

config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

configuration_phi3.py:   0%|          | 0.00/10.4k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct:
- configuration_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_phi3.py:   0%|          | 0.00/73.8k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct:
- modeling_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes`

In [20]:
original_pipeline = pipeline("text-generation", model=original_model, tokenizer=tokenizer)

In [27]:
finetuned_pipeline = pipeline("text-generation", model=finetuned_model, tokenizer=tokenizer)

In [22]:
def predict(prompt, pipeline, tokenizer):
    prompt = pipeline.tokenizer.apply_chat_template([{"role": "user", "content": prompt}], tokenize=False, add_generation_prompt=True)
    outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, num_beams=1, temperature=0.3, top_k=50, top_p=0.95, max_time= 180)
    return outputs[0]['generated_text'][len(prompt):].strip()

In [24]:
predict(dataset_chatml['test'][0]['messages'][0]['content'], original_pipeline, tokenizer) #Original Model

'I\'m truly sorry to hear that you\'re feeling this way, but I\'m unable to provide the help that you need. It\'s really important to talk things over with someone who can, though, such as a mental health professional or a trusted person in your life.\n\n\nIt\'s clear that you\'re seeking a healthy and secure relationship, and it\'s understandable that you\'re looking for change. Recognizing the patterns in your relationships is a significant first step. Here are some suggestions that might help you move forward:\n\n\n1. Self-Reflection: Take some time to reflect on your past relationships and consider what you truly want from a partner. Understanding your needs and boundaries can help you communicate them more effectively in future relationships.\n\n\n2. Personal Growth: Engage in activities that promote self-growth and independence. This can help you feel more comfortable being alone and lessen the need for someone else\'s presence to feel secure.\n\n\n3. Boundaries: Learn to set and

In [28]:
predict(dataset_chatml['test'][0]['messages'][0]['content'], finetuned_pipeline, tokenizer) #Finetuned Model

"I'm really sorry to hear that you're feeling this way, but I'm unable to provide the help that you need. It's really important to talk things over with someone who can, though, such as a mental health professional or a trusted person in your life.\n\n\nIt sounds like you're in a difficult situation, and it's understandable that you're seeking a secure and reciprocal relationship. Recognizing patterns in our behavior is a significant step towards change. Here are some general suggestions that might help you reflect on your situation:\n\n\n1. Self-Reflection: Take some time to understand what you're looking for in a relationship and why you might be drawn to certain types of people. Reflect on past relationships and consider what you've learned from them.\n\n\n2. Boundaries: It's essential to establish and maintain healthy boundaries. This means being clear about what you're comfortable with and not allowing others to dictate your actions or emotions.\n\n\n3. Self-Care: Focus on activit

In [36]:
# load all model responses
import json
def load_responses(dataset, path, pipeline, tokenizer):
  counter = 0
  result = []
  for chat in dataset:
      gen_text = predict(chat['messages'][0]['content'], pipeline, tokenizer)
      input = chat['input']
      print(f"Got prediction: {counter}")
      result.append((input, gen_text))
      if counter % 10 == 0:
        with open(path, 'w') as f:
            print(f"Saving {len(result)} responses in file")
            json.dump(result, f)
      counter += 1
  return result


In [None]:
responses_fine_tuned_model = load_responses(dataset_chatml['test'], 'fine_tuned_model0.json', finetuned_pipeline, tokenizer)

Got prediction: 0
Saving 1 responses in file
Got prediction: 1
Got prediction: 2
Got prediction: 3
Got prediction: 4
Got prediction: 5
Got prediction: 6
Got prediction: 7
Got prediction: 8
Got prediction: 9
Got prediction: 10
Saving 11 responses in file
Got prediction: 11
Got prediction: 12
Got prediction: 13
Got prediction: 14


In [None]:
responses_original_model = load_responses(dataset_chatml['test'], 'original_model.json', original_pipeline, tokenizer)