<a href="https://colab.research.google.com/github/madhavdasm/Model-FineTuning/blob/main/Mymodel_TinyLlama_QLoRA_Support_Bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🚀 Fine-Tune TinyLlama-1.1B with QLoRA on Support Chat Data

In [29]:
!pip install -q bitsandbytes accelerate datasets loralib peft transformers trl


In [2]:
import torch
from datasets import Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training


In [30]:
model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
tokenizer.pad_token = tokenizer.eos_token
model.config.use_cache = False


In [31]:
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)


In [38]:
import pandas as pd
from datasets import Dataset

# Step 1: Read from output.csv
df = pd.read_csv("Mental_Health_FAQ.csv")  # Make sure 'question' and 'answer' columns are present

# Step 2: Convert DataFrame to Hugging Face Dataset
dataset = Dataset.from_pandas(df)

# Step 3: Define formatting function
def format_instruction(example):
    return f"### Instruction:\n{example['Questions']}\n\n### Response:\n{example['Answers']}"

# Step 4: Apply formatting
dataset = dataset.map(lambda x: {"text": format_instruction(x)})

# Optional: Preview
print(dataset[0])


Map:   0%|          | 0/98 [00:00<?, ? examples/s]

{'Question_ID': 1590140, 'Questions': 'What does it mean to have a mental illness?', 'Answers': 'Mental illnesses are health conditions that disrupt a personâ€™s thoughts, emotions, relationships, and daily functioning. They are associated with distress and diminished capacity to engage in the ordinary activities of daily life.\nMental illnesses fall along a continuum of severity: some are fairly mild and only interfere with some aspects of life, such as certain phobias. On the other end of the spectrum lie serious mental illnesses, which result in major functional impairment and interference with daily life. These include such disorders as major depression, schizophrenia, and bipolar disorder, and may require that the person receives care in a hospital.\nIt is important to know that mental illnesses are medical conditions that have nothing to do with a personâ€™s character, intelligence, or willpower. Just as diabetes is a disorder of the pancreas, mental illness is a medical conditio

In [39]:
def tokenize_function(example):
    tokenized = tokenizer(example["text"], truncation=True, padding="max_length", max_length=512)
    tokenized["labels"] = tokenized["input_ids"].copy()
    return tokenized

tokenized_dataset = dataset.map(tokenize_function, batched=True)


Map:   0%|          | 0/98 [00:00<?, ? examples/s]

In [40]:
training_args = TrainingArguments(
    output_dir="./tinyllama-qlora-support-bot",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    logging_dir="./logs",
    num_train_epochs=3,
    logging_steps=10,
    save_total_limit=2,
    save_strategy="epoch",
    bf16=True,
    optim="paged_adamw_8bit"
)


In [41]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    tokenizer=tokenizer
)

trainer.train()


  trainer = Trainer(
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
  return fn(*args, **kwargs)


Step,Training Loss
10,1.1861
20,1.1396
30,1.0944


  return fn(*args, **kwargs)
  return fn(*args, **kwargs)


TrainOutput(global_step=39, training_loss=1.1040601974878557, metrics={'train_runtime': 489.9927, 'train_samples_per_second': 0.6, 'train_steps_per_second': 0.08, 'total_flos': 935355809267712.0, 'train_loss': 1.1040601974878557, 'epoch': 3.0})

In [42]:
model.save_pretrained("tinyllama-qlora-support-bot")
tokenizer.save_pretrained("tinyllama-qlora-support-bot")


('tinyllama-qlora-support-bot/tokenizer_config.json',
 'tinyllama-qlora-support-bot/special_tokens_map.json',
 'tinyllama-qlora-support-bot/chat_template.jinja',
 'tinyllama-qlora-support-bot/tokenizer.model',
 'tinyllama-qlora-support-bot/added_tokens.json',
 'tinyllama-qlora-support-bot/tokenizer.json')

In [45]:
from transformers import pipeline

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

instruction = "What does it mean to have a mental illness?"
prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"

output = pipe(prompt, max_new_tokens=100)
print(output[0]['generated_text'])


Device set to use cuda:0


### Instruction:
What does it mean to have a mental illness?

### Response:
A mental illness is not a disease or a disorder, although mental illnesses are often thought of as being similar to diseases. They are different because they are mental, and they have different causes and symptoms. A mental illness can or may be affecting your thinking, emotions, and behaviours. It may be something that happens to you when you have a mental illness or something that is going on in your brain or mind. Here are some examples: 


In [12]:
import shutil
shutil.make_archive("tinyllama-qlora-support-bot", 'zip', "tinyllama-qlora-support-bot")


'/content/tinyllama-qlora-support-bot.zip'

In [17]:
from google.colab import files

# Download the zipped model to your local system
files.download("tinyllama-qlora-support-bot.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [16]:
pip install -q gradio

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.2/54.2 MB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m323.3/323.3 kB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.2/95.2 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.5/11.5 MB[0m [31m127.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.5/62.5 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [18]:
# Mount your Google Drive (optional)
# from google.colab import drive
# drive.mount('/content/drive')

# Unzip your fine-tuned model
!unzip -o tinyllama-qlora-support-bot.zip -d fine_tuned_model


Archive:  tinyllama-qlora-support-bot.zip
   creating: fine_tuned_model/checkpoint-12/
   creating: fine_tuned_model/checkpoint-18/
  inflating: fine_tuned_model/adapter_model.safetensors  
  inflating: fine_tuned_model/adapter_config.json  
  inflating: fine_tuned_model/special_tokens_map.json  
  inflating: fine_tuned_model/tokenizer_config.json  
  inflating: fine_tuned_model/tokenizer.model  
  inflating: fine_tuned_model/chat_template.jinja  
  inflating: fine_tuned_model/tokenizer.json  
  inflating: fine_tuned_model/README.md  
  inflating: fine_tuned_model/checkpoint-18/training_args.bin  
  inflating: fine_tuned_model/checkpoint-18/trainer_state.json  
  inflating: fine_tuned_model/checkpoint-18/adapter_model.safetensors  
  inflating: fine_tuned_model/checkpoint-18/adapter_config.json  
  inflating: fine_tuned_model/checkpoint-18/special_tokens_map.json  
  inflating: fine_tuned_model/checkpoint-18/tokenizer_config.json  
  inflating: fine_tuned_model/checkpoint-18/tokenizer.

In [22]:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
import gradio as gr

# Load your fine-tuned Airline FAQ model
model = AutoModelForCausalLM.from_pretrained("./fine_tuned_model")
tokenizer = AutoTokenizer.from_pretrained("./fine_tuned_model")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

def format_prompt(instruction):
    return f"### Instruction:\n{instruction}\n\n### Response:\n"

def chat_response(message, chat_history):
    prompt = format_prompt(message)
    output = pipe(prompt, max_new_tokens=300, do_sample=True, temperature=0.7)
    generated_text = output[0]['generated_text']
    response = generated_text.split("### Response:\n")[-1].strip()
    chat_history.append((message, response))
    return "", chat_history

# Gradio UI with Submit button
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("## 🛫 Airline FAQ Assistant\nAsk about flight policies, cancellations, and more!")

    chatbot = gr.Chatbot()
    msg = gr.Textbox(label="Your Question", placeholder="e.g., What is the baggage allowance for domestic flights?", lines=1)
    submit = gr.Button("Submit")
    clear = gr.Button("Clear Chat")

    submit.click(chat_response, [msg, chatbot], [msg, chatbot])
    msg.submit(chat_response, [msg, chatbot], [msg, chatbot])  # Enter key still works
    clear.click(lambda: None, None, chatbot, queue=False)

demo.launch()


Device set to use cuda:0
  chatbot = gr.Chatbot()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://9669015b168bf75e8a.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [23]:
# Upload CSV to Google Colab (from local machine)
from google.colab import files
import pandas as pd

# Method 1: Upload directly from your computer
uploaded = files.upload()

# Get the filename (assuming you upload just one file)
filename = next(iter(uploaded))

Saving Mental_Health_FAQ.csv to Mental_Health_FAQ.csv
