# **Fine-Tuning AI Models with LoRA and Deploying with Streamlit**
## **Hands-On Workshop**
### **Duration: 45 minutes**

This hands-on session covers fine-tuning AI models using **LoRA (Low-Rank Adaptation)** and deploying them using **Streamlit**.

### **Objectives:**
- Understand LoRA and its impact on efficient model fine-tuning.
- Apply LoRA fine-tuning to AI models based on project requirements.
- Fine-tune models including **GPT-2, BERT, Whisper, and Stable Diffusion**.
- Build and deploy an interactive **Streamlit web application**.
- Customize LoRA models for real-world project applications.


## **Step 1: Install Dependencies**
First, install the required libraries.

In [5]:
!pip install transformers peft accelerate streamlit diffusers torch torchaudio



## **Step 2: Select and Load Your Model**
Choose the model based on your project:
- **GPT-2** for text generation.
- **BERT** for text classification.
- **Whisper** for speech-to-text.
- **Stable Diffusion** for text-to-image.

In [6]:
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSequenceClassification, AutoModelForSpeechSeq2Seq
from diffusers import StableDiffusionPipeline
from peft import LoraConfig, get_peft_model

# Choose model
model_choice = 'gpt2'  # Change to 'bert', 'whisper', or 'stable-diffusion' as needed

if model_choice == 'gpt2':
    model_name = 'gpt2'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
elif model_choice == 'bert':
    model_name = 'bert-base-uncased'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
elif model_choice == 'whisper':
    model_name = 'openai/whisper-small'
    tokenizer = None
    model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name)
elif model_choice == 'stable-diffusion':
    model_name = 'runwayml/stable-diffusion-v1-5'
    tokenizer = None
    model = StableDiffusionPipeline.from_pretrained(model_name)

## **Step 3: Apply LoRA Fine-Tuning**
Fine-tune the model using LoRA to improve efficiency.

In [7]:
# Apply LoRA configuration
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.05,
    target_modules=["c_attn"],  # Changed target modules to 'c_attn'
    task_type="CAUSAL_LM"  # Add task type for causal language modeling
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()

trainable params: 294,912 || all params: 124,734,720 || trainable%: 0.2364




## **Step 4: Test Fine-Tuned Model**
Provide sample inputs to test the fine-tuned model.

In [8]:
# Example for GPT-2
if model_choice == 'gpt2':
    prompt = "The future of AI is"
    input_ids = tokenizer(prompt, return_tensors='pt').input_ids
    output = model.generate(input_ids, max_length=50)
    print(tokenizer.decode(output[0], skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


The future of AI is uncertain. The future of AI is uncertain.

The future of AI is uncertain. The future of AI is uncertain.

The future of AI is uncertain. The future of AI is uncertain.

The future


## **Step 5: Deploy as a Streamlit Web App**
Now, create a simple **Streamlit web interface** for model interaction.

In [9]:
%%writefile app.py
import streamlit as st
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

st.title('LoRA Fine-Tuned Model Web Interface')

# Load model
tokenizer = AutoTokenizer.from_pretrained('gpt2')
model = AutoModelForCausalLM.from_pretrained('gpt2')

# User input
prompt = st.text_input('Enter your prompt:')

if st.button('Generate Text'):
    input_ids = tokenizer(prompt, return_tensors='pt').input_ids
    with torch.no_grad():
        output = model.generate(input_ids, max_length=50, do_sample=True)
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    st.write(generated_text)

Overwriting app.py


## **Step 6: Run the Streamlit App**
Run the following command in Colab to launch the application.

In [10]:
!streamlit run app.py

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://10.0.80.179:8501[0m
[0m
[34m[1m  For better performance, install the Watchdog module:[0m

  $ xcode-select --install
  $ pip install watchdog
            [0m
^C
[34m  Stopping...[0m
Exception ignored in: <module 'threading' from '/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/threading.py'>
Traceback (most recent call last):
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/threading.py", line 1447, in _shutdown
    atexit_call()
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/concurrent/futures/thread.py", line 31, in _python_exit
    t.join()
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/threading.py", line 1060, in join
    self._wait_for_tstate_lock()
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/threading.py", line 1080, in _wait_for_tst

## **Step 7: Customize for Your Project**
Participants should adapt LoRA fine-tuning and Streamlit deployment based on their specific project requirements.

### **Customizing LoRA for Your Project:**
- Adjust LoRA parameters such as rank and dropout based on dataset size.
- Train with domain-specific data to improve model accuracy.

### **Enhancing the Web Interface:**
- Modify the UI to include more features such as dropdowns and sliders.
- Optimize performance by reducing latency and improving text responses.

### **Deploying Your Model:**
- Consider deploying the model on **Hugging Face Spaces** or **AWS Lambda** for wider accessibility.
- Document project results and improvements.

## Load and Preprocess the Dataset

In [11]:
import pandas as pd
from datasets import Dataset
from transformers import BertTokenizer

# Load training dataset
train_file_path = "archive/twitter_training.csv"
df_train = pd.read_csv(train_file_path)

# Rename columns based on dataset structure
df_train.columns = ["ID", "Category", "Sentiment", "Text"]

# Drop rows with missing text
df_train = df_train.dropna(subset=["Text"])

# Map sentiment labels to numerical values
label_mapping = {"Negative": 0, "Neutral": 1, "Positive": 2}
df_train["Sentiment"] = df_train["Sentiment"].map(label_mapping)

# Drop rows where sentiment mapping failed (if any)
df_train = df_train.dropna(subset=["Sentiment"]).reset_index(drop=True)

# Convert Sentiment column to integer type
df_train["Sentiment"] = df_train["Sentiment"].astype(int)

# Convert DataFrame to Hugging Face Dataset
train_dataset = Dataset.from_pandas(df_train[["Text", "Sentiment"]])

# Load validation dataset
val_file_path = "archive/twitter_validation.csv"
df_val = pd.read_csv(val_file_path)

# Rename columns for validation dataset
df_val.columns = ["ID", "Category", "Sentiment", "Text"]

# Drop missing text in validation dataset
df_val = df_val.dropna(subset=["Text"])

# Apply sentiment mapping
df_val["Sentiment"] = df_val["Sentiment"].map(label_mapping)

# Drop invalid rows
df_val = df_val.dropna(subset=["Sentiment"]).reset_index(drop=True)

# Convert Sentiment column to integer type
df_val["Sentiment"] = df_val["Sentiment"].astype(int)

# Convert Validation DataFrame to Hugging Face Dataset
val_dataset = Dataset.from_pandas(df_val[["Text", "Sentiment"]])

In [12]:
# Load tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

# Tokenization function
def preprocess_function(examples):
    result = tokenizer(examples['Text'], truncation=True, padding='max_length', max_length=128)
    result["labels"] = examples["Sentiment"]
    return result

# Apply tokenization to training and validation datasets
train_dataset = train_dataset.map(preprocess_function, batched=True)
val_dataset = val_dataset.map(preprocess_function, batched=True)

Map: 100%|██████████| 61120/61120 [00:16<00:00, 3700.35 examples/s]
Map: 100%|██████████| 828/828 [00:00<00:00, 3105.60 examples/s]


## LoRA Fine-Tuning (BERT Model)

In [13]:
from transformers import BertForSequenceClassification, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model
import torch

# Load pre-trained BERT model
model_name = "bert-base-uncased"
base_model = BertForSequenceClassification.from_pretrained(model_name, num_labels=3)  # 3 sentiment classes

# Define LoRA Configuration
lora_config = LoraConfig(
    r=8,  # LoRA rank
    lora_alpha=16,  # LoRA scaling factor
    lora_dropout=0.1,  # Dropout rate for LoRA layers
    target_modules=["query", "value"],  # Apply LoRA to key transformer layers
)

# Apply LoRA to the model
model = get_peft_model(base_model, lora_config)

# Print model summary (optional)
model.print_trainable_parameters()

# Define Training Arguments
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=50,
    report_to="none",  # Avoid logging to wandb/huggingface unless needed
)

# Trainer with validation dataset
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,  
)

# Start training
trainer.train()

# Save the trained model
trainer.save_model("./trained_lora_model")


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 294,912 || all params: 109,779,459 || trainable%: 0.2686


Epoch,Training Loss,Validation Loss
1,0.9318,No log
2,0.7993,No log
3,0.7973,No log


## Streamlit Deployment

In [14]:
!streamlit run app.py

[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://10.0.80.179:8501[0m
[0m
[34m[1m  For better performance, install the Watchdog module:[0m

  $ xcode-select --install
  $ pip install watchdog
            [0m
Using device: mps
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2025-02-13 15:08:43.448 Examining the path of torch.classes raised:
Traceback (most recent call last):
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python3.9/site-packages/streamlit/watcher/local_sources_watcher.py", line 217, in get_module_paths
    potential_paths = extract_paths(module)
  File "/Users/nanxuan/miniconda3/envs/dscapstone/lib/python