# **Fine-Tuning AI Models with LoRA and Deploying with Streamlit**
## **Hands-On Workshop**
### **Duration: 45 minutes**

This hands-on session covers fine-tuning AI models using **LoRA (Low-Rank Adaptation)** and deploying them using **Streamlit**.

### **Objectives:**
- Understand LoRA and its impact on efficient model fine-tuning.
- Apply LoRA fine-tuning to AI models based on project requirements.
- Fine-tune models including **GPT-2, BERT, Whisper, and Stable Diffusion**.
- Build and deploy an interactive **Streamlit web application**.
- Customize LoRA models for real-world project applications.


## **Step 1: Install Dependencies**
First, install the required libraries.

In [2]:
!pip install transformers peft accelerate streamlit diffusers torch torchaudio

Collecting streamlit
  Downloading streamlit-1.42.0-py2.py3-none-any.whl.metadata (8.9 kB)
Collecting watchdog<7,>=2.1.5 (from streamlit)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_c

## **Step 2: Select and Load Your Model**
Choose the model based on your project:
- **GPT-2** for text generation.
- **BERT** for text classification.
- **Whisper** for speech-to-text.
- **Stable Diffusion** for text-to-image.

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSequenceClassification, AutoModelForSpeechSeq2Seq
from diffusers import StableDiffusionPipeline
from peft import LoraConfig, get_peft_model

# Choose model
model_choice = 'bert'  # Change to 'bert', 'whisper', or 'stable-diffusion' as needed

if model_choice == 'gpt2':
    model_name = 'gpt2'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
elif model_choice == 'bert':
    model_name = 'bert-base-uncased'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
elif model_choice == 'whisper':
    model_name = 'openai/whisper-small'
    tokenizer = None
    model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name)
elif model_choice == 'stable-diffusion':
    model_name = 'runwayml/stable-diffusion-v1-5'
    tokenizer = None
    model = StableDiffusionPipeline.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Loading the dataset

In [47]:
!pip install datasets



In [4]:
print(df.columns)  # Ensure "Description" is a valid column name


NameError: name 'df' is not defined

In [5]:
df = df.dropna(subset=["Description"])
df = df.astype(str)  # Ensure all values are strings


NameError: name 'df' is not defined

In [50]:
dataset = Dataset.from_pandas(df)


In [51]:
def tokenize_function(examples):
    return tokenizer(examples["Description"], padding=True, truncation=True)

# Tokenize dataset
tokenized_datasets = dataset.map(tokenize_function, batched=True)


Map:   0%|          | 0/540455 [00:00<?, ? examples/s]

In [52]:
print(dataset[0])  # Check first row
print(dataset.column_names)  # Ensure "Description" is in the dataset


{'InvoiceNo': '536365', 'StockCode': '85123A', 'Description': 'WHITE HANGING HEART T-LIGHT HOLDER', 'Quantity': '6', 'InvoiceDate': '12/1/2010 8:26', 'UnitPrice': '2.55', 'CustomerID': '17850.0', 'Country': 'United Kingdom', '__index_level_0__': 0}
['InvoiceNo', 'StockCode', 'Description', 'Quantity', 'InvoiceDate', 'UnitPrice', 'CustomerID', 'Country', '__index_level_0__']


In [53]:
df = pd.read_csv("/content/e-commerce.csv", encoding="ISO-8859-1")
df = df.dropna(subset=["Description"]).astype(str)

dataset = Dataset.from_pandas(df)

def tokenize_function(examples):
    return tokenizer(examples["Description"], padding=True, truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)


Map:   0%|          | 0/540455 [00:00<?, ? examples/s]

## **Step 3: Apply LoRA Fine-Tuning**
Fine-tune the model using LoRA to improve efficiency.

In [6]:
# Define LoRA config for BERT-based models
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.05,
    target_modules=["attention.self.query", "attention.self.key", "attention.self.value", "intermediate.dense"],  # Target attention and linear layers
    task_type="SEQ_CLS"  # Sequence classification task
)

# Apply LoRA to the model
model = get_peft_model(model, lora_config)

# Check trainable parameters
model.print_trainable_parameters()


trainable params: 812,546 || all params: 110,296,324 || trainable%: 0.7367


## **Step 4: Test Fine-Tuned Model**
Provide sample inputs to test the fine-tuned model.

In [7]:
from transformers import BertTokenizer, BertForMaskedLM
import torch

# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased')

# Define the sentence with a masked token
prompt = "The future of AI is [MASK]."

# Tokenize the input and get the tensor representation
input_ids = tokenizer(prompt, return_tensors='pt').input_ids

# Predict the masked token
with torch.no_grad():
    outputs = model(input_ids)
    predictions = outputs.logits

# Get the index of the predicted token
masked_index = torch.where(input_ids == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)

# Decode the predicted token
predicted_word = tokenizer.decode(predicted_token_id)

# Print the completed sentence
completed_prompt = prompt.replace("[MASK]", predicted_word)
print(completed_prompt)


BertForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another archite

The future of AI is uncertain.


## **Step 5: Deploy as a Streamlit Web App**
Now, create a simple **Streamlit web interface** for model interaction.

In [14]:
%%writefile app.py
import streamlit as st
import pandas as pd
from transformers import BertTokenizer, BertForSequenceClassification
import torch

st.title('BERT-based Sentiment Analysis for E-commerce Data')

# Load the dataset directly (Ensure the file is in the same directory)
@st.cache_data
def load_data():
    return pd.read_csv("/content/sample_data/e-commerce.csv", encoding="ISO-8859-1")  # Update path if needed

df = load_data()
st.write("Dataset Loaded Successfully!")

# Ensure correct columns exist
if "Description" in df.columns:
    user_input = st.text_area("Enter a product description:")

    if st.button("Analyze Sentiment"):
        # Load the pre-trained BERT model and tokenizer for sentiment analysis
        tokenizer = BertTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
        model = BertForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

        # Tokenize the user input (description)
        inputs = tokenizer(user_input, return_tensors='pt', truncation=True, padding=True, max_length=512)

        # Perform inference
        with torch.no_grad():
            outputs = model(**inputs)
            logits = outputs.logits

        # Get sentiment (highest score)
        sentiment = torch.argmax(logits, dim=1).item()

        # Map sentiment to a label (0-4 scale)
        sentiment_map = {
            0: "Very Negative",
            1: "Negative",
            2: "Neutral",
            3: "Positive",
            4: "Very Positive"
        }

        # Display sentiment
        st.write(f"Sentiment: {sentiment_map[sentiment]}")
else:
    st.error("CSV file must have a 'Description' column.")


Overwriting app.py


## **Step 6: Run the Streamlit App**
Run the following command in Colab to launch the application.

In [15]:
!streamlit run app.py


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://34.16.189.129:8501[0m
[0m
[34m  Stopping...[0m


In [10]:
!pip install streamlit pyngrok

Collecting pyngrok
  Downloading pyngrok-7.2.3-py3-none-any.whl.metadata (8.7 kB)
Downloading pyngrok-7.2.3-py3-none-any.whl (23 kB)
Installing collected packages: pyngrok
Successfully installed pyngrok-7.2.3


In [16]:
import os
from pyngrok import ngrok

# Set your authentication token (only needed if not set globally)
ngrok.set_auth_token("2sh8tZRTzmljkt4YiY7hv3Qv1SI_fY1azBK9xS6q9eqN2vzp")

# Start ngrok tunnel
port = 8501
public_url = ngrok.connect(port).public_url
print(f"Public URL: {public_url}")

# Run Streamlit
os.system(f"streamlit run app.py --server.port {port}")

Public URL: https://bc4f-34-16-189-129.ngrok-free.app


2

## **Step 7: Customize for Your Project**
Participants should adapt LoRA fine-tuning and Streamlit deployment based on their specific project requirements.

### **Customizing LoRA for Your Project:**
- Adjust LoRA parameters such as rank and dropout based on dataset size.
- Train with domain-specific data to improve model accuracy.

### **Enhancing the Web Interface:**
- Modify the UI to include more features such as dropdowns and sliders.
- Optimize performance by reducing latency and improving text responses.

### **Deploying Your Model:**
- Consider deploying the model on **Hugging Face Spaces** or **AWS Lambda** for wider accessibility.
- Document project results and improvements.