<a href="https://colab.research.google.com/github/katybohanan/5588-LoRA-Streamlit/blob/main/LoRA_Streamlit_HandsOn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Fine-Tuning AI Models with LoRA and Deploying with Streamlit**
## **Hands-On Workshop**
### **Duration: 45 minutes**

This hands-on session covers fine-tuning AI models using **LoRA (Low-Rank Adaptation)** and deploying them using **Streamlit**.

### **Objectives:**
- Understand LoRA and its impact on efficient model fine-tuning.
- Apply LoRA fine-tuning to AI models based on project requirements.
- Fine-tune models including **GPT-2, BERT, Whisper, and Stable Diffusion**.
- Build and deploy an interactive **Streamlit web application**.
- Customize LoRA models for real-world project applications.


## **Step 1: Install Dependencies**
First, install the required libraries.

In [26]:
!pip install transformers peft accelerate streamlit diffusers torch torchaudio pyngrok kaggle



## **Step 2: Select and Load Your Model**
Choose the model based on your project:
- **GPT-2** for text generation.
- **BERT** for text classification.
- **Whisper** for speech-to-text.
- **Stable Diffusion** for text-to-image.

In [27]:
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSequenceClassification, AutoModelForSpeechSeq2Seq
from diffusers import StableDiffusionPipeline
from peft import LoraConfig, get_peft_model

# Choose model
model_choice = 'gpt2'

if model_choice == 'gpt2':
    model_name = 'gpt2'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
elif model_choice == 'bert':
    model_name = 'bert-base-uncased'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
elif model_choice == 'whisper':
    model_name = 'openai/whisper-small'
    tokenizer = None
    model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name)
elif model_choice == 'stable-diffusion':
    model_name = 'runwayml/stable-diffusion-v1-5'
    tokenizer = None
    model = StableDiffusionPipeline.from_pretrained(model_name)

## **Step 3: Apply LoRA Fine-Tuning**
Fine-tune the model using LoRA to improve efficiency.

In [28]:
# Apply LoRA configuration
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.05,
    target_modules=["c_attn"],  # Changed target modules to 'c_attn'
    task_type="CAUSAL_LM"  # Add task type for causal language modeling
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()

trainable params: 294,912 || all params: 124,734,720 || trainable%: 0.2364




## **Step 4: Test Fine-Tuned Model**
Provide sample inputs to test the fine-tuned model.

In [29]:
# Example for GPT-2
if model_choice == 'gpt2':
    prompt = "The future of AI is"
    input_ids = tokenizer(prompt, return_tensors='pt').input_ids
    output = model.generate(input_ids, max_length=50)
    print(tokenizer.decode(output[0], skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


The future of AI is uncertain. The future of AI is uncertain.

The future of AI is uncertain. The future of AI is uncertain.

The future of AI is uncertain. The future of AI is uncertain.

The future


## **Step 5: Deploy as a Streamlit Web App**
Now, create a simple **Streamlit web interface** for model interaction.

In [30]:
%%writefile app.py
import streamlit as st
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

st.title('LoRA Fine-Tuned Model Web Interface')

# Load model
tokenizer = AutoTokenizer.from_pretrained('gpt2')
model = AutoModelForCausalLM.from_pretrained('gpt2')

# User input
prompt = st.text_input('Enter your prompt:')

if st.button('Generate Text'):
    input_ids = tokenizer(prompt, return_tensors='pt').input_ids
    with torch.no_grad():
        output = model.generate(input_ids, max_length=250, do_sample=True)
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    st.write(generated_text)

Overwriting app.py


## **Step 6: Run the Streamlit App**
Run the following command in Colab to launch the application.

In [33]:
!streamlit run app.py


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://35.201.140.21:8501[0m
[0m
2025-02-14 02:08:59.237674: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1739498939.253906   29177 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739498939.257728   29177 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-14 02:09:04.512 Examining the path of torch.classes raised:
Traceback (most recent call la

In [32]:
from pyngrok import ngrok

# Set up ngrok
!ngrok authtoken 2gC2Ca9ha8zRTWVB3I9DSbpRdsd_6JV3JHhfP5jYZ4moAf3w

public_url = ngrok.connect(8501)
print(f"Public URL: {public_url}")


Authtoken saved to configuration file: /root/.config/ngrok/ngrok.yml
Public URL: NgrokTunnel: "https://3ed3-35-201-140-21.ngrok-free.app" -> "http://localhost:8501"


## **Step 7: Customize for Your Project**
Participants should adapt LoRA fine-tuning and Streamlit deployment based on their specific project requirements.

### **Customizing LoRA for Your Project:**
- Adjust LoRA parameters such as rank and dropout based on dataset size.
- Train with domain-specific data to improve model accuracy.

### **Enhancing the Web Interface:**
- Modify the UI to include more features such as dropdowns and sliders.
- Optimize performance by reducing latency and improving text responses.

### **Deploying Your Model:**
- Consider deploying the model on **Hugging Face Spaces** or **AWS Lambda** for wider accessibility.
- Document project results and improvements.