<a href="https://colab.research.google.com/github/snehapriya-bs/AIMLOps/blob/main/M4_NB_MiniProject_1_Deploy_Medical_Q%26A_GPT2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Programme in AI and MLOps
## A programme by IISc and TalentSprint
### Mini-Project: Medical Q&A using GPT2 | Deployment on Hugging Face Spaces

## Learning Objectives

At the end of the experiment, you will be able to:

* perform data preprocessing, EDA and feature extraction on the Medical Q&A dataset
* load a pre-trained tokenizer
* finetune a GPT-2 language model for medical question-answering
* upload your fine-tuned model to Hugging Face Model Hub
* deploy application with uploaded model on HuggingFace Spaces using Gradio

## Dataset Description

The dataset used in this project is the *Medical Question Answering Dataset* ([MedQuAD](https://github.com/abachaa/MedQuAD/tree/master)). It includes medical question-answer pairs along with additional information, such as the question type, the question *focus*, its UMLS(Unified Medical Language System) details like - Concept Unique Identifier(*CUI*) and Semantic *Type* and *Group*.

To know more about this data's collection, and construction method, refer to this [paper](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3119-4).

The data is extracted and is in CSV format with below features:

- **Focus**: the question focus
- **CUI**: concept unique identifier
- **SemanticType**
- **SemanticGroup**
- **Question**
- **Answer**

## Grading = 10 Points

## Information

Healthcare professionals often have to refer to medical literature and documents while seeking answers to medical queries. Medical databases or search engines are powerful resources of upto date medical knowledge. However, the existing documentation is large and makes it difficult for professionals to retrieve answers quickly in a clinical setting. The problem with search engines and informative retrieval engines is that these systems return a list of documents rather than answers. Instead, healthcare professionals can use question answering systems to retrieve short sentences or paragraphs in response to medical queries. Such systems have the biggest advantage of generating answers and providing hints in a few seconds.

### Problem Statement

Fine-tune gpt2 model on medical-question-answering-dataset for performing response generation for medical queries. Later, deploy the fine-tuned model on Hugging Face Spaces.

Please refer to ***M4 Assignment-1 Fine-tune GPT2*** and ***M4 AdditionalNB Fine-tune GPT2 for TextClassification*** to get familiar with how to load pre-trained gpt2 tokenizer and model.

Please refer to ***The demo session held on 26 Jan - Hugging Face Spaces Deployment*** to get familiar with how to do deployment using Hugging Face Spaces.

### Installing Dependencies

In [None]:
# @title
%%capture
!pip -q uninstall pyarrow -y
!pip -q install pyarrow==15.0.2
!pip -q install datasets
!pip -q install accelerate
!pip -q install transformers

### <font color="#990000">Restart Session/Runtime</font>

### Import required packages

In [None]:
import os
import re
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, DataCollatorForLanguageModeling
from transformers import Trainer, TrainingArguments

import warnings
warnings.filterwarnings('ignore')

In [None]:
#@title Download the dataset
!wget -q https://cdn.iisc.talentsprint.com/AIandMLOps/MiniProjects/Datasets/MedQuAD.csv
!ls | grep ".csv"

MedQuAD.csv


**Exercise 1: Read the MedQuAD.csv dataset**

**Hint:** pd.read_csv()

In [None]:
# YOUR CODE HERE
df = pd.read_csv('MedQuAD.csv')

### Pre-processing and EDA

**Exercise 2: Perform below operations on the dataset [0.5 Mark]**

- Handle missing values
- Remove duplicates from data considering `Question` and `Answer` columns

- **Handle missing values**

In [None]:
# YOUR CODE HERE
# Check for missing values in the dataset
missing_values = df.isnull().sum()

# Handle missing values:
# Option 1: Drop rows with missing Question or Answer
df_cleaned = df.dropna(subset=['Question', 'Answer'])



- **Remove duplicates from data considering `Question` and `Answer` columns**

In [None]:
# YOUR CODE HERE

# Check for duplicates considering 'Question' and 'Answer' columns
df_cleaned = df_cleaned.drop_duplicates(subset=['Question', 'Answer'])

# Verify the cleaning
print(f"Missing values after cleaning: \n{df_cleaned.isnull().sum()}")
print(f"Number of rows after removing duplicates: {df_cleaned.shape[0]}")

Missing values after cleaning: 
Focus             14
CUI              565
SemanticType     597
SemanticGroup    565
Question           0
Answer             0
dtype: int64
Number of rows after removing duplicates: 16359


**Exercise 3: Display the category name, and the number of records belonging to top 100 categories of `Focus` column [0.5 Mark]**

In [None]:
# Total categories in Focus column
# YOUR CODE HERE
focus_counts = df['Focus'].value_counts()


In [None]:
# Displaying the distinct categories of Focus column and the number of records belonging to each category
# (Top 100 only)

# YOUR CODE HERE
# Get the top 100 categories (or less if there are fewer than 100)
top_100_focus = focus_counts.head(100)

# Display the distinct categories and their respective counts
for category, count in top_100_focus.items():
    print(f"{category}: {count}")

Breast Cancer: 53
Prostate Cancer: 43
Stroke: 35
Skin Cancer: 34
Alzheimer's Disease: 30
Colorectal Cancer: 29
Lung Cancer: 29
High Blood Cholesterol: 28
Heart Failure: 28
Heart Attack: 28
Causes of Diabetes: 28
High Blood Pressure: 27
Parkinson's Disease: 25
Leukemia: 22
Shingles: 21
Osteoporosis: 21
Diabetes: 20
Hemochromatosis: 20
Age-related Macular Degeneration: 20
Diabetic Retinopathy: 19
Gum (Periodontal) Disease: 19
Psoriasis: 19
Kidney Disease: 17
Cataract: 16
Dry Mouth: 16
COPD: 16
Balance Problems: 16
Gout: 15
Glaucoma: 15
Prescription and Illicit Drug Abuse: 15
Medicare and Continuing Care: 15
Wilson Disease: 15
Problems with Taste: 14
Neuroblastoma: 14
Short Bowel Syndrome: 14
Osteoarthritis: 14
Narcolepsy: 14
Endometrial Cancer: 14
Rheumatoid Arthritis: 14
Peripheral Arterial Disease (P.A.D.): 13
Kidney Dysplasia: 13
Anxiety Disorders: 13
Dry Eye: 13
Pituitary Tumors: 13
Surviving Cancer: 13
Problems with Smell: 13
Urinary Tract Infections in Children: 13
Diabetes, Heart 

In [None]:
# Top 100 Focus categories names

# YOUR CODE HERE
# Display the top 100 focus category names
top_100_focus_names = top_100_focus.index.tolist()

# Print the top 100 focus category names
for i, focus_name in enumerate(top_100_focus_names, 1):
    print(f"{i}. {focus_name}")

1. Breast Cancer
2. Prostate Cancer
3. Stroke
4. Skin Cancer
5. Alzheimer's Disease
6. Colorectal Cancer
7. Lung Cancer
8. High Blood Cholesterol
9. Heart Failure
10. Heart Attack
11. Causes of Diabetes
12. High Blood Pressure
13. Parkinson's Disease
14. Leukemia
15. Shingles
16. Osteoporosis
17. Diabetes
18. Hemochromatosis
19. Age-related Macular Degeneration
20. Diabetic Retinopathy
21. Gum (Periodontal) Disease
22. Psoriasis
23. Kidney Disease
24. Cataract
25. Dry Mouth
26. COPD
27. Balance Problems
28. Gout
29. Glaucoma
30. Prescription and Illicit Drug Abuse
31. Medicare and Continuing Care
32. Wilson Disease
33. Problems with Taste
34. Neuroblastoma
35. Short Bowel Syndrome
36. Osteoarthritis
37. Narcolepsy
38. Endometrial Cancer
39. Rheumatoid Arthritis
40. Peripheral Arterial Disease (P.A.D.)
41. Kidney Dysplasia
42. Anxiety Disorders
43. Dry Eye
44. Pituitary Tumors
45. Surviving Cancer
46. Problems with Smell
47. Urinary Tract Infections in Children
48. Diabetes, Heart Disea

### Create Training and Validation set

**Exercise 4: Create training and validation set [1 Mark]**

- Consider 4 samples per `Focus` category, for each top 100 categories, from the dataset (It will give 400 samples for training)

- Consider 1 sample per `Focus` category (different from training set), for each top 100 categories, from the dataset (It will give 100 samples for validation)

In [None]:
# YOUR CODE HERE
# YOUR CODE HERE
training_samples = []
validation_samples = []

# Loop over the top 100 categories
for focus_category in top_100_focus.index:
    # Get all rows corresponding to the current focus category
    category_data = df[df['Focus'] == focus_category]

    # Select 4 random samples for the training set (without replacement)
    train_samples = category_data.sample(n=4, random_state=42, replace=False)
    training_samples.append(train_samples)

    # Select 1 random sample for the validation set, ensuring it's not in the training set
    remaining_data = category_data.drop(train_samples.index)
    val_sample = remaining_data.sample(n=1, random_state=42)
    validation_samples.append(val_sample)

# Combine the training and validation samples into dataframes
train_df = pd.concat(training_samples)
val_df = pd.concat(validation_samples)

# Display the resulting datasets
print(f"Training Set Size: {train_df.shape[0]}")
print(f"Validation Set Size: {val_df.shape[0]}")
# Display the resulting datasets
print(f"Training Set Size: {train_df.shape[0]}")
print(f"Validation Set Size: {val_df.shape[0]}")

Training Set Size: 400
Validation Set Size: 100
Training Set Size: 400
Validation Set Size: 100


### Pre-process `Question` and `Answer` text

**Exercise 5: Perform below tasks:  [1 Mark]**

- Combine `Question` and `Answer` for train and validation data as shown below:
    - sequence = *'\<question\>' + question-text + '\<answer\>' + answer-text + '\<end\>'*

- Join the combined text using '\n' into a single string for training and validation separately

- Save the training and validation strings as separate text files

- **Combine Question and Answer for train and val data**

In [None]:
# Combine Questions and Answers for train and val data
## sequence = '<question>' + question + '<answer>' + answer

# YOUR CODE HERE
# Combine Questions and Answers for train and val data
train_sequences = ['<question>' + q + '<answer>' + a + '<end>' for q, a in zip(train_df['Question'], train_df['Answer'])]
val_sequences = ['<question>' + q + '<answer>' + a + '<end>' for q, a in zip(val_df['Question'], val_df['Answer'])]



- **Join the combined text using '\n' into a single string for training and validation separately**

In [None]:
# Train and Validation text for all Q&As

# YOUR CODE HERE
# Join the combined text using '\n' into a single string for training and validation separately
train_text = '\n'.join(train_sequences)
val_text = '\n'.join(val_sequences)



- **Save the training and validation strings as text files**

In [None]:
# Save the training and validation data as text files

# YOUR CODE HERE
# Save the training and validation strings as text files
with open('train.txt', 'w') as f:
    f.write(train_text)

with open('val.txt', 'w') as f:
    f.write(val_text)

**Exercise 6: Load pre-trained GPT2Tokenizer**

- Use checkpoint = "gpt2"

In [None]:
# Set up the tokenizer
# YOUR CODE HERE
# Set up the tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

**Exercise 7: Tokenize train and validation data [0.5 Mark]**

- Use the loaded pre-trained tokenizer
- Use training and validation data saved in text files

In [None]:
# YOUR CODE HERE
# Tokenize the training and validation data
# Set the padding token to the eos_token (end-of-sequence token)
from datasets import Dataset
tokenizer.pad_token = tokenizer.eos_token

# Tokenize the training and validation data
train_encodings = tokenizer(train_text, truncation=True, padding=True, return_tensors="pt")
val_encodings = tokenizer(val_text, truncation=True, padding=True, return_tensors="pt")

# Convert tokenized data to a Dataset
train_dataset = Dataset.from_dict(train_encodings)
val_dataset = Dataset.from_dict(val_encodings)

**Exercise 8: Create a DataCollator object**

In [None]:
# Create a Data collator object
# YOUR CODE HERE
# Create a Data collator object
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False  # Causal language modeling
)

**Exercise 9: Load pre-trained GPT2LMHeadModel**

In [None]:
# Set up the model
# YOUR CODE HERE
# Set up the model
model = GPT2LMHeadModel.from_pretrained("gpt2")

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

**Exercise 10: Fine-tune GPT2 Model [1 Mark]**

- Specify training arguments and create a TrainingArguments object (Use 30 epochs)

- Train a GPT-2 model using the provided training arguments

- Save the resulting trained model and tokenizer to a specified output directory

In [None]:
# Set up the training arguments

# YOUR CODE HERE
# Set up the training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=30,
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    save_total_limit=2,
)




In [None]:
# Train the model
# YOUR CODE HERE

# Save the model
# YOUR CODE HERE

# Save the tokenizer
# YOUR CODE HERE


# Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    data_collator=data_collator,
)




trainer.train()

# Save the model
model.save_pretrained("./fine-tuned-gpt2")

# Save the tokenizer
tokenizer.save_pretrained("./fine-tuned-gpt2")



<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33msnehapriya-bs[0m ([33msnehapriya-bs-credit-suisse[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


Epoch,Training Loss,Validation Loss
1,No log,2.418651
2,No log,2.418595
3,No log,2.418479
4,No log,2.418309
5,No log,2.41808
6,No log,2.417804
7,No log,2.417481
8,No log,2.417109
9,No log,2.416682
10,2.503300,2.41621


('./fine-tuned-gpt2/tokenizer_config.json',
 './fine-tuned-gpt2/special_tokens_map.json',
 './fine-tuned-gpt2/vocab.json',
 './fine-tuned-gpt2/merges.txt',
 './fine-tuned-gpt2/added_tokens.json')

**Exercise 11: Test Model with user input prompts [1 Mark]**

- Create `generate_response()` function that takes a trained *model*, *tokenizer*, and a *prompt* string as input and generates a response using the GPT-2 model

- Test it with some user input prompts

In [None]:
def generate_response(model, tokenizer, prompt, max_length=200):
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    outputs = model.generate(inputs, max_length=max_length, num_return_sequences=1)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response






In [None]:
# Load the fine-tuned model and tokenizer

# YOUR CODE HERE
# YOUR CODE HERE
# Load the fine-tuned model and tokenizer
model = GPT2LMHeadModel.from_pretrained("./fine-tuned-gpt2")
tokenizer = GPT2Tokenizer.from_pretrained("./fine-tuned-gpt2")

In [None]:
# Testing with a sample prompt 1


# Testing with a sample prompt 1
prompt = "What precautions to take for a healthy life?"
response = generate_response(model, tokenizer, prompt)
print(response)



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


What precautions to take for a healthy life?

The following are some of the most common questions you'll hear from your doctor or nurse about your health.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from


In [None]:
# Testing with a sample prompt 2

# Testing with a sample prompt 2
prompt = "What to do after being diagnosed with cancer?"
response = generate_response(model, tokenizer, prompt)
print(response)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


What to do after being diagnosed with cancer?

The first step is to get your doctor's approval for a treatment.

If you have a cancer diagnosis, you may need to get a second opinion.

If you have a cancer diagnosis, you may need to get a second opinion. If you have a cancer diagnosis, you may need to get a third opinion.

If you have a cancer diagnosis, you may need to get a third opinion. If you have a cancer diagnosis, you may need to get a fourth opinion.

If you have a cancer diagnosis, you may need to get a fourth opinion. If you have a cancer diagnosis, you may need to get a fifth opinion.

If you have a cancer diagnosis, you may need to get a fifth opinion. If you have a cancer diagnosis, you may need to get a sixth opinion.

If you have a cancer diagnosis, you may need to get a sixth opinion. If you have


**Exercise 12: Compare the performance of a *GPT2 model* with the *GPT2 model fine-tuned* on MedQuAD data [0.5 Mark]**

- Load another pre-trained GPT2LMHeadModel and do not fine-tune it

- To generate response using the untuned model, pass it as a parameter to `generate_response()` function

- Test both models (fine-tuned and untuned) with below user input prompts:

    - "What precautions to take for a healthy life?"
    - "What to do after being diagnosed with cancer?"
    - "What to do when feeling sick?"

In [None]:
# Load a pre-trained GPT2 model, do not finetune it with MedQuAD data

# YOUR CODE HERE
# Load a pre-trained GPT2 model, do not finetune it with MedQuAD data
untuned_model = GPT2LMHeadModel.from_pretrained("gpt2")
untuned_tokenizer = GPT2Tokenizer.from_pretrained("gpt2")



In [None]:
# Testing with finetuned model: prompt 1
prompt = "What precautions to take for a healthy life?"
response_finetuned = generate_response(model, tokenizer, prompt)
print("Fine-tuned Model Response:", response_finetuned)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Fine-tuned Model Response: What precautions to take for a healthy life?

The following are some of the most common questions you'll hear from your doctor or nurse about your health.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from those of taking a drug.

What are the risks of taking certain medications?

The risks of taking certain medications are very different from


In [None]:
# Testing with untuned model: prompt 1
response_untuned = generate_response(untuned_model, untuned_tokenizer, prompt)
print("Untuned Model Response:", response_untuned)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Untuned Model Response: What precautions to take for a healthy life?

The following are some of the most common questions you'll hear from your doctor or nurse about your health.

What are the risks of taking a drug that can cause cancer?

The risks of taking a drug that can cause cancer are very high.

What are the risks of taking a drug that can cause cancer?

The risks of taking a drug that can cause cancer are very high.

What are the risks of taking a drug that can cause cancer?

The risks of taking a drug that can cause cancer are very high.

What are the risks of taking a drug that can cause cancer?

The risks of taking a drug that can cause cancer are very high.

What are the risks of taking a drug that can cause cancer?

The risks of taking a drug that can cause cancer are very high.

What are the risks of taking a drug that can cause


In [None]:

# Testing with finetuned model: prompt 2
prompt = "What to do after being diagnosed with cancer?"
response_finetuned = generate_response(model, tokenizer, prompt)
print("Fine-tuned Model Response:", response_finetuned)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Fine-tuned Model Response: What to do after being diagnosed with cancer?

The first step is to get your doctor's approval for a treatment.

If you have a cancer diagnosis, you may need to get a second opinion.

If you have a cancer diagnosis, you may need to get a second opinion. If you have a cancer diagnosis, you may need to get a third opinion.

If you have a cancer diagnosis, you may need to get a third opinion. If you have a cancer diagnosis, you may need to get a fourth opinion.

If you have a cancer diagnosis, you may need to get a fourth opinion. If you have a cancer diagnosis, you may need to get a fifth opinion.

If you have a cancer diagnosis, you may need to get a fifth opinion. If you have a cancer diagnosis, you may need to get a sixth opinion.

If you have a cancer diagnosis, you may need to get a sixth opinion. If you have


In [None]:
# Testing with untuned model: prompt 2
response_untuned = generate_response(untuned_model, untuned_tokenizer, prompt)
print("Untuned Model Response:", response_untuned)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Untuned Model Response: What to do after being diagnosed with cancer?

The first step is to get your doctor's approval for a treatment.

If you have a cancer diagnosis, you may need to get a second opinion.

If you have a cancer diagnosis, you may need to get a second opinion. If you have a cancer diagnosis, you may need to get a third opinion.

If you have a cancer diagnosis, you may need to get a third opinion. If you have a cancer diagnosis, you may need to get a fourth opinion.

If you have a cancer diagnosis, you may need to get a fourth opinion. If you have a cancer diagnosis, you may need to get a fifth opinion.

If you have a cancer diagnosis, you may need to get a fifth opinion. If you have a cancer diagnosis, you may need to get a sixth opinion.

If you have a cancer diagnosis, you may need to get a sixth opinion. If you have


In [None]:
# Testing with finetuned model: prompt 3
prompt = "What to do when feeling sick?"
response_finetuned = generate_response(model, tokenizer, prompt)
print("Fine-tuned Model Response:", response_finetuned)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Fine-tuned Model Response: What to do when feeling sick?

The first thing you should do is to get your body to relax. If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a nap. If you feel sick, you should take a nap.

If you feel sick, you should take a


In [None]:
# Testing with untuned model: prompt 3
response_untuned = generate_response(untuned_model, untuned_tokenizer, prompt)
print("Untuned Model Response:", response_untuned)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Untuned Model Response: What to do when feeling sick?

The first thing you should do is to get your body to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick, you should take a few minutes to relax.

If you're feeling sick


## Push your model to Hugging Face Model Hub

**Exercise 13: Follow below steps to push your fine-tuned model to HuggingFace Model Hub**

1. [Sign up](https://huggingface.co/join) for a Hugging Face account
2. Create an access token for your account and save it
3. Store your access token in the Hugging Face cache folder within colab
4. Push your fine-tuned model and tokenizer to Model Hub
5. Load the model back from Hub and test it with user input prompts

* **Create an access token for your account**

    Once you have an account, to create an access token:
    
    - Go to your `Settings`, then click on the `Access Tokens` tab. Click on the `New token` button to create a new User Access Token.
    - Select a Token type as `Write` and give a name for your token
    - Click on Create token
    - Once a token is created save it somewhere
    - When required later, use the old saved token or create a new token again

    To know more about Access Tokens, refer [here](https://huggingface.co/docs/hub/security-tokens).

* **Store your access token in the Hugging Face cache folder within colab**

    Once you have your User Access Token, run the following command to authenticate your identity to the Hub.
    - `!huggingface-cli login`
    - Paste your Access token when prompted
    - Type **n** when prompted to Add token as git credential? (Y/n)

    For more details on login, refer [here](https://huggingface.co/docs/huggingface_hub/quick-start#login).

In [None]:
# YOUR CODE HERE
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) Y
Token is valid (permission: write).
The token `token2` has been saved to /root/.cache/huggingface/stored_tokens
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticat

* **Push your fine-tuned model and tokenizer to Model Hub [0.5 Mark]**

    - Use `push_to_hub()` method of your model and tokenizer both, to push them on hub
    - Specify name for your repository where the model and tokenizer will be pushed using `repo_id` parameter
    - Push model and tokenizer to the same repository

    - **Hint:**

        - Use `push_to_hub()` method of your model. For parameter details, refer [here](https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.push_to_hub).
        - Use `push_to_hub()` method of your tokenizer. For parameter details, refer [here](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.PreTrainedTokenizer.push_to_hub).
        - Access your pushed model at `https://huggingface.co/[YOUR-USER-NAME]/[YOUR-MODEL-REPO-NAME]/tree/main`

In [None]:
# Push model
# YOUR CODE HERE
# Push model
model.push_to_hub("snehapriyabs/gpt2")



model.safetensors:   0%|          | 0.00/498M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/snehapriyabs/gpt2/commit/70eb52d2316ab113b7b365aee4e657d8bbbd4ff8', commit_message='Upload model', commit_description='', oid='70eb52d2316ab113b7b365aee4e657d8bbbd4ff8', pr_url=None, repo_url=RepoUrl('https://huggingface.co/snehapriyabs/gpt2', endpoint='https://huggingface.co', repo_type='model', repo_id='snehapriyabs/gpt2'), pr_revision=None, pr_num=None)

In [None]:
# Push tokenizer
# YOUR CODE HERE
# Push tokenizer
tokenizer.push_to_hub("snehapriyabs/gpt2")

README.md:   0%|          | 0.00/5.17k [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/snehapriyabs/gpt2/commit/57429afd666a58179d6176c1220076ee1abebbbb', commit_message='Upload tokenizer', commit_description='', oid='57429afd666a58179d6176c1220076ee1abebbbb', pr_url=None, repo_url=RepoUrl('https://huggingface.co/snehapriyabs/gpt2', endpoint='https://huggingface.co', repo_type='model', repo_id='snehapriyabs/gpt2'), pr_revision=None, pr_num=None)

* **Load the model and tokenizer back from Hub and test it with user input prompts [0.5 Mark]**

    - In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the `from_pretrained()` method. **AutoClasses** can be used to automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

    - Instantiating one of `AutoConfig`, `AutoModel`, and `AutoTokenizer` will directly create a class of the relevant architecture.

    - When the GPT2 Model transformer has a language modeling head on top, you can use an auto class with language modeling head on top as well - `AutoModelWithLMHead`.

    - Specify full path of your model repo i.e. ***''YOUR-USER-NAME/YOUR-REPO-NAME''*** while calling `from_pretrained()` method.

In [None]:
from transformers import AutoModelWithLMHead, AutoTokenizer, AutoModelForCausalLM

In [None]:

# Specify the full path of your model repo
model_name = 'snehapriyabs/gpt2'

In [None]:
# Load your tokenizer from hub

tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the model with a language modeling head
model = AutoModelForCausalLM.from_pretrained(model_name)

config.json:   0%|          | 0.00/920 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/498M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/119 [00:00<?, ?B/s]

In [None]:
# Response from loaded model
# Function to generate text based on user input
def generate_text(prompt, max_length=50):
    # Tokenize the input prompt
    inputs = tokenizer(prompt, return_tensors='pt')

    # Generate text
    outputs = model.generate(inputs['input_ids'], max_length=max_length, num_return_sequences=1)

    # Decode the generated text
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

    return generated_text

# Example usage
user_prompt = "Once upon a time"
generated_text = generate_text(user_prompt)
print(generated_text)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a


## Gradio Implementation

Gradio is an open-source python library that allows us to quickly create easy-to-use, customizable UI components for our ML model, any API, or any arbitrary function in just a few lines of code. We can integrate the GUI directly into the Python notebook, or we can share the link with anyone.

**Exercise 14: Create a Gradio app for your fine-tuned model pushed on Hugging Face Model Hub [1 Marks]**

- Install and import `gradio` library
- Create a function to use your fine-tuned model for response generation
    - Use the model and tokenizer directly within the function, do not pass them as parameters
    - Function should take input prompt text, and max response length as its input parameters
    - Function should output the generated response text
- Create input and output gradio elements
- Create a gradio interface object
- Launch the interface to generate UI

In [None]:
!pip -q install gradio

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.7/57.7 MB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m321.9/321.9 kB[0m [31m17.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.8/94.8 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m59.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.5/71.5 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.3/62.3 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import gradio

In [None]:
# Function for response generation


# Import required libraries
import gradio as gr
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load your fine-tuned model and tokenizer
model_name = "snehapriyabs/gpt2"  # Replace with your model's path
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Set pad_token if it doesn't exist
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Function to generate responses using your fine-tuned model
def generate_response(prompt, max_length=50):
    # Tokenize the input prompt
    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)

    # Generate text with attention_mask and pad_token_id
    outputs = model.generate(
        inputs["input_ids"],
        attention_mask=inputs["attention_mask"],  # Pass attention_mask
        pad_token_id=tokenizer.pad_token_id,      # Set pad_token_id
        max_length=max_length,                    # Set max response length
        num_return_sequences=1                    # Generate one sequence
    )

    # Decode the generated text
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response

# Create Gradio input and output elements
input_prompt = gr.Textbox(label="Input Prompt", placeholder="Enter your prompt here...")
input_max_length = gr.Slider(minimum=10, maximum=200, value=50, label="Max Response Length")
output_response = gr.Textbox(label="Generated Response")

# Create a Gradio interface
gradio_app = gr.Interface(
    fn=generate_response,              # Function to call
    inputs=[input_prompt, input_max_length],  # Input components
    outputs=output_response,           # Output component
    title="Fine-Tuned Model Text Generator",
    description="Generate text using your fine-tuned model. Adjust the max response length using the slider."
)

# Launch the Gradio interface
gradio_app.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://b95d0b704c39e8ceba.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




## Upload your Gradio application on Hugging Face Spaces

**Exercise 15: Upload your Gradio application on Hugging Face Spaces [2 Marks]**

1. Start a new Hugging Face Space by going to your profile and [clicking "New Space"](https://huggingface.co/new-space)

2. Provide details for your space:
    - Space name
    - License (eg. [MIT](https://opensource.org/licenses/MIT))
    - Space SDK (software development kit) (eg. `Gradio`)
    - Space hardware (CPU basic)
    - Choose whether your Space is public or private
    - Click "Create Space"

3. Go to ***Add files -> Create a new file*** option to add below files:
    - `requirements.txt`: should contain the dependencies to run your app such as `transformers`, `torch`, and `gradio`
    - `app.py`: should contain steps to
        - import required packages
        - load your fine-tuned model and tokenizer from the Model Hub
        - function to use your fine-tuned model for response generation
        - create input and output gradio elements
        - create a gradio inference object
        - launch the interface to generate UI

4. Access the `App` tab of your repository to see the build progress (debug if error persists)

5. Once the app has built successfully, test the application running on your Space with a user input prompt

