<a href="https://colab.research.google.com/github/Rohan5076/AIMLOPS/blob/main/Rohan_M6_NB_MiniProject_1_Deploy_Medical_Q%26A_GPT2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Programme in AI and MLOps
## A programme by IISc and TalentSprint
### Mini-Project: Medical Q&A using GPT2 | Deployment on Hugging Face Spaces

## Learning Objectives

At the end of the experiment, you will be able to:

* perform data preprocessing, EDA and feature extraction on the Medical Q&A dataset
* load a pre-trained tokenizer
* finetune a GPT-2 language model for medical question-answering
* upload your fine-tuned model to Hugging Face Model Hub
* deploy application with uploaded model on HuggingFace Spaces using Gradio

## Dataset Description

The dataset used in this project is the *Medical Question Answering Dataset* ([MedQuAD](https://github.com/abachaa/MedQuAD/tree/master)). It includes medical question-answer pairs along with additional information, such as the question type, the question *focus*, its UMLS(Unified Medical Language System) details like - Concept Unique Identifier(*CUI*) and Semantic *Type* and *Group*.

To know more about this data's collection, and construction method, refer to this [paper](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3119-4).

The data is extracted and is in CSV format with below features:

- **Focus**: the question focus
- **CUI**: concept unique identifier
- **SemanticType**
- **SemanticGroup**
- **Question**
- **Answer**

## Grading = 10 Points

## Information

Healthcare professionals often have to refer to medical literature and documents while seeking answers to medical queries. Medical databases or search engines are powerful resources of upto date medical knowledge. However, the existing documentation is large and makes it difficult for professionals to retrieve answers quickly in a clinical setting. The problem with search engines and informative retrieval engines is that these systems return a list of documents rather than answers. Instead, healthcare professionals can use question answering systems to retrieve short sentences or paragraphs in response to medical queries. Such systems have the biggest advantage of generating answers and providing hints in a few seconds.

### Problem Statement

Fine-tune gpt2 model on medical-question-answering-dataset for performing response generation for medical queries. Later, deploy the fine-tuned model on Hugging Face Spaces.

Please refer to ***M6 Assignment-1 Fine-tune GPT2*** and ***M6 AdditionalNB Fine-tune GPT2 for TextClassification*** to get familiar with how to load pre-trained gpt2 tokenizer and model.

Please refer to ***The demo session held on 14 Sep - Hugging Face Spaces Deployment*** to get familiar with how to do deployment using Hugging Face Spaces.

### Installing Dependencies

In [1]:
%%capture
!pip -q uninstall pyarrow -y
!pip -q install pyarrow==15.0.2
!pip -q install datasets
!pip -q install accelerate
!pip -q install transformers

### <font color="#990000">Restart Session/Runtime</font>

### Import required packages

In [1]:
import os
import re
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, DataCollatorForLanguageModeling
from transformers import Trainer, TrainingArguments

import warnings
warnings.filterwarnings('ignore')

In [2]:
#@title Download the dataset
!wget -q https://cdn.iisc.talentsprint.com/AIandMLOps/MiniProjects/Datasets/MedQuAD.csv
!ls | grep ".csv"

MedQuAD.csv


**Exercise 1: Read the MedQuAD.csv dataset**

**Hint:** pd.read_csv()

In [3]:
df = pd.read_csv('MedQuAD.csv')

### Pre-processing and EDA

**Exercise 2: Perform below operations on the dataset [0.5 Mark]**

- Handle missing values
- Remove duplicates from data considering `Question` and `Answer` columns

- **Handle missing values**

In [4]:
df.dropna(inplace=True)

- **Remove duplicates from data considering `Question` and `Answer` columns**

In [5]:
df.drop_duplicates(subset=['Question', 'Answer'], inplace=True)

**Exercise 3: Display the category name, and the number of records belonging to top 100 categories of `Focus` column [0.5 Mark]**

In [6]:
# Total categories in Focus column
df.Focus.nunique()

4770

In [7]:
# Displaying the distinct categories of Focus column and the number of records belonging to each category
# (Top 100 only)

df.Focus.value_counts()[:100]

Unnamed: 0_level_0,count
Focus,Unnamed: 1_level_1
Breast Cancer,53
Prostate Cancer,43
Stroke,35
Skin Cancer,34
Alzheimer's Disease,30
...,...
Alzheimer's Caregiving,11
Polycythemia Vera,11
"Diabetes, Heart Disease, and Stroke",11
Pelizaeus-Merzbacher disease,10


In [9]:
# Top 100 Focus categories names

top_100_focus = df.Focus.value_counts()[:100].index.tolist()

### Create Training and Validation set

**Exercise 4: Create training and validation set [1 Mark]**

- Consider 4 samples per `Focus` category, for each top 100 categories, from the dataset (It will give 400 samples for training)

- Consider 1 sample per `Focus` category (different from training set), for each top 100 categories, from the dataset (It will give 100 samples for validation)

In [10]:
train_df = pd.DataFrame()
val_df = pd.DataFrame()

for focus in top_100_focus:
  temp_df = df[df.Focus == focus]
  if len(temp_df) >= 5:
    train_samples = temp_df.sample(n=4, random_state=42)
    val_samples = temp_df.drop(train_samples.index).sample(n=1, random_state=42)
    train_df = pd.concat([train_df, train_samples])
    val_df = pd.concat([val_df, val_samples])

print(f"Training set size: {len(train_df)}")
print(f"Validation set size: {len(val_df)}")


Training set size: 400
Validation set size: 100


### Pre-process `Question` and `Answer` text

**Exercise 5: Perform below tasks:  [1 Mark]**

- Combine `Question` and `Answer` for train and validation data as shown below:
    - sequence = *'\<question\>' + question-text + '\<answer\>' + answer-text + '\<end\>'*

- Join the combined text using '\n' into a single string for training and validation separately

- Save the training and validation strings as separate text files

- **Combine Question and Answer for train and val data**

In [11]:
# Combine Questions and Answers for train and val data
## sequence = '<question>' + question + '<answer>' + answer
train_df['sequence'] = '<question>' + train_df['Question'] + '<answer>' + train_df['Answer'] + '<end>'
val_df['sequence'] = '<question>' + val_df['Question'] + '<answer>' + val_df['Answer']

- **Join the combined text using '\n' into a single string for training and validation separately**

In [12]:
# Train and Validation text for all Q&As
train_text = '\n'.join(train_df['sequence'].tolist())
val_text = '\n'.join(val_df['sequence'].tolist())

- **Save the training and validation strings as text files**

In [13]:
# Save the training and validation data as text files
with open('train.txt', 'w') as f:
  f.write(train_text)

with open('val.txt', 'w') as f:
  f.write(val_text)

**Exercise 6: Load pre-trained GPT2Tokenizer**

- Use checkpoint = "gpt2"

In [14]:
# Set up the tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
tokenizer.pad_token = tokenizer.unk_token

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

**Exercise 7: Tokenize train and validation data [0.5 Mark]**

- Use the loaded pre-trained tokenizer
- Use training and validation data saved in text files

In [15]:
from datasets import load_dataset
train_file_path = 'train.txt'
val_file_path = 'val.txt'

dataset = load_dataset("text", data_files={"train": train_file_path,
                                           "validation": val_file_path})

block_size = 256     # max tokens in an input sample
def tokenize_function(examples):
    return tokenizer(examples["text"], padding='max_length', truncation=True, max_length=block_size, return_tensors='pt')

tokenized_datasets = dataset.map(tokenize_function, batched=True)

Generating train split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Map:   0%|          | 0/400 [00:00<?, ? examples/s]

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

**Exercise 8: Create a DataCollator object**

In [16]:
# Create a Data collator object
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False, return_tensors="pt")

**Exercise 9: Load pre-trained GPT2LMHeadModel**

In [17]:
# Set up the model
model = GPT2LMHeadModel.from_pretrained('gpt2')

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

**Exercise 10: Fine-tune GPT2 Model [1 Mark]**

- Specify training arguments and create a TrainingArguments object (Use 30 epochs)

- Train a GPT-2 model using the provided training arguments

- Save the resulting trained model and tokenizer to a specified output directory

In [18]:
# Set up the training arguments
model_output_path = "/content/gpt2_model"

training_args = TrainingArguments(
    output_dir = model_output_path,
    overwrite_output_dir = True,
    per_device_train_batch_size = 4, # try with 2
    per_device_eval_batch_size = 4,  #  try with 2
    num_train_epochs = 50,
    save_steps = 1_000,
    save_total_limit = 2,
    logging_dir = './logs',
    )

In [19]:
# Train the model
trainer = Trainer(
    model = model,
    args = training_args,
    data_collator = data_collator,
    train_dataset = tokenized_datasets["train"],
    eval_dataset = tokenized_datasets["validation"],
)
trainer.train()
# Save the model
saved_model_path = "/content/finetuned_gpt2_model"
trainer.save_model(saved_model_path)

# Save the tokenizer
tokenizer.save_pretrained(saved_model_path)

Step,Training Loss
500,1.995
1000,1.241
1500,0.7635
2000,0.464
2500,0.2956
3000,0.2098
3500,0.1623
4000,0.1348
4500,0.1188
5000,0.1106


('/content/finetuned_gpt2_model/tokenizer_config.json',
 '/content/finetuned_gpt2_model/special_tokens_map.json',
 '/content/finetuned_gpt2_model/vocab.json',
 '/content/finetuned_gpt2_model/merges.txt',
 '/content/finetuned_gpt2_model/added_tokens.json')

**Exercise 11: Test Model with user input prompts [1 Mark]**

- Create `generate_response()` function that takes a trained *model*, *tokenizer*, and a *prompt* string as input and generates a response using the GPT-2 model

- Test it with some user input prompts

In [20]:
def generate_response(model, tokenizer, prompt, max_length=200):
  input_ids = tokenizer.encode(prompt, return_tensors="pt")      # 'pt' for returning pytorch tensor

  # Check the device of the model
  device = next(model.parameters()).device

  # Move input_ids to the same device as the model
  input_ids = input_ids.to(device)

  # Create the attention mask and pad token id
  attention_mask = torch.ones_like(input_ids)
  pad_token_id = tokenizer.eos_token_id

  output = model.generate(
      input_ids,
      max_length=max_length,
      num_return_sequences=1,
      attention_mask=attention_mask,
      pad_token_id=pad_token_id
  )

  return tokenizer.decode(output[0], skip_special_tokens=True)


In [21]:
# Load the fine-tuned model and tokenizer

my_model = GPT2LMHeadModel.from_pretrained(saved_model_path)
my_tokenizer = GPT2Tokenizer.from_pretrained(saved_model_path)

In [22]:
# Testing with a sample prompt 1
prompt = 'What precautions to take for a healthy life?'
response = generate_response(model, tokenizer, prompt)
response

"What precautions to take for a healthy life? - Hemochromatosis and its variants is one of the most common genetic diseases in the United States. It's most common in Caucasians of Northern European descent. - Hemochromatosis is one of the most common genetic diseases in the United States. - There are two forms of hemochromatosis, which are more common in Caucasians of Northern European descent. - The forms are as follows: - Hyaluronic hemochromatosis - Hyperaluric hemochromatosis - Hyperuric hemochromatosis - Both forms are present in varying degrees. - Forms that are most common are hemochromotosis, which is less common in Caucasians, and troponotosis, which is more common in Caucasians of Northern European descent. - Other forms of the disease include - hemochromotosis-imbalances - hemochromotosis-resistant - hemochromot"

In [23]:
# Testing with a sample prompt 2

prompt = 'What to do after being diagnosed with cancer?'
response = generate_response(model, tokenizer, prompt)
response

'What to do after being diagnosed with cancer? - Ask your doctor about the right way to diagnose cancer. - Check with your doctor if the results indicate that the disease is not cancerous. - Check with your family doctor, dietitian, or physical health specialist to make a definite diagnosis. If the diagnosis is based on a small sample of blood or urine, your doctor may order a second analysis. If the diagnosis is based on a large sample of urine or saliva, your doctor may order a second analysis. Your doctor will work with your doctor to find out whether the disease is cancerous. If not, your doctor may order a routine biopsy. The doctor will use a combination of tests and laboratory tests to determine the disease. The results of a biopsy may show the disease is cancerous. If the disease is found to be cancerous, your doctor will perform a follow-up test: Screening for Skin Cancer. This test may assess the presence of cancer in your skin.'

**Exercise 12: Compare the performance of a *GPT2 model* with the *GPT2 model fine-tuned* on MedQuAD data [0.5 Mark]**

- Load another pre-trained GPT2LMHeadModel and do not fine-tune it

- To generate response using the untuned model, pass it as a parameter to `generate_response()` function

- Test both models (fine-tuned and untuned) with below user input prompts:

    - "What precautions to take for a healthy life?"
    - "What to do after being diagnosed with cancer?"
    - "What to do when feeling sick?"

In [24]:
# Load a pre-trained GPT2 model, do not finetune it with MedQuAD data
untuned_model = GPT2LMHeadModel.from_pretrained('gpt2')

In [25]:
# Testing with finetuned model: prompt 1

prompt = "What precautions to take for a healthy life?"
response = generate_response(model, tokenizer, prompt)
response

'What precautions to take for a healthy life? - Ask your doctor about any possible side effects of taking certain medications. - Check with your doctor if you have any that you think could be harmful. - If you have a family history of certain diseases or conditions that can cause a problem with your immune system, you may want to talk with your doctor about taking these medications with a grain of salt. You can take the medications with or without food, but you should always consult with your doctor before taking any medications. - Be sure to talk with your doctor about any side effects of taking certain medications with or without food. - If you have any questions about the medicines you take, ask your doctor about them in writing. He or she may ask how much you take, and how much you take in. - Be sure to tell your doctor about any changes in your eating, drinking, and daily routine. You may want to check to make sure that you are getting the proper nutrients and vitamins you need wh

In [26]:
# Testing with untuned model: prompt 1

prompt = "What precautions to take for a healthy life?"
response = generate_response(untuned_model, tokenizer, prompt)
response

"What precautions to take for a healthy life?\n\nThe following are some of the most common questions you'll hear from your doctor or nurse about your health.\n\nWhat are the risks of taking a drug that can cause cancer?\n\nThe risks of taking a drug that can cause cancer are very high.\n\nWhat are the risks of taking a drug that can cause cancer?\n\nThe risks of taking a drug that can cause cancer are very high.\n\nWhat are the risks of taking a drug that can cause cancer?\n\nThe risks of taking a drug that can cause cancer are very high.\n\nWhat are the risks of taking a drug that can cause cancer?\n\nThe risks of taking a drug that can cause cancer are very high.\n\nWhat are the risks of taking a drug that can cause cancer?\n\nThe risks of taking a drug that can cause cancer are very high.\n\nWhat are the risks of taking a drug that can cause"

In [None]:
# Testing with finetuned model: prompt 2

prompt = "What to do after being diagnosed with cancer?"
response = generate_response(model, tokenizer, prompt)
response

'What to do after being diagnosed with cancer? - Ask your doctor about the following questions. - When did I first become aware of my cancer? - Did I have a cold or the flu? - Did I have a head injury? - Did I have a head injury? - Was I exposed to the sun, air, or the earth for a long time? - Is this a normal part of growing older? - Does it affect my ability to do certain activities? - What medicines I take? What are the treatments? - What medicines do I take? What is the outlook for my family? Is this a normal part of growing older? How might I become aware of my cancer? When did I first become aware of my cancer? Ask your doctor about the following questions. When did I first become aware of my cancer? Did I have a cold or the flu? Did I have a head injury? Was I exposed to the sun, air, or the earth for a long time? Is'

In [None]:
# Testing with untuned model: prompt 2

prompt = "What to do after being diagnosed with cancer?"
response = generate_response(untuned_model, tokenizer, prompt)
response

"What to do after being diagnosed with cancer?\n\nThe first step is to get your doctor's approval for a treatment.\n\nIf you have a cancer diagnosis, you may need to get a second opinion.\n\nIf you have a cancer diagnosis, you may need to get a second opinion. If you have a cancer diagnosis, you may need to get a third opinion.\n\nIf you have a cancer diagnosis, you may need to get a third opinion. If you have a cancer diagnosis, you may need to get a fourth opinion.\n\nIf you have a cancer diagnosis, you may need to get a fourth opinion. If you have a cancer diagnosis, you may need to get a fifth opinion.\n\nIf you have a cancer diagnosis, you may need to get a fifth opinion. If you have a cancer diagnosis, you may need to get a sixth opinion.\n\nIf you have a cancer diagnosis, you may need to get a sixth opinion. If you have"

In [None]:
# Testing with finetuned model: prompt 3

prompt = "What to do when feeling sick?"
response = generate_response(model, tokenizer, prompt)
response

'What to do when feeling sick?<answer>A person who has felt ill for a long time may feel ill at first. However, over time, as the illness gets worse, the feeling may decrease until the person is usually not sick. (Watch the video to learn more about the different types of ill-considered illnesses. To enlarge the video, click the brackets in the lower right-hand corner. To reduce the video, press the Escape (Esc) button on your keyboard.) To manage your sick days, here are things to do every day. - Take your medicines for nausea and vomiting. Take your medicines for nausea and vomiting. Take your medicines for anxiety and depression. - Keep your clothes dry. Dry them quickly. - Keep your clothes and your medicines a dry day. - Bring all medicines with you to your next medical appointment. Keep your clothes dry. - Bring your medicines with you to your next medical appointment. - Keep your medicines a dry day. - Check your blood'

In [None]:
# Testing with untuned model: prompt 3

prompt = "What to do when feeling sick?"
response = generate_response(untuned_model, tokenizer, prompt)
response

"What to do when feeling sick?\n\nThe first thing you should do is to get your body to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick, you should take a few minutes to relax.\n\nIf you're feeling sick"

## Push your model to Hugging Face Model Hub

**Exercise 13: Follow below steps to push your fine-tuned model to HuggingFace Model Hub**

1. [Sign up](https://huggingface.co/join) for a Hugging Face account
2. Create an access token for your account and save it
3. Store your access token in the Hugging Face cache folder within colab
4. Push your fine-tuned model and tokenizer to Model Hub
5. Load the model back from Hub and test it with user input prompts

* **Create an access token for your account**

    Once you have an account, to create an access token:
    
    - Go to your `Settings`, then click on the `Access Tokens` tab. Click on the `New token` button to create a new User Access Token.
    - Select a Token type as `Write` and give a name for your token
    - Click on Create token
    - Once a token is created save it somewhere
    - When required later, use the old saved token or create a new token again

    To know more about Access Tokens, refer [here](https://huggingface.co/docs/hub/security-tokens).

* **Store your access token in the Hugging Face cache folder within colab**

    Once you have your User Access Token, run the following command to authenticate your identity to the Hub.
    - `!huggingface-cli login`
    - Paste your Access token when prompted
    - Type **n** when prompted to Add token as git credential? (Y/n)

    For more details on login, refer [here](https://huggingface.co/docs/huggingface_hub/quick-start#login).

In [27]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) n
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


* **Push your fine-tuned model and tokenizer to Model Hub [0.5 Mark]**

    - Use `push_to_hub()` method of your model and tokenizer both, to push them on hub
    - Specify name for your repository where the model and tokenizer will be pushed using `repo_id` parameter
    - Push model and tokenizer to the same repository

    - **Hint:**

        - Use `push_to_hub()` method of your model. For parameter details, refer [here](https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.push_to_hub).
        - Use `push_to_hub()` method of your tokenizer. For parameter details, refer [here](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.PreTrainedTokenizer.push_to_hub).
        - Access your pushed model at `https://huggingface.co/[YOUR-USER-NAME]/[YOUR-MODEL-REPO-NAME]/tree/main`

In [28]:
# Push model
my_repo = "medquad-text-generation-gpt2"
model.push_to_hub(repo_id= my_repo, commit_message= "Upload fine-tuned model")

model.safetensors:   0%|          | 0.00/498M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/Rohan5076/medquad-text-generation-gpt2/commit/8b35fd2682efdb39ade206b6f3cceb8d244b7ba3', commit_message='Upload fine-tuned model', commit_description='', oid='8b35fd2682efdb39ade206b6f3cceb8d244b7ba3', pr_url=None, pr_revision=None, pr_num=None)

In [29]:
# Push tokenizer
tokenizer.push_to_hub(repo_id= my_repo, commit_message= "Upload tokenizer used")

README.md:   0%|          | 0.00/5.17k [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/Rohan5076/medquad-text-generation-gpt2/commit/f346794fd2d61942e80c8fb93f9233569d47fea0', commit_message='Upload tokenizer used', commit_description='', oid='f346794fd2d61942e80c8fb93f9233569d47fea0', pr_url=None, pr_revision=None, pr_num=None)

* **Load the model and tokenizer back from Hub and test it with user input prompts [0.5 Mark]**

    - In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the `from_pretrained()` method. **AutoClasses** can be used to automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

    - Instantiating one of `AutoConfig`, `AutoModel`, and `AutoTokenizer` will directly create a class of the relevant architecture.

    - When the GPT2 Model transformer has a language modeling head on top, you can use an auto class with language modeling head on top as well - `AutoModelWithLMHead`.

    - Specify full path of your model repo i.e. ***''YOUR-USER-NAME/YOUR-REPO-NAME''*** while calling `from_pretrained()` method.

In [30]:
from transformers import AutoModelWithLMHead, AutoTokenizer

In [31]:
# Load your model from hub
username = "rohan5076"
my_checkpoint = username + '/' + my_repo
loaded_model = AutoModelWithLMHead.from_pretrained(my_checkpoint)

config.json:   0%|          | 0.00/907 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/498M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/119 [00:00<?, ?B/s]

In [32]:
# Load your tokenizer from hub

loaded_tokenizer = AutoTokenizer.from_pretrained(my_checkpoint)

tokenizer_config.json:   0%|          | 0.00/525 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/999k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/470 [00:00<?, ?B/s]

In [33]:
# Response from loaded model

prompt = "What is the outlook for Skin Cancer ?"
response = generate_response(loaded_model, loaded_tokenizer, prompt)
response

'What is the outlook for Skin Cancer?<answer>The long-term outlook for a person with skin cancer is generally good. A person with a small amount of melanoma may develop small patches of thick, red skin with less than 10 percent of the cells remaining in the skin. A person with a high risk of developing skin cancer may also develop small areas of thick, red skin with less than 10 percent of the cells remaining in the skin. The risk of developing a new type of skin cancer or a combination of skin cancer and skin cancer is higher than the risk of developing any one type of skin cancer. Some types of cancer are more common in men than in women. Some types of cancer are more common in men than in women. There are three main types of melanoma: B3, SMC3, and RAD21. Most types of melanoma are benign (noncancerous) and usually do not cause health problems. Most people who have had melanoma will benefit from regular'

## Gradio Implementation

Gradio is an open-source python library that allows us to quickly create easy-to-use, customizable UI components for our ML model, any API, or any arbitrary function in just a few lines of code. We can integrate the GUI directly into the Python notebook, or we can share the link with anyone.

**Exercise 14: Create a Gradio app for your fine-tuned model pushed on Hugging Face Model Hub [1 Marks]**

- Install and import `gradio` library
- Create a function to use your fine-tuned model for response generation
    - Use the model and tokenizer directly within the function, do not pass them as parameters
    - Function should take input prompt text, and max response length as its input parameters
    - Function should output the generated response text
- Create input and output gradio elements
- Create a gradio interface object
- Launch the interface to generate UI

In [34]:
!pip -q install gradio

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.1/18.1 MB[0m [31m86.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m318.7/318.7 kB[0m [31m14.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.6/94.6 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m7.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.9/141.9 kB[0m [31m13.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.9/11.9 MB[0m [31m65.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [35]:
import gradio
import gradio as gr

In [36]:
# Function for response generation

def generate_query_response(prompt, max_length=200):

    model = loaded_model
    tokenizer = loaded_tokenizer

    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    attention_mask = torch.ones_like(input_ids)
    pad_token_id = tokenizer.eos_token_id
    output = model.generate(
        input_ids,
        max_length=max_length,
        num_return_sequences=1,
        attention_mask=attention_mask,
        pad_token_id=pad_token_id
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)



In [37]:
# Gradio elements

# Input from user
in_prompt = 'What is the outlook for Skin Cancer ?'
in_max_length = 255
# Output response
out_response = generate_query_response(in_prompt, in_max_length)

In [38]:
# Gradio interface to generate UI link
iface = gradio.Interface(fn=generate_query_response,
    inputs=["text", "number"],
    outputs="text")

# YOUR CODE HERE to launch the interface
iface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://883397b6b360f1820f.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## Upload your Gradio application on Hugging Face Spaces

**Exercise 15: Upload your Gradio application on Hugging Face Spaces [2 Marks]**

1. Start a new Hugging Face Space by going to your profile and [clicking "New Space"](https://huggingface.co/new-space)

2. Provide details for your space:
    - Space name
    - License (eg. [MIT](https://opensource.org/licenses/MIT))
    - Space SDK (software development kit) (eg. `Gradio`)
    - Space hardware (CPU basic)
    - Choose whether your Space is public or private
    - Click "Create Space"

3. Go to ***Add files -> Create a new file*** option to add below files:
    - `requirements.txt`: should contain the dependencies to run your app such as transformers, torch, and gradio
    - `app.py`: should contain steps to
        - import required packages
        - load your fine-tuned model and tokenizer from the Model Hub
        - function to use your fine-tuned model for response generation
        - create input and output gradio elements
        - create a gradio inference object
        - launch the interface to generate UI

4. Access the `App` tab of your repository to see the build progress (debug if error persists)

5. Once the app has built successfully, test the application running on your Space with a user input prompt

