<a href="https://colab.research.google.com/github/Dhanasree-Rajamani/Deep_Learning_Assignments/blob/main/Assignment_2a/Gradio_Chat_GPT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Building an application like Chat GPT using Gradio and Google Flan T5 Large Hugging Face

Import Statements

In [9]:
from IPython.display import HTML, display
import torch
import multiprocessing

Use GPU if available for computations, else CPU

In [10]:
device = "cuda:0" if torch.cuda.is_available() else "cpu"

Installations

In [11]:
!pip install gradio transformers accelerate bitsandbytes sentencepiece

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


Using the Google flan T5 Large instead of the Google flan T5 XL because the T5 large model is trained with 780M parameters and the T5 xl model is trained with 3B parameters, which requires much more RAM than the large model. My colab did not have the required RAM, so using the large training model.

The purpose of using the T5 model is text summarization, which is trained end-to-end with text as input as modified text as output. This protects sensitive texts, while preserving it's business utility.

These are used for instruction finetuning, multitask instruction finetuning, chain-of-thought finetuning

In [12]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto")

This function is the important function that is called to obtain answers from AI, for the questions asked. We pass the input text(question or text from user), the value of parameters min length, max length - which denotes the min and max length of the results expected, temperature is the measure of randomness.

In [13]:
def generate(input_text, minimum_length, maximum_length, beam_search, temperature):
  input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
  output = model.generate(input_ids, min_length = minimum_length, max_new_tokens = maximum_length, length_penalty = 1.8, num_beams=beam_search, no_repeat_ngram_size = 3, temperature = temperature, top_k = 150, top_p = 0.92, repetition_penalty = 2.4)
  return tokenizer.decode(output[0], skip_special_tokens=True)

This cell contains code needed for a Gradio UI. Some example questions that can be asked for the AI to answer. The sliders required for min length, max length, temperature, beam search and the text boxes and buttons required for the UI are given here. The generate function is called to get results from AI.

In [14]:
import gradio as gr

examples = [["Is There a best color on earth? What is the reason for this answer?"],
            ["Can Mahatma Gandhi have a conversation with Barack Obama? What is the rationale behind your answer?"],
            ["Why is Mars the only planet that can be habitable for human beings?"],
            ["Translate to Tamil: I love playing and eating a lot of food"],
            ["Generate a cooking recipe to prepare a blueberry cake. Give step by step instructions"],
            ["The tower is 330 metres (1,083 ft) tall,[5] about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest human-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure in the world to surpass both the 200-metre and 300-metre mark in height. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.The tower has three levels for visitors, with restaurants on the first and second levels. The top level's upper platform is 276 m (906 ft) above the ground – the highest observation deck accessible to the public in the European Union."],
            ["Answer the following question by reasoning step by step. The world consists of 12 triangles. If we lose 3 triangles, and destroy 4 triangles, how many would be left?"]]

title = "Chat GPT using Flan-T5-Large and Gradio UI"

def inference(text, minimum_length, maximum_length, beam_search, temperature):
  return generate(text, minimum_length, maximum_length, beam_search, temperature)

io = gr.Interface(
    fn=inference, 
    inputs=[gr.Textbox(lines=4), gr.Slider(10,500), gr.Slider(20,1000), gr.Slider(1, 16, step=1), gr.Slider(0, 1)],
    outputs = [
        gr.Textbox(lines=2, label="Flan-T5-Large Inference")],title = title, examples = examples, css= """body{background-color: Purple}.input_text input{background-color green !important;}""")
io.launch(share = True, debug=False)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://dcb02e0e-4b47-4e00.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces


