## **Install Required Dependencies**

In [2]:
!pip install transformers gradio torch safetensors
!pip install gradio
!pip install torch
!pip install safetensors

Collecting gradio
  Downloading gradio-5.15.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.8-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.7.0 (from gradio)
  Downloading gradio_client-1.7.0-py3-none-any.whl.metadata (7.1 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.9.3 (from gradio)
  Downloading ruff-0.9.5-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.meta

### **Import Necessary Libraries**

In [3]:
# For loading GPT-based models
from transformers import AutoModelForCausalLM, AutoTokenizer

# For creating an interactive UI
import gradio as gr

# PyTorch for deep learning computations
import torch

### **Load GPT-Neo Model**

Model: GPT-Neo 2.7B

Precision: 16-bit float

Device: Auto CPU/GPU


In [4]:
#  Function to Load GPT-Neo Model with Automatic Resource Management

def load_gpt_neo_model():
    try:
        model_name = "EleutherAI/gpt-neo-2.7B"
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        model = AutoModelForCausalLM.from_pretrained(
            model_name,
            torch_dtype=torch.float16,
            device_map="auto"
        )
        return tokenizer, model
    except Exception as e:  # Prints an error message if loading fails
        print("Error loading GPT-Neo model:", e)
        raise   # Raises the exception to stop execution

### **Define GPT-Neo Response Function**

Text: Converts text to tensor and moves it to GPU

Limit: 150 tokens

Sampling: Enabled

Top-k: 50 tokens

Nucleus: Sampling


In [5]:
# Function to generate a response using GPT-Neo

def gpt_neo_response(prompt):
    try:
        inputs = tokenizer.encode(prompt, return_tensors="pt").to("cuda")
        outputs = model.generate(
            inputs,
            max_length=150,
            do_sample=True,
            top_k=50,
            top_p=0.92
        )
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)  # Converts tensor back to text
        return response
    except Exception as e:
        return f"Error generating response: {str(e)}" # Returns an error message in case of failure

### **Initialize the Model**

load: GPT-Neo

Fallback: GPT-2 tokenizer, model

In [6]:
# Attempt to initialize the GPT-Neo model

try:
    tokenizer, model = load_gpt_neo_model()
except Exception as e:
    print("Falling back to GPT-2 model due to insufficient resources.")
    tokenizer = AutoTokenizer.from_pretrained("gpt2")
    model = AutoModelForCausalLM.from_pretrained("gpt2")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/90.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/10.7G [00:00<?, ?B/s]

### **Create a Gradio Chatbot Interface**

In [7]:
# Function to interact with the chatbot via Gradio

def chatbot(prompt):
    return gpt_neo_response(prompt)


Calls the response generation function

In [8]:
# Create a Gradio interface for the chatbot

interface = gr.Interface(
    fn=chatbot,   # Links the chatbot function to the UI
    inputs="text",
    outputs="text",
    title="GPT-Neo Chatbot",
    description="A chatbot using GPT-Neo with fallback to GPT-2.",
    theme="default"   # Uses Gradio's default theme
)


### **Launch the Gradio Chatbot**

In [None]:
# Starts the UI, allowing sharing and debugging

interface.launch(share=True, debug=True)

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://b1e0adf8e66ca09d12.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


**Public URL**

A unique Gradio public link is generated (e.g., https://c1d451a72e3218ffe9.gradio.live).

This link allows you to share the chatbot interface with others over the internet.

Note: This link typically expires after 72 hours.


**Gradio Interface**
Title: "GPT-Neo Chatbot"

This is the name of the chatbot interface, set using the title parameter in the Gradio code.

Description: "A chatbot using GPT-Neo with fallback to GPT-2."

This explains the chatbot’s functionality, as defined in the description parameter.


**Input and Output Fields**

Input Field (prompt):
Users can enter a query or prompt here.

Example: "What is GPT-Neo?"

Output Field (output):
The chatbot generates and displays its response to the query here.


**Buttons**

Submit Button:
Triggers the chatbot to process the input and generate a response.

Clear Button:
Clears both the input and output fields.

Flag Button:
Allows users to flag inappropriate or incorrect responses.