<a href="https://colab.research.google.com/github/Madhusudan3223/MadBotX-AI-Chatbot/blob/main/MadBotX_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🤖 MadBotX: Real-Time AI Chatbot with Hugging Face Zephyr

MadBotX is a lightweight, interactive AI chatbot built using Hugging Face's powerful **Zephyr-7B** model. It allows users to have real-time, human-like conversations directly in a web interface using **Gradio** — all without needing any external paid API!

## 🔗 MadBotX Chatbot on Hugging Face Spaces
[Click here to open MadBotX](https://huggingface.co/spaces/madhumandal/MadBotX)

## 🚀 Features

- 🔍 **Conversational AI** using `HuggingFaceH4/zephyr-7b-alpha`
- ⚡ **Accelerated performance** with `torch_dtype=auto` and `device_map=auto`
- 🌐 **Web UI** powered by Gradio ChatInterface
- 🧠 **Context-aware replies** with user-assistant memory
- ☁️ **Deployable on Hugging Face Spaces** for free public access

## 🖥️ UI & Interaction

The app uses **Gradio’s ChatInterface**, which:
- Keeps a history of user & assistant turns
- Automatically formats prompts
- Provides a clean and responsive frontend

📦 Install Dependencies & Authenticate with Hugging Face
Before running the chatbot, we need to install the required libraries and authenticate with the Hugging Face Hub to access the Zephyr model.

In [None]:
# Install required libraries
!pip install transformers accelerate gradio --quiet

# Login to Hugging Face
from huggingface_hub import login
from google.colab import userdata

# Paste your token here
login(userdata.get("HF_TOKEN_ID"))

🧠 GPU Availability Check
Before running the Zephyr model for real-time inference, it's important to check whether a GPU is available. This block uses PyTorch to confirm GPU access, which speeds up model inference significantly.

In [None]:
import torch

if torch.cuda.is_available():
    print("GPU is available.")
    print("Device name:", torch.cuda.get_device_name(0))
else:
    print("GPU is not available.")

GPU is available.
Device name: Tesla T4


🤖 Load Zephyr Chat Model
This block loads the Zephyr 7B Alpha model using Hugging Face's transformers pipeline. It enables real-time conversational AI directly in Colab.

In [None]:
from transformers import pipeline

# Load chat model
chatbot = pipeline(
    "text-generation",
    model="HuggingFaceH4/zephyr-7b-alpha",
    torch_dtype="auto",
    device_map="auto"
)

# Test it with a simple input
response = chatbot("### User: What is the capital of France?\n### Assistant:", max_new_tokens=100)
print(response[0]['generated_text'])


Fetching 8 files:   0%|          | 0/8 [00:00<?, ?it/s]

model-00002-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00005-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00007-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00006-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00004-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00008-of-00008.safetensors:   0%|          | 0.00/816M [00:00<?, ?B/s]

model-00003-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00001-of-00008.safetensors:   0%|          | 0.00/1.89G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]



tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

Device set to use cuda:0


### User: What is the capital of France?
### Assistant: The capital of France is Paris.


💬 Build Gradio Chatbot Interface (MadBotX)
This section creates an interactive chatbot UI using Gradio and integrates it with the Zephyr 7B Alpha model.

In [None]:
import gradio as gr

# Format and respond using Zephyr-style prompts
def generate_reply(message, history):
    prompt = ""
    for user, bot in history:
        prompt += f"### User: {user}\n### Assistant: {bot}\n"
    prompt += f"### User: {message}\n### Assistant:"

    result = chatbot(prompt, max_new_tokens=150, do_sample=True, temperature=0.7)
    reply = result[0]['generated_text'].split("### Assistant:")[-1].strip()
    return reply

# Launch interactive Gradio chat
gr.ChatInterface(
    fn=generate_reply,
    title="🤖 MadBotX	",
    theme="soft",
    description="Ask anything in real-time using Hugging Face's Zephyr model!"
).launch(share=True)


  self.chatbot = Chatbot(


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://fbb7b6aed71e3e7929.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


