<a href="https://colab.research.google.com/github/jchen8000/DemystifyingLLMs/blob/main/6_Deployment/Chatbot_HuggingFace_Hosted_LLM_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 6.11 Chatbot, Example of LLM-Powered Application

## Notebook Description & Compatibility Notice

This notebook demonstrates how to build a lightweight conversational chatbot using **Hugging Face Inference APIs** via the `huggingface_hub` Python SDK.

### What this notebook does
- Uses **Hugging Face hosted large language models** for text generation
- Authenticates requests using a **Hugging Face Access Token**
- Leverages the `InferenceClient` with **automatic provider selection**, which dynamically routes requests to the best available inference backend to reduce 404/503 errors
- Implements a simple interactive chat loop with conversation history and configurable generation parameters (temperature and max tokens)

### Authentication Requirement
This script **requires a Hugging Face Access Token** to run.

- Create a token at: https://huggingface.co/settings/tokens  
- Store the token securely (e.g., as a Colab secret named `HF_TOKEN`)
- The token is retrieved at runtime and is **not hard-coded** in the notebook

Without a valid token, inference requests will fail.

### Model Notes
- The notebook uses the model: `Qwen/Qwen2.5-7B-Instruct`
- This model is stable and high-performance **as of 2026**
- The model ID can be changed to any compatible Hugging Face chat model
What this script does

### ⚠️ Compatibility & Future-Proofing Notice
This notebook and its code are **tested and functional in 2026**. However, Hugging Face continuously updates its:
- Models and model availability
- API endpoints and request formats
- SDK behavior (`huggingface_hub`, `transformers`)
- Hosting providers and inference backends
- Authentication and usage policies

As a result, this notebook **may stop working in the future without modification**.

Potential breaking changes include:
- Model deprecation or renaming
- Changes to the inference API interface
- Authentication or quota policy updates
- SDK version incompatibilities

This notebook should be treated as a **reference implementation**, not a permanently guaranteed interface.

In [None]:
%pip install \
  huggingface_hub==0.36.0 \
  transformers==4.57.3

In [None]:
from huggingface_hub import InferenceClient
from google.colab import userdata

HuggingFaceToken = userdata.get('HF_TOKEN')

client = InferenceClient(
    token=HuggingFaceToken
)

In [None]:
MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"

def ask_model(message, history, temperature=0.7, max_tokens=512):
    messages = [{"role": "system", "content": "You are a helpful AI assistant."}]
    
    for user_msg, bot_msg in history:
        messages.append({"role": "user", "content": user_msg})
        messages.append({"role": "assistant", "content": bot_msg})
    
    messages.append({"role": "user", "content": message})

    try:
        # We pass the model ID here instead of in the Client constructor
        # This allows the 'auto' provider system to work best
        response = client.chat.completions.create(
            model=MODEL_ID,
            messages=messages,
            max_tokens=max_tokens,
            temperature=temperature,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"System Busy: {str(e)}. Please try again in a moment."

In [None]:
def chatbot():
    print(f"--- Chatbot Live ({MODEL_ID}) ---")
    print("Type 'quit' to exit.\n")
    history = []

    while True:
        user_input = input("You: ")
        if user_input.lower() in ["quit", "exit"]:
            break

        answer = ask_model(user_input, history)
        print(f"\nAI: {answer}\n")
        history.append([user_input, answer])

chatbot()

Chatbot initialized. You can start chatting now (type 'quit' to stop)!

You: Hello
Chatbot: Hello! How can I assist you today? If you have any questions or need help with something, feel free to ask. I'm here to help.

You: How are you?
Chatbot: I'm just a computer program, so I don't have feelings or emotions like humans do. I'm here to provide information and help you with your questions to the best of my ability. How can I assist you today?

You: What are the top 5 largest cities in Canada?
Chatbot: The top 5 largest cities in Canada by population (as of 2021) are:

1. Toronto (Toronto-Durham Region CMA) - 6,417,516
2. Montreal (CMA) - 4,340,395
3. Vancouver (CMA) - 2,642,811
4. Calgary (CMA) - 1,388,988
5. Ottawa-Gatineau (CMA) - 1,429,629 (split between Ontario and Quebec)

These population numbers are for the entire metropolitan areas, not just the city proper. The cities themselves have smaller populations.

You: What is the next largest city?
Chatbot: The next largest city in C