<a href="https://colab.research.google.com/github/paulusshewamre/huggingface-chatbot-intent-recognition/blob/main/chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -q sentence-transformers transformers qdrant-client accelerate

Import dependencies

In [2]:
from sentence_transformers import SentenceTransformer
from transformers import pipeline
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct
import uuid
import numpy as np

Load embedding model + chat LLM

In [3]:
print("Loading models...")
embedder = SentenceTransformer("all-MiniLM-L6-v2")

chat_llm = pipeline(
    "text-generation",
    model="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    max_new_tokens=80,
    temperature=0.7
)
print("‚úÖ Models loaded.\n")

Loading models...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Device set to use cpu


‚úÖ Models loaded.



Define intents and example sentences

In [4]:
intents = {
    "greeting": {
        "examples": ["hi", "hello", "hey", "good morning", "good evening"],
        "response": "Hey there! How are you doing today?"
    },
    "goodbye": {
        "examples": ["bye", "see you", "goodbye", "catch you later"],
        "response": "Goodbye! Talk to you soon"
    },
    "ask_weather": {
        "examples": ["what's the weather", "is it raining", "how's the weather"],
        "response": "It looks like a nice day! ‚òÄÔ∏è (I don‚Äôt have real-time data.)"
    },
    "ask_movie": {
        "examples": ["recommend a movie", "suggest a film", "best sci-fi movie"],
        "response": None
    },
    "ask_name": {
        "examples": ["what's your name", "who are you"],
        "response": "I'm your friendly chat assistant"
    },
    "general_chat": {
        "examples": ["how are you", "what's up", "how's it going"],
        "response": "I‚Äôm doing great! Thanks for asking."
    },
    "ask_description": {
        "examples": ["what's it about", "tell me more", "explain it"],
        "response": None
    }
}

Encode all example sentences for similarity lookup

In [5]:
all_examples = []
intent_labels = []

for intent, data in intents.items():
    for ex in data["examples"]:
        all_examples.append(ex)
        intent_labels.append(intent)

example_embeddings = embedder.encode(all_examples)

Initialize in-memory Qdrant vector database

In [6]:
qdrant = QdrantClient(":memory:")

if qdrant.collection_exists("chat_memory"):
    qdrant.delete_collection("chat_memory")

qdrant.create_collection(
    collection_name="chat_memory",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)

True

Define memory storage + retrieval functions

In [7]:
def store_memory(text):
    emb = embedder.encode(text).tolist()
    qdrant.upsert(
        collection_name="chat_memory",
        points=[
            PointStruct(
                id=str(uuid.uuid4()),
                vector=emb,
                payload={"text": text}
            )
        ]
    )

def retrieve_memory(query, limit=5):
    emb = embedder.encode(query).tolist()

    results = qdrant.query_points(
        collection_name="chat_memory",
        query=emb,
        limit=limit
    ).points

    return [r.payload["text"] for r in results]


Intent recognition using cosine similarity

In [8]:
def recognize_intent(user_input):
    user_emb = embedder.encode([user_input])
    sims = np.dot(example_embeddings, user_emb.T).flatten()
    best_idx = np.argmax(sims)
    return intent_labels[best_idx], float(sims[best_idx])

Generate a response using intent logic + LLM

In [9]:
def generate_response(user_input):

    intent, score = recognize_intent(user_input)

    # Confident intent classification
    if score > 0.40:

        # Movie recommendation response
        if intent == "ask_movie":
            movie = "Inception"
            store_memory(f"movie_recommendation: {movie}")
            return f"You should watch *{movie}*!"

        # Movie description using memory
        if intent == "ask_description":
            memories = retrieve_memory("movie")
            for m in memories:
                if "movie_recommendation" in m:
                    movie = m.split(":")[1].strip()
                    return f"{movie} is a mind-bending sci-fi thriller about entering dreams within dreams."
            return "Tell me what exactly you want to know more about."

        # Regular intent with predefined response
        response = intents[intent].get("response")
        if response:
            return response

    # If intent unclear ‚Üí fallback to LLM + memory context
    memories = retrieve_memory(user_input)
    memory_text = "\n".join(memories)

    prompt = f"""
User said: {user_input}
Relevant memories:
{memory_text}

Respond naturally and briefly:
"""

    result = chat_llm(prompt)[0]["generated_text"]
    store_memory(user_input)
    return result.strip()

Run interactive chat loop

In [None]:
print("ü§ñ Chatbot ready! Type 'exit' to quit.\n")

while True:
    user_input = input("You: ").strip()
    if user_input.lower() in ["exit", "quit"]:
        print("Bot: Goodbye!")
        break

    print("Bot:", generate_response(user_input))


ü§ñ Chatbot ready! Type 'exit' to quit.

You: hi
Bot: Hey there! How are you doing today?
You: what's up
Bot: I‚Äôm doing great! Thanks for asking.
You: how is the weather like today
Bot: It looks like a nice day! ‚òÄÔ∏è (I don‚Äôt have real-time data.)
You: what is a good movie to watch
Bot: You should watch *Inception*!
You: what is it about
Bot: Inception is a mind-bending sci-fi thriller about entering dreams within dreams.
You: okay bye
Bot: Goodbye! Talk to you soon
