# üá≥üáµ Nepali Chatbot using Pre-trained Models
## Building a Conversational AI with HuggingFace Transformers


---

### Pre-trained Models Used:
1. **Sakonii/distilgpt2-nepali** - Nepali GPT-2 for text generation
2. **NepBERTa/NepBERTa** - Nepali BERT for understanding
3. **google/mt5-small** - Multilingual T5 for translation/generation
4. **facebook/mbart-large-50** - For multilingual conversations

### What We'll Build:
- Text generation chatbot using Nepali GPT-2
- Question-answering system
- Retrieval-based chatbot with semantic similarity
- Fine-tuning on custom conversation data

## 1. Setup and Installation

In [None]:
# Install required packages
#!pip install -q transformers torch sentencepiece accelerate datasets

In [None]:
# Import libraries
import torch
import numpy as np
import pandas as pd
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    AutoModel,
    AutoModelForSeq2SeqLM,
    pipeline,
    Trainer,
    TrainingArguments,
    DataCollatorForLanguageModeling
)
from datasets import Dataset
import warnings
warnings.filterwarnings('ignore')

# Check device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

## 2. Load Pre-trained Nepali GPT-2 Model

We'll use **Sakonii/distilgpt2-nepali** - a GPT-2 model pre-trained on 13+ million Nepali text sequences.

In [None]:
# Load Nepali GPT-2 model
print("Loading Nepali GPT-2 model...")

MODEL_NAME = "Sakonii/distilgpt2-nepali"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
model = model.to(device)

# Set pad token if not set
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print(f"‚úÖ Model loaded: {MODEL_NAME}")
print(f"   Vocabulary size: {tokenizer.vocab_size}")
print(f"   Model parameters: {model.num_parameters():,}")

In [None]:
# Test basic text generation
def generate_text(prompt, max_length=100, temperature=0.8, top_p=0.9, num_return_sequences=1):
    """
    Generate Nepali text using the pre-trained model.

    Args:
        prompt: Input text in Nepali
        max_length: Maximum length of generated text
        temperature: Controls randomness (lower = more focused)
        top_p: Nucleus sampling parameter
        num_return_sequences: Number of sequences to generate

    Returns:
        Generated text(s)
    """
    # Encode input
    inputs = tokenizer.encode(prompt, return_tensors='pt').to(device)

    # Generate
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=max_length,
            temperature=temperature,
            top_p=top_p,
            do_sample=True,
            num_return_sequences=num_return_sequences,
            pad_token_id=tokenizer.eos_token_id,
            no_repeat_ngram_size=2
        )

    # Decode
    generated_texts = []
    for output in outputs:
        text = tokenizer.decode(output, skip_special_tokens=True)
        generated_texts.append(text)

    return generated_texts[0] if num_return_sequences == 1 else generated_texts

# Test generation
print("Testing text generation...")
test_prompts = [
    "‡§®‡•á‡§™‡§æ‡§≤ ‡§è‡§ï ‡§∏‡•Å‡§®‡•ç‡§¶‡§∞",
    "‡§ï‡§æ‡§†‡§Æ‡§æ‡§°‡•å‡§Ç ‡§∂‡§π‡§∞‡§Æ‡§æ",
    "‡§Ü‡§ú ‡§Æ‡•å‡§∏‡§Æ"
]

for prompt in test_prompts:
    generated = generate_text(prompt, max_length=50)
    print(f"\nPrompt: {prompt}")
    print(f"Generated: {generated}")

## 3. Create Chatbot Class using Pre-trained Model

In [None]:
class NepaliGPTChatbot:
    """
    Nepali Chatbot using pre-trained GPT-2 model.
    """

    def __init__(self, model, tokenizer, device):
        self.model = model
        self.tokenizer = tokenizer
        self.device = device
        self.conversation_history = []

        # Chat templates
        self.user_prefix = "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: "
        self.bot_prefix = "‡§¨‡•ã‡§ü: "

    def generate_response(self, user_input, max_length=100, temperature=0.7):
        """
        Generate a response to user input.
        """
        # Create prompt with conversation context
        prompt = f"{self.user_prefix}{user_input}\n{self.bot_prefix}"

        # Add recent history for context
        if self.conversation_history:
            history_text = "\n".join(self.conversation_history[-4:])  # Last 2 exchanges
            prompt = history_text + "\n" + prompt

        # Encode
        inputs = self.tokenizer.encode(prompt, return_tensors='pt').to(self.device)

        # Generate
        with torch.no_grad():
            outputs = self.model.generate(
                inputs,
                max_length=len(inputs[0]) + max_length,
                temperature=temperature,
                top_p=0.9,
                do_sample=True,
                pad_token_id=self.tokenizer.eos_token_id,
                no_repeat_ngram_size=2,
                early_stopping=True
            )

        # Decode and extract response
        full_response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)

        # Extract just the bot's response
        response = full_response.split(self.bot_prefix)[-1].strip()

        # Clean up - take first sentence/line
        if "\n" in response:
            response = response.split("\n")[0]
        if self.user_prefix in response:
            response = response.split(self.user_prefix)[0]

        # Update history
        self.conversation_history.append(f"{self.user_prefix}{user_input}")
        self.conversation_history.append(f"{self.bot_prefix}{response}")

        return response.strip()

    def chat(self, user_input):
        """Simple chat interface."""
        return self.generate_response(user_input)

    def reset(self):
        """Reset conversation history."""
        self.conversation_history = []
        print("Conversation reset.")

# Create chatbot instance
chatbot = NepaliGPTChatbot(model, tokenizer, device)
print("‚úÖ Nepali GPT Chatbot created!")

In [None]:
# Test the chatbot
print("="*60)
print("ü§ñ NEPALI GPT CHATBOT TEST")
print("="*60)

test_inputs = [
    "‡§®‡§Æ‡§∏‡•ç‡§§‡•á",
    "‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§®‡§æ‡§Æ ‡§ï‡•á ‡§π‡•ã?",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§ï‡•á ‡§π‡•ã?",
    "‡§Ü‡§ú ‡§Æ‡•å‡§∏‡§Æ ‡§ï‡§∏‡•ç‡§§‡•ã ‡§õ?"
]

for user_input in test_inputs:
    response = chatbot.chat(user_input)
    print(f"\nüë§ User: {user_input}")
    print(f"ü§ñ Bot: {response}")

# Reset for next test
chatbot.reset()

## 4. Retrieval-Based Chatbot with Semantic Similarity

Using **NepBERTa** for computing semantic similarity between user query and predefined responses.

In [None]:
# Load NepBERTa for embeddings
print("Loading NepBERTa for semantic similarity...")

try:
    nepberta_tokenizer = AutoTokenizer.from_pretrained("NepBERTa/NepBERTa")
    nepberta_model = AutoModel.from_pretrained("NepBERTa/NepBERTa")
    nepberta_model = nepberta_model.to(device)
    print("‚úÖ NepBERTa loaded!")
    USE_NEPBERTA = True
except Exception as e:
    print(f"Could not load NepBERTa: {e}")
    print("Trying multilingual BERT instead...")
    nepberta_tokenizer = AutoTokenizer.from_pretrained("bert-base-multilingual-cased")
    nepberta_model = AutoModel.from_pretrained("bert-base-multilingual-cased")
    nepberta_model = nepberta_model.to(device)
    print("‚úÖ mBERT loaded as fallback!")
    USE_NEPBERTA = True

In [None]:
# Create knowledge base for retrieval chatbot
knowledge_base = {
    # Greetings
    "‡§®‡§Æ‡§∏‡•ç‡§§‡•á": "‡§®‡§Æ‡§∏‡•ç‡§§‡•á! ‡§Æ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§ï‡§∏‡§∞‡•Ä ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§∏‡§ï‡•ç‡§õ‡•Å?",
    "‡§π‡•á‡§≤‡•ã": "‡§π‡•á‡§≤‡•ã! ‡§ï‡§∏‡•ç‡§§‡•ã ‡§π‡•Å‡§®‡•Å‡§π‡•Å‡§®‡•ç‡§õ ‡§Ü‡§ú?",
    "‡§ï‡•á ‡§õ": "‡§∏‡§¨‡•à ‡§†‡•Ä‡§ï ‡§õ, ‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶! ‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§ï‡•á ‡§õ?",
    "‡§ï‡§∏‡•ç‡§§‡•ã ‡§õ": "‡§Æ ‡§†‡•Ä‡§ï ‡§õ‡•Å‡•§ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§ï‡•á‡§π‡•Ä ‡§∏‡•ã‡§ß‡•ç‡§®‡•Å ‡§õ?",

    # Identity
    "‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§®‡§æ‡§Æ ‡§ï‡•á ‡§π‡•ã": "‡§Æ ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§ö‡•ç‡§Ø‡§æ‡§ü‡§¨‡•ã‡§ü ‡§π‡•Å‡§Å, ‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§∏‡§π‡§æ‡§Ø‡§ï‡•§",
    "‡§§‡§™‡§æ‡§à ‡§ï‡•ã ‡§π‡•ã": "‡§Æ ‡§è‡§ï AI ‡§ö‡•ç‡§Ø‡§æ‡§ü‡§¨‡•ã‡§ü ‡§π‡•Å‡§Å, ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§≠‡§æ‡§∑‡§æ‡§Æ‡§æ ‡§ï‡•Å‡§∞‡§æ‡§ï‡§æ‡§®‡•Ä ‡§ó‡§∞‡•ç‡§® ‡§¨‡§®‡§æ‡§á‡§è‡§ï‡•ã‡•§",
    "‡§ï‡§∏‡§≤‡•á ‡§¨‡§®‡§æ‡§Ø‡•ã ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à": "‡§Æ Deep Learning ‡§∞ Transformer ‡§™‡•ç‡§∞‡§µ‡§ø‡§ß‡§ø‡§¨‡§æ‡§ü ‡§¨‡§®‡§æ‡§á‡§è‡§ï‡•ã ‡§π‡•Å‡§Å‡•§",

    # Nepal Info
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§ï‡•á ‡§π‡•ã": "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§ï‡§æ‡§†‡§Æ‡§æ‡§°‡•å‡§Ç ‡§π‡•ã‡•§",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ú‡§®‡§∏‡§Ç‡§ñ‡•ç‡§Ø‡§æ ‡§ï‡§§‡§ø ‡§õ": "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ú‡§®‡§∏‡§Ç‡§ñ‡•ç‡§Ø‡§æ ‡§≤‡§ó‡§≠‡§ó ‡•© ‡§ï‡§∞‡•ã‡§° ‡§õ‡•§",
    "‡§®‡•á‡§™‡§æ‡§≤ ‡§ï‡§π‡§æ‡§Å ‡§õ": "‡§®‡•á‡§™‡§æ‡§≤ ‡§¶‡§ï‡•ç‡§∑‡§ø‡§£ ‡§è‡§∂‡§ø‡§Ø‡§æ‡§Æ‡§æ ‡§≠‡§æ‡§∞‡§§ ‡§∞ ‡§ö‡•Ä‡§®‡§ï‡•ã ‡§¨‡•Ä‡§ö‡§Æ‡§æ ‡§Ö‡§µ‡§∏‡•ç‡§•‡§ø‡§§ ‡§õ‡•§",
    "‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ‡§ï‡•ã ‡§â‡§ö‡§æ‡§à ‡§ï‡§§‡§ø ‡§π‡•ã": "‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ‡§ï‡•ã ‡§â‡§ö‡§æ‡§à ‡•Æ,‡•Æ‡•™‡•Æ.‡•Æ‡•¨ ‡§Æ‡§ø‡§ü‡§∞ ‡§õ‡•§",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§∑‡•ç‡§ü‡•ç‡§∞‡§ø‡§Ø ‡§´‡•Ç‡§≤ ‡§ï‡•á ‡§π‡•ã": "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§∑‡•ç‡§ü‡•ç‡§∞‡§ø‡§Ø ‡§´‡•Ç‡§≤ ‡§≤‡§æ‡§≤‡•Ä‡§ó‡•Å‡§∞‡§æ‡§Å‡§∏ ‡§π‡•ã‡•§",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§∑‡•ç‡§ü‡•ç‡§∞‡§ø‡§Ø ‡§ö‡§∞‡§æ ‡§ï‡•á ‡§π‡•ã": "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§∑‡•ç‡§ü‡•ç‡§∞‡§ø‡§Ø ‡§ö‡§∞‡§æ ‡§°‡§æ‡§Å‡§´‡•á ‡§π‡•ã‡•§",

    # Food
    "‡§¶‡§æ‡§≤‡§≠‡§æ‡§§ ‡§ï‡•á ‡§π‡•ã": "‡§¶‡§æ‡§≤‡§≠‡§æ‡§§ ‡§®‡•á‡§™‡§æ‡§≤‡•Ä‡§π‡§∞‡•Ç‡§ï‡•ã ‡§Æ‡•Å‡§ñ‡•ç‡§Ø ‡§ñ‡§æ‡§®‡§æ ‡§π‡•ã - ‡§≠‡§æ‡§§, ‡§¶‡§æ‡§≤, ‡§§‡§∞‡§ï‡§æ‡§∞‡•Ä ‡§∞ ‡§Ö‡§ö‡§æ‡§∞‡•§",
    "‡§Æ‡•ã‡§Æ‡•ã ‡§ï‡•á ‡§π‡•ã": "‡§Æ‡•ã‡§Æ‡•ã ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§≤‡•ã‡§ï‡§™‡•ç‡§∞‡§ø‡§Ø ‡§ñ‡§æ‡§®‡§æ ‡§π‡•ã, ‡§°‡§Æ‡•ç‡§™‡•ç‡§≤‡§ø‡§Ç‡§ó ‡§ú‡§∏‡•ç‡§§‡•ã‡•§",
    "‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§ñ‡§æ‡§®‡§æ ‡§ï‡•á ‡§π‡•ã": "‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§ñ‡§æ‡§®‡§æ‡§Æ‡§æ ‡§¶‡§æ‡§≤‡§≠‡§æ‡§§, ‡§Æ‡•ã‡§Æ‡•ã, ‡§∏‡•á‡§≤ ‡§∞‡•ã‡§ü‡•Ä, ‡§ó‡•Å‡§®‡•ç‡§¶‡•ç‡§∞‡•Å‡§ï ‡§™‡•ç‡§∞‡§∏‡§ø‡§¶‡•ç‡§ß ‡§õ‡§®‡•ç‡•§",

    # Festivals
    "‡§¶‡§∂‡•à‡§Ç ‡§ï‡•á ‡§π‡•ã": "‡§¶‡§∂‡•à‡§Ç ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∏‡§¨‡•à‡§≠‡§®‡•ç‡§¶‡§æ ‡§†‡•Ç‡§≤‡•ã ‡§ö‡§æ‡§° ‡§π‡•ã, ‡§¶‡•Å‡§∞‡•ç‡§ó‡§æ ‡§™‡•Ç‡§ú‡§æ‡§ï‡•ã ‡§∞‡•Ç‡§™‡§Æ‡§æ ‡§Æ‡§®‡§æ‡§á‡§®‡•ç‡§õ‡•§",
    "‡§§‡§ø‡§π‡§æ‡§∞ ‡§ï‡•á ‡§π‡•ã": "‡§§‡§ø‡§π‡§æ‡§∞ ‡§¶‡•Ä‡§™‡§æ‡§µ‡§≤‡•Ä ‡§ú‡§∏‡•ç‡§§‡•ã ‡§ö‡§æ‡§° ‡§π‡•ã, ‡§™‡§æ‡§Å‡§ö ‡§¶‡§ø‡§®‡§∏‡§Æ‡•ç‡§Æ ‡§Æ‡§®‡§æ‡§á‡§®‡•ç‡§õ‡•§",
    "‡§π‡•ã‡§≤‡•Ä ‡§ï‡•á ‡§π‡•ã": "‡§π‡•ã‡§≤‡•Ä ‡§∞‡§Ç‡§ó‡§π‡§∞‡•Ç‡§ï‡•ã ‡§ö‡§æ‡§° ‡§π‡•ã, ‡§´‡§æ‡§ó‡•Å‡§® ‡§™‡•Ç‡§∞‡•ç‡§£‡§ø‡§Æ‡§æ‡§Æ‡§æ ‡§Æ‡§®‡§æ‡§á‡§®‡•ç‡§õ‡•§",

    # Tourism
    "‡§™‡•ã‡§ñ‡§∞‡§æ ‡§ï‡§π‡§æ‡§Å ‡§õ": "‡§™‡•ã‡§ñ‡§∞‡§æ ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§™‡§∂‡•ç‡§ö‡§ø‡§Æ‡•Ä ‡§≠‡§æ‡§ó‡§Æ‡§æ ‡§õ, ‡§´‡•á‡§µ‡§æ ‡§§‡§æ‡§≤ ‡§∞ ‡§π‡§ø‡§Æ‡§æ‡§≤‡§ï‡•ã ‡§¶‡•É‡§∂‡•ç‡§Ø‡§ï‡•ã ‡§≤‡§æ‡§ó‡§ø ‡§™‡•ç‡§∞‡§∏‡§ø‡§¶‡•ç‡§ß‡•§",
    "‡§≤‡•Å‡§Æ‡•ç‡§¨‡§ø‡§®‡•Ä ‡§ï‡§ø‡§® ‡§™‡•ç‡§∞‡§∏‡§ø‡§¶‡•ç‡§ß ‡§õ": "‡§≤‡•Å‡§Æ‡•ç‡§¨‡§ø‡§®‡•Ä ‡§≠‡§ó‡§µ‡§æ‡§® ‡§¨‡•Å‡§¶‡•ç‡§ß‡§ï‡•ã ‡§ú‡§®‡•ç‡§Æ‡§∏‡•ç‡§•‡§æ‡§® ‡§π‡•ã, UNESCO ‡§µ‡§ø‡§∂‡•ç‡§µ ‡§∏‡§Æ‡•ç‡§™‡§¶‡§æ ‡§∏‡•ç‡§•‡§≤‡•§",
    "‡§ö‡§ø‡§§‡§µ‡§®‡§Æ‡§æ ‡§ï‡•á ‡§õ": "‡§ö‡§ø‡§§‡§µ‡§®‡§Æ‡§æ ‡§∞‡§æ‡§∑‡•ç‡§ü‡•ç‡§∞‡§ø‡§Ø ‡§®‡§ø‡§ï‡•Å‡§û‡•ç‡§ú ‡§õ, ‡§ó‡•à‡§Ç‡§°‡§æ ‡§∞ ‡§¨‡§æ‡§ò ‡§π‡•á‡§∞‡•ç‡§® ‡§∏‡§ï‡§ø‡§®‡•ç‡§õ‡•§",

    # General
    "‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶": "‡§∏‡•ç‡§µ‡§æ‡§ó‡§§ ‡§õ! ‡§ñ‡•Å‡§∂‡•Ä ‡§≤‡§æ‡§ó‡•ç‡§Ø‡•ã ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§™‡§æ‡§è‡§∞‡•§",
    "‡§¨‡§æ‡§Ø": "‡§Ö‡§≤‡§µ‡§ø‡§¶‡§æ! ‡§´‡•á‡§∞‡§ø ‡§≠‡•á‡§ü‡•å‡§Ç‡§≤‡§æ!",
    "‡§Æ‡§≤‡§æ‡§à ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ö‡§æ‡§π‡§ø‡§Ø‡•ã": "‡§Æ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§§‡§Ø‡§æ‡§∞ ‡§õ‡•Å‡•§ ‡§ï‡•á ‡§ö‡§æ‡§π‡§ø‡§®‡•ç‡§õ?",

    # Weather/Time
    "‡§Ü‡§ú ‡§Æ‡•å‡§∏‡§Æ ‡§ï‡§∏‡•ç‡§§‡•ã ‡§õ": "‡§Æ ‡§Æ‡•å‡§∏‡§Æ ‡§ú‡§æ‡§®‡§ï‡§æ‡§∞‡•Ä ‡§¶‡§ø‡§® ‡§∏‡§ï‡•ç‡§¶‡§ø‡§®, ‡§§‡§∞ ‡§Ü‡§∂‡§æ ‡§ó‡§∞‡•ç‡§õ‡•Å ‡§∞‡§æ‡§Æ‡•ç‡§∞‡•ã ‡§õ!",
    "‡§ï‡§§‡§ø ‡§¨‡§ú‡•ç‡§Ø‡•ã": "‡§Æ ‡§∏‡§Æ‡§Ø ‡§¨‡§§‡§æ‡§â‡§® ‡§∏‡§ï‡•ç‡§¶‡§ø‡§®, ‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§´‡•ã‡§® ‡§π‡•á‡§∞‡•ç‡§®‡•Å‡§π‡•ã‡§∏‡•ç!",
}

print(f"Knowledge base created with {len(knowledge_base)} entries.")

In [None]:
def get_embedding(text, tokenizer, model, device):
    """
    Get sentence embedding using BERT model.
    """
    inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=128)
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = model(**inputs)

    # Use mean pooling of last hidden state
    embeddings = outputs.last_hidden_state.mean(dim=1)
    return embeddings

def cosine_similarity(a, b):
    """Compute cosine similarity between two vectors."""
    return torch.nn.functional.cosine_similarity(a, b).item()

# Pre-compute embeddings for knowledge base
print("Computing embeddings for knowledge base...")
kb_embeddings = {}

for question in knowledge_base.keys():
    emb = get_embedding(question, nepberta_tokenizer, nepberta_model, device)
    kb_embeddings[question] = emb

print(f"‚úÖ Computed embeddings for {len(kb_embeddings)} questions.")

In [None]:
class NepaliRetrievalChatbot:
    """
    Retrieval-based Nepali Chatbot using semantic similarity.
    """

    def __init__(self, knowledge_base, kb_embeddings, tokenizer, model, device, threshold=0.6):
        self.knowledge_base = knowledge_base
        self.kb_embeddings = kb_embeddings
        self.tokenizer = tokenizer
        self.model = model
        self.device = device
        self.threshold = threshold
        self.default_response = "‡§Æ‡§æ‡§´ ‡§ó‡§∞‡•ç‡§®‡•Å‡§π‡•ã‡§∏‡•ç, ‡§Æ‡•à‡§≤‡•á ‡§¨‡•Å‡§ù‡§ø‡§®‡•§ ‡§ï‡•É‡§™‡§Ø‡§æ ‡§´‡•á‡§∞‡§ø ‡§≠‡§®‡•ç‡§®‡•Å‡§π‡•ã‡§∏‡•ç‡•§"

    def find_best_match(self, user_input):
        """
        Find the most similar question in knowledge base.
        """
        # Get embedding for user input
        user_emb = get_embedding(user_input, self.tokenizer, self.model, self.device)

        best_match = None
        best_score = -1

        for question, emb in self.kb_embeddings.items():
            score = cosine_similarity(user_emb, emb)
            if score > best_score:
                best_score = score
                best_match = question

        return best_match, best_score

    def chat(self, user_input):
        """
        Generate response based on semantic similarity.
        """
        best_match, score = self.find_best_match(user_input)

        if score >= self.threshold:
            return self.knowledge_base[best_match], best_match, score
        else:
            return self.default_response, None, score

    def get_response(self, user_input):
        """Simple response without debug info."""
        response, _, _ = self.chat(user_input)
        return response

# Create retrieval chatbot
retrieval_chatbot = NepaliRetrievalChatbot(
    knowledge_base, kb_embeddings,
    nepberta_tokenizer, nepberta_model, device,
    threshold=0.5
)

print("‚úÖ Nepali Retrieval Chatbot created!")

In [None]:
# Test retrieval chatbot
print("="*60)
print("üîç RETRIEVAL-BASED CHATBOT TEST")
print("="*60)

test_queries = [
    "‡§®‡§Æ‡§∏‡•ç‡§§‡•á",
    "‡§§‡§ø‡§Æ‡•ç‡§∞‡•ã ‡§®‡§æ‡§Æ ‡§ï‡•á ‡§π‡•ã?",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã capital ‡§ï‡•á ‡§π‡•ã?",
    "‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ ‡§ï‡§§‡§ø ‡§Ö‡§ó‡•ç‡§≤‡•ã ‡§õ?",
    "‡§Æ‡•ã‡§Æ‡•ã ‡§≠‡§®‡•á‡§ï‡•ã ‡§ï‡•á ‡§π‡•ã?",
    "‡§¶‡§∂‡•à‡§Ç ‡§ö‡§æ‡§° ‡§ï‡§π‡§ø‡§≤‡•á ‡§π‡•ã?",
    "‡§™‡•ã‡§ñ‡§∞‡§æ‡§Æ‡§æ ‡§ï‡•á ‡§π‡•á‡§∞‡•ç‡§®‡•á?",
    "‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶"
]

for query in test_queries:
    response, matched, score = retrieval_chatbot.chat(query)
    print(f"\nüë§ User: {query}")
    print(f"ü§ñ Bot: {response}")
    if matched:
        print(f"   [Matched: '{matched}' | Score: {score:.3f}]")

## 5. Hybrid Chatbot (Retrieval + Generation)

Combines both approaches for better responses.

In [None]:
class NepaliHybridChatbot:
    """
    Hybrid chatbot combining retrieval and generation.
    Uses retrieval for known queries, generation for unknown.
    """

    def __init__(self, retrieval_bot, generative_bot, retrieval_threshold=0.6):
        self.retrieval_bot = retrieval_bot
        self.generative_bot = generative_bot
        self.retrieval_threshold = retrieval_threshold
        self.conversation_history = []

    def chat(self, user_input):
        """
        Try retrieval first, fall back to generation.
        """
        # Try retrieval first
        response, matched, score = self.retrieval_bot.chat(user_input)
        method = "retrieval"

        # If retrieval confidence is low, use generation
        if score < self.retrieval_threshold:
            response = self.generative_bot.chat(user_input)
            method = "generation"

        # Store in history
        self.conversation_history.append({
            'user': user_input,
            'bot': response,
            'method': method,
            'score': score
        })

        return response, method, score

    def get_response(self, user_input):
        """Simple interface."""
        response, _, _ = self.chat(user_input)
        return response

    def reset(self):
        """Reset both chatbots."""
        self.conversation_history = []
        self.generative_bot.reset()

# Create hybrid chatbot
hybrid_chatbot = NepaliHybridChatbot(
    retrieval_chatbot,
    chatbot,  # GPT-based
    retrieval_threshold=0.55
)

print("‚úÖ Nepali Hybrid Chatbot created!")

In [None]:
# Test hybrid chatbot
print("="*60)
print("üîÄ HYBRID CHATBOT TEST")
print("="*60)

test_inputs = [
    "‡§®‡§Æ‡§∏‡•ç‡§§‡•á",  # Should use retrieval
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§¨‡§§‡§æ‡§â‡§®‡•Å‡§∏‡•ç",  # Should use retrieval
    "‡§Ü‡§ú ‡§ï‡•á ‡§ó‡§∞‡•ç‡§®‡•á?",  # May use generation
    "‡§§‡§™‡§æ‡§à‡§Ç ‡§ï‡§∏‡•ç‡§§‡•ã ‡§π‡•Å‡§®‡•Å‡§π‡•Å‡§®‡•ç‡§õ?",  # May use retrieval or generation
    "‡§π‡§ø‡§Æ‡§æ‡§≤ ‡§ö‡§¢‡•ç‡§® ‡§ï‡§§‡§ø ‡§ó‡§æ‡§π‡•ç‡§∞‡•ã ‡§õ?",  # Should use generation
    "‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶ ‡§Æ‡§¶‡•ç‡§¶‡§§‡§ï‡•ã ‡§≤‡§æ‡§ó‡§ø",  # Should use retrieval
]

for user_input in test_inputs:
    response, method, score = hybrid_chatbot.chat(user_input)
    print(f"\nüë§ User: {user_input}")
    print(f"ü§ñ Bot: {response}")
    print(f"   [Method: {method} | Score: {score:.3f}]")

## 6. Fine-tune GPT-2 on Custom Nepali Conversations

Let's fine-tune the pre-trained model on custom conversation data.

In [None]:
# Create training data for fine-tuning
conversation_data = [
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§®‡§Æ‡§∏‡•ç‡§§‡•á\n‡§¨‡•ã‡§ü: ‡§®‡§Æ‡§∏‡•ç‡§§‡•á! ‡§Æ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§ï‡§∏‡§∞‡•Ä ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§∏‡§ï‡•ç‡§õ‡•Å?",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§®‡§æ‡§Æ ‡§ï‡•á ‡§π‡•ã?\n‡§¨‡•ã‡§ü: ‡§Æ‡•á‡§∞‡•ã ‡§®‡§æ‡§Æ ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§ö‡•ç‡§Ø‡§æ‡§ü‡§¨‡•ã‡§ü ‡§π‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§ï‡•á ‡§π‡•ã?\n‡§¨‡•ã‡§ü: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§∞‡§æ‡§ú‡§ß‡§æ‡§®‡•Ä ‡§ï‡§æ‡§†‡§Æ‡§æ‡§°‡•å‡§Ç ‡§π‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ ‡§ï‡§§‡§ø ‡§Ö‡§ó‡•ç‡§≤‡•ã ‡§õ?\n‡§¨‡•ã‡§ü: ‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ ‡•Æ,‡•Æ‡•™‡•Æ.‡•Æ‡•¨ ‡§Æ‡§ø‡§ü‡§∞ ‡§Ö‡§ó‡•ç‡§≤‡•ã ‡§õ‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§¶‡§æ‡§≤‡§≠‡§æ‡§§ ‡§ï‡•á ‡§π‡•ã?\n‡§¨‡•ã‡§ü: ‡§¶‡§æ‡§≤‡§≠‡§æ‡§§ ‡§®‡•á‡§™‡§æ‡§≤‡•Ä‡§π‡§∞‡•Ç‡§ï‡•ã ‡§Æ‡•Å‡§ñ‡•ç‡§Ø ‡§ñ‡§æ‡§®‡§æ ‡§π‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§Æ‡•ã‡§Æ‡•ã ‡§ï‡§∏‡•ç‡§§‡•ã ‡§π‡•Å‡§®‡•ç‡§õ?\n‡§¨‡•ã‡§ü: ‡§Æ‡•ã‡§Æ‡•ã ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§≤‡•ã‡§ï‡§™‡•ç‡§∞‡§ø‡§Ø ‡§ñ‡§æ‡§®‡§æ ‡§π‡•ã, ‡§°‡§Æ‡•ç‡§™‡•ç‡§≤‡§ø‡§Ç‡§ó ‡§ú‡§∏‡•ç‡§§‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§¶‡§∂‡•à‡§Ç ‡§ï‡§π‡§ø‡§≤‡•á ‡§π‡•Å‡§®‡•ç‡§õ?\n‡§¨‡•ã‡§ü: ‡§¶‡§∂‡•à‡§Ç ‡§Ö‡§∏‡•ã‡§ú-‡§ï‡§æ‡§∞‡•ç‡§§‡§ø‡§ï ‡§Æ‡§π‡§ø‡§®‡§æ‡§Æ‡§æ ‡§Æ‡§®‡§æ‡§á‡§®‡•ç‡§õ‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§§‡§ø‡§π‡§æ‡§∞ ‡§ï‡•á ‡§π‡•ã?\n‡§¨‡•ã‡§ü: ‡§§‡§ø‡§π‡§æ‡§∞ ‡§¶‡•Ä‡§™‡§æ‡§µ‡§≤‡•Ä ‡§ú‡§∏‡•ç‡§§‡•ã ‡§™‡§æ‡§Å‡§ö ‡§¶‡§ø‡§®‡•á ‡§ö‡§æ‡§° ‡§π‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§™‡•ã‡§ñ‡§∞‡§æ‡§Æ‡§æ ‡§ï‡•á ‡§õ?\n‡§¨‡•ã‡§ü: ‡§™‡•ã‡§ñ‡§∞‡§æ‡§Æ‡§æ ‡§´‡•á‡§µ‡§æ ‡§§‡§æ‡§≤ ‡§∞ ‡§π‡§ø‡§Æ‡§æ‡§≤‡§ï‡•ã ‡§¶‡•É‡§∂‡•ç‡§Ø ‡§õ‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§≤‡•Å‡§Æ‡•ç‡§¨‡§ø‡§®‡•Ä ‡§ï‡§ø‡§® ‡§™‡•ç‡§∞‡§∏‡§ø‡§¶‡•ç‡§ß ‡§õ?\n‡§¨‡•ã‡§ü: ‡§≤‡•Å‡§Æ‡•ç‡§¨‡§ø‡§®‡•Ä ‡§¨‡•Å‡§¶‡•ç‡§ß‡§ï‡•ã ‡§ú‡§®‡•ç‡§Æ‡§∏‡•ç‡§•‡§æ‡§® ‡§π‡•ã‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ù‡§®‡•ç‡§°‡§æ ‡§ï‡§∏‡•ç‡§§‡•ã ‡§õ?\n‡§¨‡•ã‡§ü: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ù‡§®‡•ç‡§°‡§æ ‡§¶‡•Å‡§à ‡§§‡•ç‡§∞‡§ø‡§ï‡•ã‡§£‡§ï‡•ã ‡§Ü‡§ï‡§æ‡§∞‡§Æ‡§æ ‡§õ, ‡§∏‡§Ç‡§∏‡§æ‡§∞‡§Æ‡§æ ‡§Ö‡§¶‡•ç‡§µ‡§ø‡§§‡•Ä‡§Ø‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶\n‡§¨‡•ã‡§ü: ‡§∏‡•ç‡§µ‡§æ‡§ó‡§§ ‡§õ! ‡§ñ‡•Å‡§∂‡•Ä ‡§≤‡§æ‡§ó‡•ç‡§Ø‡•ã ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§™‡§æ‡§è‡§∞‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§¨‡§æ‡§Ø\n‡§¨‡•ã‡§ü: ‡§Ö‡§≤‡§µ‡§ø‡§¶‡§æ! ‡§´‡•á‡§∞‡§ø ‡§≠‡•á‡§ü‡•å‡§Ç‡§≤‡§æ!",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§ï‡§∏‡•ç‡§§‡•ã ‡§õ?\n‡§¨‡•ã‡§ü: ‡§Æ ‡§†‡•Ä‡§ï ‡§õ‡•Å, ‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶! ‡§§‡§™‡§æ‡§à‡§Ç ‡§ï‡§∏‡•ç‡§§‡•ã ‡§π‡•Å‡§®‡•Å‡§π‡•Å‡§®‡•ç‡§õ?",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§Æ‡§≤‡§æ‡§à ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ö‡§æ‡§π‡§ø‡§Ø‡•ã\n‡§¨‡•ã‡§ü: ‡§Æ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§§‡§Ø‡§æ‡§∞ ‡§õ‡•Å‡•§ ‡§ï‡•á ‡§ö‡§æ‡§π‡§ø‡§®‡•ç‡§õ?",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§§‡§™‡§æ‡§à ‡§ï‡•á ‡§ó‡§∞‡•ç‡§® ‡§∏‡§ï‡•ç‡§®‡•Å‡§π‡•Å‡§®‡•ç‡§õ?\n‡§¨‡•ã‡§ü: ‡§Æ ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§≠‡§æ‡§∑‡§æ‡§Æ‡§æ ‡§ï‡•Å‡§∞‡§æ‡§ï‡§æ‡§®‡•Ä ‡§∞ ‡§ú‡§æ‡§®‡§ï‡§æ‡§∞‡•Ä ‡§¶‡§ø‡§® ‡§∏‡§ï‡•ç‡§õ‡•Å‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ú‡§®‡§∏‡§Ç‡§ñ‡•ç‡§Ø‡§æ ‡§ï‡§§‡§ø ‡§õ?\n‡§¨‡•ã‡§ü: ‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§ú‡§®‡§∏‡§Ç‡§ñ‡•ç‡§Ø‡§æ ‡§≤‡§ó‡§≠‡§ó ‡•© ‡§ï‡§∞‡•ã‡§° ‡§õ‡•§",
    "‡§™‡•ç‡§∞‡§Ø‡•ã‡§ó‡§ï‡§∞‡•ç‡§§‡§æ: ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§≠‡§æ‡§∑‡§æ‡§Æ‡§æ ‡§ï‡§§‡§ø ‡§Ö‡§ï‡•ç‡§∑‡§∞ ‡§õ‡§®‡•ç?\n‡§¨‡•ã‡§ü: ‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§µ‡§∞‡•ç‡§£‡§Æ‡§æ‡§≤‡§æ‡§Æ‡§æ ‡•©‡•¨ ‡§µ‡•ç‡§Ø‡§û‡•ç‡§ú‡§® ‡§∞ ‡•ß‡•® ‡§∏‡•ç‡§µ‡§∞ ‡§õ‡§®‡•ç‡•§",
]

print(f"Created {len(conversation_data)} training conversations.")

In [None]:
# Prepare dataset for fine-tuning
def tokenize_conversations(conversations, tokenizer, max_length=256):
    """
    Tokenize conversation data for training.
    """
    tokenized_data = []

    for conv in conversations:
        # Add EOS token at the end
        text = conv + tokenizer.eos_token

        # Tokenize
        encodings = tokenizer(
            text,
            truncation=True,
            max_length=max_length,
            padding='max_length',
            return_tensors='pt'
        )

        tokenized_data.append({
            'input_ids': encodings['input_ids'].squeeze(),
            'attention_mask': encodings['attention_mask'].squeeze(),
            'labels': encodings['input_ids'].squeeze()
        })

    return tokenized_data

# Tokenize
tokenized_conversations = tokenize_conversations(conversation_data, tokenizer)

# Create HuggingFace Dataset
train_dataset = Dataset.from_list([
    {
        'input_ids': item['input_ids'].tolist(),
        'attention_mask': item['attention_mask'].tolist(),
        'labels': item['labels'].tolist()
    }
    for item in tokenized_conversations
])

print(f"Dataset created with {len(train_dataset)} samples.")

In [None]:
# Fine-tuning configuration
training_args = TrainingArguments(
    output_dir='./nepali_chatbot_finetuned',
    overwrite_output_dir=True,
    num_train_epochs=5,
    per_device_train_batch_size=2,
    save_steps=50,
    save_total_limit=2,
    logging_steps=10,
    learning_rate=5e-5,
    warmup_steps=10,
    logging_dir='./logs',
)

# Create trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
)

print("Trainer configured!")

In [None]:
# Fine-tune the model (optional - takes time)
print("="*60)
print("FINE-TUNING NEPALI GPT-2")
print("="*60)

# Uncomment to actually fine-tune
trainer.train()
model.save_pretrained('./nepali_chatbot_finetuned')
tokenizer.save_pretrained('./nepali_chatbot_finetuned')
print("‚úÖ Fine-tuning complete! Model saved.")

#print("\n‚ö†Ô∏è Fine-tuning is commented out to save time.")
#print("Uncomment the lines above to train on your data.")

## 7. Interactive Chat Interface

In [None]:
def interactive_chat(chatbot_instance, chatbot_name="Nepali Chatbot"):
    """
    Interactive chat session.
    """
    print("\n" + "="*60)
    print(f"ü§ñ {chatbot_name} - Interactive Mode")
    print("="*60)
    print("Type 'quit', 'exit', or '‡§¨‡§æ‡§π‡§ø‡§∞' to end the conversation.")
    print("Type 'reset' to clear conversation history.")
    print("-"*60)

    while True:
        user_input = input("\nüë§ You: ").strip()

        if not user_input:
            continue

        if user_input.lower() in ['quit', 'exit', '‡§¨‡§æ‡§π‡§ø‡§∞', 'bye', '‡§¨‡§æ‡§Ø']:
            print("ü§ñ Bot: ‡§Ö‡§≤‡§µ‡§ø‡§¶‡§æ! ‡§´‡•á‡§∞‡§ø ‡§≠‡•á‡§ü‡•å‡§Ç‡§≤‡§æ!")
            break

        if user_input.lower() == 'reset':
            if hasattr(chatbot_instance, 'reset'):
                chatbot_instance.reset()
            print("ü§ñ Bot: Conversation reset. ‡§®‡§Ø‡§æ‡§Å ‡§ï‡•Å‡§∞‡§æ‡§ï‡§æ‡§®‡•Ä ‡§∏‡•Å‡§∞‡•Å ‡§ó‡§∞‡•å‡§Ç!")
            continue

        # Get response
        if hasattr(chatbot_instance, 'get_response'):
            response = chatbot_instance.get_response(user_input)
        else:
            response = chatbot_instance.chat(user_input)
            if isinstance(response, tuple):
                response = response[0]

        print(f"ü§ñ Bot: {response}")

# Uncomment to start interactive chat

interactive_chat(hybrid_chatbot, "Nepali Hybrid Chatbot")

In [None]:
# Demo conversation
print("\n" + "="*60)
print("üá≥üáµ DEMO CONVERSATION")
print("="*60)

demo_conversation = [
    "‡§®‡§Æ‡§∏‡•ç‡§§‡•á",
    "‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§®‡§æ‡§Æ ‡§ï‡•á ‡§π‡•ã?",
    "‡§®‡•á‡§™‡§æ‡§≤‡§ï‡•ã ‡§¨‡§æ‡§∞‡•á‡§Æ‡§æ ‡§ï‡•á‡§π‡•Ä ‡§¨‡§§‡§æ‡§â‡§®‡•Å‡§∏‡•ç",
    "‡§∏‡§ó‡§∞‡§Æ‡§æ‡§•‡§æ ‡§ï‡§§‡§ø ‡§Ö‡§ó‡•ç‡§≤‡•ã ‡§õ?",
    "‡§®‡•á‡§™‡§æ‡§≤‡•Ä ‡§ñ‡§æ‡§®‡§æ ‡§ï‡•á ‡§∞‡§æ‡§Æ‡•ç‡§∞‡•ã ‡§õ?",
    "‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶, ‡§¨‡§æ‡§Ø!"
]

hybrid_chatbot.reset()

for user_input in demo_conversation:
    response = hybrid_chatbot.get_response(user_input)
    print(f"\nüë§ User: {user_input}")
    print(f"ü§ñ Bot: {response}")

## 8. Save the Models

In [None]:
import pickle
import json

# Save knowledge base
with open('nepali_knowledge_base.json', 'w', encoding='utf-8') as f:
    json.dump(knowledge_base, f, ensure_ascii=False, indent=2)

print("‚úÖ Knowledge base saved to nepali_knowledge_base.json")

# Save conversation data
with open('nepali_conversations.json', 'w', encoding='utf-8') as f:
    json.dump(conversation_data, f, ensure_ascii=False, indent=2)

print("‚úÖ Conversation data saved to nepali_conversations.json")

## 9. Summary

### Pre-trained Models Used:

| Model | Purpose | Source |
|-------|---------|--------|
| **Sakonii/distilgpt2-nepali** | Text generation | HuggingFace |
| **NepBERTa/NepBERTa** | Semantic embeddings | HuggingFace |
| **bert-base-multilingual-cased** | Fallback embeddings | HuggingFace |

### Chatbot Types Created:

| Type | Approach | Best For |
|------|----------|----------|
| **Generative** | GPT-2 text generation | Open-ended conversations |
| **Retrieval** | Semantic similarity | Known Q&A pairs |
| **Hybrid** | Both combined | General use |

### Key Features:
- Uses **actual pre-trained Nepali models**
- Semantic similarity with NepBERTa/mBERT
- Conversation history tracking
- Fine-tuning capability
- Knowledge base extensibility

In [None]:
print("\n" + "="*60)
print("‚úÖ NEPALI CHATBOT NOTEBOOK COMPLETE!")
print("="*60)
print("\nYou have learned:")
print("  ‚úì Loading pre-trained Nepali GPT-2 model")
print("  ‚úì Building generative chatbot")
print("  ‚úì Building retrieval-based chatbot with semantic similarity")
print("  ‚úì Creating hybrid chatbot")
print("  ‚úì Fine-tuning on custom data")
print("\nüá≥üáµ ‡§ß‡§®‡•ç‡§Ø‡§µ‡§æ‡§¶! (Thank you!)")