# 📘 Meteor Chatbot — Jupyter-Friendly Version

This notebook implements a simple NLP chatbot named **Meteor Bot**. It uses **NLTK**, **TF-IDF**, and **cosine similarity** to return relevant responses to user queries based on two corpora: one for general answers and another for module-related questions.

## Step 1: Import Libraries
We start by importing necessary libraries including NLTK for natural language processing,
Scikit-learn for TF-IDF and cosine similarity, and standard Python libraries for basic operations.

In [None]:
# 🔧 Setup and Imports
import warnings, random, string
from pathlib import Path
import nltk
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

## Step 2: Download Required NLTK Resources
We download the Punkt tokenizer and WordNet lemmatizer to assist in tokenization and normalization of text.

In [None]:
# 🚫 Disable Warnings and Download NLTK Resources
warnings.filterwarnings("ignore")
nltk.download("punkt", quiet=True)
nltk.download("wordnet", quiet=True)

## Step 3: Load Data Files
Ensure that `answer.txt` and `chatbot.txt` are placed in the same directory as this notebook. These files
contain the corpora that the chatbot will use for answering questions.

In [None]:
# 📂 Load Required Files
DATA_DIR = Path(".")
ANS_FILE = DATA_DIR / "answer.txt"
Q_FILE = DATA_DIR / "chatbot.txt"

if not ANS_FILE.exists() or not Q_FILE.exists():
    raise FileNotFoundError("❌  Put 'answer.txt' and 'chatbot.txt' in the same directory!")

## Step 4: Tokenize Text into Sentences
We split the raw content of each file into individual sentences so that responses can be matched at the sentence level.

In [None]:
# 📖 Tokenize the Corpora
raw_ans = ANS_FILE.read_text(encoding="utf8", errors="ignore").lower()
raw_q = Q_FILE.read_text(encoding="utf8", errors="ignore").lower()

sent_tokens_ans = nltk.sent_tokenize(raw_ans)
sent_tokens_q = nltk.sent_tokenize(raw_q)

## Step 5: Text Normalization Helpers
Functions to remove punctuation, tokenize, and lemmatize user input and corpus content for consistent comparisons.

In [None]:
# 🧠 NLP Helpers: Lemmatizer and Normalizer
lemmer = nltk.stem.WordNetLemmatizer()
remove_punct = dict((ord(p), None) for p in string.punctuation)

def LemTokens(tokens):
    return [lemmer.lemmatize(t) for t in tokens]

def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct)))

## Step 6: Handle Greetings and Predefined Replies
This section defines how the bot should handle basic greetings and common small talk.

In [None]:
# 💬 Greeting Detection and Predefined Replies
INTRO_ANS = [
    "My name is Meteor Bot.",
    "You can call me Meteor Bot or B.O.T.",
    "I'm Meteor Bot, happy to chat!"
]
GREETING_IN = ("hello", "hi", "greetings", "sup", "what's up", "hey")
GREETING_OUT = ("hi", "hey", "hello", "hi there", "hello there")

def greeting(sentence: str):
    for word in sentence.split():
        if word.lower() in GREETING_IN:
            return random.choice(GREETING_OUT)

## Step 7: Core Logic - TF-IDF Based Response Matching
We use TF-IDF to vectorize the text and cosine similarity to find the most relevant response.

In [None]:
# 🤖 TF-IDF Based Response Generator
def generate_response(user_msg: str, corpus):
    corpus.append(user_msg)
    vec = TfidfVectorizer(tokenizer=LemNormalize, stop_words=None)
    tfidf = vec.fit_transform(corpus)
    sims = cosine_similarity(tfidf[-1], tfidf)

    idx = sims.argsort()[0][-2]
    flat = sims.flatten(); flat.sort()
    score = flat[-2]
    corpus.pop()

    if score == 0:
        return "I'm sorry, I didn't understand that."
    return corpus[idx]

## Step 8: Chatbot Interface Function
This function integrates canned responses and TF-IDF matching logic to respond to user queries.

In [None]:
# 🎯 Public Chat Interface
def chat(user_msg: str) -> str:
    u = user_msg.strip().lower()
    if not u:
        return "Please say something 🙂"

    if u in ("bye", "goodbye", "see you"):
        return "Bye! Take care."
    if u in ("thanks", "thank you", "thx"):
        return "You're welcome."
    if u in ("how are you", "how r u", "how're you", "how's it going", "how's everything"):
        return "I'm fine, thank you for asking!"
    if "your name" in u:
        return random.choice(INTRO_ANS)

    g = greeting(u)
    if g:
        return g

    if "module" in u:
        return generate_response(u, sent_tokens_q.copy())

    return generate_response(u, sent_tokens_ans.copy())

## Step 9: Command-Line Style Chat (Optional)
This allows you to chat with the bot directly from the notebook using standard input/output.

In [None]:
# 💻 Run Terminal-style Chat in Notebook
def run_cli():
    print("Meteor Bot — type 'bye' to quit")
    while True:
        try:
            user = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            print("\nBye!")
            break
        if not user:
            continue
        reply = chat(user)
        print("Bot:", reply)
        if user.lower() in ("bye", "goodbye", "see you"):
            break