Title: Building a Simple NLP Chatbot with Python: A Case Study on "QuestionStage"
Introduction
In this notebook, we'll explore the "QuestionStage" project, a Python console application leveraging Natural Language Processing (NLP) techniques to simulate a chatbot. We'll demonstrate how to analyze and respond to user inputs using Python NLP libraries and discuss how these concepts can be applied in a .NET MAUI application using C# NLP toolkits.

Table of Contents
Project Overview
Setting Up the Environment
Data Preprocessing
Implementing NLP Techniques
Tokenization
POS Tagging
Lemmatization
Building the Chatbot Logic
Running the Chatbot
Transitioning to .NET MAUI with C#
Conclusion and Next Steps
1. Project Overview
"QuestionStage" is designed to help users simulate conversation stages typically experienced in online dating or casual interactions. The core functionality revolves around understanding and generating human-like responses to user queries based on predefined responses.

2. Setting Up the Environment
We'll begin by setting up the necessary Python environment. This involves installing libraries such as nltk, textblob, and spacy.

In [None]:
import os
import sys
import subprocess
import pkg_resources

def install(package):
    try:
        pkg_resources.get_distribution(package)
        print(f"{package} is already installed.")
    except pkg_resources.DistributionNotFound:
        try:
            subprocess.check_call([sys.executable, "-m", "pip", "install", package])
            print(f"Installed {package}.")
        except Exception as e:
            print(f"Failed to install {package}. Error: {str(e)}")

# Install necessary packages
packages = ['nltk', 'textblob', 'spacy']
for package in packages:
    install(package)

# Import installed packages
import nltk
from nltk.corpus import wordnet
from textblob import TextBlob
import spacy

# Ensure NLTK resources are downloaded with error handling
def download_nltk_resource(resource):
    try:
        nltk.data.find(resource)
    except LookupError:
        nltk.download(resource)

download_nltk_resource('wordnet')
download_nltk_resource('averaged_perceptron_tagger')
download_nltk_resource('punkt')

# Ensure spaCy model is downloaded and load it
try:
    nlp = spacy.load('en_core_web_sm')
except OSError:
    spacy.cli.download('en_core_web_sm')
    nlp = spacy.load('en_core_web_sm')


3. Data Preprocessing
Preprocessing text data is a critical step in any NLP project. We'll demonstrate tokenization, lemmatization, and correction using TextBlob. Note, that preprocess_text() is not a method that exist in the QuestionProcessor.py script itself, just using this method name to demonstrate the logic behind the script.

In [None]:
def preprocess_text(text):
    # Correct the question using TextBlob
    blob = TextBlob(text)
    corrected_text = str(blob.correct())
    
    # Tokenize
    tokens = nltk.word_tokenize(corrected_text)
    
    return corrected_text, tokens

text = "Helo! Hw cn I help yu today?"
corrected_text, tokens = preprocess_text(text)
print("Corrected Text:", corrected_text)
print("Tokens:", tokens)


4. Implementing NLP Techniques
POS Tagging
Part-of-speech (POS) tagging assigns word types (nouns, verbs, adjectives, etc.) to each token.

In [None]:
nltk.download('averaged_perceptron_tagger')
pos_tags = nltk.pos_tag(tokens)
print("POS Tags:", pos_tags)


Lemmatization
Lemmatization reduces words to their base or root form.

In [None]:
lemmatizer = nltk.WordNetLemmatizer()
lemmas = [lemmatizer.lemmatize(token) for token in tokens]
print("Lemmas:", lemmas)


5. Building the Chatbot Logic
The chatbot logic involves responding to user inputs based on a predefined dictionary of responses. The input is processed to identify keywords, and corresponding responses are retrieved.

In [None]:
# Define your responses
responses = {
    'name': 'your name',
    'location': 'your location',
    'age': 'your age',
    'occupation': 'your job',
    'height': 'your height',
    'physical activity': 'your level of physical actvities',
    'educational level': 'your educational level',
    'drinking habit': 'your drinking frequency',
    'smoking habit': 'your smoking frequency',
    'gender identity': 'your gender',
    'seeking': 'what you seek from your partner',
    'want children': 'if you want children or not',
    'star sign': 'your horoscope sign',
    'politics': 'how political are you',
    'religion': 'your religion or lack of',
    'tribe': 'your tribe if you have one',
    'hobbies': 'your hobbies',
    'passion': 'your passion',
    'dreams': 'what you dream of doing or being',
    'expectations in a relationship': 'what you expect in a relationship',
    'favorite food': 'best food',
    'favorite color': 'best color',
    'favorite animal': 'best animal',
    'favorite movie': 'best movie',
    'favorite book': 'best book',
    'favorite music genre': 'best music type',
    'favorite artist': 'best musician ',
    'favorite travel destination': 'best place to travel',
    'favorite sport': 'best sport',
    'favorite team': 'best team',
    'favorite player': 'best player',
    'favorite subject in school': 'best subject',
    'favorite type of music': 'best music type',
    'favorite type of food': 'best food type',
    'favorite type of movie': 'best movie type',
    'favorite type of book': 'best book type',
    'favorite type of sport': 'best sport type',
    'favorite type of travel': 'best travel type',
    'do you prefer being the one to start chats the most or your man?': 'no',
}

# Convert the keys in the responses dictionary to lowercase
responses = {key.lower(): value for key, value in responses.items()}

# Function to translate Penn Treebank tags to WordNet tags
def get_wordnet_pos(treebank_tag):
    if treebank_tag.startswith('J'):
        return wordnet.ADJ
    elif treebank_tag.startswith('V'):
        return wordnet.VERB
    elif treebank_tag.startswith('N'):
        return wordnet.NOUN
    elif treebank_tag.startswith('R'):
        return wordnet.ADV
    else:
        return None

# Function to lemmatize verbs to their related noun if possible
def verb_to_noun(verb):
    synsets = wordnet.synsets(verb, pos=wordnet.VERB)
    for synset in synsets:
        for lemma in synset.lemmas():
            if lemma.derivationally_related_forms():
                for related_lemma in lemma.derivationally_related_forms():
                    if related_lemma.synset().pos() == 'n':
                        return related_lemma.name()
            elif lemma.name().endswith('d'):
                return lemma.name()[:-1]
            elif lemma.name().endswith('ed'):
                return lemma.name()[:-2]
    return verb


6. Running the Chatbot
The chatbot runs in a loop, accepting user inputs and providing responses based on the predefined dictionary.

In [None]:
while True:
    question = input("Ask me anything: ")

    # Correct the question using TextBlob
    blob = TextBlob(question)
    corrected_question = str(blob.correct())

    # Convert the corrected question to lowercase and split it into words
    keywords = corrected_question.lower().split()

    # List to store matching responses
    matching_responses = []

    # Tokenize and process the corrected question
    tokens = nltk.word_tokenize(corrected_question)
    pos_tags = nltk.pos_tag(tokens)

    for token, pos in pos_tags:
        wordnet_pos = get_wordnet_pos(pos)
        if wordnet_pos:
            lemma = nltk.WordNetLemmatizer().lemmatize(token, pos=wordnet_pos)
        else:
            lemma = token

        if wordnet_pos == wordnet.VERB:
            noun_form = verb_to_noun(lemma)
            if noun_form in responses:
                matching_responses.append(responses[noun_form])
        elif lemma in responses:
            matching_responses.append(responses[lemma])

    # Prefix matching for additional flexibility
    for keyword in keywords:
        keyword_prefix = keyword[:3]
        for response_key in responses:
            if response_key.startswith(keyword_prefix):
                matching_responses.append(responses[response_key])
                break

    # Process the corrected question with spaCy
    doc = nlp(corrected_question)
    for token in doc:
        lemma = token.lemma_.lower()
        if lemma in responses:
            matching_responses.append(responses[lemma])

    # Print all unique responses
    unique_responses = set(matching_responses)
    for response in unique_responses:
        print(response)
    if not unique_responses:
        print("I don't have an answer for that.")


7. Transitioning to .NET MAUI with C#
To implement similar functionality in a .NET MAUI application, we can use C# NLP libraries/toolkit such as OpenNlp, Catalyst and ML.NET. The concepts remain the same, but the implementation will differ in syntax and library usage. This would be committed to Github under the repo name TalkingStage. 

8. Conclusion and Next Steps
We've demonstrated how to build a simple NLP chatbot in Python. The next steps involve transitioning.