# CSC 620 - Natural Language Technologies

## Homework \# 2
### Slightly Improved Chatbot

### Instructions:

Implement two improvements to your chatbot from HA1.

1. Use word normalizations to make your chatbot less brittle.  Specifically, use NLTK's stemmer to accept word variations in the user's response.  For example, your chatbot should be able to understand (using normalization/stemming) that 'Can I (*)' and 'Could I (*)' map to the same response pair.  Print to the chatlog the original and the stemmed versions of the user input.  

2. Compute and print the total count of types and the total count of tokens in the stemmed version of user's input from the chatlog (Ignore the chatbot responses for this part.)  Use 10-15 dialogues for this part.  Don't use NLTK's tokenizer.  Implement a simple tokenizer of your own. 
Notes:
* You choose the NLTK stemmer you would like to use in your program. 

Submit: 
1. The python program or ipython notebook
2. Output from your program (chatlog and the counts)

---

## Code:

In [13]:
# Natural Language Toolkit: Chatbot Utilities
#
# Copyright (C) 2001-2022 NLTK Project
# Authors: Steven Bird <stevenbird1@gmail.com>
# URL: <https://www.nltk.org/>
# For license information, see LICENSE.TXT

# Based on an Eliza implementation by Joe Strout <joe@strout.net>,
# Jeff Epler <jepler@inetnebr.com> and Jez Higgins <jez@jezuk.co.uk>.

import random
import re
from nltk.stem.snowball import SnowballStemmer

reflections = {
    "i am": "you are",
    "i was": "you were",
    "i": "you",
    "i'm": "you are",
    "i'd": "you would",
    "i've": "you have",
    "i'll": "you will",
    "my": "your",
    "you are": "I am",
    "you were": "I was",
    "you've": "I have",
    "you'll": "I will",
    "your": "my",
    "yours": "mine",
    "you": "me",
    "me": "you",
}

class Chat:
    def __init__(self, pairs, reflections={}):
        """
        Initialize the chatbot.  Pairs is a list of patterns and responses.  Each
        pattern is a regular expression matching the user's statement or question,
        e.g. r'I like (.*)'.  For each such pattern a list of possible responses
        is given, e.g. ['Why do you like %1', 'Did you ever dislike %1'].  Material
        which is matched by parenthesized sections of the patterns (e.g. .*) is mapped to
        the numbered positions in the responses, e.g. %1.

        :type pairs: list of tuple
        :param pairs: The patterns and responses
        :type reflections: dict
        :param reflections: A mapping between first and second person expressions
        :rtype: None
        """

        self._pairs = [(re.compile(x, re.IGNORECASE), y) for (x, y) in pairs]
        self._reflections = reflections
        self._regex = self._compile_reflections()
        self._raw_text = []
        self._text_types = {}

    def _compile_reflections(self):
        sorted_refl = sorted(self._reflections, key=len, reverse=True)
        return re.compile(
            r"\b({})\b".format("|".join(map(re.escape, sorted_refl))), re.IGNORECASE
        )

    def _substitute(self, str):
        """
        Substitute words in the string, according to the specified reflections,
        e.g. "I'm" -> "you are"

        :type str: str
        :param str: The string to be mapped
        :rtype: str
        """

        return self._regex.sub(
            lambda mo: self._reflections[mo.string[mo.start() : mo.end()]], str.lower()
        )

    def _wildcards(self, response, match):
        pos = response.find("%")
        while pos >= 0:
            num = int(response[pos + 1 : pos + 2])
            response = (
                response[:pos]
                + self._substitute(match.group(num))
                + response[pos + 2 :]
            )
            pos = response.find("%")
        return response

    def respond(self, str):
        """
        Generate a response to the user input.

        :type str: str
        :param str: The string to be mapped
        :rtype: str
        """

        # check each pattern
        for (pattern, response) in self._pairs:
            match = pattern.match(str)

            # did the pattern match?
            if match:
                resp = random.choice(response)  # pick a random response
                resp = self._wildcards(resp, match)  # process wildcards

                # fix munged punctuation at the end
                if resp[-2:] == "?.":
                    resp = resp[:-2] + "."
                if resp[-2:] == "??":
                    resp = resp[:-2] + "?"
                return resp

    # Clear stored conversations
    def clear_conversation(self):
        self._raw_text = []
        self._text_types = {}

    # Store raw conversation into a list
    def store_raw_text(self, text: list[str]):
        self._raw_text.extend(text)

    # Store conversation text types with counts in a dictionary
    def process_text_types(self, text: list[str]) -> str:
        stemmer = SnowballStemmer("english")
        stemmed_text = []
        for word in text:
            # I'm really hungry.  I am wanting some food.
            word_stem = re.sub(r'[^\w\s\']', '', stemmer.stem(word))
            # ["I", am, real, hungr, I, am, want, som, food]
            stemmed_text.append(word_stem)
            if word_stem in self._text_types.keys():
                self._text_types[word_stem] = self._text_types.get(word_stem) + 1
            else:
                self._text_types[word_stem] = 1
        return ' '.join(stemmed_text)
    
    # Count types
    def type_count(self) -> int:
        return len(self._text_types.keys())

    # Count tokens
    def token_count(self) -> int:
        return sum(self._text_types.values())

    # Hold a conversation with a chatbot
    def converse(self, quit="quit"):
        user_input = ""
        res = ""
        while user_input != quit:
            user_input = quit
            try:
                user_input = input(">")
            except EOFError:
                print(user_input)
            if user_input:
                while user_input[-1] in "!.?":
                    user_input = user_input[:-1]
                if user_input != quit:
                    user_input_arr = user_input.split()
                    self.store_raw_text(user_input_arr)
                    stemmed_user_input = self.process_text_types(user_input_arr)
                    print("\nStemmed user input: " + stemmed_user_input + "\n")
                resp = self.respond(stemmed_user_input)
                if user_input != quit:
                    resp_arr = resp[:-1].split()
                    self.store_raw_text(resp_arr)
                    self.process_text_types(resp_arr)
                print(resp)

In [14]:
# Conversation pairs for college therapist

pairs = (
    (
        r"I need (.*)",
        (
            "Why do you need %1?",
            "Would it really help you to get %1?",
            "Are you sure you need %1?",
        ),
    ),
    (
        r"Why don\'t you (.*)",
        (
            "Do you really think I don't %1?",
            "Perhaps eventually I will %1.",
            "Do you really want me to %1?",
        ),
    ),
    (
        r"Why can\'t I (.*)",
        (
            "Do you think you should be able to %1?",
            "If you could %1, what would you do?",
            "I don't know -- why can't you %1?",
            "Have you really tried?",
        ),
    ),
    (
        r"I can\'t (.*)",
        (
            "How do you know you can't %1?",
            "Perhaps you could %1 if you tried.",
            "What would it take for you to %1?",
        ),
    ),
    (
        r"I(\'m| am) (.*) (final|exam|test|quiz(z)?|homework|project|assign)(.*)",
        (
            "Tell me more about your %3.",
            "How does being %2 about %3 affect you?",
        ),
    ),
    (
        r"I am (.*)",
        (
            "Did you come to me because you are %1?",
            "How long have you been %1?",
            "How do you feel about being %1?",
        ),
    ),
    (
        r"I\'m (.*)",
        (
            "How does being %1 make you feel?",
            "Do you enjoy being %1?",
            "Why do you tell me you're %1?",
            "Why do you think you're %1?",
        ),
    ),
    (
        r"Are you (.*)",
        (
            "Why does it matter whether I am %1?",
            "Would you prefer it if I were not %1?",
            "Perhaps you believe I am %1.",
            "I may be %1 -- what do you think?",
        ),
    ),
    (
        r"What (.*)",
        (
            "Why do you ask?",
            "How would an answer to that help you?",
            "What do you think?",
        ),
    ),
    (
        r"How (.*)",
        (
            "How do you suppose?",
            "Perhaps you can answer your own question.",
            "What is it you're really asking?",
        ),
    ),
    (
        r"Because (.*)",
        (
            "Is that the real reason?",
            "What other reasons come to mind?",
            "Does that reason apply to anything else?",
            "If %1, what else must be true?",
        ),
    ),
    (
        r"(.*) sorry (.*)",
        (
            "There are many times when no apology is needed.",
            "What feelings do you have when you apologize?",
        ),
    ),
    (
        r"Hello(.*)",
        (
            "Hello... I'm glad you could drop by today.",
            "Hi there... how are you today?",
            "Hello, how are you feeling today?",
        ),
    ),
    (
        r"I think (.*)",
        ("Do you doubt %1?", "Do you really think so?", "But you're not sure %1?"),
    ),
    (
        r"(.*) friend (.*)",
        (
            "Tell me more about your friends.",
            "When you think of a friend, what comes to mind?",
            "Why don't you tell me about a childhood friend?",
        ),
    ),
    (r"Yes", ("You seem quite sure.", "OK, but can you elaborate a bit?")),
    (r"No", 
        (
            "I understand you don't want to share. What would you like to talk about?", 
            "OK, is there something else you would like to talk about?"
        )
    ),
    (
        r"(.*) computer(.*)",
        (
            "Are you really talking about me?",
            "Does it seem strange to talk to a computer?",
            "How do computers make you feel?",
            "Do you feel threatened by computers?",
        ),
    ),
    (
        r"Is it (.*)",
        (
            "Do you think it is %1?",
            "Perhaps it's %1 -- what do you think?",
            "If it were %1, what would you do?",
            "It could well be that %1.",
        ),
    ),
    (
        r"It is (.*)",
        (
            "You seem very certain.",
            "If I told you that it probably isn't %1, what would you feel?",
        ),
    ),
    (
        r"Can you (.*)",
        (
            "What makes you think I can't %1?",
            "If I could %1, then what?",
            "Why do you ask if I can %1?",
        ),
    ),
    (
        r"Can I (.*)",
        (
            "Perhaps you don't want to %1.",
            "Do you want to be able to %1?",
            "If you could %1, would you?",
        ),
    ),
    (
        r"You are (.*)",
        (
            "Why do you think I am %1?",
            "Does it please you to think that I'm %1?",
            "Perhaps you would like me to be %1.",
            "Perhaps you're really talking about yourself?",
        ),
    ),
    (
        r"You\'re (.*)",
        (
            "Why do you say I am %1?",
            "Why do you think I am %1?",
            "Are we talking about you, or me?",
        ),
    ),
    (
        r"I don\'t (.*)",
        ("Don't you really %1?", "Why don't you %1?", "Do you want to %1?"),
    ),
    (
        r"I feel (.*)",
        (
            "Good, tell me more about these feelings.",
            "Do you often feel %1?",
            "When do you usually feel %1?",
            "When you feel %1, what do you do?",
        ),
    ),
    (
        r"I have (.*)",
        (
            "Why do you tell me that you've %1?",
            "Have you really %1?",
            "Now that you have %1, what will you do next?",
        ),
    ),
    (
        r"I would (.*)",
        (
            "Could you explain why you would %1?",
            "Why would you %1?",
            "Who else knows that you would %1?",
        ),
    ),
    (
        r"Is there (.*)",
        (
            "Do you think there is %1?",
            "It's likely that there is %1.",
            "Would you like there to be %1?",
        ),
    ),
    (
        r"(.*)(final|exam|test|quiz(z)?|homework|project|assign)(.*)",
        (
            "Talk to me about your %2.",
            "Can you elaborate on your %2?",
        ),
    ),
    (
        r"(.*)(comput|math(emat)?)(.*)",
        (
            "How do you feel about %2?",
            "What would you like to do with %2?",
            "What draws you to %2?",
            "Why did you choose to study %2?",
        ),
    ),
    (
        r"(.*)(class|cours)(.*)",
        (
            "Tell me more about your %2.",
            "How would you like your %2 to go?",
        ),
    ),
    (
        r"(.*)(grade|score)(.*)",
        (
            "How does your %2 make you feel?",
            "What do you think you can do to improve your %2?",
            "How important is your %2 to you?",
            "What do you think a good %2 will allow you to do?"
        ),
    ),
    (
        r"(.*)(job|career)(.*)",
        (
            "How important is having a good %2 to you?",
            "What would you like to accomplish in your future %2",
            "If you can't handle it, why don't you just drop out of school?",
            "I don't know about you, but I don't think you are good enough for any job.",
            "There's always burger flipping..."
        ),
    ),
    (
        r"(.*)(teacher|instructor|professor|tutor)(.*)",
        (
            "What do you think about your %2?",
            "How does your %2 make you feel?",
            "What do you expect from your %2?",
            "How do you think your %2 is fulfilling your academic needs?",
        ),
    ),
    (
        r"My (.*)",
        (
            "I see, your %1.",
            "Why do you say that your %1?",
            "When your %1, how do you feel?",
        ),
    ),
    (
        r"You (.*)",
        (
            "We should be discussing you, not me.",
            "Why do you say that about me?",
            "Why do you care whether I %1?",
        ),
    ),
    (r"Why (.*)", ("Why don't you tell me the reason why %1?", "Why do you think %1?")),
    (
        r"I want (.*)",
        (
            "What would it mean to you if you got %1?",
            "Why do you want %1?",
            "What would you do if you got %1?",
            "If you got %1, then what would you do?",
        ),
    ),
    (
        r"([^\x00-\x7F])+",
        (
            "I'm sorry, I don't understand %1.",
            "Can you tell me what %1 means?"
        ),
    ),
    (
        r"(.*) mother(.*)",
        (
            "Tell me more about your mother.",
            "What was your relationship with your mother like?",
            "How do you feel about your mother?",
            "How does this relate to your feelings today?",
            "Good family relations are important.",
        ),
    ),
    (
        r"(.*) father(.*)",
        (
            "Tell me more about your father.",
            "How did your father make you feel?",
            "How do you feel about your father?",
            "Does your relationship with your father relate to your feelings today?",
            "Do you have trouble showing affection with your family?",
        ),
    ),
    (
        r"(.*) child(.*)",
        (
            "Did you have close friends as a child?",
            "What is your favorite childhood memory?",
            "Do you remember any dreams or nightmares from childhood?",
            "Did the other children sometimes tease you?",
            "How do you think your childhood experiences relate to your feelings today?",
        ),
    ),
    (
        r"(.*)\?",
        (
            "Why do you ask that?",
            "Please consider whether you can answer your own question.",
            "Perhaps the answer lies within yourself?",
            "Why don't you tell me?",
        ),
    ),
    (
        r"quit",
        (
            "Thank you for talking with me.",
            "Good-bye.",
            "Thank you, that will be $150.  Have a good day!",
            "I hate you.  Go away.",
            "This was the most worthless session I've participated in. Goodbye",
        ),
    ),
    (
        r"(.*)",
        (
            "Please tell me more.",
            "Let's change focus a bit... Tell me about your family.",
            "Can you elaborate on that?",
            "Why do you say that %1?",
            "I see.",
            "Very interesting.",
            "%1.",
            "I see.  And what does that tell you?",
            "How does that make you feel?",
            "How do you feel when you say that?",
        ),
    ),
)

In [15]:
eliza_chatbot = Chat(pairs, reflections)

def eliza_chat():
    print("College Therapist\n---------")
    print("Talk to the program by typing in plain English, using normal upper-")
    print('and lower-case letters and punctuation.  Enter "quit" when done.')
    print("=" * 72)
    print("Hello.  How are you feeling today?")

    eliza_chatbot.converse()

In [16]:
eliza_chat()

College Therapist
---------
Talk to the program by typing in plain English, using normal upper-
and lower-case letters and punctuation.  Enter "quit" when done.
Hello.  How are you feeling today?
>I am quite tired today.

Stemmed user input: i am quit tire today

How long have you been quit tire today?
>Not long, since it is because I am pulling a bunch of all-nighters to catch up on all the work I need to do.

Stemmed user input: not long sinc it is becaus i am pull a bunch of allnight to catch up on all the work i need to do

OK, is there something else you would like to talk about?
>Classes are hard

Stemmed user input: class are hard

Tell me more about your class.
>My assignments are really difficult and time consuming.

Stemmed user input: my assign are realli difficult and time consum

Talk to me about your assign.
>I have lots of them from many technical classes.

Stemmed user input: i have lot of them from mani technic class

Now that you have lot of them from mani technic cla

In [17]:
print("Type count: " + str(eliza_chatbot.type_count()))
print("Token count: " + str(eliza_chatbot.token_count()))

Type count: 132
Token count: 348
