# Homework 2

## Instructions:

Implement two improvements to your chatbot from HA1.

1. Use word normalizations to make your chatbot less brittle.  Specifically, use NLTK's stemmer to accept word variations in the user's response.  For example, your chatbot should be able to understand (using normalization/stemming) that 'Can I (*)' and 'Could I (*)' map to the same response pair.  Print to the chatlog the original and the stemmed versions of the user input.  

2. Compute and print the total count of types and the total count of tokens in the stemmed version of user's input from the chatlog (Ignore the chatbot responses for this part.)  Use 10-15 dialogues for this part.  Don't use NLTK's tokenizer.  Implement a simple tokenizer of your own. 
Notes:

## Imports
Previously in homework 1 we imported the NLTK `Chat` class which provided the basic implementation of Eliza.

In [1]:
# Natural Language Toolkit: Eliza
#
# Copyright (C) 2001-2023 NLTK Project
# Authors: Steven Bird <stevenbird1@gmail.com>
#          Edward Loper <edloper@gmail.com>
# URL: <https://www.nltk.org/>
# For license information, see LICENSE.TXT

# Based on an Eliza implementation by Joe Strout <joe@strout.net>,
# Jeff Epler <jepler@inetnebr.com> and Jez Higgins <mailto:jez@jezuk.co.uk>.

# a translation table used to convert things you say into things the
# computer says back, e.g. "I am" --> "you are"

from nltk.chat.util import Chat, reflections

To improve on this by introducing text normalization, we will have to override some of NLTK's `Chat` functionality.

We can do this with making a child class that can inherits from `Chat`.

We will also import a stemmer from NLTK and the regex module.

In [2]:
from nltk.stem.snowball import SnowballStemmer
import re

We will also import the `Defaultdict` container from Python's `collections` module to store our word types when normalizing.

We are using a `defaultdict` rather than a dictionary because we are expecting unique keys. This will save us from having to check for existing keys (new types) whenever we are processing types.

In [3]:
from collections import defaultdict

In [4]:
class normalizedChat(Chat):
    def __init__(self, pairs, reflections={}):
        super().__init__(pairs, reflections)
        self._text = []
        self._types = defaultdict(int)
        self.stemmer = SnowballStemmer("english")
        
    def add_text(self, text: list[str]):
        self._text.extend(text);
    
    def add_type(self, text: list[str]):
        stemmed_text = []
        regex_pattern = re.compile(r'[^\w\s\']')
        
        for word in text:
            word_stem = re.sub(regex_pattern, '',self.stemmer.stem(word))
            stemmed_text.append(word_stem)
            self._types[word_stem] += 1
            
        return ' '.join(stemmed_text)
    
    def type_count(self):
        return len(self._types.keys())
    
    def token_count(self):
        return sum(self._types.values())
    
    def process_input(self, user_input):
        word_pattern = re.compile(r'\b\w+\b')
        words = word_pattern.findall(user_input)
        return words
    
    def converse(self, quit="quit"):
        user_input = ""
        response = ""
        while user_input != quit:
            user_input = quit
            try:
                user_input = input(">")
            except EOFError:
                print(user_input)
            if user_input:
                while(user_input[-1] in "!."):
                    user_input = user_input[:-1]
                input_list = self.process_input(user_input)
                self.add_text(input_list)
                stemmed_user_input = self.add_type(input_list)
                print("Stemmed user input: ", stemmed_user_input, "\n")
                response = self.respond(stemmed_user_input)
                self.add_text(input_list)
                self.add_type(input_list)
            print(response)
            print("Type count: ", self.type_count(), "\n")
            print("Token count: ", self.token_count(), "\n")
        

## Pairs
Most of the pairs remain the same from homework 1, but some were slight changed to support our implementation of stemming. It often removed the apostraphe in contractions, so input `don't` could be stemmed to `don t`.

In [5]:
# a table of response pairs, where each pair consists of a
# regular expression, and a list of possible responses,
# with group-macros labelled as %1, %2.
pairs = (
    # Table format (Regex of phrase to look for, (potential responses))
    (
        r"I need (.*)", # Regex of phrase to look for "I need X"
        (
            "Why do you need %1?", # Probing the necessity of the stated need
            "Would it really help you to get %1?", # Questions the effectiveness of fulfilling the need
            "Are you sure you need %1?", # Challenges the certainty of the need
        ),
    ),
    (
        r"Why (don\'t|don t) you (.*)",
        (
            "Do you really think I don't %1?",  # Questions the assumption that the program does not perform the action
            "Perhaps eventually I will %1.",  # Implies potential future capability
            "Do you really want me to %1?",  # Inquires about the user's desire for the program to perform the action

        ),
    ),
    (
        r"Why ((can\'t|can t)|can t) I (.*)",
        (
            "Do you think you should be able to %1?",  # Asks about the user's belief in their ability
            "If you could %1, what would you do?",  # Encourages the user to consider the implications of being able to perform the action
            "I don't know -- why can't you %1?",  # Reflective question about the user's limitations
            "Have you really tried?",  # Suggests exploring whether the user has attempted the action

        ),
    ),
    (
        r"I can\'t (.*)",
        (
            "How do you know you can't %1?",  # Questions the certainty of the user's inability
            "Perhaps you could %1 if you tried.",  # Suggests potential capability with effort
            "What would it take for you to %1?",  # Inquires about the conditions necessary for the user to perform the action

        ),
    ),
    (
        r"I am (.*)",
        (
            "Did you come to me because you are %1?",  # Investigates the user's motive for stating their current state
            "How long have you been %1?",  # Inquires about the duration of the user's current state
            "How do you feel about being %1?",  # Asks about the user's emotions regarding their current state

        ),
    ),
    (
        r"(I\'m|I m) (.*)", # Different set of responses for the contraction of "I am". 
        (
            "How does being %1 make you feel?",  # Asks about the user's emotional response to their state
            "Do you enjoy being %1?",  # Questions the user's satisfaction with their current state
            "Why do you tell me you're %1?",  # Inquires about the user's motive for sharing their state
            "Why do you think you're %1?",  # Asks about the user's perception of their state

        ),
    ),
    (
        r"Are you (.*)",
        (
            "Why does it matter whether I am %1?",  # Questions the significance of the program's state
            "Would you prefer it if I were not %1?",  # Asks about the user's preference regarding the program's state
            "Perhaps you believe I am %1.",  # Acknowledges the user's perception of the program's state
            "I may be %1 -- what do you think?",  # Encourages the user to share their opinion on the program's state

        ),
    ),
    (
        r"What (.*)",
        (
            "Why do you ask?",  # Inquires about the user's motive for asking the question
            "How would an answer to that help you?",  # Encourages the user to reflect on the purpose of their question
            "What do you think?",  # Invites the user to share their thoughts or opinions
        ),
    ),
    (
        r"How (.*)",
        (
            "How do you suppose?",  # Encourages the user to speculate or consider possibilities
            "Perhaps you can answer your own question.",  # Suggests self-reflection or problem-solving
            "What is it you're really asking?",  # Encourages the user to clarify their question or intent
        ),
    ),
    (
        r"Because (.*)",
        (
            "Is that the real reason?",  # Questions the validity or depth of the stated reason
            "What other reasons come to mind?",  # Encourages the user to consider alternative explanations
            "Does that reason apply to anything else?",  # Explores the broader implications of the stated reason
            "If %1, what else must be true?",  # Prompts the user to consider additional logical consequences
        ),
    ),
    (
        r"(.*) sorry (.*)",
        (
            "There are many times when no apology is needed.",  # Challenges the necessity of an apology
            "What feelings do you have when you apologize?",  # Encourages the user to reflect on their emotions when apologizing
        ),
    ),
    (
        r"Hello(.*)",
        (
            "Hello... I'm glad you could drop by today.",  # Greets the user warmly
            "Hi there... how are you today?",  # Initiates friendly conversation
            "Hello, how are you feeling today?",  # Inquires about the user's emotional state
        ),
    ),
    (
        r"I think (.*)",
        (
            "Do you doubt %1?",  # Questions the certainty or confidence of the user's statement
            "Do you really think so?",  # Seeks confirmation or clarification of the user's belief
            "But you're not sure %1?",  # Challenges the user's certainty or explores potential doubt
        )
    ),
    (
        r"(.*) friend (.*)",
        (
            "Tell me more about your friends.",  # Invites the user to share information about their social connections
            "When you think of a friend, what comes to mind?",  # Encourages the user to reflect on their perceptions of friendship
            "Why don't you tell me about a childhood friend?",  # Prompts the user to reminisce about past relationships
        ),
    ),
    (
        r"Yes", 
        (
            "You seem quite sure.",  # Acknowledges the user's confidence in their statement
            "OK, but can you elaborate a bit?",  # Requests further explanation or detail
        )
    ),
    (
        r"(.*) computer(.*)",
        (
            "Are you really talking about me?",  # Questions if the user is referring to the program itself
            "Does it seem strange to talk to a computer?",  # Inquires about the user's perception of interacting with a computer
            "How do computers make you feel?",  # Asks about the user's emotional response to computers
            "Do you feel threatened by computers?",  # Explores the user's feelings of insecurity or intimidation towards computers
        ),
    ),
    (
        r"Is it (.*)",
        (
            "Do you think it is %1?",  # Asks for the user's opinion or belief regarding the stated condition
            "Perhaps it's %1 -- what do you think?",  # Encourages the user to consider alternative possibilities
            "If it were %1, what would you do?",  # Prompting the user to consider potential actions based on the stated condition
            "It could well be that %1.",  # Acknowledges the possibility of the stated condition being true
        ),
    ),
    (
        r"It is (.*)",
        (
            "You seem very certain.",  # Acknowledges the user's confidence in the stated condition
            "If I told you that it probably isn't %1, what would you feel?",  # Encourages the user to reflect on their emotions in response to potential contradiction
        ),
    ),
    (
        r"Can you (.*)",
        (
            "What makes you think I can't %1?",  # Asks for the reasoning behind the user's doubt in the program's capability
            "If I could %1, then what?",  # Encourages the user to consider the implications if the program could perform the stated action
            "Why do you ask if I can %1?",  # Inquires about the user's motive for questioning the program's ability
        ),
    ),
    (
        r"Can I (.*)",
        (
            "Perhaps you don't want to %1.",  # Suggests hesitation or reluctance on the user's part to perform the stated action
            "Do you want to be able to %1?",  # Inquires about the user's desire or intention to perform the action
            "If you could %1, would you?",  # Prompting the user to consider hypothetical scenarios regarding their ability to perform the action
        ),
    ),
    (
        r"You are (.*)",
        (
            "Why do you think I am %1?",  # Inquires about the user's perception or reasoning behind attributing a certain characteristic to the program
            "Does it please you to think that I'm %1?",  # Explores the user's emotional response to ascribing certain traits to the program
            "Perhaps you would like me to be %1.",  # Suggests the user may have preferences regarding the program's attributes
            "Perhaps you're really talking about yourself?",  # Suggests that the user's statement may reflect their own thoughts or feelings
        ),
    ),
    (
        r"You\'re (.*)",
        (
            "Why do you say I am %1?",  # Asks for clarification or justification for attributing a certain characteristic to the program
            "Why do you think I am %1?",  # Inquires about the user's perception or reasoning behind the statement
            "Are we talking about you, or me?",  # Prompts the user to consider the subject of the conversation
        ),
    ),
    (
        r"I (don\'t|don t) (.*)",
        (
             "Don't you really %1?", # Questions the user's assertion, but more importantly
             "Why don't you %1?", # This and the next response encourages further thought and exploration as to WHY the user doesn't do something
             "Do you want to %1?"
        ),
    ),
    (
        r"I feel (.*)",
        (
            "Good, tell me more about these feelings.",  # Encourages the user to elaborate on their emotions
            "Do you often feel %1?",  # Inquires about the frequency of the stated emotion
            "When do you usually feel %1?",  # Asks about the circumstances or triggers associated with the stated emotion
            "When you feel %1, what do you do?",  # Encourages the user to reflect on their coping mechanisms or behaviors in response to the stated emotion
        ),
    ),
    (
        r"I have (.*)",
        (
            "Why do you tell me that you've %1?",  # Inquires about the significance or motive behind sharing the information
            "Have you really %1?",  # Questions the veracity or certainty of the user's statement
            "Now that you have %1, what will you do next?",  # Prompts the user to consider the implications or future actions related to the stated possession
        ),
    ),
    (
        r"I would (.*)",
        (
            "Could you explain why you would %1?",  # Asks for clarification or justification for the stated intention
            "Why would you %1?",  # Inquires about the motive or reasoning behind the stated intention
            "Who else knows that you would %1?",  # Prompts the user to consider others who may be aware of or affected by their intention
        ),
    ),
    (
        r"Is there (.*)",
        (
            "Do you think there is %1?",  # Asks for the user's opinion or belief regarding the existence of the stated condition
            "It's likely that there is %1.",  # Expresses agreement or support for the possibility of the stated condition
            "Would you like there to be %1?",  # Inquires about the user's preference or desire regarding the existence of the stated condition
        ),
    ),
    (
        r"My (.*)",
        (
            "I see, your %1.",  # Acknowledges the user's possession or association with the stated attribute
            "Why do you say that your %1?",  # Inquires about the significance or motive behind the user's statement
            "When your %1, how do you feel?",  # Encourages the user to reflect on their emotions or experiences related to the stated attribute
        ),
    ),
    (
        r"You (.*)",
        (
            "We should be discussing you, not me.",  # Redirects the focus of the conversation back to the user
            "Why do you say that about me?",  # Inquires about the user's reasoning or motive behind their statement regarding the program
            "Why do you care whether I %1?",  # Questions the significance of the user's interest in the program's actions or behaviors
        ),
    ),
    (
        r"Why (.*)", 
        (
            "Why don't you tell me the reason why %1?", # Both responses encourages the user to explain their own rationale or motivation behind their question.
            "Why do you think %1?" # Kinda cheap but I guess it works
        )
    ),
    (
        r"I want (.*)",
        (
            "What would it mean to you if you got %1?",  # Inquires about the significance or importance of the desired outcome to the user
            "Why do you want %1?",  # Encourages the user to explore the motives or desires underlying their want
            "What would you do if you got %1?",  # Prompts the user to consider potential actions or implications associated with obtaining the desired outcome
            "If you got %1, then what would you do?",  # Encourages further reflection on the potential consequences or follow-up actions
        ),
    ),
    (
        r"(.*) mother(.*)",
        ( # It seems any mention of the user mother will be met with a request to elaborate on the user's relationship with their mother
            "Tell me more about your mother.",  # Encourages the user to share additional information about their mother
            "What was your relationship with your mother like?",  # Inquires about the dynamics and interactions within the user's relationship with their mother
            "How do you feel about your mother?",  # Asks about the user's emotions or sentiments regarding their mother
            "How does this relate to your feelings today?",  # Encourages the user to connect their past experiences or relationships with their current emotions
            "Good family relations are important.",  # Provides a general statement acknowledging the significance of positive familial connections
        ),
    ),
    (
        r"(.*) father(.*)",
        ( # Similar to the responses from when the mother is mentioned except for the last
            "Tell me more about your father.",
            "How did your father make you feel?",
            "How do you feel about your father?",
            "Does your relationship with your father relate to your feelings today?",
            "Do you have trouble showing affection with your family?", # Why is this response only in the father? Is it implying that the mention of a father tends to be problematic.
        ),
    ),
    (
        r"(.*) child(.*)",
        (
            "Did you have close friends as a child?",  # Inquires about the user's childhood social experiences
            "What is your favorite childhood memory?",  # Encourages the user to reminisce about positive childhood experiences
            "Do you remember any dreams or nightmares from childhood?",  # Explores the user's recollections of dreams or nightmares from their childhood
            "Did the other children sometimes tease you?",  # Raises the topic of childhood teasing or bullying
            "How do you think your childhood experiences relate to your feelings today?",  # Encourages the user to consider the connection between past experiences and current emotions
        ),
    ),
    (
        r"(.*)\?", # Matches any questions from the user. I was wondering why the above questions (why, how) didn't have question marks in the regex. That means these following responses are potential responses for why and how inputs as well.
        (
            "Why do you ask that?",  # Inquires about the user's motive or intention behind asking the question
            "Please consider whether you can answer your own question.",  # Encourages the user to reflect on whether they already possess the answer to their question
            "Perhaps the answer lies within yourself?",  # Suggests that the user may already know the answer to their question
            "Why don't you tell me?",  # Encourages the user to share their thoughts or insights on the topic of their question
        ),
    ),
    (
        r"quit",
        (
            "Thank you for talking with me.",  # Expresses gratitude for the conversation
            "Good-bye.",  # Offers a farewell
            "Thank you, that will be $150.  Have a good day!",  # Humorous response suggesting a fee for the conversation
        ),
    ),
    (
        r"(.*)", # The responses below are potential for any kind of input from the user. Probably a good catch all in case the use says anything unexpected.
        (
            "Please tell me more.",  # Encourages the user to provide additional information or details
            "Let's change focus a bit... Tell me about your family.",  # Redirects the conversation to a new topic
            "Can you elaborate on that?",  # Requests further explanation or clarification
            "Why do you say that %1?",  # Inquires about the reasoning behind the user's statement
            "I see.",  # Acknowledges the user's input
            "Very interesting.",  # Expresses interest or curiosity
            "%1.",  # Repeats the user's statement, but why include this at all?
            "I see.  And what does that tell you?",  # Encourages the user to reflect on the implications of their statement
            "How does that make you feel?",  # Inquires about the user's emotional response
            "How do you feel when you say that?",  # Asks about the user's emotional state during the conversation

        ),
    ),
)

We then initialize the new `NormalizedChat` class, which inherits from NLTK's `Chat` class.

In [6]:
eliza_chatbot = normalizedChat(pairs, reflections) 

In [7]:
def eliza_chat():
    print("Therapist\n---------")
    print("Talk to the program by typing in plain English, using normal upper-")
    print('and lower-case letters and punctuation.  Enter "quit" when done.')
    print("=" * 72)
    print("Hello.  How are you feeling today?")

    eliza_chatbot.converse()

def demo():
    eliza_chat()

In [8]:
if __name__ == "__main__":
    demo()

Therapist
---------
Talk to the program by typing in plain English, using normal upper-
and lower-case letters and punctuation.  Enter "quit" when done.
Hello.  How are you feeling today?


> I'm feeling tired today.


Stemmed user input:  i m feel tire today 

Why do you tell me you're you m?
Type count:  5 

Token count:  10 



> Because I stayed up to do homework all night


Stemmed user input:  becaus i stay up to do homework all night 

How do you feel when you say that?
Type count:  13 

Token count:  28 



> I feel a little proud that I got my work done, but question whether it was worth the lost sleep


Stemmed user input:  i feel a littl proud that i got my work done but question whether it was worth the lost sleep 

Do you often feel a littl proud that you got your work done but question whether it was worth the lost sleep?
Type count:  30 

Token count:  68 



> Yes. But I think I will only do prioritize work over sleep when I'm in school


Stemmed user input:  yes but i think i will onli do priorit work over sleep when i m in school 

You seem quite sure.
Type count:  39 

Token count:  102 



> You're right, I might prioritize my professional work after I graduate too


Stemmed user input:  you re right i might priorit my profession work after i graduat too 

Why do you care whether I re right you might priorit your profession work after you graduat too?
Type count:  47 

Token count:  128 



> Because I might be a workaholic. I want to always do good work


Stemmed user input:  becaus i might be a workahol i want to alway do good work 

Let's change focus a bit... Tell me about your family.
Type count:  52 

Token count:  154 



> I just moved out from home. I lived with my family for over 20 years


Stemmed user input:  i just move out from home i live with my famili for over 20 year 

I see.  And what does that tell you?
Type count:  63 

Token count:  184 



> I like living on my own, but I definitely need to move back to save up a little more


Stemmed user input:  i like live on my own but i definit need to move back to save up a littl more 

Can you elaborate on that?
Type count:  71 

Token count:  222 



> I want more financial stability so that I don't have to worry about whether I can make rent next month


Stemmed user input:  i want more financi stabil so that i don t have to worri about whether i can make rent next month 

If you got more financi stabil so that you don t have to worri about whether you can make rent next month, then what would you do?
Type count:  84 

Token count:  264 



> I wouldn't be as distracted while doing the things I love


Stemmed user input:  i wouldn t be as distract while do the thing i love 

Can you elaborate on that?
Type count:  90 

Token count:  288 



> Sometimes when I'm spending time with friends, my mood can get brought down because I think about my responsibilities


Stemmed user input:  sometim when i m spend time with friend my mood can get brought down becaus i think about my respons 

Why don't you tell me about a childhood friend?
Type count:  99 

Token count:  328 



> She's coming over for dinner next Tuesday


Stemmed user input:  she s come over for dinner next tuesday 

How does that make you feel?
Type count:  104 

Token count:  344 



> I'm excited. I haven't seen her in a few months


Stemmed user input:  i m excit i haven t seen her in a few month 

Why do you think you're you m?
Type count:  109 

Token count:  368 



> We've been friends for a long time, I will always be excited to see her


Stemmed user input:  we ve been friend for a long time i will alway be excit to see her 

Tell me more about your friends.
Type count:  114 

Token count:  400 



> I used to not have a lot of friends. Since coming to SFSU, that has changed a lot


Stemmed user input:  i use to not have a lot of friend sinc come to sfsu that has chang a lot 

Tell me more about your friends.
Type count:  122 

Token count:  436 



> quit


Stemmed user input:  quit 

Thank you for talking with me.
Type count:  123 

Token count:  438 



# Summary and My Thoughts

I have written a lot of what I thought about this implementation of Eliza throughout the entire notebook. One thing I would like to expand upon is that although I think the implementation is rudimentary, I was still able to have a somewhat deep conversation with it. Shown in the attached screenshots, I talked quite a lot about my sister who is a new international student. We did not grow up together as she grew up in Vietnam with my parents while I lived in the US with my grandparents. The questions and responses from Eliza were enough to get my to explore my thoughts and feelings on my relationship with my sister. Regardless of how simple or rudimentary this program is, it can be effective and I'm totally sold on it. I would have continued the conversation with Eliza about my sister but it seemed to have bugged out and just kept repeating what I was saying. 