###  Simple IR-based Chatbot

In [1]:
# April 2019
# Source: https://medium.com/analytics-vidhya/building-a-simple-chatbot-in-python-using-nltk-7c8c8215ac6e

In [2]:
#-*- coding: utf-8 -*-

In [3]:
%autosave 300

Autosaving every 300 seconds


In [4]:
import nltk
import numpy as np
import random
import string # to process standard python strings
import re
print('The nltk version is {}.'.format(nltk.__version__))

The nltk version is 3.4.


In [5]:
nltk.download('punkt') # first-time use only
nltk.download('wordnet') # first-time use only

[nltk_data] Downloading package punkt to /Users/sohyun/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /Users/sohyun/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

### Sample Therapist Responses 

Therapist responses are roughly gathered from Carl Rogers' transcript collection.

In [6]:
f=open('data/rogers-target.txt','r',errors = 'ignore')
raw=f.readlines()

In [7]:
raw[0] # contains weird characters due to encoding problems 

' And I get several feelings there. One is that you‚Äôre a little fearful that I will think, ‚ÄúOh, that‚Äôs not an important problem.‚Äù And another one is that, uh, uhm, you have a very definite goal for yourself and sometimes you achieve it and feel good, but then it cycles that‚Äôs very discouraging. And I was interested that when you reach as much weight as you have now, it‚Äôs not just a feeling of, of, uh, dissatisfaction, uh, you use a stronger word than that, uh, despair. Did you use despair?\n'

In [8]:
# due to rough preprocessing and encoding problems,
# do some decluttering  
data=[]
for r in raw:
    r=r.replace("‚Äô", "'")
    r=r.replace("‚Äú", '"')
    r=r.replace("‚Äù", "")
    r=r.replace("‚Ä¶", "")
    r=r.replace('"', "")
    r=r.replace("  ", " ")
    r=r.replace("\n", "")
    r=r.replace("\x1e", "")
    if r.startswith(" "):
        r=r.replace(" ", "", 1)
    elif r.startswith("  "):
        r=r.replace("  ", "", 1)
    elif r.startswith("   "):
        r=r.replace("   ", "", 1)
    elif r.startswith("    "):
        r=r.replace("    ", "", 1)
    data.append(r)

In [9]:
len(data)

8509

In [10]:
sent_tokens=[]
for d in data:
    temp=nltk.sent_tokenize(d)
    for t in temp:
        if t.startswith(" "):
            t=t.replace(" ", "", 1)
        elif t.startswith("  "):
            t=t.replace("  ", "", 1)
        elif t.startswith("   "):
            t=t.replace("   ", "", 1)
        elif t.startswith("    "):
            t=t.replace("    ", "", 1)
        sent_tokens.append(t) # converts to list of sentences 
        
sent_tokens=[s for s in sent_tokens if not s.startswith(".")]

In [11]:
sent_tokens[:5]

['And I get several feelings there.',
 "One is that you're a little fearful that I will think, Oh, that's not an important problem.",
 "And another one is that, uh, uhm, you have a very definite goal for yourself and sometimes you achieve it and feel good, but then it cycles that's very discouraging.",
 "And I was interested that when you reach as much weight as you have now, it's not just a feeling of, of, uh, dissatisfaction, uh, you use a stronger word than that, uh, despair.",
 'Did you use despair?']

In [12]:
word_tokens=[]
for d in data:
    temp=nltk.word_tokenize(d)
    for t in temp:
        word_tokens.append(t) # converts to list of words

In [13]:
word_tokens[:10]

['And', 'I', 'get', 'several', 'feelings', 'there', '.', 'One', 'is', 'that']

In [14]:
from collections import Counter
count=Counter(sent_tokens)
count.most_common(20)
random.choice(count.most_common(20))[0] # randomly generate most common therapist response 

'Umm-hmm.'

In [15]:
lemmer = nltk.stem.WordNetLemmatizer()
#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
    # given a list of tokens, lemmatize each token
    return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)
def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))

In [16]:
GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey",)
GREETING_RESPONSES = ["Hello.", "Hi.", "*nods*", "Hello there.", "How are you?", "Good to see you today."]
def greeting(sentence):
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

In [17]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [18]:
stops = nltk.corpus.stopwords.words('english')
stops = stops.append('\"')
def response(user_response):
    bot_response=''
    sent_tokens.append(user_response)
    TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words=stops)
    tfidf = TfidfVec.fit_transform(sent_tokens)
    vals = cosine_similarity(tfidf[-1], tfidf)
    idx=vals.argsort()[0][-2]
    flat = vals.flatten()
    flat.sort()
    req_tfidf = flat[-2]
    if(req_tfidf==0):
        bot_response=bot_response+"My apologies. I didn't quite get that."
        return bot_response
    elif(req_tfidf<0.5):
        bot_response=bot_response+random.choice(count.most_common(20))[0]
        return bot_response
    else:
        bot_response=bot_response+sent_tokens[idx]
        return bot_response

### Chatbot 

Therapist responses are roughly implemented with 'print.' <br>
To allow varied responses, one can implement a therapist response function.

In [19]:
from time import sleep
# loading = 'LOADING...'
# for i in range(10):
#     print(loading[i], sep=' ', end=' ', flush=True); 

In [21]:
flag=True
username=input("What is your name?  ")
sleep(1.0)
print("\n")
print("Rogers: Hi %s, I'm Rogers. I'm here to help you write about how you feel. If you're ready, type 'OK' to begin." % (username))
user_response0 = input("You: ")
sleep(0.5)
if user_response0.lower()!="ok":
    print("\n")
    print("Rogers: I guess it wasn't quite the right time. You can come back anytime.")
    flag==False
else:
    print("\n")
    print("Rogers: Now, tell me about anything that caused you a significant emotional upheaval.")
    user_response1 = input("You: ")
    print("\n")
    print("Rogers: "+response(user_response1)+ " Could you detail the feelings you had, %s?" % (username))
    user_response2 = input("You: ")
    sleep(0.5)
    print("\n")
    print("Rogers: "+response(user_response2)+ " Would it be possible that this event has anything to do with your relationships with others, such as family and friends?")
    user_response3 = input("You: ")
    sleep(0.5)
    print("\n")
    print("Rogers: "+response(user_response3)+ " Hmm. How about this? Think about how this event may relate to your past.")
    user_response4 = input("You: ")
    sleep(0.5)
    print("\n")
    print("Rogers: "+response(user_response4)+ " Then how does this event make you feel who you are now?")
    user_response5 = input("You: ")
    sleep(0.5)
    print("\n")
    print("Rogers: "+response(user_response5)+ " What is your hope for the future?")
    user_response6 = input("You: ")
    sleep(0.5)
    print("\n")
    print("Rogers: "+response(user_response6)+ " Thank you for sharing, %s. Please review what you've shared with me today: " % (username))
    sleep(2.5)
    print("\n")
    print("*"*115)
    print(user_response1+" "+ user_response2+" "+ user_response3+" "+ user_response4+" "+ user_response5+" "+ user_response6)
    print("*"*115)
    print("\n")
    print("Rogers: You've made one big step today. If you're done reviewing, please enter 'DONE'.")
    user_response7 = input("You: ")
    if user_response7.lower()!="done":
        sleep(0.5)
        print("Rogers: I guess something went wrong. You may now exit.")
        flag==False
    else:
        print("\n")
        sleep(0.5)
        print("Rogers: Alright. Take care, %s." % (username))
        flag==False  

What is your name?  Gloria


Rogers: Hi Gloria, I'm Rogers. I'm here to help you write about how you feel. If you're ready, type 'OK' to begin.
You: OK


Rogers: Now, tell me about anything that caused you a significant emotional upheaval.
You: Well, I lied to my daughter, Pam, about the man I am dating and it bothers me so much that I can't focus on anything.


Rogers: Is that...? Could you detail the feelings you had, Gloria?
You: It's like... I feel suffocated. I am so afraid that Pam will find out one day and will be greatly disappointed at me. I can't bear to be a bad mom for her because she doesn't have a father.


Rogers: O.K. Would it be possible that this event has anything to do with your relationships with others, such as family and friends?
You: I guess some of it could be her father's fault in that he wasn't a responsible man to raise a family. But I chose him so...


Rogers: I see. Hmm. How about this? Think about how this event may relate to your past.
You: I don't think