### Simple Chat with Reflections
https://www.youtube.com/watch?v=FFT4p6me2g0

In [1]:
from nltk.chat.util import Chat, reflections

In [2]:
pairs = [['my name is marlena', ['hi marlena']]
         , ['my name is (.*)', ['hi %1']]
         , ['(hi|hello|hey|holla|hola)', ['hey there', 'hi there', 'haayyy']]
         , ['(.*) in (.*) is fun', ['%1 in %2 is indeed fun!']]
         , ['(.*)(location|city) ?', ['Tokyo, Japan']]
         , ['(.*) created you?', ['yuchild did using NLTK']]
         , ['(.*)help(.*)', ['I can help you!']]
         , ['how is the weather in (.*)', ['the weather in %1 is amazing like always']]
         , ['(.*) your name?', ['My name is Jarvis']]
        ]

In [3]:
my_reflections = {'go': 'gone'
                  , 'hello': 'hey there'
                 }

In [4]:
reflections

{'i am': 'you are',
 'i was': 'you were',
 'i': 'you',
 "i'm": 'you are',
 "i'd": 'you would',
 "i've": 'you have',
 "i'll": 'you will',
 'my': 'your',
 'you are': 'I am',
 'you were': 'I was',
 "you've": 'I have',
 "you'll": 'I will',
 'your': 'my',
 'yours': 'mine',
 'you': 'me',
 'me': 'you'}

In [5]:
chat = Chat(pairs, reflections)

In [6]:
# chat = Chat(pairs, my_reflections)
# chat.converse()

### Auto Chatbot
https://www.youtube.com/watch?v=QpMsT0WuIuI

In [7]:
# Description: This is a self learning chatbot

In [8]:
# Install package NLTK

In [9]:
# ! pip install nltk

In [10]:
# Install package newspaper3k

In [11]:
# ! pip install newspaper3k

### Pipeline
1. Scarpe website about kidney disease
2. Have the package learn
3. Ask some questions to chatbot

In [12]:
from newspaper import Article
import random
import string
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import nltk
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [13]:
import ssl

ssl._create_default_https_context = ssl._create_unverified_context

In [14]:
# Download packages from NLTK
nltk.download('punkt', quiet=True)
nltk.download('wordnet', quiet=True)

True

In [15]:
# Get the article
article = Article('https://www.mayoclinic.org/diseases-conditions/chronic-kidney-disease/symptoms-causes/syc-20354521')
article.download()
article.parse()
article.nlp()
corpus = article.text

#print corpus
print(corpus)

Overview

Chronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function. Your kidneys filter wastes and excess fluids from your blood, which are then excreted in your urine. When chronic kidney disease reaches an advanced stage, dangerous levels of fluid, electrolytes and wastes can build up in your body.

In the early stages of chronic kidney disease, you may have few signs or symptoms. Chronic kidney disease may not become apparent until your kidney function is significantly impaired.

Treatment for chronic kidney disease focuses on slowing the progression of the kidney damage, usually by controlling the underlying cause. Chronic kidney disease can progress to end-stage kidney failure, which is fatal without artificial filtering (dialysis) or a kidney transplant.

Chronic kidney disease care at Mayo Clinic

How kidneys work

Symptoms

Signs and symptoms of chronic kidney disease develop over time if kidney damage progresses slowly. Signs an

In [16]:
# Tokenization 
text = corpus
sent_tokens = nltk.sent_tokenize(text) # convert text to list of sentences

# print the list of sentences 
print(sent_tokens)

['Overview\n\nChronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function.', 'Your kidneys filter wastes and excess fluids from your blood, which are then excreted in your urine.', 'When chronic kidney disease reaches an advanced stage, dangerous levels of fluid, electrolytes and wastes can build up in your body.', 'In the early stages of chronic kidney disease, you may have few signs or symptoms.', 'Chronic kidney disease may not become apparent until your kidney function is significantly impaired.', 'Treatment for chronic kidney disease focuses on slowing the progression of the kidney damage, usually by controlling the underlying cause.', 'Chronic kidney disease can progress to end-stage kidney failure, which is fatal without artificial filtering (dialysis) or a kidney transplant.', 'Chronic kidney disease care at Mayo Clinic\n\nHow kidneys work\n\nSymptoms\n\nSigns and symptoms of chronic kidney disease develop over time if kidney damage

In [17]:
# create a dictionary (key:value) pair to remove punctuations, use ord to get the ordinal numbers
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)

# print punctuations
print(string.punctuation)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~


In [18]:
print(remove_punct_dict)

{33: None, 34: None, 35: None, 36: None, 37: None, 38: None, 39: None, 40: None, 41: None, 42: None, 43: None, 44: None, 45: None, 46: None, 47: None, 58: None, 59: None, 60: None, 61: None, 62: None, 63: None, 64: None, 91: None, 92: None, 93: None, 94: None, 95: None, 96: None, 123: None, 124: None, 125: None, 126: None}


In [19]:
# Create a function to return a list of lemmatized lower case words after removing punctuations
def LemNormalize(text):
    return nltk.word_tokenize(text.lower().translate(remove_punct_dict))

# print the tokenization text
print(LemNormalize(text))

['overview', 'chronic', 'kidney', 'disease', 'also', 'called', 'chronic', 'kidney', 'failure', 'describes', 'the', 'gradual', 'loss', 'of', 'kidney', 'function', 'your', 'kidneys', 'filter', 'wastes', 'and', 'excess', 'fluids', 'from', 'your', 'blood', 'which', 'are', 'then', 'excreted', 'in', 'your', 'urine', 'when', 'chronic', 'kidney', 'disease', 'reaches', 'an', 'advanced', 'stage', 'dangerous', 'levels', 'of', 'fluid', 'electrolytes', 'and', 'wastes', 'can', 'build', 'up', 'in', 'your', 'body', 'in', 'the', 'early', 'stages', 'of', 'chronic', 'kidney', 'disease', 'you', 'may', 'have', 'few', 'signs', 'or', 'symptoms', 'chronic', 'kidney', 'disease', 'may', 'not', 'become', 'apparent', 'until', 'your', 'kidney', 'function', 'is', 'significantly', 'impaired', 'treatment', 'for', 'chronic', 'kidney', 'disease', 'focuses', 'on', 'slowing', 'the', 'progression', 'of', 'the', 'kidney', 'damage', 'usually', 'by', 'controlling', 'the', 'underlying', 'cause', 'chronic', 'kidney', 'disease'

In [20]:
# Keyword Matching...

# Greeting Inputs from User

greet_in = ['hi', 'hello', 'hola', 'greetings', 'whats up', 'wassup', 'hey']

# Greeting Responses to User
greet_out = ['howdy', 'hi', 'hey', 'whats good', 'hello', 'hey there']

# Make a function to retrun a random greeting to a user greeting
def greeting(sentence):
    # if the user input is a greeting then return random greeting response
    for word in sentence.split():
        if word.lower() in greet_in:
            return random.choice(greet_out)

greeting('hello')

'hey there'

In [21]:
sent_tokens

['Overview\n\nChronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function.',
 'Your kidneys filter wastes and excess fluids from your blood, which are then excreted in your urine.',
 'When chronic kidney disease reaches an advanced stage, dangerous levels of fluid, electrolytes and wastes can build up in your body.',
 'In the early stages of chronic kidney disease, you may have few signs or symptoms.',
 'Chronic kidney disease may not become apparent until your kidney function is significantly impaired.',
 'Treatment for chronic kidney disease focuses on slowing the progression of the kidney damage, usually by controlling the underlying cause.',
 'Chronic kidney disease can progress to end-stage kidney failure, which is fatal without artificial filtering (dialysis) or a kidney transplant.',
 'Chronic kidney disease care at Mayo Clinic\n\nHow kidneys work\n\nSymptoms\n\nSigns and symptoms of chronic kidney disease develop over time if kidney

In [22]:
# Input from user
user_in = 'What is chronic kidney disease?'
user_in = user_in.lower() # make user_in lower case

### Print User response:
print(user_in)

what is chronic kidney disease?


In [23]:
# set chatbot response to any empty string
robo_out = ''

# Append users response to sentence list
sent_tokens.append(user_in)

### Print sentence list after appending the user in
print(sent_tokens)

['Overview\n\nChronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function.', 'Your kidneys filter wastes and excess fluids from your blood, which are then excreted in your urine.', 'When chronic kidney disease reaches an advanced stage, dangerous levels of fluid, electrolytes and wastes can build up in your body.', 'In the early stages of chronic kidney disease, you may have few signs or symptoms.', 'Chronic kidney disease may not become apparent until your kidney function is significantly impaired.', 'Treatment for chronic kidney disease focuses on slowing the progression of the kidney damage, usually by controlling the underlying cause.', 'Chronic kidney disease can progress to end-stage kidney failure, which is fatal without artificial filtering (dialysis) or a kidney transplant.', 'Chronic kidney disease care at Mayo Clinic\n\nHow kidneys work\n\nSymptoms\n\nSigns and symptoms of chronic kidney disease develop over time if kidney damage

In [24]:
# Create a TfidfVectorizer object
tf_vec = TfidfVectorizer(tokenizer = LemNormalize, stop_words = 'english')

# Convert text to a matrix of TF-IDF features
tfidf = tf_vec.fit_transform(sent_tokens)

### Print TF-IDF features, has all the sentences
print(tfidf)

  (0, 104)	0.21517271105967337
  (0, 142)	0.31781973567163924
  (0, 109)	0.36025299997679194
  (0, 64)	0.36025299997679194
  (0, 90)	0.31781973567163924
  (0, 31)	0.2643601450098346
  (0, 72)	0.1426325666011141
  (0, 133)	0.37394360688199496
  (0, 45)	0.3636909501497882
  (0, 173)	0.36025299997679194
  (1, 247)	0.29621491649318066
  (1, 88)	0.40366263341462977
  (1, 22)	0.24110051322517356
  (1, 101)	0.40366263341462977
  (1, 87)	0.40366263341462977
  (1, 257)	0.35611626124035106
  (1, 97)	0.40366263341462977
  (1, 134)	0.274835201145623
  (2, 23)	0.28616745795373816
  (2, 29)	0.32437471199107215
  (2, 80)	0.32437471199107215
  (2, 99)	0.2590589798027162
  (2, 138)	0.28616745795373816
  (2, 62)	0.32437471199107215
  (2, 221)	0.32437471199107215
  :	:
  (20, 206)	0.16848899614087914
  (20, 131)	0.0930600200686559
  (20, 185)	0.07181971765373221
  (20, 113)	0.07740664797917175
  (20, 144)	0.0930600200686559
  (20, 111)	0.16848899614087914
  (20, 229)	0.0930600200686559
  (20, 63)	0.18612

In [25]:
# Get measure of similarity (score)
vals = cosine_similarity(tfidf[-1], tfidf) # compare user in with all features

### Print similarity scores (0-1)
print(vals)

[[0.50685596 0.         0.23643029 0.3919396  0.43337518 0.29784479
  0.37059411 0.43398704 0.10962133 0.         0.15811599 0.15010381
  0.         0.17344031 0.         0.05161854 0.14446323 0.1493211
  0.41960848 0.36629263 0.12003449 1.        ]]


In [26]:
# Get index of the sentence most similar to user's in
idx = vals.argsort()[0][-2] # 0 because list in list, -2 to get the end top score, -1 is the user_in which is most similar

# Reduce dimensionality of vals, to make from list of list to just one list
val_flat = vals.flatten()

# Sort val_flat in ascending order
val_flat.sort()

# Get the most similar score to the user_in
score = val_flat[-2] # -1 is the user_in, -2 is the top score

### Print similiarty score
print(f'Similarity Score: {score}')

Similarity Score: 0.5068559627834549


In [27]:
# If the score is 0, then there is no text similar to user's response
if (score == 0):
    robo_out = robo_out + " I apologize, I don't understand." 
else:
    robo_out = robo_out + sent_tokens[idx]
    
### Print robo_out
print(robo_out)

Overview

Chronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function.


In [28]:
# Remove user_in
sent_tokens.remove(user_in)
print(sent_tokens)

['Overview\n\nChronic kidney disease, also called chronic kidney failure, describes the gradual loss of kidney function.', 'Your kidneys filter wastes and excess fluids from your blood, which are then excreted in your urine.', 'When chronic kidney disease reaches an advanced stage, dangerous levels of fluid, electrolytes and wastes can build up in your body.', 'In the early stages of chronic kidney disease, you may have few signs or symptoms.', 'Chronic kidney disease may not become apparent until your kidney function is significantly impaired.', 'Treatment for chronic kidney disease focuses on slowing the progression of the kidney damage, usually by controlling the underlying cause.', 'Chronic kidney disease can progress to end-stage kidney failure, which is fatal without artificial filtering (dialysis) or a kidney transplant.', 'Chronic kidney disease care at Mayo Clinic\n\nHow kidneys work\n\nSymptoms\n\nSigns and symptoms of chronic kidney disease develop over time if kidney damage