# Erstellen eines einfachen Chatbots from Scratch in Python (mit NLTK)

Die Geschichte der Chatbots geht auf das Jahr 1966 zurück, als Weizenbaum ein Computerprogramm namens ELIZA erfand. Es imitierte die Sprache eines Psychotherapeuten aus nur 200 Zeilen Code. Siehe: [Eliza](http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm?utm_source=ubisend.com&utm_medium=blog-link&utm_campaign=ubisend). 

Ähnlich wie Weizenbaum erstellen wir einen sehr einfachen Chatbot, der jedoch die NLTK-Bibliothek von Python verwendet. Einen sehr einfachen Bot mit kaum kognitiven Fähigkeiten, aber ein kleiner Einstieg in NLP einzusteigen und um Chatbots kennenzulernen....

## Module importieren

In [1]:
import io
import random
import string # to process standard python strings
import warnings
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import warnings
warnings.filterwarnings('ignore')

## NLTK importieren

In [2]:
import nltk
from nltk.stem import WordNetLemmatizer
nltk.download('popular', quiet=True) # for downloading packages
#nltk.download('punkt') # first-time use only
#nltk.download('wordnet') # first-time use only

True

## Eigenen Text einlesen, z.B.:

Bostrom's Superintelligence & Yuval Hararis brief history of humankind

In [None]:
f=open('./data/bostrom-harari.txt','r',errors = 'ignore')
raw=f.read()
raw = raw.lower()# converts to lowercase

## Tokenisation

In [4]:
sent_tokens = nltk.sent_tokenize(raw)# converts to list of sentences 
word_tokens = nltk.word_tokenize(raw)# converts to list of words

## Preprocessing

Eine `Funktion Namens LemTokens definieren`, die tokens einliest und sie **normalized** weitergiebt

In [5]:
lemmer = nltk.stem.WordNetLemmatizer()
#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
    return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)

def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))

## Keyword matching

Begrüßung

In [6]:
GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey",)
GREETING_RESPONSES = ["hi", "hey", "*nods*", "hi there", "hello", "I am glad! You are talking to me"]
def greeting(sentence):
 
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

## Generating Response

To generate a response from our bot for input questions, the concept of document similarity will be used. We define a function response which searches the user’s utterance for one or more known keywords and returns one of several possible responses. If it doesn’t find the input matching any of the keywords, it returns a response:” I am sorry! I don’t understand you”

In [7]:
def response(user_response):
    robo_response=''
    sent_tokens.append(user_response)
    TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words='english')
    tfidf = TfidfVec.fit_transform(sent_tokens)
    vals = cosine_similarity(tfidf[-1], tfidf)
    idx=vals.argsort()[0][-2]
    flat = vals.flatten()
    flat.sort()
    req_tfidf = flat[-2]
    if(req_tfidf==0):
        robo_response=robo_response+"I am sorry! I don't understand you"
        return robo_response
    else:
        robo_response = robo_response+sent_tokens[idx]
        return robo_response



Zeilen eingeben, die unser Bot sagen soll, während er ein Gespräch beginnt und beendet, je nach Eingabe des Benutzers.

In [None]:
flag=True
print("Transbot: My name is Transbot. I will answer your queries about AI & Mankind. If you want to exit, type Bye!")
while(flag==True):
    user_response = input()
    user_response=user_response.lower()
    if(user_response!='bye'):
        if(user_response=='thanks' or user_response=='thank you' ):
            flag=False
            print("Transbot: You are welcome..")
        else:
            if(greeting(user_response)!=None):
                print("Transbot: "+greeting(user_response))
            else:
                print("Transbot: ",end="")
                print(response(user_response))
                sent_tokens.remove(user_response)
    else:
        flag=False
        print("Transbot: Bye! take care..")

Transbot: My name is Transbot. I will answer your queries about AI & Mankind. If you want to exit, type Bye!
what is Artificial Intelligence?
Transbot: Artificial intelligence or whole brain emulation first?
i dont know? brain?
Transbot: Nobody knows.
human brain?
Transbot: This thing, the human brain,
has some capabilities that the brains of other animals lack.
and human?
Transbot: They were all human beings.
and AI?
Transbot: Instead of launching this AI directly,
imagine that we first built an oracle AI for the sole purpose of answering questions
about what the sovereign AI would do.
