# Natural Language Processing: Chatbots Assignment

In [1]:
!pip install chatterbot --quiet
!pip install chatterbot_corpus --quiet

import re
import requests
from bs4 import BeautifulSoup
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer

[K     |████████████████████████████████| 71kB 5.4MB/s 
[K     |████████████████████████████████| 122kB 12.4MB/s 
[?25h

### Scrape the HTML from the URL below which contains questions and answers about databases.

In [2]:
url = 'https://www.wisdomjobs.com/e-university/database-interview-questions.html'

In [11]:
import nltk
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


True

In [19]:
from nltk import word_tokenize
from nltk.corpus import stopwords
from nltk.stem.snowball import SnowballStemmer
from nltk.stem.wordnet import WordNetLemmatizer

def get_url_text(url):
    response = requests.get(url)
    content = response.text

    TAGS = ['h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'h7', 'p', 'li']
    soup = BeautifulSoup(content, "lxml")
    text_list = [tag.get_text() for tag in soup.find_all(TAGS)]
    text = " ".join(text_list)
    return text

def preprocess(docs):
    lemmatizer = WordNetLemmatizer()
    stemmer = SnowballStemmer("english")
    preprocessed = []

    for doc in docs:
        tokenized = word_tokenize(doc)
        cleaned = [stemmer.stem(lemmatizer.lemmatize(token.lower()))
        for token in tokenized
        if not token.lower() in stopwords.words("english")
        if token.isalpha()]

        untokenized = " ".join(cleaned)
        preprocessed.append(untokenized)

    return preprocessed

text = get_url_text(url)
cleaned = preprocess([text]) #just in case something is needed for training

### Clean the acquired HTML, extracting just the text.

In [20]:
# here is the cleaned form
text = text.replace('\n', ' ')
text

'Home Database Tutorial Database Interview Questions Database Interview Questions & Answers    Searching for a new job can be so stress that it can turn into a job itself. If you are expertise in Adobe Systems and graphics making then prepare well for the job interviews to get your dream job. Here\'s our recommendation on the important things to need to prepare for the job interview to achieve your career goals in an easy way. Database is a systematic collection of data and an electronic filing system. Databases helps in storing and manipulating the data. Databases are created in such a way that data is stored, retrieved, manipulated and deleted with operations. Follow Wisdomjobs page for Database job interview questions and answers page to get through your job interview successfully in first attempt. Database Interview Questions And Answers  Prev Next    Sql Database Tutorial    Database Practice Test    Database Pragnya Meter Exam  Database Jobs  All Interview Questions  Question 1. 

### Organize the text into a list of questions and answers.

The best way to do this is to use a regular expression to split the text wherever a question or an answer appears (they are labeled).

In [21]:
interview = text.split('Question')
interview = interview[5:125]
answer_ = "Answer :"
question_ = '. '
script = []
for q in range(0, len(interview), 2):
  line = interview[q]
  question = line.find(question_) + len(question_)
  answer = line.find(answer_) + len(answer_)
  script.append(line[question:answer - len(answer_)])
  script.append(line[answer:])
script

['Explain What Is Sqlite Transactions? ',
 'The transaction is referred as a unit of work that is performed against a database. It is the propagation of one or more changes to the database. Properties of transactions are determined by ACID.  Atomicity: It ensures that all work unit are successfully completed. Consistency: It ensures that the database changes states upon a successfully committed transaction. Isolation: It enables transactions to operate independently of and transparent to each other. Durability: It ensures that the result or effect of a committed transaction persists in case of a system failure.   ',
 'What Does Sql Stand For, And What Is It Used For? ',
 'SQL stands for structured query language, used with relational databases. It is used to query, update, and retrieve the contents of databases.  ',
 'List Out The Areas Where Sql Lite Works Well? ',
 'SQL lite works well with :  Embedded devices and the internet of things. Application file format. Data Analysis. Websit

### Train a Chatterbot chatbot on the list of questions and answers.

In [22]:
chatbot = ChatBot('Nicee')
chatbot.storage.drop()
trainer = ListTrainer(chatbot)
trainer.train(script)

List Trainer: [####################] 100%


### Write the user interface logic that allows a user to ask the chatbot questions about databases and have the chatbot return an answer. 

Include logic that checks to see if the answer returned is going to be a question and if so, return the next element in the list after that question.

In [23]:
print("Hello. I am your next-best candidate. Ask any SQL Question and I will answer you. When the interview is over, simply say 'Thank You' and I will take my leave.")

while True:
  user_input = input("Hiring Manager: ")
  if user_input.lower() == 'thank you':
    print('It was a pleasure meeting you. Goodbye.')
    user_input = ''
    break
  else:
    print(chatbot.get_response(user_input))

Hello. I am your next-best candidate. Ask any SQL Question and I will answer you. When the interview is over, simply say 'Thank You' and I will take my leave.
Hiring Manager: Hello
An attribute is a column in a table.
Hiring Manager: No, not that
Using transactions we can group all SQL commands into a single unit. The transaction begins with some task and finishes only when all tasks within it are over. The transaction gets over successfully only when all commands in it are successfully over. Even if one command fails, the whole transaction fails. The BEGIN TRANSACTION, ROLLBACK TRANSACTION, and COMMIT TRANSACTION statements are used to work with transactions. A group of tasks starts with the begin statement. In case of any problem, the rollback command is executed to abort the transaction. If all the tasks run successfully, all commands are executed through commit statement.
Hiring Manager: Stop
Fragmentation can be defined as a database feature of server that promotes control on data

KeyboardInterrupt: ignored