# HLT Chatbot Project
*Written by Anthony Maranto for the project by Anthony Maranto (ATM170000) and Usaid Malik (UXM170001).*

Simply run every cell in this notebook (Runtime -> Run all), scroll to the bottom, and wait for it to prompt you for your input. As long as you don't reset the runtime, data should be stored and retained in the database across sessions JUST SO LONG as you don't interrupt individual cells. This means that, if you want to restart the bot while retaining the same data, you should use "Runtime" -> "Restart and run all"; that will ensure that the CoreNLP environment is properly reloaded.

In [1]:
import os
if not os.path.exists("bot.py"):
  print("Downloading chatbot code")
  !curl https://personal.utdallas.edu/~atm170000/ChatbotHLT.zip --insecure > ChatbotHLT.zip

  password = b"usaid_and_tonys_chatbot_project_for_hlt_passw0rd"

  print("Unzipping with supplied password")
  from zipfile import ZipFile
  zf = ZipFile("ChatbotHLT.zip")

  zf.extractall(".", pwd=password)

Downloading chatbot code
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 4267k  100 4267k    0     0  2682k      0  0:00:01  0:00:01 --:--:-- 2680k
Unzipping with supplied password


In [2]:
import glob

if not os.path.isfile("CoreNLP.zip"):
  print("Downloading Stanford CoreNLP")
  !curl -L https://nlp.stanford.edu/software/stanford-corenlp-latest.zip > CoreNLP.zip
  print("Extracting Stanford CoreNLP")
  !unzip CoreNLP.zip

  f = open("corenlp.pth", "w")
  jf_1 = glob.glob("stanford-corenlp-?.?.?/stanford-corenlp-?.?.?.jar")[0]
  jf_2 = glob.glob("stanford-corenlp-?.?.?/stanford-corenlp-?.?.?-models.jar")[0]
  f.write(jf_1 + "\n" + jf_2)
  f.close()

Downloading Stanford CoreNLP
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   355    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  482M  100  482M    0     0  5399k      0  0:01:31  0:01:31 --:--:-- 5189k
Extracting Stanford CoreNLP
Archive:  CoreNLP.zip
   creating: stanford-corenlp-4.5.0/
  inflating: stanford-corenlp-4.5.0/pom-java-11.xml  
  inflating: stanford-corenlp-4.5.0/corenlp.sh  
  inflating: stanford-corenlp-4.5.0/javax.json-api-1.0-sources.jar  
  inflating: stanford-corenlp-4.5.0/Makefile  
  inflating: stanford-corenlp-4.5.0/input.txt  
  inflating: stanford-corenlp-4.5.0/ejml-core-0.39-sources.jar  
  inflating: stanford-corenlp-4.5.0/jaxb-api-2.4.0-b180830.0359.jar  
  inflating: stanford-corenlp-4.5.0/stanford-corenlp-4.5.0-javadoc.jar  
  inflating: stanford-corenlp-4.5.0/joda-time.jar  
  inflating: stanford-corenlp-4.5.0/jolly

In [3]:
!cat corenlp.pth

stanford-corenlp-4.5.0/stanford-corenlp-4.5.0.jar
stanford-corenlp-4.5.0/stanford-corenlp-4.5.0-models.jar

In [4]:
# To run on Colab, we need to alter corenlp.py slightly to get around used Colab ports

#@markdown
f = open("corenlp.py", "w")
f.write('''
# A test script that runs the input through CoreNLP through NLTK

import atexit, sys, os, code
from nltk.parse.corenlp import CoreNLPServer, CoreNLPParser
from nltk import word_tokenize

print("Loading CoreNLP Server...")

config_file = "corenlp.pth"
if not os.path.isfile(config_file):
    warning_msg = f"Warning: {config_file} does not exist. Hopefully, the required jarfiles for Stanford CoreNLP are in the path. " + \
                  f"Otherwise, please download them from https://stanfordnlp.github.io/CoreNLP/download.html and put the paths to " + \
                  f"stanford-corenlp-X.X.X.jar and stanford-corenlp-X.X.X-models.jar as the first two lines of corenlp.pth."
    print(warning_msg, file=sys.stderr)
    corenlp_server = None
    corenlp_models = None
else:
    with open(config_file, "r") as f:
        corenlp_server = f.readline().strip()
        corenlp_models = f.readline().strip()

corenlp_options = ["-preload", "tokenize,ssplit,pos,lemma,parse,depparse,ner,openie"]

import random
port = random.randint(9000, 30000)
print("Using port", port)

corenlp_options.append("-port")
corenlp_options.append(str(port))

server = CoreNLPServer(corenlp_server, corenlp_models, corenlp_options=corenlp_options, port=port)
server.start() #(open("stdout.log", "wb"), open("stderr.log", "wb"))
atexit.register(server.stop)

parser = CoreNLPParser(server.url)

# item = list(parser.parse(word_tokenize("The end of the world is upon us, and Mario Kart 3 won't help.")))[0]

if __name__ == "__main__":
    if 'interact' in sys.argv:
        code.interact(local=locals())
    else:
        print("Enter sentences to be parsed")
        while True:
            for i, tree in enumerate(parser.parse(word_tokenize(input("> ")))):
                print(f"Tree {i+1}:")
                print(tree)
''')
f.close()

In [5]:
# Workaround the expectation that the server will start up within thirty seconds
path = "/usr/local/lib/python3.7/dist-packages/nltk/parse/corenlp.py"
data = open(path, "r").read()
f = open(path, "w")
data = data.replace("for i in range(30):", "for i in range(90):\n            print('waiting...' + str(i + 1))\n")
data = data.replace("def try_port(port=0):\n", "def try_port(port=0):\n    return port\n")
stem = '''            raise CoreNLPServerError(
                'Could not connect to the server.'
            )'''
data = data.replace(stem, '            if self.popen.poll() is not None: print(self.popen.poll(), self.popen.communicate()[1].decode("utf-8"))\n' + stem.replace("Could", "!!Could"))
f.write(data)
f.close()

In [10]:
# Import nltk and download necessary extra submodules
import nltk
nltk.download("wordnet")
nltk.download("punkt")
nltk.download("averaged_perceptron_tagger")
nltk.download('omw-1.4')

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...


True

In [7]:
print("Please note: loading CoreNLP may take up to ninety seconds.")
from bot import *

Please note: loading CoreNLP may take up to ninety seconds.
Loading CoreNLP Server...
Using port 10057
waiting...1
waiting...2
waiting...3
waiting...4
waiting...5
waiting...6
waiting...7
waiting...8
waiting...9
waiting...10
waiting...11
waiting...12
waiting...13
waiting...14
waiting...15
waiting...16
waiting...17
waiting...18
waiting...19
waiting...20
waiting...21
waiting...22
waiting...23
waiting...24
waiting...25
waiting...26
waiting...27
waiting...28
waiting...29
waiting...30
waiting...31
waiting...32
waiting...33
waiting...34
waiting...35


In [8]:
import requests

In [11]:
warning_message = """
Due to bugs with Google Colab (CoreNLP *REALLY* doesn't like being run here),
if you stop this cell, it will cause the internal CoreNLP server to quit. To
resolve this, if you would like to test this bot's data persistence, then
please use "Runtime" -> "Restart and run all" to restart the bot. DO NOT just
interrupt the execution or "break" a single cell, as that will likely cause a
connection error. 
"""
print("Starting bot process.")
print(warning_message.strip())

bot = GameBot()
try:
  bot.loop()
except requests.exceptions.ConnectionError:
  print("ConnectionError! Did you forget to use \"Restart and run all\" when")
  print("restarting this bot? Just interrupting (or \"quit\"ting and rerunning the cell) won't work!")

Starting bot process.
Due to bugs with Google Colab (CoreNLP *REALLY* doesn't like being run here),
if you stop this cell, it will cause the internal CoreNLP server to quit. To
resolve this, if you would like to test this bot's data persistence, then
please use "Runtime" -> "Restart and run all" to restart the bot. DO NOT just
interrupt the execution or "break" a single cell, as that will likely cause a
connection error.
Welcome back Tony. Type "logout" to log out.
Enter your prompt for GameBot. Type "quit" to quit.
> tony
Invalid input; please try again.
Enter your prompt for GameBot. Type "quit" to quit.
> What is Tony?
1 responses matched your query:
Tony is a human
Enter your prompt for GameBot. Type "quit" to quit.
> What is Usaid?
I'm afraid that I don't know much about that.
Enter your prompt for GameBot. Type "quit" to quit.
> quit
