# Eliza Bot

#### *COSC523: Assignment 1*
#### *Author: Christopher Pawlenok*

<br>

To get started, we used ChatGPT to generate a list of keywords, decompisition rules, and reassembly rules.

In [8]:
import yaml
from google.colab import drive

drive.mount('/content/drive')

try:
  with open("/content/drive/MyDrive/ElizaBot/keywords.yaml") as f:
      keywords = yaml.safe_load(f)
except Exception as e:
  print('Load keywords failed...', e)

try:
  with open("/content/drive/MyDrive/ElizaBot/pronouns.yaml") as f:
      pronouns = yaml.safe_load(f)
except Exception as e:
  print('Load pronouns failed...', e)

try:
  with open("/content/drive/MyDrive/ElizaBot/word_classes.yaml") as f:
      word_classes = yaml.safe_load(f)
except Exception as e:
  print('Load pronouns failed...', e)

print('Data Loaded successfully!')

# keyword loading

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Data Loaded successfully!


# Decomposition

Here exists the decomposition logic for returning a matching pattern and reassembly rule.

In [30]:
import re

# Normalize words
def prescan(word):
  if word in pronouns['pre_subs'].keys():
    return pronouns['pre_subs'][word]
  else:
    return word

# Normalize words
def postscan(wildcards):
  wildcard_list = []
  for wildcard in wildcards:
    new_wc = ''
    words = wildcard.split(' ')
    i = 0
    for word in words:
      if word in pronouns['post_subs'].keys():
        if i == 0:
          new_wc = new_wc + pronouns['post_subs'][word]
        else:
          new_wc = new_wc + ' ' + pronouns['post_subs'][word]
      else:
        if i == 0:
          new_wc = new_wc + word
        else:
          new_wc = new_wc + ' ' + word
      i+=1
    wildcard_list.append(new_wc)
  return wildcard_list


# match word class for
def scan_classes(word):
  for key in word_classes['classes'].keys():
    for item in word_classes['classes'][key]:
      if word.casefold() == item.casefold():
        return key.upper()
  return word

# initial input scan
def scan_input(user_input):
  sentence = user_input.upper()
  words = sentence.split()
  new_words = []
  for word in words:
    word = word.replace(' ', '')\
      .replace(',', '')\
      .replace('.', '')\
      .replace('?', '')\
      .replace('!', '')\
      .replace('\'', '')\
      .replace('"', '')\
      .replace(':', '')\
      .replace(';', '')\
      .replace('(', '')\
      .replace(')', '')

    word = prescan(word)
    word = scan_classes(word)
    new_words.append(word)

  words = new_words
  return words

def find_keywords(words):
  stack = Keystack()

  # To handle no-keywords available inputs
  stack.push(("NONE",
              keywords["NONE"]['rank'],
              keywords["NONE"]['decompositions'])
  )
  for word in words:
    if word in keywords.keys():
      stack.push((word,
                     keywords[word]['rank'],
                     keywords[word]['decompositions'])
      )

  return stack

def find_decomposition(stack, words):
  while not stack.is_empty():
    keyword = stack.pop()
    for dec in keyword[2]:
      res, wildcards = match_token(dec['pattern'], words)
      if res:
        wildcards = postscan(wildcards)
        return dec['reassemblies'], wildcards

  return words, []

def match_token(pattern, words):
  pi = 0
  wi = len(words)
  separator = ' '
  wildcards = []

  while pi < len(pattern):
    token = pattern[pi]

    if token == "0":
      if pi < len(pattern) - 1:
        next_token = pattern[pi+1]
        loc = get_index(words, next_token)
        if loc == -1:
          return False, []
        wildcards.append(separator.join(words[wi:loc]))
        wi = loc
        pi += 1
      else:
        wildcards.append(separator.join(words[wi:]))
        wi = len(words)
        pi += 1

    else:
      if wi >= len(words) or not words[wi] == token:
        return False, []

      wi += 1
      pi += 1

  if wi == len(words):
    return True, wildcards
  else:
    return False, []

# helper function to replace .index() without throwing error
def get_index(list, flag):
  i = 0
  for item in list:
    if item == flag:
      return i
    else:
      i+=1
  return -1

# landing function for decomposing
def decompose(user_input):

  words = scan_input(user_input)
  stack = find_keywords(words)
  reassemblies, wildcards = find_decomposition(stack, words)

  return reassemblies, wildcards


#Keystack

Very similar to a stack, we need to implement a keystack.  The only difference between the keystack and FILO stack is that the keystack needs to rank the keywords whenever something is pushed.  I also decided to use merge sort and implement to refresh my data structures understanding.

In [21]:
# Quick keystack implementation
class Keystack:
  def __init__(self):
    self.items = []

  def push(self, item):
    self.items.append(item)
    self.items = merge_sort(self.items)

  def pop(self):
    return self.items.pop()

  def peek(self):
    return self.items[-1]

  def is_empty(self):
    return len(self.items) == 0

# Merge Sort
def merge_sort(list):
  if len(list) <= 1:
    return list

  mid = len(list) // 2
  l = merge_sort(list[:mid])
  r = merge_sort(list[mid:])

  return merge(l, r)

# helper function for merge sort
def merge(l, r):
  merged = []

  i, j = 0, 0

  # compare left and right elements
  while i < len(l) and j < len(r):
    if r[j][1] < l[i][1]:
      merged.append(r[j])
      i += 1
    else:
      merged.append(l[i])
      j += 1

  # add any remaining elements to the end of the merged list
  while i < len(r): # if equal we take the left since it is earlier in the sentence
    merged.append(r[i])
    i += 1

  while j < len(l):
    merged.append(l[j])
    j += 1

  return merged


# Reassembly

Here is where we reassemble the response based on the reassembly rules and add the wildcards.

In [20]:
import random
import re

def select_rule(rules):
  ri = random.randint(0, len(rules)-1)
  return rules[ri]

def apply_wildcards(rule, wildcards):
  rule = re.split(r'()', rule)

  i=0
  output = ''
  for word in rule:
    if word.isdigit():
      rule[i] = wildcards[int(word)-1]
      rule[i] = rule[i].lower()
    if not word in ['(', ')']:
      output = output + rule[i]
    i+=1
  return output

def reassemble(rules, wildcards):
  rule = select_rule(rules)
  output = apply_wildcards(rule, wildcards)
  return output

In [31]:
print('Welcome to ELIZA Bot, input sentences in correct English format to begin...')

user_input = ''

# core user input loop
while not user_input in ('Exit', 'exit', 'Quit', 'quit'):
  user_input = input('Enter here: ')
  reassemblies, wildcards = decompose(user_input)
  output = reassemble(reassemblies, wildcards)
  print('ELIZA: ' + output + '\n')

Welcome to ELIZA Bot, input sentences in correct English format to begin...
Enter here: I am passionate about my work and my family.
ELIZA: Do you believe you are passionate about your work and your family?

Enter here: Yes I do believe I am passionate about my family.
ELIZA: You seem quite positive.

Enter here: I am quite positive!  You seem positive today as well.
ELIZA: Would you like to be quite positive i seem positive today as well?

Enter here: Yes, that is the way I want to feel. Can you help me feel that way more?
ELIZA: What makes you think I help you?

Enter here: Because talking out my thoughts helps a lot.
ELIZA: What other reasons might there be?

Enter here: You also give me good advice on how to handle my day to day life.
ELIZA: What makes you think I also give you?

Enter here: I think you're deflecting today more than usual.
ELIZA: That is interesting. Go on.

Enter here: You normally comment on my thoughts.
ELIZA: What comes to mind when you say that?

Enter here: Y