<a href="https://colab.research.google.com/github/nyp-sit/it3103/blob/main/week15/simple_chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Chatbot

In this final part, we will take the models that we have trained and use them to recognize an intent and the entities and building simple responses to that.

We've already saved a copy of our own trained models and uploaded it to a public server on Amazon for you to download. If you have your own copies of models, you can also copy to your Google Drive and mount the Google Drive in colab and use them.

Before starting, click on the Colab's Runtime > Manage Sessions menu. Click the "TERMINATE OTHER SESSIONS" button.  


In [2]:
!wget https://nyp-aicourse.s3.ap-southeast-1.amazonaws.com/pretrained-models/intent_model.zip
!wget https://nyp-aicourse.s3.ap-southeast-1.amazonaws.com/pretrained-models/token_model.zip
!unzip intent_model.zip
!unzip token_model.zip

--2024-09-16 02:16:41--  https://nyp-aicourse.s3.ap-southeast-1.amazonaws.com/pretrained-models/intent_model.zip
Resolving nyp-aicourse.s3.ap-southeast-1.amazonaws.com (nyp-aicourse.s3.ap-southeast-1.amazonaws.com)... 3.5.149.145, 3.5.148.173, 52.219.133.51, ...
Connecting to nyp-aicourse.s3.ap-southeast-1.amazonaws.com (nyp-aicourse.s3.ap-southeast-1.amazonaws.com)|3.5.149.145|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 246823911 (235M) [application/zip]
Saving to: ‘intent_model.zip’


2024-09-16 02:16:55 (18.3 MB/s) - ‘intent_model.zip’ saved [246823911/246823911]

--2024-09-16 02:16:55--  https://nyp-aicourse.s3.ap-southeast-1.amazonaws.com/pretrained-models/token_model.zip
Resolving nyp-aicourse.s3.ap-southeast-1.amazonaws.com (nyp-aicourse.s3.ap-southeast-1.amazonaws.com)... 3.5.150.157, 3.5.148.105, 3.5.148.186, ...
Connecting to nyp-aicourse.s3.ap-southeast-1.amazonaws.com (nyp-aicourse.s3.ap-southeast-1.amazonaws.com)|3.5.150.157|:443... connected.

Next, run the following to install the HuggingFace Transformers library. Our models was trained using transformer version 4.15, so it is safer to use the same version of transformer.

In [1]:
# !pip install transformers==4.15
!pip install transformers



## Section 1 - Inferring Intent

In this section, we declare the codes to infer intent based on a single line of input text.

In [3]:
# Import the necessary libraries
#
from transformers import (
    AutoTokenizer,
    TFAutoModelForSequenceClassification
)

import numpy as np
import tensorflow as tf


# Create the DistilBERT tokenizer
#
model_checkpoint = 'distilbert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)


# Create a list of unique labels that we will recognize.
#
sentence_labels = [
              "others",
              "atis_abbreviation",
              "atis_aircraft",
              "atis_airfare",
              "atis_airline",
              "atis_flight",
              "atis_flight_time",
              "atis_greeting",
              "atis_ground_service",
              "atis_quantity",
              "atis_yes",
              "atis_no"]

# Define a function to perform inference on a single input text.
#
def infer_intent(text, model, tokenizer):
    # Passes the text into the tokenizer
    #
    input = tokenizer(text, truncation=True, padding=True, return_tensors="tf")

    # Sends the result from the tokenizer into our classification model
    #
    output = model(input)
    pred_label = np.argmax(tf.nn.softmax(output.logits, axis=-1))

    # Return the result to the caller
    #
    return sentence_labels[pred_label]


# Load the saved model file
#
intent_model = TFAutoModelForSequenceClassification.from_pretrained('intent_model')



The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at intent_model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


Run the following cell to test the codes that infers the intent.

In [4]:
infer_intent("How much is the ticket to fly to New York", intent_model, tokenizer)

'atis_airfare'

## Section 2 - Inferring Entity

In this section, we declare the codes to infer entities for each individual word in a line of text. The entities are then constructed and returned to the caller.


In [5]:
from transformers import TFAutoModelForTokenClassification
import numpy as np

In [6]:
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)



In [7]:
# Define a list of unique labels that we will recognized
#
token_labels = ['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'B-MISC', 'I-MISC']

# Define the function to infer the individual tokens
#
def infer_tokens(text, model, tokenizer):
    # here we assume the text has not been splitted into individual words
    text = text.split()

    encodings = tokenizer(
        [text],
        padding=True,
        truncation=True,
        is_split_into_words=True,
        return_tensors='tf')

    logits = model(encodings)[0] # assume only a single prediction
    preds = np.argmax(logits, axis=-1)[0]

    # as the prediction is on individual tokens, including subtokens,
    # we need to group subtokens belonging to the same word together
    # again, we use the word_ids to help us here
    previous_word_idx = None
    word_ids = encodings[0].word_ids
    labels = []
    for i, word_idx in enumerate(word_ids):
        # we check if the word_id different from previous one, then it is a new word
        # we also need to check if the word_id is not None so that we won't include it
        if word_idx != previous_word_idx and word_idx != None:
            labels.append(token_labels[preds[i]])
        # update the previous_word_idx to current word_id
        previous_word_idx = word_idx

    return text, labels


# Define the function to combine individual tokens into a dictionary
#
def infer_combined_tokens(text, token_model, tokenizer):
    result = {
        "PER" : [],
        "LOC" : [],
        "ORG" : [],
        "MISC" : []
    }

    result_texts, result_tokens = infer_tokens(text, token_model, tokenizer)

    current_token_label = ""
    current_result_index = -1;

    for i in range(len(result_tokens)):
        if result_tokens[i].startswith("B-"):
            current_token_label = result_tokens[i].replace("B-", "")
            result[current_token_label].append(result_texts[i])
            current_result_index = len(result[current_token_label]) - 1
        elif result_tokens[i].startswith("I-"):
            result[current_token_label][current_result_index] += " " + result_texts[i]

    return result


In [8]:
token_model = TFAutoModelForTokenClassification.from_pretrained('token_model')

All model checkpoint layers were used when initializing TFDistilBertForTokenClassification.

All the layers of TFDistilBertForTokenClassification were initialized from the model checkpoint at token_model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForTokenClassification for predictions without further training.


In [9]:
text, tokens = infer_tokens("How much is the ticket to fly to New York", token_model, tokenizer)

In [10]:
print(tokens)
print(text)

['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-LOC', 'I-LOC']
['How', 'much', 'is', 'the', 'ticket', 'to', 'fly', 'to', 'New', 'York']


Run the following cell to test the codes that extracts and combines all the entities for us.

In [11]:
infer_combined_tokens("Peter Leong and John Lim of Aims are going to fly to New York",
                      token_model,
                      tokenizer)

{'PER': ['Peter Leong', 'John Lim'],
 'LOC': ['New York'],
 'ORG': ['Aims'],
 'MISC': []}

## Section 3 - Implementing Logic for Our Chatbot

In this section, let's implement some very basic logic for our chatbot. We will make use of the two functions that we wrote above.

You can implement some simple logic that looks like the following:

```
        if (intent == "atis_flight" or intent == "atis_airline") and len(tokens["LOC"]):
            print ("Can I confirmed if you just asked about flying to " + tokens["LOC"][0])
        elif intent == "atis_yes":
            print ("Great, then let's me book the ticket for you")
        elif intent == "atis_no":
            print ("Oh I am sorry what did I get wrong?")
        elif intent == "atis_greeting":
            print ("Hi, how are you?")            
        else:
            print ("I don't quite know how to respond to " + intent + " yet.")
```

In [None]:
def chatbot():
    print ("Chatbot Started. Press 'Q'+Enter to quit.")

    while (True):
        input_text = input()
        if input_text == "Q" or input_text == "":
            break

        intent = infer_intent(input_text, intent_model, tokenizer)
        tokens = infer_combined_tokens(input_text, token_model, tokenizer)

        # TODO:
        # Write you own logic to conduct a conversation with the user
        # about buying tickets and flying somewhere.
        #...#
        if (intent == "atis_flight" or intent == "atis_airline") and len(tokens["LOC"]):
            print ("Can I confirmed if you just asked about flying to " + tokens["LOC"][0])
        elif intent == "atis_yes":
            print ("Great, then let me book the ticket for you")
        elif intent == "atis_no":
            print ("Oh I am sorry what did I get wrong?")
        elif intent == "atis_greeting":
            print ("Hi, how are you?")
        else:
            print ("I don't quite know how to respond to " + intent + " yet.")

    print ("Good bye!")

chatbot()


Chatbot Started. Press 'Q'+Enter to quit.
I want to go new york
Can I confirmed if you just asked about flying to new york
yes i ticket 
I don't quite know how to respond to atis_airfare yet.
one ticket to new york
I don't quite know how to respond to atis_airfare yet.
Peter Leong and John Lim of Aims are going to fly to New York
Can I confirmed if you just asked about flying to New York
yes
Great, then let me book the ticket for you
ok
Great, then let me book the ticket for you
yes
Great, then let me book the ticket for you
