# **COM3029 Group Project**

## Project Description

This project aims to deliver a chatbot that will act as a conversational diary. The user will be able to add entries to their own digital diary through natural conversation with the chatbot. Additionally, the chatbot will allow the user to request details regarding previous entries in the diary. Each user will be personally identifiable to the chatbot by providing the chatbot with their name and a special phrase or word that will unlock their diary information.

The diary will store user information regarding the places they visited, the people that they met, and how they felt that day.

## **Q1- Model Serving Decisions**

### **Model Serving Options**

#### ***Model embedded in the app*** 

Model embedding is the most direct way to use a model in an application. By embedding the file that contains the model in the the application code, the application can directly interact with it, and fetch predictions on demand. This is a simple infrastructure as it is easy to set up and allows the user to interact with the chatbot offline.

However this is not a very scalable approach as models can oftentimes be large files. We found that especially with transformer based model architecture, the model files are over 400mb which would require a lot of bandwidth for initial setup for an end user. Additionally, this means that almost 1gb of data is transferred every time the model is updated.

#### ***Model served as an API***

An alternative to embedding the model is wrapping a binary file around a microservice that includes features to make the model accessable to applications. This is when we can use a pickle or a dump of the python object of the model that can then be deserialised and exposed to an endpoint for applications to interact with. This means that despite the complexity of the model it can be saved and loaded easily.

#### ***Model serving choice***

We chose to serve the model as an API as it is a simple approach that we had experience with, and it would allow applications to interact with the model without having to embed large files into the application.

### **API Considerations**

For this project we wanted to deliver the project in such a way that it follows a realistic deployment process that would be appropriate for the delivery of a production-level application. 

Various model serving options using an API approach were explored to determine the best way to deliver the application.

#### ***Django***

Django is a very well known framework for making full-stack web applications. It uses the REST framework to expose endpoints to clients. The REST framework includes endpoints GET and POST which can be used to send client requests to host.

However, a high perfomance application can be diffcult to achieve using Django as it has a significantly larger codebase than other solutions that we explored. It also has a monolithic work flow that can complicate things as Django also includes many functions that are not necessary for a simple project.

#### ***FastAPI***

FastAPI is a fast, high performance web framework that allows developers to build APIs using python.

It is a good approach to use as it is offers a great approach for creating scalable products. It also provides an alternative to REST in GraphQL.

While REST is the de-facto standard for web APIs. It can cause request overfetching when multiple endpoints are created.

Comparatively, GraphQL is a query language that uses one endpoint and return values are dependent on client requests.

As our project only required using two endpoints at most, GraphQL was not considered necessary for our process.

#### ***Flask***

Flask is another web framework that can be classified as a micro-framework.

It is a light-weight approach that allows for simple protoypes to be made enabling rapid development. It is also easily extended to cover many use cases such as serving models from an endpoint. It uses REST to create endpoints for client requests to the server. Flask is considered the most policed and feature-rich micro-framework.

#### ***Bottle***

Bottle is similar to Flask. The main difference to Flask is that it is only a wrapper around a server. It is not as extensible as Flask nor does it scale to include other modules that Flask can include.

#### ***Final model serving approach***

As we decided to use the REST API for creating our endpoints, we chose Flask as our API framework as we found that it easy to setup and develop on. Additionally, Flask contained enough functionality for our use case without extra bulk, as discussed, whilst still offering flexibility as a framework.

## **Q2- Web Service and Architectural Choices**

In this section, we discuss the process of building a web service to host our chosen models for the chat bot.

For each component (intent classification, NER, dialogue flow, and the chatbot's response mechanisms) we detail what models were chosen, how responses or predictions are fetched by the web service, and how they interact with other components of the model when necessary.

### **Core components**

To begin with, we walk through the individual components and their implementations, starting with the dialogue flow manager, as this forms the basis of our chatbot functionality. This is followed by the implementation of our intent classifier, sentiment analysis model, and the NER model.

#### **Dialogue Flow Manager**

We decided to implement a heuristics based approach for our dialogue flow manager (DFM) as it performed better than other attempts with AI models during the research stage. The dialogue for our chatbot is controlled by a state machine that can use the intent classfier to determine a state change. The flow of the dialogue can be viewed below. It is also available as a .pdf file in the **documentation** folder.

![Dialog flow](images\dialogflow.jpg)

The chatbot begins when the client connects to the server. It then asks the user whether they have spoken to the bot before. If not, the chatbot asks the user to enter their name and a passphrase. If the user is already known to the chatbot, then the chatbot will ask for the user's passphrase in order to be able to identify the user. The user details are currently stored in a .csv but for a real deployed chatbot, this information would need to be encrypted or stored securely in a different manner. Additionally, the data stored would need to be in a scalable database (an SQL databased hosted on the server machine for example).

Once the user has been recognised or initialised to the bot, DearBot enters it's base state. It asks the user what they would like to do. The user has a few options. They can either create a diary entry, view a previous diary entry, or exit the conversation:

Create entry - the user indicates they would like to create an entry (i.e. "I would like to talk about my day"):<br>
If an entry does not already exist for the current day, the user is asked about their day, and then a summary of their day is offered. The chatbot state is then returned to the base state.

View entry - the user indicates they would like to view an entry (i.e. "What was I doing last week"):<br>
The day of the entry that the user asks for in their message is returned by the bot to the user. The chatbot state is returned to the base state.

Exit conversation - the user indicates they are leaving or do not require any other services (i.e. "Nothing"/"Goodbye"):<br>
The chatbot says goodbye and returns to the start state. The webservice recognises the "Goodbye!" message from the chatbot and closes the client.

Throughout all stages, the DFM relies on the intent classifier to determine the intent of a user's messages and uses this to set the state of the chatbot. The list of stages is in the **state_enum.py** file

```python
class STATE(Enum):
    GREETING = auto()

    CREATE_PROFILE_NAME = auto()
    CONFIRM_NAME = auto()

    CREATE_PROFILE_PHRASE = auto()
    CONFIRM_PHRASE = auto()

    LOGIN_NAME_ENTRY = auto()
    CONFIRM_LOGIN_NAME = auto()

    LOGIN_PHRASE_ENTRY = auto()
    CONFIRM_LOGIN_PHRASE = auto()

    RUNNING = auto()
    ASK_WHAT_TO_DO = auto()

    CHECK_IF_NEW = auto()

    ADD_ENTRY = auto()
    ADD_OVERWRITE = auto()
    VIEW_ENTRY = auto()
    CONFIRM_VIEW_ENTRY = auto()

    CONFIRM_OVERWRITE = auto()
    CONFIRM_EXIT = auto()
    QUIT = -1
```

The chatbot starts in the GREETING state, and then the responses from the user are analysed by the intent classifier. This is to ensure that any "cancel" messages or "goodbye" messages are processed properly. Once the intent has been analysed the chatbot acts according to the dialogue flow diagram. Cancel and return state changes have not been included in the diagram but are implemented throughout the chatbot states. When a user indicates that they want to add an entry, the chatbot uses the NER_handler to get the relevant information.

A diary entry consists of:
<ul>
<li>The unabridged entry as typed in by the user</li>
<li>The people that the user was with that day (as predicted by the NER model)</li>
<li>The location that the user was in (as predicted by the NER model)</li>
<li>And finally, the emotion that the user felt that day (as predicted by the sentiment analysis model)</li>
</ul>

This is currently stored in a separate .csv file, but again, for deployment, the data would need to be stored securely and in a scalable solution.

An example of how the chatbot implements dialogue flow is shown below:

```python
def check_if_exists(self, user_input):
        response = "Tell me about your day"

        intent = intent_handler.predict_intent(user_input)

        if intent in ["cancel", "no"]:
            self.__change_state(STATE.RUNNING)
            response = "No problem, you can always add to your diary later. What would you like to do now?"
        elif intent == "goodbye":
            response = self.goodbye()
        else:
            user_data = pd.read_csv('csvs/user_csvs/{}.csv'.format(self.user_id))
            user_data = user_data.to_numpy()
            for row in user_data:
                if str(row[0]) == str(date.today()):
                    self.__change_state(STATE.CONFIRM_OVERWRITE)
                    return "It seems that you've already got an entry in your diary for today. Would you like to overwrite it?"

            self.__change_state(STATE.ADD_ENTRY)

        return response
```

In this method, the chatbot is checking whether the message the user sends is classified as a "cancel" or "no" intent by the intent classifier. If it is, the state is returned to the base state (**STATE.RUNNING**). If the user says goodbye, the chat bot will comence the goodbye response.

Otherwise, the chatbot will check if an entry for the current day already exists before changing to **STATE.ADD_ENTRY**, where the chatbot will wait for the user to enter a diary entry.

#### **Intent Classification Component**

The intent classifier determines what the user wants to do with the chatbot. The following intents are included in the intent classification model to be used to determine the state of the chatbot:

* greeting - any messages that indicate a user is saying hi to the bot
* yes - any messages that indicate a postive agreement
* no - messages that indicate a negative response
* goodbye - when a user indicates leaving the chat, exiting, or saying bye to the bot
* cancel - when the user wants to cancel their last action or return the the previous state
* add_entry - intent that indicates a user wants to add a diary entry
* query_entry - any messages where the user wants to look back on previous entries

Additionally, there was an intent included for out-of-scope messages, such as "What is the time?" which are outside of the chatbot's intended use.

Any message that the user sends is passed to the intent classifier and is used by the dialogue flow manager to determine the chatbot's response.

(In the end, we did not use the "greeting" intent as it was unnecessary for the proposed dialogue flow. However, it is still included in the model.)

##### Model choice, dataset, and training results

The intent classification model chosen for this application is a CNN based architecture (inspired by Yoon Kim's <a href="https://doi.org/10.48550/arXiv.1408.5882">TextCNN</a>) featuring fastText word embeddings. This model architecture was chosen as it performed well during research completed for the individual coursework component of this module. Additionally, this model is more lightweight than the corresponding BERT or transformer based models.

As intent classification will be performed on all messages, a lighter model was chosen over potentially more accurate but bulkier models. (Usage of transformer based models is demonstrated in the NER and Sentiment Classification components which are only called once when the user makes a diary entry, so efficiency was less important in those cases.)

We trained the intent classifier model on a custom intent dataset inspired by <a href="https://archive.ics.uci.edu/ml/datasets/CLINC150">CLINC150</a> with manual dataset entry generation for the add_entry and query_entry intents. (The .csv of intents can be found in the `documentation/intent_classifier` folder).

The custom dataset contains over 1000 sample message entries and is visualised as below.

![Intent distribution](images\intents.png)

The code to train the model is provided in notebook format for reference purposes in the `documentation/intent classifier` folder. The model created from training on the custom dataset performed well against the validation set, reaching a validation accuracy of 96.23% and a loss of 0.0795. The intent classifier works sufficiently well at predicting the intent of a message for our intended purpose. Graphs of the training loss and accuracy against validation loss and accuracy are included below.

![Accuracy graph](images\accuracy_intents.png) ![Loss graph](images\loss_intents.png)

The model is then saved in keras' legacy h5 format, and the label encoder and vectoriser used for training are pickled. These three components form the model and are saved in the intent_classifier folder to be used in the intent handler for the web service.

The file `intent_handler.py` loads the model, vectoriser, and label encoder, and provides a method to return the intent as a string.

##### Loading the model


```python
#load model from folder
model = load_model("intent_classifier/intents.h5")

#unpickle vectoriser configs
from_disk = pickle.load(open("intent_classifier/vectoriser.pkl", "rb"))

#load vectoriser
vectoriser = TextVectorization.from_config(from_disk['config'])
vectoriser.adapt(tf.data.Dataset.from_tensor_slices(["xyz"]))  # call adapt on dummy data (necessary due to a keras bug)
vectoriser.set_weights(from_disk['weights'])
encoder = pickle.load(open("intent_classifier/encoder.pkl", "rb"))

#get the label mappings
class_names = encoder.classes_

#create the end-to-end pipeline model for intent prediction
input_str = tf.keras.Input(shape=(1,), dtype="string")
x = vectoriser(input_str)  # vectorize
output = model(x)
intent_classifier = tf.keras.Model(input_str, output)
```

##### Method to return intent

```python
def predict_intent(user_input):
    user_input = clean(user_input) # clean the text
    prediction = intent_classifier.predict([[user_input]]) # get model prediction
    intent = class_names[np.argmax(prediction[0])] # return label
    chatbot_logger.log_prediction("Predicting user intent", user_input, intent) # log
    return intent
```

##### Cleaning text


The pipeline additionally includes text cleaning as this was performed on the training data. We strip the punctuation and numbers from the user's message before passing it through to the model to generate an intent.

```python
def clean(message):
    message = " ".join([word.lower() for word in message.split()])

    remove = str.maketrans((string.punctuation + '£' + string.digits), ' '*len((string.punctuation + '£' + string.digits)))
    result = message.translate(remove)
    return result
```

##### **Sentiment Classification Model**

We wanted to develop the chatbot to be able to classify user diary entries by the following emotions:

* happiness
* sadness
* anxiety
* anger
* neutral / no emotion

The chatbot will take a user's diary entry, analyse the content, and return an emotion to be saved as part of the diary entry.

##### Model choice, dataset, and training results

To implement the sentiment analysis component, we chose to fine-tune a BERT model using the HuggingFace Transformer's library, specifically the <a href="https://huggingface.co/bert-base-uncased">BERT base uncased</a> pre-trained model.

We chose to use BERT as sentiment analysis will be performed once at most per user, as mentioned previously. Additionally, BERT performed the best in predicting a diary entry's emotion during research and testing. Due to diary entries being longer than the standard response messages, fetching emotions from a diary entry is reliant on the context of words throughout the whole entry and BERT historically performs well on such tasks.

A dataset collated from three different sources was used for the training of this model. We used entries from <a href="http://yanran.li/dailydialog.html">DailyDialog</a>, <a href="https://www.site.uottawa.ca/~diana/resources/emotion_stimulus_data/">emotion simulus</a>, and <a href="https://github.com/sinmaniphel/py_isear_dataset">ISEAR</a>.

We pre-processed the datasets to cohesively combine them as the datasets were labelled differently (i.e.,"joy" vs "happy" vs "happiness" vs encoded emotion labels). We removed any emotions that were irrelevant, such as love, and saved this dataset for training BERT. Some of the emotions had a lot more entries than others, as shown in the distribution below. Neutral had over 800,000 entries compared to anxiety which only contained around 1,000 so we randomly undersampled the neutral and happiness classes to 3,000 each. (The final dataset can be found in `documentation/sentiment_analysis/emotions_final.csv`).

![Emotions](images\emotions_dist.png)

The code for finetuning BERT on our custom dataset is provided in notebook format for reference in `documentation/sentiment_analysis`. We split the dataset into training, validation, and test datasets, before pre-processing text using HuggingFace's AutoTokenizer and AutoModelForSequenceClassification to finetune BERT.

The evaluation metrics from training were as follows:

```python 
{'eval_loss': 1.1812270879745483,
 'eval_accuracy': 0.8042553191489362,
 'eval_f1': 0.8043095167183316,
 'eval_runtime': 1.3743,
 'eval_samples_per_second': 342.003,
 'eval_steps_per_second': 21.83,
 'epoch': 10.0}
```

And test prediction metrics:
```python
{'test_loss': 1.189466953277588,
 'test_accuracy': 0.781021897810219,
 'test_f1': 0.7798639468951565,
 'test_runtime': 3.5763,
 'test_samples_per_second': 306.459,
 'test_steps_per_second': 19.293}
```
Once the model was finetuned on our data, the model generated can be saved and shared (for this project, the model is saved in the model folder of the root directory). It can be used to form an end-to-end pipeline to return a prediction of an emotion based on a user's input.

##### Loading the model

The file `sentiment_handler.py` of the webservice loads the model as a text classification pipeline and provides a method for the webservice to analyse text for an emotion.

```python
from transformers import pipeline

classifier = pipeline("text-classification", model='model') # load model in as a pipeline that returns the top prediction only
# the pipeline automatically cleans and tokenizes the text as required for BERT as well as making the prediction
```

##### Method to return emotion

```python
# dictionary of text emojis to represent emotions, could be replaced with real emojis if a GUI is implemented
moods = {
    "happy": ":D",
    "sad": ":C",
    "anxious": ":z",
    "angry": ">:C",
    "tired": "(z_Z)",
    "bored": ":|",
    "neutral": ":L"}


def predict_sentiment(user_input):
    sent_pred = classifier(user_input) # use pipeline to predict emotion
    sentiment = sent_pred[0]['label'] # return prediction as string
    chatbot_logger.log_prediction("Making Sentiment Prediction", user_input, sentiment) # log
    return sentiment


def get_emoticon(user_input):
    sentiment = predict_sentiment(user_input) # get emotion
    emoticon = moods[sentiment] # get emoticon related to emotion
    return sentiment, emoticon
```


The emotion is returned as a string to be stored with an entry.

One modification that needed to be made was to ensure the labels returned the correct emotions rather than an encoded label number. This required a one-time manual editing of the file "model/config.json" to change the id2label and label2id dictionaries to show an emotion rather than LABEL_03, for example.

#### **Named Entity Recognition**

Simple Transformers' [NERModel](https://simpletransformers.ai/docs/ner-model/) was used to train our name entity recognition component. The [conll2003](https://huggingface.co/datasets/conll2003) dataset was used as it focuses on language independednt named entity recognition which we found performed well for our chatbot domain where users may have non-English names. As most of the data in conll2003 is from newspapers, we were able to extract date and time tokens, re-label them and use them for training the model on. The model itself is a pre-trained model of BERT (bert-based-cased). We found that this had a high accuracy in entity recognition and allowed returned predictions that were easy to process. 

This was the run summary of the best performing model

![Run summary of NERModel](images/ner_summary.png)

The `ner_handler.py` of the webservice loads the model from a serialised python object that is saved in a joblib format and provides several methods for retrieving named entities.

```python
from simpletransformers.ner import NERModel

model = joblib.load("ner_classifier_nmp.joblib") #Loading pickled model
```

```python
#Returns a list of all entities in a users input
def predict_ner(user_input):
    prediction, _ = model.predict([user_input])
    grouped_entities = group_split_entities(prediction)
    chatbot_logger.log_prediction("Making NER Predictions",user_input,grouped_entities)
    return grouped_entities

```

NERModel includes the tokenizer in the class with configs on how raw inputs should be processed
![Example tokenizer json config file](images\example_tokeniser_json.png)

```python
#Returns a list of entities based on the required type
def get_entity(user_input,entity_type):
  predictions = predict_ner(user_input)
  people = [k for k in predictions.keys() if predictions[k]==entity_type]
  return people
```

**Formatting entity predictions into a more accessible format for the chatbot**

```python
new_labels_enum = {"PER":"PERSON","LOC":"GPE","ORG":"ORGANIZATION","MISC":"MISCELLANEOUS"}

begin_ent = re.compile("B-[A-Za-z]+")

#This groups entities that should be together. The Model for example predicts "San" as B-LOC and "Fransisco" as I-LOC. This function groups these as "San Fransisco" with a new label of GPE
def group_split_entities(predictions):
  entities = {}
  list_prediction = list(predictions[0])
  for i in range(0,len(list_prediction)):
    end_entity = ""
    list_ent = list_prediction[i]
    entity = list(list_ent.keys())[0]
    if list_ent[entity] == "O" or re.match("I-[A-Za-z]+",list_ent[entity]):
      continue
    elif begin_ent.match(list_ent[entity]):
        end_entity = list_ent[entity][2:]
        valid_ent=True
        new_entity = entity
        i=i+1
        while i< len(list_prediction) and valid_ent:
          list_ent=list_prediction[i]
          entity = list(list_ent.keys())[0]
          if re.match("I-"+end_entity,list_ent[entity]):
            new_entity = new_entity+" "+entity
            i+=1
          else:
            valid_ent = False        
        entities = {**entities, new_entity:new_labels_enum[end_entity]}
        
    else:
      entities = {**entities,entity:list_ent[entity]}
    
  return entities

```

##### **Handling Date entities**

Processing dates required more use of heuristics and basic language analysis to improve accuracy, as there are many ways to represent a single idea. "1 day ago" and "yesterday" are syntactically different but share the same meaning. Another case is "last week" and "last week tuesday". They may both start with the same first two tokens but the word "tuesday" changes the meaning from a fairly vague to a specific day. Which is why methods were created to handle potential user inputs that may not be accurately detected otherwise.

```python
#This function uses the NER Model and regular expressions to look for date like entitites to return a date based on the users.
def get_date(user_input):
  predictions = predict_ner(user_input)
  input_split = user_input.split(" ")
  potential_dates = list(filter(lambda v: re.match('(^(0[1-9]|[12][0-9]|3[01])(-|\/)(0[1-9]|1[0-2])(-|\/)\d{4}$)', v), input_split))
  target_date = ""
  if len(potential_dates)>0:
    target_date=potential_dates[0]
  else:
    target_date = search_by_weekday(predictions,input_split)

  return target_date
```

**Creating REST Endpoints with Flask**

```python
from flask import Flask, render_template, request, redirect, url_for
from chatbot import Chatbot
import chatbot_logger
import json

bot_name = "DearBot"
print("Starting: "+bot_name)
app = Flask(__name__)
@app.post('/get_response')
def get_response():
    data = request.json
    print(data)
    user_input = data['msg']
    response=bot.get_response(user_input)
    return json.dumps(response)

@app.post('/start_greeting')   
def start_greeting():
    response=bot.say_greeting()
    print(response)
    return json.dumps(response)
    
if __name__ == '__main__' :
    bot = Chatbot(bot_name) 
    app.run(debug=True, use_reloader=False)


```

## **Q3- Basic Functionality Testing**

To run the Flask app, run the command "python build_and_run.py" in terminal.

Requirements:
Python Version: 3.9.7

Once the Flask server is running, a client can then send REST requests to the app to interact with the bot.
These functionalities can be either be tested using the `test_endpoints.ipynb` or running `test_client`.

If this does not work,  you can manually start the server by following the below instructions in order to get the server running.

<ol>
<li>On a command line in the folder containing `dear_bot.py`</li>
<li>Run the command `pip install --upgrade  --user pip`</li>
<li>Run the command `pip install -r requirements.txt`</li>
<li>Run the command `python dear_bot.py`</li>
<li>Run the command `python test_client.py`</li>
</ol>

As diary entries cannot be retroactively added or edited (this was out-of-scope for our project and not a specified objective), we have provided example entries in pre-initialised .csv databases available in the `csvs/` folder. Normally, the program would create an empty users database when being used for the first time, and create new databases per new user.

This is provided so that the reader can test the chatbot's **view_entry** functionality, using the name `Bob` and the passphrase `test123`. Use the example below for testing.

```Bot: Hey there! Have you used DearBot before?
User: yes
Bot: Great! Please could you tell me your name?
User: Steven
Bot: Just checking if I got that right, is your name Steven?
User: yes
Bot: Hi Steven! Can you please enter your special phrase?
User: test123
Bot: Hi Steven! Nice to see you again! What would you like to do?
User: What was I doing 7 days ago?
Bot: Here's your summary for 2022-05-18

You went to: New York City
You were with: Siren
Overall on this day you felt neutral :L

This was your full entry for the day: i visted my dad Siren in New York City

What else would you like to do today?```

### **Example to add a new user and add an entry**

In the following example, a user has a straight-forward conversation with the chatbot and adds a diary entry, the bot summarises the user's day from their diary entry.

![Example of A new user adding an entry](images/dobby1.png)

We can see that the chatbot successfully extracts relevant information to save into the summary and recalls this for the user. It also was able to predict the sentiment of the entry, correctly saying that the user was happy.

This interaction shows that the bot is able to predict yes, no, add_entry, and goodbye intents, as well as extract information from a user's input. We can also see that the last user input, "nothing thanks", is correctly predicted with a "goodbye" intent.

In the next image we show the same user trying the same type of interaction of making a diary entry but being less straightforward.

![Example of confusing inputs](images/dobby2.png)

In this example, the bot has to handle several attempts to confuse it such as when it is given an unexpected answer in the inccorect state. In this instance the second 'it is' is sent despite the bot having already responded. the bot correctly responds with the "cannot understand" message and reiterates what it is looking for from the user.

This example also shows good accuracy in intent classifcation as "it is" and "yes" are accurately predicted as "yes intents".

### **Functionality Testing**

Results from basic functionalities that have been tested on our chatbot are available as a .pdf accessible at `documentation/testing/functionality_tests.pdf`. The functionality tests that were performed are summarised in the following table.

| Functional Requirement    | Expected Outcome                                                                                                                             | Pass/Fail |
|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-----------|
| User Creation             | A new user is added to the .csv database and the user can continue                                                                           | Pass      |
| Failed Sign-up Indication | Chatbot requests a different phrase if a user of the same name and passphrase already exists in CSV                                          | Pass      |
| Recognise Unique Users    | If user and passphrase exist, a welcome message should indicate the user has been recognised and can continue                                | Pass      |
| Failed Login Attempt      | If user and passphrase combination do not exist, restart user recognition process.                                                           | Pass      |
| Name Entity Recognition   | Identify the location and names of people mentioned inuser input                                                                             | Pass      |
| Sentiment Anlaysis        | Predict the correct sentiment from a diary entry                                                                                             | Pass      |
| Intent Classification     | Identify the correct intent from user input                                                                                                  | Pass      |
| Add Entry                 | When user adds entry, the .csv is updatedwith the entry, location, people, and emotion                                                       | Pass      |
| Change Entry              | If entry already exists for the current day, user is prompted about overwriting entry, and if a yes intent is indicated, the .csv is updated | Pass      |
| View Entry Success        | Fetch and display entry and entry summary from .csv when user asks to see an entry                                                           | Pass      |
| View Entry Fail           | If there no entry exists for the date requested, return explanation message                                                                  | Pass      |
| User can exit chat bot    | User can cancel, or say goodbye at all stages                                                                                                | Pass      |
| User can exit chat bot    | User can cancel at all stages                                                                                                                | Pass      |

## **Q4- Performance of Chatbot**

#### **Accuracy of solution**


Testing the `ner_handler` on a large list of randomly generated list of first and last names. The model was able to accurate predict and group the names together 89% of the time. This is a fairly high result and with the state machine that enables the chatbot to continue asking for the entities it needs, We were happy with its performance.

#### **Speed of Solution**

The graph below shows the speed between requests and response. For 20 requests, it served a response in under 9 seconds. This slow speed is not necessarily a bad thing as the relatively long wait between response is apt for a diary chatbot as it should be more converstational than robotic. It does however show that our solution is not very scalable which we knew we would compromise by using Flask and the size of the models we chose.

![Load test](images\load_test.png)

### **Size of solution**

One of the biggest downsides to our approach is the size of the models we use. Although BERT models produce very accurate results when trained correctly, they also take up a significant amount of storage space. Our models for emotions classifying and named entity recognition took about an average of 400MB. 

An improvement to this would have been to use DistilBERT. DistilBERT is a model based on BERT and like its name suggests, it is a fast, cheap and light alternative to using BERT. It uses knowledge distillation during the pre training phase which reduces the size of a BERT model by 40%. According to [Huggingface](https://huggingface.co/docs/transformers/model_doc/distilbert), DistilBERT also runs 60% faster than BERT while preserving 95% of the performance of BERT. 

### **Database scalabiility**

Our approach of using csv files to store user information. ALthough it is very easy to implement, there are also several issues with doing so. The main issue is that concurrency is not feasible with this approach as two or more people will not be able to access the application at the same time. This is because, there is no way of ensuring that changes one user makes does not affect the data of another. The csv method is also very slow as it requires the entire file to be read each time data is requested. Parsing of the csv files from string to the necessary Python objects is also memory and time intensive. Using databases solves the majortity of these issues. ORM (Object-relational mapping) can be used to convert Python code into SQL and can be used to manage SQL databases. Another method involves using a database manager such as MySQL Workbench to maintain the database. In contrast to the csv method, SQL has a lower search complexity and does allow for concurrent access to the application.

## **Q5- Basic Monitoring Capability**

For logging user input and the chatbot’s response. We created a wrapper for the built-in python logging module and extended the functionality with functions as needed to log various information.

#### Conversation logging

![log_conversation function](images\log_conversation.png)

`log_conversation` is used to log an interaction between the user and the chatbot.


##### Example conversation log

![Example conversation log](images\user_chatbot_interaction.png "Example conversation log")

#### State logging

![log_bot_state function](images\log_state.png)

`log_bot_state` can be used to log the dialog state of the chatbot.


##### Example state logging

![log_bot_state function](images\logged_state.png)

#### Prediction logging

![log_prediction function](images\log_prediction.png)

`log_prediction` is used to log any prediction the chatbot makes about user input such as intent and NER predictions.

##### Example prediction logging

***NER Logging***

![log ner](images\log_ner.png)

***Intent Logging***

![log ner](images\logged_intent.png)

## **Q6- CI/CD Build and Deployment**

To deploy this project, we will use a build management and continuous integration server software to host the server-side of the chatbot application. The build management software of choice is TeamCity by JetBrains. The codebase for the project itself will be stored on GitHub. The TeamCity server will have a project created, where the version control system (VCS) root will use the main branch for the project repository on GitHub.

![VCS Root in TeamCity](images/TeamCity_VCS.png "VCS Root in TeamCity")

Once the VCS root has been set up to track the main, a build configuration will be created to automate the deployment of the server application. We will call the step, "run dear_bot", as this is the Python program that must be ran to start the server program.

![Build Configuration in TeamCity](images/TeamCity_Build_Config.png "VCS Root in TeamCity")

We can then edit the build steps to execute a command of our choosing. This could be command line level input, or if the build agent supports it, it can be the direct execution of a file. For this project, the server will be deployed onto a local machine rather than on a machine on the cloud. It is known that this machine has the necessary Python dependencies for the execution of dear_bot.py, so the build step can be set to directly execute this file.

![Build Steps in TeamCity](images/build_steps.png "VCS Root in TeamCity")

Additionally, the requirements to be installed should are specified from a text file. This is so that if the server is deployed on a different machine, the required dependencies are installed automatically through TeamCity.

![Build Steps in TeamCity](images/requirements.png "VCS Root in TeamCity")

We can then run the build in TeamCity to deploy the server on the local machine.

![Build Steps in TeamCity](images/running_build.png "VCS Root in TeamCity")

## **Q7- Recording**

The recording can be found in the submission zip: 