<a href="https://colab.research.google.com/github/sishef/nlpworkshop/blob/main/6_GeneratingTextWithML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Notebook 6: Using machine learning to generate responses
In our previous chatbot we used ML to find sentences with similar meaning to those in our training data and then responded with the subsequent sentence in the data.
This is ok but means the chatbot's responses are limited to the exact sentences we have trained it with. We can do better!

DialogGPT is a state of the art machine learning text generator. It can actually create totally new sentences about most subjects that make sense to the conversation. It has been trained on 147 million conversations taken from Reddit, so most topics are covered.

Here's some sample code you can run and then use in your chatbot. Don't worry too much about all the detail just yet.

In [None]:
% pip install transformers

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "microsoft/DialoGPT-large"
#model_name = "microsoft/DialoGPT-medium"
# model_name = "microsoft/DialoGPT-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

In [None]:
# chatting 5 times with nucleus sampling & tweaking temperature
for step in range(5):

    # take user input
    text = input(">> You:")

    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")

    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids

    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_p=0.95,
        top_k=0,
        temperature=0.75,
        pad_token_id=tokenizer.eos_token_id
    )
    
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

A few things to notice about this code:



1. There are small, medium and large versions of the model. The smallest downloads quickest and generates results fastest, but how does it output compare versus the medium and large?

2. You already know about tokenization. This code uses a tokenizer from the transformers library to turn your sentence into tokens.

3. The response is created by the 'model.generate(...)' call. This function takes in some text that has already been encoded (the bot_input_ids) and returns a response that is encoded. That's why we have to use 'tokenizer.decode(...)' to translate from these encodings back into regular text.

4. Notice also that we are remembering all of the previous sentences and adding them to our new sentence when we pass them into the model. This means the model should use the full context of the conversation so far when generating the next response, as opposed to just reacting to the latest user input.
















**TASK:** Change your current chatbot to use the DialogGPT model to generate it's responses

**TASK:** Play with the different models (small, medium, large) to see how much better or worse each one is

**TASK:** Change the code to generate two responses, one that uses the current conversation history and one that does not. Do the responses make a lot more sense when using the history or not?

Note: This code is taken from this tutorial: https://www.thepythoncode.com/article/conversational-ai-chatbot-with-huggingface-transformers-in-python
