# DialoGPT from Microsoft

**DialoGPT needs pytorch and transformers.**

https://github.com/microsoft/DialoGPT
https://huggingface.co/microsoft/DialoGPT-medium

A State-of-the-Art Large-scale Pretrained Response generation model (DialoGPT)
DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

## Interesting use cases
* https://colab.research.google.com/drive/15wa925dj7jvdvrz8_z3vU7btqAFQLVlG A great tutorial of how to finetune DialoGPT to build a customized bot built by Rostyslav Neskorozhenyi. 

Let's make a chatbot! Source: https://www.thepythoncode.com/article/conversational-ai-chatbot-with-huggingface-transformers-in-python

## To download the pre-trained models, uncomment and run the below lines

In [4]:
# the pretrained DialoGPT-medium is ~1G
# the pretrained DialoGPT-large is ~2G

import torch

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large")

In [7]:
# chatting 5 times with nucleus & top-k sampling & tweaking temperature & multiple sentences

for step in range(5):

    # take user input
    text = input(">> You:")
    
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    
    # generate a bot response
    chat_history_ids_list = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_p=0.95,
        top_k=50,
        temperature=0.75,
        num_return_sequences=5,
        pad_token_id=tokenizer.eos_token_id
    )
    
    #print the outputs
    for i in range(len(chat_history_ids_list)):
      output = tokenizer.decode(chat_history_ids_list[i][bot_input_ids.shape[-1]:], skip_special_tokens=True)
      print(f"DialoGPT {i}: {output}")
    
    choice_index = int(input("Choose the response you want for the next input: "))
    
    chat_history_ids = torch.unsqueeze(chat_history_ids_list[choice_index], dim=0)
    
    

>> You: what's your favorite food?


DialoGPT 0: I'm a big fan of French toast.
DialoGPT 1: Haven't had it in a while. I like sushi and sushi is my favorite.
DialoGPT 2: I like pizza
DialoGPT 3: I like food.
DialoGPT 4: chicken parm


Choose the response you want for the next input:  1
>> You: I also love sushi. could you recommend any restaurants?


DialoGPT 0: I like The Sushi Club on 3rd and 11th. It's just a little bit out of my way, but it's not bad.
DialoGPT 1: I think they're all over the place.
DialoGPT 2: I don't like restaurants. I like sushi food.
DialoGPT 3: I like Nijiro. They have awesome sushi.
DialoGPT 4: I like the sushi at Haruki Nakamura in Nihonjin, it's a bit pricey, but it's great.


Choose the response you want for the next input:  4
>> You: Oh I want to go there. How much is it?


DialoGPT 0: It's a little pricey but it's a great restaurant.
DialoGPT 1: 25 for 4 meals.
DialoGPT 2: 75 for the meal.
DialoGPT 3: It's about 4.5 or 5 dollars. It's near a train station.
DialoGPT 4: It's 20


Choose the response you want for the next input:  2
>> You: hmm not too bad. I can buy you


DialoGPT 0: yeah. But
DialoGPT 1: hahahaha
DialoGPT 2: Haha
DialoGPT 3: It's on my wish list
DialoGPT 4: with a free meal, too.


Choose the response you want for the next input:  1
>> You: hahahaha


DialoGPT 0: hahahaha
DialoGPT 1: this guy is a real
DialoGPT 2: oh that's a good
DialoGPT 3: I'm not sure but
DialoGPT 4: lol what a guy.


Choose the response you want for the next input:  4
