<a href="https://colab.research.google.com/github/deniskapel/autoskill/blob/main/DialogRPT_ipynb%22.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#DialogRPT Online Demo (with 🤗)

How likely a dialog response is upvoted by people and/or trigger more replies? This is what [DialogRPT](https://github.com/golsun/DialogRPT) is learned to predict.
It is a set of dialog response ranking transformer-based models trained on millions of human feedback data. 

The HuggingFace model card for DialogRPT is availabel:

| Model card | Description  | 
| :-----------: | :----------- |
|   | **given a context and its two human responses, predict...** | 
| [`microsoft/DialogRPT-updown`](https://huggingface.co/microsoft/DialogRPT-updown) |  ... which gets more upvotes?  |
| [`microsoft/DialogRPT-width`](https://huggingface.co/microsoft/DialogRPT-width) | ... which gets more direct replies?  | 
| [`microsoft/DialogRPT-depth`](https://huggingface.co/microsoft/DialogRPT-depth) |  ... which gets longer follow-up thread? | 
|  | **given a context and one human response, distinguish it with...**  |
| [`microsoft/DialogRPT-human-vs-rand`](https://huggingface.co/microsoft/DialogRPT-human-vs-rand) | ... a random human response  | 
|[`microsoft/DialogRPT-human-vs-machine`](https://huggingface.co/microsoft/DialogRPT-human-vs-machine) | ... a machine generated response  | 

Related: Check out [this notebook](https://colab.research.google.com/drive/1jQXzTYsgdZIQjJKrX4g3CP0_PGCeVU3C) for demo with the original implementation.



**Step 1**. Get the latest Hugging Face Transformers (`pip install` may NOT give you the latest version)

In [None]:
!git clone https://github.com/huggingface/transformers.git
%cd transformers
!pip install -e .
%cd src

**Step 2**. Set up the `DialogRPT` model using Hugging Face model card

In [2]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_card = "microsoft/DialogRPT-updown"   # you can try other model_card listed in the table above
tokenizer = AutoTokenizer.from_pretrained(model_card)
model = AutoModelForSequenceClassification.from_pretrained(model_card)

def score(cxt, hyp):
  model_input = tokenizer.encode(cxt + "<|endoftext|>" + hyp, return_tensors="pt")
  result = model(model_input, return_dict=True)
  return torch.sigmoid(result.logits)

**Step 3**. Play!

In the following example, the model predicts that, given the same context "I love NLP!", response B is gets more upvotes than response A.

|  | Response of "I love NLP!"  | Score |
| :-----------: | : ----------- | :----------- : |
|  A |  Me too! | 0.111|
|  B |  Here’s a free textbook (URL) in case anyone needs it. | 0.613|


In [3]:
cxt = "I love NLP!"
hyp_A = "Me too!"
hyp_B = "Here’s a free textbook (URL) in case anyone needs it."

print('%.3f %s'%(score(cxt, hyp_A).squeeze(), hyp_A))
print('%.3f %s'%(score(cxt, hyp_B).squeeze(), hyp_B))

0.125 Me too!
0.640 Here’s a free textbook (URL) in case anyone needs it.


### Data

In [4]:
import json
import os

In [5]:
!pwd

/content/transformers/src


In [23]:
with open('../../data/convert_contexts.json', 'r', encoding="utf8") as f:
    train_context = json.load(f)

with open('../../data/convert_contexts_zero.json', 'r', encoding="utf8") as f:
    train_context_zero = json.load(f)

with open('../../data/convert_responses.json', 'r', encoding="utf8") as f:
    train_responses = json.load(f)

with open('../../data/convert_responses_zero.json', 'r', encoding="utf8") as f:
    train_responses_zero = json.load(f)

with open('../../data/val_context.json', 'r', encoding="utf8") as f:
    val_context = json.load(f)

with open('../../data/val_responses.json', 'r', encoding="utf8") as f:
    val_responses = json.load(f)

In [7]:
len(train_responses)

(9991, 9991)

In [8]:
len(train_responses_zero)

(9991, 9991)

In [9]:
len(val_context), len(val_responses)

(2177, 2177)

In [24]:
responses = train_responses + train_responses_zero + val_responses

In [25]:
responses = [res[-1] for res in responses]
responses[0:3]

["i didn't even realize that he played any SPORT.",
 'i head it was because PERSON was so dominant at it.',
 'usually though ORGANIZATION.']

In [27]:
val_context = [[" ".join(ut) for ut in cnt] for cnt in val_context]
val_context[0:3]

[["Yes.  Talking about how big it is....it's relatively small in the big picture of the galaxy.... the sun is only 1 billionth the size of the biggest star discovered in  our galazy!",
  'It is a modest star. but it serves our needs well. i am glad that he will be around for the next billion years or so. LOL',
  'lol  On July 11. 2011 Neptune completed its first full orbit around the sun since its discovery in 1846.'],
 ['Yes I know that Wilson was the only one to have a PhD',
  'Nice. Do you know who the three wealthiest presidents in the us are?',
  'I have no idea they are ranked in that way. who is it?\n'],
 ['You voted for him, right?',
  'Did you vote for him, because I know that I did.',
  'I did, too.']]

In [28]:
len(val_context), len(responses)

(2177, 22159)

In [15]:
#del train_context, train_context_zero, val_context
#del train_responses, train_responses_zero, val_responses

In [44]:
output = []

for cnt in val_context[0:2]:
    output.append(" ".join(cnt) + '\t' + "\t".join(responses[0:3]))

In [50]:
import csv
with open('../../data/to_score.tsv', 'w') as tsvfile:
    writer = csv.writer(tsvfile, delimiter='\t')
    for line in val_context[0:2]:
        writer.writerow(line + responses)

In [39]:
!python src/score.py test --data=../../data/to_score.tsv -p=restore/updown.pth

python3: can't open file 'src/score.py': [Errno 2] No such file or directory


In [51]:
lines = open('../../data/to_score.tsv').readlines()

# Scoring

In [None]:
from sklearn.metrics import f1_score, accuracy_score
from sklearn.metrics import classification_report
from tensorflow.keras.utils import to_categorical

In [None]:
def top_preds(probas: np.ndarray, top_n:int=1, num_classes=20) -> np.ndarray:
    """ extract top_n predictions from from given probabilities """
    preds = np.argsort(probas, axis=-1)[:,::-1][:,:top_n]
    preds = to_categorical(preds, num_classes=num_classes)
    
    if top_n > 1:
        preds = np.max(preds, axis=1)
    
    return preds

In [29]:
import numpy as np

In [30]:
scores = np.zeros([len(val_context), len(responses)], dtype='f')

In [32]:
for i in range(len(val_context)):
    cnt = " ".join(val_context[i])
    for j in range(len(responses)):
        res = responses[j]
        scores[i,j] = score(cnt, res).squeeze().detach().numpy()

KeyboardInterrupt: ignored

In [None]:
y_true = [i for i in range(19982, 19982+len(val_context))]
y_true = to_categorical(y_true, num_classes=responses.shape[0])
y_true[0:4,19980:19986]

In [None]:
preds_at_1 = top_preds(scores, top_n=1, num_classes=responses.shape[0])
preds_at_5 = top_preds(scores, top_n=5, num_classes=responses.shape[0])
preds_at_10 = top_preds(scores, top_n=10, num_classes=responses.shape[0])

In [None]:
accuracy_score(y_true, preds_at_1)

In [None]:
accuracy_score(y_true, preds_at_5)

In [None]:
accuracy_score(y_true, preds_at_10)