
<img src="https://github.com/golsun/DialogRPT/blob/master/doc/icon.png?raw=true" width="500">


#DialogRPT Online Demo

How likely a dialog response is upvoted by people and/or trigger more replies? This is what [DialogRPT](https://github.com/golsun/DialogRPT) is learned to predict.
It is a set of dialog response ranking transformer-based models trained on millions of human feedback data. 

This demo is based on the [original implementation](https://github.com/golsun/DialogRPT). A [Demo with HuggingFace implementation](https://colab.research.google.com/drive/1cAtfkbhqsRsT59y3imjR1APw3MHDMkuV?usp=sharing) is also available.

* Download the repo

In [None]:
!git clone https://github.com/golsun/DialogRPT
%cd DialogRPT/

Cloning into 'DialogRPT'...
remote: Enumerating objects: 249, done.[K
remote: Counting objects: 100% (249/249), done.[K
remote: Compressing objects: 100% (186/186), done.[K
remote: Total 249 (delta 147), reused 147 (delta 55), pack-reused 0[K
Receiving objects: 100% (249/249), 277.65 KiB | 1.08 MiB/s, done.
Resolving deltas: 100% (147/147), done.
/content/DialogRPT


## Play with a single model
### option1: rankers only
In the following example, the model predicts that, given the same context "I love NLP!", response B is gets more upvotes than response A.

|  | Response of "I love NLP!"  | Score |
| :-----------: | : ----------- | :----------- : |
|  A |  Me too! | 0.111|
|  B |  Here’s a free textbook (URL) in case anyone needs it. | 0.613|


In [None]:
!python src/score.py play -p=restore/updown.pth

100% 1042301/1042301 [00:00<00:00, 2691834.25B/s]
100% 456318/456318 [00:00<00:00, 1772148.58B/s]
--2020-09-16 00:17:24--  https://xiagnlp2.blob.core.windows.net/dialogrpt/updown.pth
Resolving xiagnlp2.blob.core.windows.net (xiagnlp2.blob.core.windows.net)... 52.239.160.106
Connecting to xiagnlp2.blob.core.windows.net (xiagnlp2.blob.core.windows.net)|52.239.160.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1520029114 (1.4G) [application/octet-stream]
Saving to: ‘restore/updown.pth’


2020-09-16 00:17:48 (63.3 MB/s) - ‘restore/updown.pth’ saved [1520029114/1520029114]

loading from restore/updown.pth
enter empty to stop
use `_EOS_` to delimite turns for a multi-turn context

Context:  I love NLP!
Response: Here’s a free textbook (URL) in case anyone needs it.
score = 0.613

Context:  I love NLP!
Response: Me too!
score = 0.111

Context:  


### Option2: generate and re-rank
You can integrate the rankers with your generation models by reranking the candidates, and thus improve the generation quality. In the following example, response A and B are two hypotheses generated by DialoGPT. Though A is more likely to be generated (higher Generation Score), it's less interesting. Ranker helps to re-rank the more interesting one, Respone B, to top position.


|  | Response of "Can we restart 2020?"  | Generation Probability | Ranker Score |
| :-----------: | : ----------- | :----------- : |:----------- : |
|  A |  No, we can't. | 0.314| 0.350 |
|  B |  No, we can't. It's too late for that. We need to go back in time and start from the beginning of the universe | 0.210 | 0.506 |

In [None]:
!python src/generation.py -pg=restore/medium_ft.pkl -pr=restore/updown.pth

loading from restore/updown.pth

cxt:	Can we restart 2020?
0.506 gen 0.210 ranker 0.506	No, we can't. It's too late for that. We need to go back in time and start from the beginning of the universe
0.505 gen 0.198 ranker 0.505	No, we can't. It's too late for that. We need to go back to 2015
0.483 gen 0.245 ranker 0.483	No, we can't. It's too late for that. We need to go back in time and start from the beginning of the universe.
0.471 gen 0.268 ranker 0.471	No, we can't. It's too late for that. We need to go back in time and start from the beginning.
0.470 gen 0.251 ranker 0.470	No, we can't. It's too late for that. We need to go back in time and start from the start.
0.462 gen 0.243 ranker 0.462	No, we can't. It's too late for that. We need to go back in time to when we started.
0.462 gen 0.258 ranker 0.462	No, we can't. It's too late for that. We need to go back in time and start from the beginning of time.
0.430 gen 0.247 ranker 0.430	No, we can't. It's too late for that. We need to 

## Play with an ensemble model
In the following example, the response is scored by multiple models trained on different tasks.

| Task | Description  |
| :-----------: | :----------- |
|  **Human feedback**  | **given a context and its two human responses, predict...** |
| `updown`|  ... which gets more upvotes?  | 
| `width` | ... which gets more direct replies?  | 
| `depth` |  ... which gets longer follow-up thread? | 
| **Human-like** (i.e., human vs fake)  | **given a context and one human response, distinguish it with...**  |
| `human_vs_rand` | ... a random human response  |
| `human_vs_machine` | ... a machine generated response  | 

the final score is a weighted average of these models. See file `restore/ensemble.yml` for details

In [None]:
!python src/score.py play -p=restore/ensemble.yml

{'prior': [{'name': 'human_vs_rand', 'wt': 0.5, 'path': 'restore/human_vs_rand.pth'}, {'name': 'human_vs_machine', 'wt': 0.5, 'path': 'restore/human_vs_machine.pth'}], 'cond': [{'name': 'updown', 'wt': 1, 'path': 'restore/updown.pth'}, {'name': 'depth', 'wt': 0.48, 'path': 'restore/depth.pth'}, {'name': 'width', 'wt': -0.5, 'path': 'restore/width.pth'}]}
setting up model `human_vs_rand`
--2020-09-16 00:33:01--  https://xiagnlp2.blob.core.windows.net/dialogrpt/human_vs_rand.pth
Resolving xiagnlp2.blob.core.windows.net (xiagnlp2.blob.core.windows.net)... 52.239.160.106
Connecting to xiagnlp2.blob.core.windows.net (xiagnlp2.blob.core.windows.net)|52.239.160.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1520029114 (1.4G) [application/octet-stream]
Saving to: ‘restore/human_vs_rand.pth’


2020-09-16 00:33:35 (43.6 MB/s) - ‘restore/human_vs_rand.pth’ saved [1520029114/1520029114]

loading from restore/human_vs_rand.pth
setting up model `human_vs_machine`
--202