<a href="https://colab.research.google.com/github/kjahan/gpt-2/blob/run-exp-1/experiments.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GPT-2 Playground

## Background
In this Jupyter notebook we experiment with **Open AI's GPT-2** Language Model from the paper **[Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf)**. We'll be able to choose between the small (**117M** parameters) , medium (**345M** parameters), large (**774M** parameters) and XL versions (**1.5B** parameters) version of GPT-2.  

According to the authors, the GPT-2 algorithm was trained on the task of *language modeling*--- which tests a program's ability to predict the next word in a given sentence--by ingesting huge numbers of articles, blogs, and websites. By using just this data it achieved state-of-the-art scores on a number of unseen language tests, an achievement known as *zero-shot learning.* It can also perform other writing-related tasks, like translating text from one language to another, summarizing long articles, and answering trivia questions.


In [1]:
!git clone https://github.com/ilopezfr/gpt-2/
import os
os.chdir('gpt-2')
# !python download_model.py 117M
# !python download_model.py 124M
# !python download_model.py 345M
!python download_model.py 774M
# !python download_model.py 1558M
!pip3 install -r requirements.txt
!pip3 install tensorflow==1.13.1

Cloning into 'gpt-2'...
remote: Enumerating objects: 310, done.[K
remote: Total 310 (delta 0), reused 0 (delta 0), pack-reused 310[K
Receiving objects: 100% (310/310), 4.63 MiB | 3.59 MiB/s, done.
Resolving deltas: 100% (174/174), done.
Fetching checkpoint: 1.00kit [00:00, 809kit/s]                                                      
Fetching encoder.json: 1.04Mit [00:00, 58.3Mit/s]                                                   
Fetching hparams.json: 1.00kit [00:00, 760kit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 3.10Git [00:59, 52.2Mit/s]                                 
Fetching model.ckpt.index: 16.0kit [00:00, 9.20Mit/s]                                               
Fetching model.ckpt.meta: 1.38Mit [00:00, 64.1Mit/s]                                                
Fetching vocab.bpe: 457kit [00:00, 52.9Mit/s]                                                       
Collecting fire>=0.1.3
[?25l  Downloading https://fil

**Unconditional sample generatio**

To generate unconditional samples from the small model, we need to run: 

`!python3 src/generate_unconditional_samples.py`

See below for some text that has been generated by GPT-2 model.

In [None]:
!python3 src/generate_unconditional_samples.py --model_name='124M' --nsamples=2 --top_k=40 --temperature=0.7 | tee samples

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-08-12 22:57:36.388236: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-12 22:57:36.391567: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2249995000 Hz
2020-08-12 22:57:36.391794: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1ed8680 executing computations on platform Host. Devices:
2020-08-12 22:57:36.391822: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


**Conditional sample generation**

To generate conditional samples from the small model:

`!python3 src/interactive_conditional_samples.py`

### Text Completion

- Context: random unseen text

Sample prompt 1: ([*A fiction by Neil Gaiman*](https://www.reddit.com/r/slatestarcodex/comments/hmu5lm/fiction_by_neil_gaiman_and_terry_pratchett_by_gpt3/))
```
A short-short story is only a couple of paragraphs long. This award-winning short-short story is by Neil Gaiman: Chrysalis by Neil Gaiman
```

Sample prompt 2: ([*A fiction by Terry Pratchett*](https://www.reddit.com/r/slatestarcodex/comments/hmu5lm/fiction_by_neil_gaiman_and_terry_pratchett_by_gpt3/))

```
A short-short story is only a few paragraphs long. This award winning short-short story is by Terry Pratchett, author of Wee Free Men. The Underland by Terry Pratchett
```

In [None]:
!python3 src/interactive_conditional_samples.py --model_name='345M'  --nsamples=2 --top_k=100 --temperature=1

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-08-12 23:11:26.452330: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-12 23:11:26.455843: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2249995000 Hz
2020-08-12 23:11:26.456117: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2f8a520 executing computations on platform Host. Devices:
2020-08-12 23:11:26.456148: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


**(Note to ALI) Run your Text Completeion tasks here!**

1.   List item
2.   List item



1.   Run this code snippet and then you will see a box to enter your prompt sentences.
2.   Just fill out the box and press enter.  It might take a few mins until you can see the outcome.[link text](https:// [link text](https://))

In [None]:
!python3 src/interactive_conditional_samples.py --model_name='345M'  --nsamples=2 --top_k=100 --temperature=1

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-09-14 01:08:35.123003: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-09-14 01:08:35.126826: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-09-14 01:08:35.127211: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1fd27e0 executing computations on platform Host. Devices:
2020-09-14 01:08:35.127262: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


**Runing Book Text Completeion tasks here!**

```
# We are testing with GPT-2 774M model version.  The goal is to compare GPT-2 with GPT-3.
```



In [None]:
!python3 src/interactive_conditional_samples.py --model_name='774M'  --nsamples=2 --top_k=100 --temperature=1

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-09-20 03:13:42.897937: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-09-20 03:13:42.901761: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200145000 Hz
2020-09-20 03:13:42.902111: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1af43c0 executing computations on platform Host. Devices:
2020-09-20 03:13:42.902150: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


### 2. Question-Answering

- Context: passage, some question/answer pairs, and token `A:`
- For a single word answer (i.e.: Yes/No, city), set flag `length=1`

Sample prompt 1 ([*The Baseline test*](https://bladerunner.fandom.com/wiki/Baseline_Test))
```
ELIZA is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of communication between humans and machines, Eliza simulated conversation by using a "pattern matching" and substitution methodology that gave users an illusion of understanding on the part of the program, but had no built in framework for contextualizing events. Directives on how to interact were provided by "scripts", written originally in MAD-Slip, which allowed ELIZA to process user inputs and engage in discourse following the rules and directions of the script. The most famous script, DOCTOR, simulated a Rogerian psychotherapist (in particular, Carl Rogers, who was well-known for simply parroting back at patients what they would just said), and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatterbots and one of the first programs capable of attempting the Turing test.

Q: What programming language was ELIZA written in?
A: MAD-Slip
Q: Who invented ELIZA?
A: Joseph Weizenbaum
Q: What is ELIZA?
A: a natural language processing computer program
Q: Is ELIZA a human?
A: no
Q: Where was ELIZA created at?
A: MIT Artificial Intelligence Laboratory
Q: Did ELIZA pass Turing test?
A:
```

Sample prompt 2: 
```
Trump was born and raised in Queens, a borough of New York City, and received a bachelor's degree in economics from the Wharton School. He took charge of his family's real-estate business in 1971, renamed it The Trump Organization, and expanded its operations from Queens and Brooklyn into Manhattan. Trump later started various side ventures, mostly by licensing his name. He bought the Miss Universe brand of beauty pageants in 1996, and sold it in 2015. Trump and his businesses have been involved in more than 4,000 state and federal legal actions, including six bankruptcies. He produced and hosted The Apprentice, a reality television series, from 2003 to 2015. As of 2020, Forbes estimated his net worth to be $2.1 billion.
Q: Where was Trump born?
A: Queens
Q: What is Trump business?
A: real-estates
Q: How much is Trump wealth?
A: $2.1 billion
Q: What is Trump nationality?
A: American
Q: How many times Trump businesses have declared bankruptcies?
A: six
Q: What school did Trump go to?
A:
```


In [None]:
!python3 src/interactive_conditional_samples.py  --model_name='345M'  --nsamples=10 --top_k=40 --temperature=.80 --length=1

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-08-12 23:36:02.394899: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-12 23:36:02.398341: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2249995000 Hz
2020-08-12 23:36:02.398621: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x140c520 executing computations on platform Host. Devices:
2020-08-12 23:36:02.398652: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


### Summarization



- Context: article and text *`TL;DR:`* or *`Summary:`* at the end.

Sample prompt:

```
NLP and other technologies are progressing quickly such that in the relatively near future it will be hard for humans to tell if they are talking with other humans or bots online.  This is problematic as individuals and organizations, including state and non-state actors, can use, both maliciously or just for convenience, such bots to pose as humans, including themselves, on message boards, dating sites, micro-broadcasting sites (e.g., Twitter and Instagram), and even between individual and group private text threads.  This is problematic as it undermines the trust individuals have in such forums and increases public disdain for technologies that may otherwise serve humanity.  The issue becomes how can we produce a kind of “proof of humaness” such that a party may rely on the humanness of the counterparty.  It seems one such adaptive approach may be for humans to change language faster than bots can mimic it--a kind of arms race between language use of humans and bots.
The assumption is that much of the state of the art in the near term will be trained on historical language data that is dated such that humans have a sufficient window of time to use “modern slang” consisting purposely misspelled words to connote emphasis (e,.g., niiiice), emojis, jifs, and other relatively new forms of language/expression, but in a much more frequent and ambiguous way so that humans can signal to other humans they are real.  While use of such language/expression is already occurring as a socio-cultural phenomena, to date it has not been used as a way of proving humanness.
An alternative or in conjunction with the above such modern slang may be used to train classifiers to help detect whether a bot is being used.
TL;DR: 
```

In [None]:
!python3 src/interactive_conditional_samples.py --model_name='345M' --nsamples=3 --length=100 --temperature=1 

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-08-12 23:46:41.183948: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-12 23:46:41.187303: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2249995000 Hz
2020-08-12 23:46:41.187567: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2622520 executing computations on platform Host. Devices:
2020-08-12 23:46:41.187596: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


### Translation



- Context: a few example pairs of the format *`english_sentence = spanish_sentence`*, and then *`english_sentence =`*  at the end. 

Sample prompt:
```
Good morning. = Buenos días.
I am lost. Where is the restroom? = Estoy perdido. ¿Dónde está el baño?
How much does it cost? = ¿Cuánto cuesta?
How do you say maybe in Spanish? = ¿Cómo se dice maybe en Español?
Would you speak slower, please. = Por favor, habla mas despacio.
Where is the book store? = ¿Dónde está la librería?
At last a feminist comedian who makes jokes about men. = Por fin un cómico feminista que hace chistes sobre hombres.

How old are you? = 


```


In [None]:
!python3 src/interactive_conditional_samples.py --model_name='345M'  --nsamples=3 --temperature=1

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-08-12 23:51:19.200716: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-12 23:51:19.204022: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2249995000 Hz
2020-08-12 23:51:19.204249: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1b5c520 executing computations on platform Host. Devices:
2020-08-12 23:51:19.204280: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
