# Sample Steam Store Descriptions with GPT-2
Code inspired from https://github.com/woctezuma/sample-steam-descriptions

## Setting the GPT-2 model

Install the Python package

Reference: https://github.com/minimaxir/gpt-2-simple

In [30]:
!pip install gpt_2_simple



Download the pre-trained model

In [0]:
import gpt_2_simple as gpt2
from datetime import datetime
from google.colab import files

## Downloading GPT-2

In [0]:
gpt2.download_gpt2()

Fetching checkpoint: 1.00kit [00:00, 353kit/s]                                                      
Fetching encoder.json: 1.04Mit [00:00, 49.2Mit/s]                                                   
Fetching hparams.json: 1.00kit [00:00, 417kit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:06, 72.5Mit/s]                                  
Fetching model.ckpt.index: 6.00kit [00:00, 2.25Mit/s]                                               
Fetching model.ckpt.meta: 472kit [00:00, 38.8Mit/s]                                                 
Fetching vocab.bpe: 457kit [00:00, 36.7Mit/s]                                                       


## Mounting Google Drive

In [32]:
gpt2.mount_gdrive()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Uploading a Text File to be Trained to Colaboratory

#### Either get the data by yourself

Currently not possible because you:
-   either need app details (slow to download),
-   or aggregate.json (stored with Git LFS, not installed on Google Colab.)

#### Or get a data snapshot from me

In [0]:
!curl -O https://raw.githubusercontent.com/woctezuma/sample-steam-descriptions/master/data/concatenated_store_descriptions.txt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 43.1M  100 43.1M    0     0  51.8M      0 --:--:-- --:--:-- --:--:-- 51.7M


## Finetune GPT-2

In [0]:
file_name = 'concatenated_store_descriptions.txt'

run_name = 'descriptions'

In [0]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              run_name=run_name,
              dataset=file_name,
              steps=5000,
              restore_from='latest',   # change to 'latest' to resume training
              print_every=10,   # how many steps between printing progress
              sample_every=200,   # how many steps to print a demo sample
              save_every=500   # how many steps between saving checkpoint              
              )

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use tf.random.categorical instead.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
Loading checkpoint checkpoint/descriptions/model-4400
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from checkpoint/descriptions/model-4400


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [01:07<00:00, 67.52s/it]


dataset has 11714797 tokens
Training...
[4410 | 29.48] loss=2.49 avg=2.49
[4420 | 53.68] loss=2.30 avg=2.39
[4430 | 79.21] loss=2.24 avg=2.34
[4440 | 104.01] loss=1.93 avg=2.24
[4450 | 128.61] loss=2.39 avg=2.27
[4460 | 153.61] loss=2.37 avg=2.29
[4470 | 178.75] loss=2.09 avg=2.26
[4480 | 203.44] loss=2.32 avg=2.27
[4490 | 228.39] loss=2.39 avg=2.28
Saving checkpoint/descriptions/model-4500
[4500 | 257.82] loss=2.33 avg=2.29
[4510 | 282.55] loss=2.11 avg=2.27
[4520 | 307.87] loss=2.40 avg=2.28
[4530 | 332.64] loss=2.37 avg=2.29
[4540 | 357.30] loss=2.22 avg=2.28
[4550 | 382.31] loss=2.58 avg=2.30
[4560 | 407.52] loss=2.08 avg=2.29
[4570 | 432.24] loss=2.09 avg=2.28
[4580 | 457.02] loss=2.09 avg=2.26
[4590 | 482.12] loss=2.28 avg=2.27
ory</li></ul><br><img src="https://steamcdn-a.akamaihd.net/steam/apps/996940/extras/Banner_1.png?t=1554245910" ><br><br>Fishing: Ecosystem, is a singleplayer fishing simulator. <strong>The game is extremely easy to play with high difficulty.</strong>.  In 

In [7]:
from google.colab import drive

mount_folder = '/content/gdrive'
drive.mount(mount_folder)

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [0]:
# !mkdir -p '/content/gdrive/My Drive/checkpoint/'
# !cp -r checkpoint/descriptions '/content/gdrive/My Drive/checkpoint/'

# gpt2.copy_checkpoint_to_gdrive()

## Load a Trained Model Checkpoint

In [0]:
# !mkdir -p checkpoint/
# !cp -r '/content/gdrive/My Drive/checkpoint/descriptions' checkpoint/

# gpt2.copy_checkpoint_from_gdrive()

In [0]:
# sess = gpt2.start_tf_sess()

# gpt2.load_gpt2(sess,
#                run_name=run_name)

## Generate Text From The Trained Model

In [0]:
num_samples = 3
num_batches = 3 # Unique to GPT-2, you can pass a batch_size to generate multiple samples in parallel, giving a massive speedup.

gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches)

</strong><br><br>You are a survivor of the substance crash and the crash accident, you have been evacuated to a hospital. The hospital is a huge building big enough to accommodate as many people as you can fit inside. After a while, you notice that there is no one left. You can continue to explore the hospital and find clues to get a little information. You are given a chance to get a bag of supplies, but soon you find that no one has left the hospital and there is nothing left to do. You are given a chance to get a flashlight and to get out. You find yourself in a forest, which you can't see, you have no idea what is happening and you don't know how to get out. You find a room that is inhabited by a girl. She doesn't speak much but she tells you that she is a survivor of the substance crash and the crash accident. She explains that she has been evacuated to a hospital. She also says that she has no memory of what happened to her. She is a survivor of the crash and the accident.
They s

In [0]:
gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches,
              prefix='Half-Life 3 is the long-awaited sequel in the Half-Life franchise developped by Valve')

H-Life 3 is the long-awaited sequel in the Half-Life franchise developped by Valve from the creators of <a href="http://store.steampowered.com/app/260080/" target="_blank" rel="noreferrer"  >Half-Life 2</a> and <a href="http://store.steampowered.com/app/280150/" target="_blank" rel="noreferrer"  >Half-Life 2</a> back in 1998.<br><br>When Valve originally released Half-Life 3 on PC in 2001, the game created numerous critical and commercial successes. <br><br>The game enjoyed critical acclaim from the game community, including:<br><br><ul class="bb_ul"><li> <a href="http://www.IGN.com/The-End-of-Life-3-Concept/" target="_blank" rel="noreferrer"  >IGN.com</a><br></li><li> <a href="http://www.pcgamer.com/the-end-of-life-3-concept/" target="_blank" rel="noreferrer"  >PC Gamer</a><br></li><li> <a href="http://www.pcgamer.com/2014/6/the-end-of-life-3-game-final-design-final-work/" target="_blank" rel="noreferrer"  >PC Gamer</a><br></li><li> <a href="http://www.pcgamer.com/2014/6/the-end-of-li

In [0]:
gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches,
              prefix='Spelunky 2 is the sequel of the most acclaimed rogue-like platformer of all-time')

Selunky 2 is the sequel of the most acclaimed rogue-like platformer of all-time. The game never felt so good, and so dark. It is a love letter to old-school games like Super Meat Boy, Myst and Nintendogs and the Super Nintendo classic, Metroid.<br><br>After a total of six years of development, the game has finally been published on Steam! Now you can experience the game yourself in the form of a digital version, where you can view your progress in the Steam Workshop, or play the game in our dedicated Quick Time mode.<h2 class="bb_tag">Key Features</h2><br><ul class="bb_ul"><li><strong>Pirate Mode:</strong> The player controls a &quot;Pirate&quot; that can be captured by killing enemies. It can also be used to shoot enemies from an angle.<br></li><li><strong>Shoot Mode:</strong> Players have to shoot with a few shots to break the &quot;Pirate&quot;. To do this, press the &quot;&quot;&quot; key combination.<br></li><li><strong>Brand New:</strong> Players will receive a brand new version 