# Sample Steam Reviews with GPT-2
Code inspired from https://github.com/woctezuma/sample-steam-reviews-with-gpt-2

## Setting the GPT-2 model

Install the Python package

Reference: https://github.com/minimaxir/gpt-2-simple

In [1]:
!pip install gpt_2_simple

Collecting gpt_2_simple
  Downloading https://files.pythonhosted.org/packages/b6/cf/4003c7d85425af353e15d938bc0d87a0bdedd6b00229e1f7808c2524b518/gpt_2_simple-0.2.tar.gz
Building wheels for collected packages: gpt-2-simple
  Building wheel for gpt-2-simple (setup.py) ... [?25ldone
[?25h  Stored in directory: /root/.cache/pip/wheels/51/d0/bd/293c80200f60bcd75a0f4028684e55e959da3a2727858d98a0
Successfully built gpt-2-simple
Installing collected packages: gpt-2-simple
Successfully installed gpt-2-simple-0.2


Download the pre-trained model

In [0]:
import gpt_2_simple as gpt2
from datetime import datetime
from google.colab import files

## Downloading GPT-2

In [3]:
gpt2.download_gpt2()

Fetching checkpoint: 1.00kit [00:00, 332kit/s]                                                      
Fetching encoder.json: 1.04Mit [00:00, 52.3Mit/s]                                                   
Fetching hparams.json: 1.00kit [00:00, 592kit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:07, 69.5Mit/s]                                  
Fetching model.ckpt.index: 6.00kit [00:00, 1.46Mit/s]                                               
Fetching model.ckpt.meta: 472kit [00:00, 39.8Mit/s]                                                 
Fetching vocab.bpe: 457kit [00:00, 42.1Mit/s]                                                       


## Mounting Google Drive

In [4]:
gpt2.mount_gdrive()

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


## Uploading a Text File to be Trained to Colaboratory

#### Either get the data by yourself

In [5]:
!curl -O https://raw.githubusercontent.com/woctezuma/sample-steam-reviews-with-gpt-2/master/export_review_data.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  5303  100  5303    0     0  27619      0 --:--:-- --:--:-- --:--:-- 27619


In [6]:
!curl -O https://raw.githubusercontent.com/woctezuma/sample-steam-reviews-with-gpt-2/master/requirements.txt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    37  100    37    0     0    234      0 --:--:-- --:--:-- --:--:--   234


In [7]:
!pip install -r requirements.txt

Collecting steamreviews==0.7.0 (from -r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/9d/32/96401f528f0c13f3750eb5e87ef47fcc473f9b940cf217ce69d6f9e448ad/steamreviews-0.7.0-py3-none-any.whl
Collecting langdetect==1.0.7 (from -r requirements.txt (line 2))
[?25l  Downloading https://files.pythonhosted.org/packages/59/59/4bc44158a767a6d66de18c4136c8aa90491d56cc951c10b74dd1e13213c9/langdetect-1.0.7.zip (998kB)
[K    100% |████████████████████████████████| 1.0MB 22.2MB/s 
Building wheels for collected packages: langdetect
  Building wheel for langdetect (setup.py) ... [?25ldone
[?25h  Stored in directory: /root/.cache/pip/wheels/ec/0c/a9/1647275e7ef5014e7b83ff30105180e332867d65e7617ddafe
Successfully built langdetect
Installing collected packages: steamreviews, langdetect
Successfully installed langdetect-1.0.7 steamreviews-0.7.0


In [0]:
app_id = 583950

In [9]:
from export_review_data import apply_workflow_for_app_id

apply_workflow_for_app_id(app_id)

[appID = 583950] expected #reviews = 8699
#reviews = 177
Filtering out reviews which were not written in english.
#reviews = 177
Filtering out reviews with strictly fewer than 150 characters.
#reviews = 101
Filtering out reviews which were not detected as written in en.
#reviews = 101


#### Or get a data snapshot from me

Currently only possible for Artifact, as an example, because the recommended way is to run the code above for the game of your choice instead.

In [0]:
# !curl -O https://raw.githubusercontent.com/woctezuma/sample-steam-reviews-with-gpt-2/master/data/583950.txt
# !mkdir data
# !mv 583950.txt data/

## Finetune GPT-2

In [0]:
file_name = 'data/' + str(app_id) + '.txt'

run_name = 'reviews_' + str(app_id)

In [12]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              run_name=run_name,
              dataset=file_name,
              steps=1000,
              restore_from='fresh',   # change to 'latest' to resume training
              print_every=10,   # how many steps between printing progress
              sample_every=200,   # how many steps to print a demo sample
              save_every=500   # how many steps between saving checkpoint              
              )

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use tf.random.categorical instead.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
Loading checkpoint models/117M/model.ckpt
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from models/117M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:00<00:00,  3.05it/s]


dataset has 27628 tokens
Training...
[10 | 29.21] loss=3.10 avg=3.10
[20 | 51.11] loss=2.79 avg=2.94
[30 | 73.34] loss=2.49 avg=2.79
[40 | 95.79] loss=1.99 avg=2.59
[50 | 118.35] loss=1.35 avg=2.34
[60 | 140.94] loss=1.22 avg=2.14
[70 | 163.62] loss=0.83 avg=1.95
[80 | 186.56] loss=0.67 avg=1.79
[90 | 209.57] loss=0.43 avg=1.63
[100 | 232.59] loss=0.24 avg=1.48
[110 | 255.66] loss=0.21 avg=1.36
[120 | 278.75] loss=0.12 avg=1.25
[130 | 301.87] loss=0.08 avg=1.16
[140 | 325.02] loss=0.08 avg=1.08
[150 | 348.15] loss=0.08 avg=1.00
[160 | 371.27] loss=0.06 avg=0.94
[170 | 394.40] loss=0.05 avg=0.88
[180 | 417.57] loss=0.05 avg=0.83
[190 | 440.70] loss=0.04 avg=0.79
 almost more compelling games (and also newer, more compelling games that rely more on surprise and card draw than on strategy than on direct standardplay matchmaking). That's really the only thing missing from this game right now, and I mean that in a lot of ways. You essentially have a single player campaign for this game (Moh

In [0]:
# gpt2.copy_checkpoint_to_gdrive()

## Load a Trained Model Checkpoint

In [0]:
# gpt2.copy_checkpoint_from_gdrive()

In [0]:
# sess = gpt2.start_tf_sess()

# gpt2.load_gpt2(sess,
#                run_name=run_name)

## Generate Text From The Trained Model

In [16]:
num_samples = 3
num_batches = 3 # Unique to GPT-2, you can pass a batch_size to generate multiple samples in parallel, giving a massive speedup.

gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches)

Tldr
Great Game, well thought, well executed, with a sleek UI and plenty of variety. Fun, ha?
Got this game for free. It's addictive.
Was very interested in the replay, and enjoyed it aplenty.
Ya know the appeal of a fresh new player.
Got the game after a couple of days, and was very competitive.
Got the cards (2) and a copy of Dawn of the World to play which gave away the balance of the draft to random players in draft.
Had fun, but not great.
The cards (2) I GOT were too expensive to obtain for free.
So many things done wrong, from the market for cards, to the pay to play mods, the gameplay that is ruined by auto-placing minions on the board randomly... just, don't buy it, it sucks
It was clear 4 months ago that this game badly needed an update. It is lacking so many features. Now the playerbase has left, the game is dead and I wish I could get a refund.
I come from a Prismata, Hearthstone, Eternal and MTGA background and really loved this game the most. But Valve obviously drop the 

In [17]:
gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches,
              prefix='I love Artifact')

I love Artifact and its lore is fantastic, the draft is really fun and free. If you're planning a grand prize, a draft, I'm sure you're well on your way to greater riches. Don't get me wrong, you probably heard lots of good stuff about this game. Valve made sure to organize a closed invite only closed beta closed tournament with all the good card professional players you all love, they surely loved this game too and weren't there only for the money. Ahm.
Of course, you heard all the good stuff about Artifag from them and how they are happily still playing and streaming this best card game ever made, and also talking about it non-stop to their friends and their friends' mothers (and their mothers' friends' friends). You should follow them too and do it exactly as well (that is: continue playing Heartsone/Gwent/thatothergameyoulove, you don't have to think about it too much, you will do well sabotaging the competition). Valve carefully listened well to all their honest positive feedback,

In [18]:
gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches,
              prefix='I hate Artifact')

I hate Artifact and have spent more than dozens of Expert mode tickets. I also play Dota, so I gave it a shot. Boring.
4. Valve's silence.
I've enjoyed the game aplenty, and if you're still reading this by now, you're in luck. The game is alive with lore and insight is beyond me. Breath of the Wild is a great game, but Valve is silent as hell. I'm done.
I'm not happy to write this review, the game is good and well made, every cards has his lore and voice acting, the three boards mechanics is not so complicated and you can master it after a couple of hours (the time you need to practice with the most common cards).
What is wrong with this game is the microtransaction system, if you want to build your deck you need to pay, there is no trading with friends or other people but right now the cards are very cheap on the market (because the game is dead), but let me say that if you want to play for fun with pre-build decks you can for free.
What is more wrong with this game is Valve, they kil

In [19]:
gpt2.generate(sess,
              run_name=run_name,
              nsamples=num_samples,
              batch_size=num_batches,
              prefix='Please, Valve')

P, Valve, do not release game with all of the evidence to back it up.
The game is dead and Valve killed it.
Save yourself the headache.
The game is dead and Valve killed it.
If it weren't for the fact that this game had these terrible initial problems, and because Valve didn't provide any updates, we might not even be aware of the game is dead.
Holy cow.
I won't go into details about the taxes, because you're breaking the bank.
Because I don't want taxes on this game.
Because you're breaking the bank.
Because you should pay the taxes.
Because there is no other way to gain entry to this game.
There is no other way to gain entry to this game.
Game is dead and Valve killed it.
Yes, you can win boosters by participating in runs.
However, the ratio required to win cards is very low (4 wins for a maximum of 1 loss), so you will easily lose your tickets. By buying Artifact, you receive 5 tickets. Spend 1 ticket to (maybe) win a reward:
[list][*]Win 3 games (with maximum 1 defeat) to win 1 tic