# Sample Steam Reviews with GPT-2
Code inspired from https://github.com/woctezuma/sample-steam-reviews

## Mounting Google Drive

In [1]:
from google.colab import drive

mount_folder = '/content/gdrive'
drive.mount(mount_folder)

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [2]:
%cd '/content/gdrive/My Drive/'

/content/gdrive/My Drive


## Setting the GPT-2 model

Install the Python package

Reference: https://github.com/minimaxir/gpt-2-simple

In [3]:
!pip install gpt_2_simple



Download the pre-trained model

In [4]:
import gpt_2_simple as gpt2

gpt2.download_gpt2()

Fetching checkpoint: 1.00kit [00:00, 269kit/s]                                                      
Fetching encoder.json: 1.04Mit [00:00, 44.6Mit/s]                                                   
Fetching hparams.json: 1.00kit [00:00, 470kit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:09, 52.6Mit/s]                                  
Fetching model.ckpt.index: 6.00kit [00:00, 1.81Mit/s]                                               
Fetching model.ckpt.meta: 472kit [00:00, 35.8Mit/s]                                                 
Fetching vocab.bpe: 457kit [00:00, 40.6Mit/s]                                                       


## Setting the input data
Store page: https://store.steampowered.com/app/583950/Artifact/

Download

In [5]:
!curl -O https://raw.githubusercontent.com/woctezuma/sample-steam-reviews/master/output/583950.txt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100 3579k  100 3579k    0     0  8582k      0 --:--:-- --:--:-- --:--:-- 8562k


Load

In [0]:
artifact_file_name = '583950.txt'

In [7]:
with open(artifact_file_name, 'r', encoding='utf8') as f:
  lines = [line.strip() for line in f.readlines()]
  
print('#lines = {}'.format(len(lines)))

#lines = 25728


Remove empty lines

In [8]:
texts = [line for line in lines if len(line)>0]

print('#lines = {}'.format(len(texts)))

#lines = 17575


Save file without empty lines

In [0]:
artifact_trimmed_file_name = 'artifact.txt'

In [0]:
line_separator = '\n'

with open(artifact_trimmed_file_name, 'w', encoding='utf8') as f:
  print(line_separator.join(texts), file=f)

## Fine-tune the GPT-2 model on my data
Reference: 
https://colab.research.google.com/drive/1VLG8e7YSEwypxU-noRNhsv5dW4NfTGce

In [11]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset=artifact_trimmed_file_name,
              steps=1000,
              restore_from='fresh',   # change to 'latest' to resume training
              print_every=10,   # how many steps between printing progress
              sample_every=200,   # how many steps to print a demo sample
              save_every=500   # how many steps between saving checkpoint              
              )

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use tf.random.categorical instead.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
Loading checkpoint models/117M/model.ckpt
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from models/117M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:04<00:00,  4.81s/it]


dataset has 834870 tokens
Training...
[10 | 29.71] loss=3.53 avg=3.53
[20 | 54.17] loss=3.58 avg=3.56
[30 | 77.76] loss=3.40 avg=3.50
[40 | 101.09] loss=3.48 avg=3.50
[50 | 124.79] loss=3.61 avg=3.52
[60 | 148.53] loss=3.37 avg=3.50
[70 | 172.11] loss=3.28 avg=3.46
[80 | 195.70] loss=3.13 avg=3.42
[90 | 219.39] loss=3.36 avg=3.41
[100 | 243.10] loss=3.44 avg=3.42
[110 | 266.79] loss=3.22 avg=3.40
[120 | 290.45] loss=3.20 avg=3.38
[130 | 314.08] loss=3.28 avg=3.37
[140 | 337.74] loss=3.25 avg=3.36
[150 | 361.38] loss=3.04 avg=3.34
[160 | 385.04] loss=3.17 avg=3.33
[170 | 408.68] loss=3.15 avg=3.32
[180 | 432.29] loss=3.12 avg=3.31
[190 | 455.90] loss=3.29 avg=3.30
, and when you get to the last 2 lanes, there's no way to get any cards back.
After having a few games played, I believe that the game really needs more updates to prevent people from buying the game on the Steam Market - and this update is a huge step forward.
This game is worth your $20, that means your money, even if you ar

## Sample from the model

Generate samples

In [13]:
num_samples = 3
num_batches = 3 # Unique to GPT-2, you can pass a batch_size to generate multiple samples in parallel, giving a massive speedup.

gpt2.generate(sess, nsamples=num_samples, batch_size=num_batches, prefix='I love Artifact')

I love Artifact, I love its gameplay and I love its lore.
But I also really love the monetization. I paid $15 and got Artifact for the first time, and it doesn't even cost that much to play competitively. I also got to actually buy cards that are worth more than the price of a pack, and it's not even gambling between packs!
And there's the business model. You can buy cards, sell them, trade them. It's really quite cool.
And the lore. It's really nice.
I love the lore of Dota 2. It's deep and complex and I love it.
But I love the fact that you can buy cards from other players. And the fact that you can trade them at will. The only way to get cards is by buying them.
So this is the first TCG I've ever played (and a really good one at that), but Valve has really outdone them.
They've really polished it up and I really like it.
They have a free draft mode which is pretty cool.
They have free tournaments which are really fun.
They have free hero emote and they changed it from a very powerfu

In [14]:
gpt2.generate(sess, nsamples=num_samples, batch_size=num_batches, prefix='I hate Artifact')

I hate Artifact so I know it's not for me) but I'm fine with buying the game, the mechanics are very good and it gives me a lot of satisfaction. I don't mind the microtransaction at all. The only problem is I'm not interested in playing that much anymore (besides Hearthstone where I started) and the progression seems a bit too steep. That's not to say I'm not interested, it's just not possible to become a daily player with the casual and the draft modes combined. I'm very interested in playing the expert modes and doing constructed, if that's your thing I'd probably consider it.
Overall I have a very happy and satisfied consumer who is happy to spend time with their game and who's interested in checking out this game if they're on it for the long term. I'm not one of those people who just buys the game and leaves it out of a ga line wishlist, but if you're on the fence, you can still battle it out and maybe you'll scrap it a bit. If you're more of a casual player, definitely avoid this

In [18]:
gpt2.generate(sess, nsamples=num_samples, batch_size=num_batches, prefix='Please, Valve, ')

P, Valve,  make the game better, because they know that theres nothing left to improve in this game.
i dont really like card games, but its nice to play a game that is not pay to win.
I want to like this game, but there is so much more to say.
Excellent game.  Very deep and interesting mechanics.  Not too expensive, but only if you want to play constructed mode (you have to own the cards in order to play constructed mode)
Artifact is a really fun game, with a really steep learning curve.  I think it is a good game for those who love TCGs, but just want to relax and play for a while.
For those who are wondering, Artifact is a very complex game.  There is a lot going on in this game that you need to know in order to really enjoy it.  It's not as complex as other card games, but it is more complicated than those that use a similar learning curve.  You can spend a lot of time learning about the cards and lanes, and in some games even the heros and items.  For me, I think Artifact is a much