In [1]:
import torch
from transformers import GPT2LMHeadModel,  GPT2Tokenizer, GPT2Config, GPT2LMHeadModel



In [2]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [3]:
model_dir = 'gamereveiw_distillgpt2/'
model = GPT2LMHeadModel.from_pretrained(model_dir)
tokenizer = GPT2Tokenizer.from_pretrained(model_dir)
model = model.to(device)

In [4]:
# positive reviews
prompt = "<|startoftext|>[GAME]Pokemon Wildlife Hunt[SCORE]1[REVIEW]"

generated = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
generated = generated.to(device)

sample_outputs = model.generate(generated, 
                                do_sample=True,   
                                top_k = 15, 
                                max_length = 500,
                                top_p = 0.9, 
                                num_return_sequences=5,
                                pad_token_id=tokenizer.eos_token_id
                                )

for i, sample_output in enumerate(sample_outputs):
  print("{}: {}\n\n".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))

0: [GAME]Pokemon Wildlife Hunt[SCORE]1[REVIEW]This game is pretty awesome! I love the way it blends humor and exploration together.  The biggest problem is it's so short, I had about 8 hours to complete it in the first 20 minutes! I really wish it had more content! I don't like that, but if you can find all of those hours on the website, there should be more to it!  I like the idea of the game! They have different creatures to fight, different creatures to fight, and the different types of enemies. The thing is, they can shoot you, so there should be more! It's fun to play.


1: [GAME]Pokemon Wildlife Hunt[SCORE]1[REVIEW]Good game, would definitely recommend to play it for VR, it really needs more players like me to see this game in VR.


2: [GAME]Pokemon Wildlife Hunt[SCORE]1[REVIEW]I had a lot of fun with this game and I really like the new Pokemon option. The graphics are pretty good and the sound track is really good. I like the game. I like how the game gives you different things 

In [5]:
# random reviews
prompt = "<|startoftext|>"

generated = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
generated = generated.to(device)

sample_outputs = model.generate(generated, 
                                do_sample=True,   
                                top_k = 15, 
                                max_length = 500,
                                top_p = 0.9, 
                                num_return_sequences=5,
                                pad_token_id=tokenizer.eos_token_id
                                )

for i, sample_output in enumerate(sample_outputs):
  print("{}: {}\n\n".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))

0: [GAME]Dishonored[SCORE]1[REVIEW]A game that I will never forget.


1: [GAME]DayZ[SCORE]1[REVIEW] Early Access Review


2: [GAME]The Plan[SCORE]1[REVIEW]A short, simple game that's very simple. It's short, but is very very effective. The story is cute and there's not too much focus on the end. It's a short game and is short. I'd give it a 8.5/10.


3: [GAME]Terraria[SCORE]1[REVIEW]Terraria is a very fun game. You can play for hours on end and just do what you want with your life. I don't know what to say about it, but it is a great game. There is a lot of content, and even more things. It's a lot more than you could expect. If you're a fan of the game, I recommend it. I highly recommend it.


4: [GAME]DayZ[SCORE]-1[REVIEW] Early Access Review




In [6]:
# load openai detector
from transformers import pipeline
detect_text = pipeline("text-classification", model="roberta-base-openai-detector")

Some weights of the model checkpoint at roberta-base-openai-detector were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [7]:
import pandas as pd
df_reviews = pd.read_csv('dataset.csv')
df_reviews.head()

Unnamed: 0,app_id,app_name,review_text,review_score,review_votes
0,10,Counter-Strike,Ruined my life.,1,0
1,10,Counter-Strike,This will be more of a ''my experience with th...,1,1
2,10,Counter-Strike,This game saved my virginity.,1,0
3,10,Counter-Strike,• Do you like original games? • Do you like ga...,1,0
4,10,Counter-Strike,"Easy to learn, hard to master.",1,1


In [8]:
df_reviews.loc[1, 'review_text']

"This will be more of a ''my experience with this game'' type of review, because saying things like ''great gameplay'' will not suit something I've experienced with Counter-Strike. Here you go:  I remember back in 2002 I was at a friend's house and he was playing a game. I didn't know the name of the game nor I had internet to find it. A few weeks passed by and another friend came over. He didn't have a computer, so he brought a disc with a game in it. He told me that it was one of the best games and from that very moment I knew that it is going to be the game I saw at the other friend's house. When I saw the Counter-Strike logo I was filled with gamegasm (?) and I was so happy. I was playing it hardcore. Made friends, clans, was involved in communities and even made two myself. Counter-Strike is my first game which I played competitively and it was a such an experience. Playing public servers with mods were very fun, but playing it competitively made it very intense and stressful. In 

In [9]:
i = 0
print(df_reviews.loc[i, 'review_text'])
print(detect_text(df_reviews.loc[i, 'review_text']))

Ruined my life.
[{'label': 'Real', 'score': 0.5562800168991089}]


In [10]:
i = 8
print(df_reviews.loc[i, 'review_text'])
print(detect_text(df_reviews.loc[i, 'review_text']))

Counter-Strike: Ok, after 9 years of unlimited fun with friends, I have finally quit counter strike. Counter strike, in all of its versions, its just a great FPS game that anyone can enjoy it. Its a great game and all, you just cant stop playing it, you can just sit and play with your friends for days with out stoping. The huge weaoponary option you can choose and the smooth and sound of the game, its just f*cking addicting. With this games I've met so many different people and unique friends. This game is literally G(OLD). To all the young players who are looking for a good cheep and fun game to play with their friends, I highlly recommend on this. I've got (with steam record) 2,484hrs record in counter strike IN TOTTAL,,, and who knows how many more hrs in a non official steam version of this game.... Great Game. GG WP. And too all the people who will keep playing this game, all I can say, as always, GL &amp; HF &lt;3
[{'label': 'Real', 'score': 0.9998098015785217}]


In [11]:
# negative reviews
prompt = "<|startoftext|>[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW]"

generated = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
generated = generated.to(device)

sample_outputs = model.generate(generated, 
                                do_sample=True,   
                                top_k = 10, 
                                max_length = 500,
                                top_p = 0.9, 
                                num_return_sequences=5,
                                pad_token_id=tokenizer.eos_token_id
                                )

In [12]:
sample_texts = [tokenizer.decode(sample_output, skip_special_tokens=True) for sample_output in sample_outputs]

In [13]:
for i in range(len(sample_texts)):
    print(sample_texts[i])
    print(detect_text(sample_texts[i])[0])
    print('\n')

[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW]     YouTube™ Video:  Jurassic Park Story - KR-46&nbsp;
{'label': 'Fake', 'score': 0.6107867956161499}


[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW]          YouTube™ Video:  PokemonTested Planet X &amp; PS3 - Karatanas&nbsp;
{'label': 'Real', 'score': 0.753854513168335}


[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW]This game is a complete waste of time and money. The game looks great and the developers are very responsive, and the developers are responsive. However, if you are looking for a game to relax and work your way up the list of problems, this isn't the game for you.
{'label': 'Fake', 'score': 0.9998102784156799}


[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW]The game itself is quite interesting, it's not a game at all. The story is a bit lacking and the gameplay is very repetitive. It's a cute game that is just not worth $30.
{'label': 'Fake', 'score': 0.9978112578392029}


[GAME]Pokemon Wildlife Hunt[SCORE]-1[REVIEW] Early Access R