# Collecting Steam Reviews

To obtain all the Steam reviews, I used a Steam API called steamreviews: [pypi.org/project/steamreviews/](https://pypi.org/project/steamreviews/). With this Python library, we can get all the desired information that we want from the Steam reviews for our game of Fall Guys: Ultimate Knockout. For the sake of this data, we will filter reviews to only those of the English language.

All of the data was collected on February 13, 2021, and so all the reviews in our dataset will be based on those written before this date.

In [2]:
# repo for downloading Steam reviews
import steamreviews

In [29]:
# request parameters
request_params = dict()
request_params['filter'] = 'recent'
request_params['language'] = 'english'

# after collecting the data, save into json file in folder called data
app_id = 1097150 # id of steam game (Fall Guys: Ultimate Knockout)
review_dict, query_count = steamreviews.download_reviews_for_app_id(app_id,chosen_request_params=request_params)

[appID = 1097150] expected #reviews = 128659
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
Number of queries 150 reached. Cooldown: 310 seconds
502 Bad Gateway for appID = 1097150 and cursor = AoJwnP+W4PMCdeLSmQI=. Cooldown: 10 seconds


At first glance, there may have been an issue due to the 502 Bad Gateway error, which is an HTTP status code that can occur when the server prevents our requests. The output states that it expects 128659 reviews. Checking the length of the data we extracted, we have 128657 reviews, only 2 less than the expected amount, which is not a big problem for our analysis later.

In [30]:
len(review_dict['reviews']) # amount of reviews extracted

128657

We can then import our json file and extract only the relevant information that we need for our analysis into a csv file.

In [32]:
import json
import pandas as pd

# open json file of Steam reviews for Fall Guys
with open('data/review_1097150.json') as f:
    data = json.load(f)
    
# select reviews subset
data = data['reviews']

In [33]:
# initizialize empty dataframe for storing reviews
review_df = []

# filter JSON file to extract only the relevant fields that we want
for id,info in data.items():
    review = {'Steam ID':id,
              'Review':info['review'],
              'Recommended':info['voted_up'],
              'Hours_Played':info['author']['playtime_forever'],
              'Timestamp_Created':info['timestamp_created'],
              'Last_Played':info['author']['last_played']}
    
    review_df.append(review)

In [34]:
# use StringIO for reading json file
from io import StringIO

In [35]:
# convert into JSON string
review_dump = json.dumps(review_df,indent=4)
df = pd.read_json(StringIO(review_dump))

# convert into a csv file
df.to_csv('reviews.csv',index=False,encoding='utf-8')