# Steam API Feasibility Study: An analysis of Steam player activity during covid-19 

### Why should we care about Steam?

"As many of you already know, this past year had more than its share of challenges, with everyone's lives upended by the global pandemic. While Steam was already seeing significant growth in 2020 before COVID-19 lockdowns, video game playtime surged when people started staying home, dramatically increasing the number of customers buying and playing games, and hopefully bringing some joy to counter-balance some of the craziness that was 2020. This has led to new highs for monthly active users (120.4 million), daily active users (62.6 million), peak concurrent users (24.8 million), first-time purchasers (2.6 million per month), hours of playtime (31.3 billion hours), and the number of games purchased (21.4% increase over 2019)."  --- Steam - 2020 Year in Review

### Two paths：
1. [ValveAPI](https://partner.steamgames.com/doc/webapi_overview) 
2. [SteamSpy](https://steamspy.com/about): A third party Steam stats service based on Web API provided by Valve

### References:
1. https://steamspy.com/api.php
2. https://nik-davis.github.io/posts/2019/steam-data-collection/

In [2]:
# standard library imports
import csv
import datetime as dt
import json
import os
import statistics
import time

# third-party imports
import numpy as np
import pandas as pd
import requests
import nltk

In [3]:
def get_request(url, parameters=None):
    """Return json-formatted response of a get request using optional parameters.
    
    Parameters
    ----------
    url : string
    parameters : {'parameter': 'value'}
        parameters to pass as part of get request
    
    Returns
    -------
    json_data
        json-formatted response (dict-like)
    """
    try:
        response = requests.get(url=url, params=parameters)
    except SSLError as s:
        print('SSL Error:', s)
        
        for i in range(5, 0, -1):
            print('\rWaiting... ({})'.format(i), end='')
            time.sleep(1)
        print('\rRetrying.' + ' '*10)
        
        # recusively try again
        return get_request(url, parameters)
    
    if response:
        return response.json()
    else:
        # response is none usually means too many requests. Wait and try again 
        print('No response, waiting 10 seconds...')
        time.sleep(10)
        print('Retrying.')
        return get_request(url, parameters)

### SteamSpy

In [4]:
url = "https://steamspy.com/api.php"
parameters = {"request": "all"}

# request 'all' from steam spy and parse into dataframe
json_data = get_request(url, parameters=parameters)
steam_spy_all = pd.DataFrame.from_dict(json_data, orient='index')

In [5]:
# Check what attributes we can have

# Although it won't provides data longer than 2 weeks before, it might be useful for holidays(like Christmas, valentine, Holloween for example).
"""
## Return format for an app: ##

  * appid - Steam Application ID. If it's 999999, then data for this application is hidden on developer's request, sorry.
  * name - game's name
  * developer - comma separated list of the developers of the game
  * publisher - comma separated list of the publishers of the game
  * score_rank - score rank of the game based on user reviews
  * owners - owners of this application on Steam as a range.
  * average_forever - average playtime since March 2009. In minutes.
  * average_2weeks - average playtime in the last two weeks. In minutes.
  * median_forever - median playtime since March 2009. In minutes.
  * median_2weeks - median playtime in the last two weeks. In minutes.
  * ccu - peak CCU(concurrent user) yesterday.
  * price - current US price in cents.
  * initialprice - original US price in cents.
  * discount - current discount in percents.
  * tags - game's tags with votes in JSON array.
  * languages - list of supported languages.
  * genre - list of genres.

"""
steam_spy_all.sort_values(by = ["average_2weeks"], ascending=False).head(5)

Unnamed: 0,appid,name,developer,publisher,score_rank,positive,negative,userscore,owners,average_forever,average_2weeks,median_forever,median_2weeks,price,initialprice,discount,ccu
610080,610080,Realm Grinder,Divine Games,Kongregate,,4388,675,0,"500,000 .. 1,000,000",13395,8695,165,8695,0,0,0,1212
627690,627690,Idle Champions of the Forgotten Realms,Codename Entertainment Inc.,Codename Entertainment Inc.,,6671,1414,0,"500,000 .. 1,000,000",54389,7724,321,10638,0,0,0,4902
304930,304930,Unturned,Smartly Dressed Games,Smartly Dressed Games,,398887,37945,0,"20,000,000 .. 50,000,000",6601,4222,344,3625,0,0,0,27982
218620,218620,PAYDAY 2,OVERKILL - a Starbreeze Studio.,Starbreeze Publishing AB,,451016,60190,0,"10,000,000 .. 20,000,000",6305,3790,691,5215,999,999,0,49427
1263850,1263850,Football Manager 2021,Sports Interactive,SEGA,,15853,1088,0,"500,000 .. 1,000,000",19839,3690,29603,3992,2499,4999,50,63574


### Game Comments Study Using Valve API：

For my own experience, it is always interesting to read comments under games. There should be some interesting covid-19 related comments under some specific games that implies how people get through this difficult time and make their lemon into lemonade. 

#### My plan
  - collect and research the comments under a game called plague.Inc(In this game the player can develope a contagious virus and destroy the world) 
  - filter out those comments that contains worlds like covid, covid-19, life, etc. is mentioned in positive and negative comments.
  - In these comments, find out what words are mentioned frequently, which may suggest how most of the players get through this difficult time and make their lemon into lemonade.   

In [6]:
# API used in this example can be found in here:
# https://partner.steamgames.com/doc/store/getreviews


# Find the appid for Plague Inc.
temp = steam_spy_all[steam_spy_all["name"] == "Plague Inc: Evolved"]
ID = temp["appid"]
ID = int(ID)
ID

246620

In [7]:
url = "https://store.steampowered.com/appreviews/"+str(ID)+"?"

# get the comments for the last 100 days for example
parameters = {"json": 1,"day_range":100, "num_per_page":100, "language": "english"}
response = get_request(url, parameters=parameters)

In [8]:
response = pd.DataFrame(response["reviews"])
response.head()

Unnamed: 0,recommendationid,author,language,review,timestamp_created,timestamp_updated,voted_up,votes_up,votes_funny,weighted_vote_score,comment_count,steam_purchase,received_for_free,written_during_early_access
0,88389555,"{'steamid': '76561199100552421', 'num_games_ow...",english,---{Gameplay}---\n☐ Try not to get addicted\n☑...,1615675467,1615675467,True,91,4,0.8573616743087769,0,True,False,False
1,90758198,"{'steamid': '76561198047197409', 'num_games_ow...",english,Scary.\nNeeds a Plague vs. Cure multiplayer sc...,1619144158,1619144416,True,56,0,0.8454376459121704,0,True,False,False
2,89176409,"{'steamid': '76561197974472406', 'num_games_ow...",english,The joke's not funny anymore.,1616844185,1616844185,True,61,57,0.8447731137275695,0,True,False,False
3,91889313,"{'steamid': '76561199009139092', 'num_games_ow...",english,the,1620845350,1620845350,True,164,31,0.8125506043434143,5,True,False,False
4,93022373,"{'steamid': '76561198297384332', 'num_games_ow...",english,If you need a distraction from the insanity cu...,1622593677,1622593677,True,34,23,0.7676195502281188,0,True,False,False


In [25]:
reviews = response[["author", "review","votes_up", "votes_funny"]]
reviews.sort_values(by=["votes_up", "votes_funny"], ascending=False, inplace=True)
reviews.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  reviews.sort_values(by=["votes_up", "votes_funny"], ascending=False, inplace=True)


Unnamed: 0,author,review,votes_up,votes_funny
3,"{'steamid': '76561199009139092', 'num_games_ow...",the,164,31
6,"{'steamid': '76561198009676887', 'num_games_ow...",omg get this im gonna make a virus and name is...,91,40
0,"{'steamid': '76561199100552421', 'num_games_ow...",---{Gameplay}---\n☐ Try not to get addicted\n☑...,91,4
2,"{'steamid': '76561197974472406', 'num_games_ow...",The joke's not funny anymore.,61,57
1,"{'steamid': '76561198047197409', 'num_games_ow...",Scary.\nNeeds a Plague vs. Cure multiplayer sc...,56,0


In [26]:
review_list = pd.DataFrame(reviews["review"])
review_list

Unnamed: 0,review
3,the
6,omg get this im gonna make a virus and name is...
0,---{Gameplay}---\n☐ Try not to get addicted\n☑...
2,The joke's not funny anymore.
1,Scary.\nNeeds a Plague vs. Cure multiplayer sc...
...,...
75,"This was the game to play to simulate covid, t..."
76,yes
79,its cool i guess
80,Finally! I can get people sick without being ...


In [27]:
# Using the NLTK pre-trained Sentiment Analyzer
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to C:\Users\Ben
[nltk_data]     Gao\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

In [28]:
SA = SentimentIntensityAnalyzer()

In [48]:
pos = []
neg = []
neu = []
compound = []

for rv in review_list["review"]:
#     print(SA.polarity_scores(rv))
    neg.append(SA.polarity_scores(rv)["neg"])
    pos.append(SA.polarity_scores(rv)["pos"])
    neu.append(SA.polarity_scores(rv)["neu"])
    compound.append(SA.polarity_scores(rv)["compound"])

review_list["pos"] = pos
review_list["neg"] = neg
review_list["neu"] = neu
review_list["compound"] = compound

review_list.sort_values(by = "compound", ascending=False)

# I should filter out all comments that contains : ["Covid", "covid - 19", "lockdown","life", etc.] first and then do this sentiment analysis.
# I was worrying about I might not be able to get enough comments, or miss some covid related reviews due to the limitation of the dictionary.
# But for demo purposes, this piece of code shows enough knowledge about the api, and the python library used to analyze it. 

Unnamed: 0,review,neg,pos,neu,compound
8,"This game is great, and it's great playing it ...",0.000,0.281,0.719,0.9763
0,---{Gameplay}---\n☐ Try not to get addicted\n☑...,0.137,0.224,0.639,0.9700
11,I love this game!\nMy biggest worry was whethe...,0.058,0.190,0.752,0.9656
34,While I wouldn't say it is worth the full aski...,0.103,0.200,0.697,0.9287
42,An amazing game which has almost infinite cont...,0.000,0.316,0.684,0.9211
...,...,...,...,...,...
80,Finally! I can get people sick without being ...,0.313,0.122,0.564,-0.5213
27,monke destroy everything,0.636,0.000,0.364,-0.5423
9,Incredible Gameplay And It Is Really Fun To De...,0.274,0.209,0.517,-0.5828
49,If you want to start the third world war betwe...,0.160,0.108,0.731,-0.7964
