## Getting Twitch data from the Open Critic "Hall of Fame" games list

This notebook use an Open Critic "hall of Fame.csv" games list as parameters of Twitch API.

My main goals here:

* Learn to use a csv table as parameter of a API
* Create a DataFrame appending all API results
* Join the Twitch DataFrame with OpenCritic DataFrame

In [253]:
import json
import requests
import pandas as pd
import datetime as dt
import numpy as np

In [254]:
# Read the OpenCritic Data from .csv to DataFrame
hall_of_fame = pd.read_csv('hall_of_fame.csv')
hall_of_fame.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 84 entries, 0 to 83
Data columns (total 3 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   name              84 non-null     object
 1   firstReleaseDate  84 non-null     object
 2   topCriticScore    84 non-null     int64 
dtypes: int64(1), object(2)
memory usage: 2.1+ KB


In [59]:
# The get_responses_twitch function get data from Twitch API varying by game and year. 
# And seve the results in a list.
def get_responses_twitch(game,init_year, end_year):
    years = range(init_year, end_year+1, 1)
    
    response_list=[]
    
    for year in years:

        url = "https://twitch-game-popularity.p.rapidapi.com/game"

        querystring = {"name":game,"year":str(year)}

        headers = {
            "X-RapidAPI-Key": "589a49664fmshb31addec27c7813p100adbjsn8ca39eb71bf4",
            "X-RapidAPI-Host": "twitch-game-popularity.p.rapidapi.com"
        }

        # Using the request method to GET the response
        response = requests.request('GET', url, headers=headers, params=querystring)
        # Append the multiple responses
        response_list.append(response.json())
    return response_list

In [256]:
# The "all_games_data" function use the "get_responses_twitch" function to get Twitch data 
# the function get data varying by Game, Year and organize the Json response in a list
def all_games_data(games,years,features):
    all_games_data = []
    for game in games:

        aux1 = get_responses_twitch(game,years[0], years[1])
        game_data = []
        for i in range(len(aux1)):

            for j in range(len(aux1[i])):
                feature_list = []
                for feature in features:
                    feature_list.append(aux1[i][j][feature])
                game_data.append(feature_list)
        all_games_data.append(game_data)
    
    #all_games_data=[ite for sub_list in all_games_data for ite in sub_list]
    return all_games_data


In [229]:
#select the features used in "all_games_data" function
features =  ['Rank',
             'Game',
             'Month',
             'Year',
             'Hours_watched',
             'Hours_Streamed',
             'Peak_viewers',
             'Peak_channels',
             'Streamers',
             'Avg_viewers',
             'Avg_channels',
             'Avg_viewer_ratio']

#select the range of yeaars used in "all_games_data" function
years = [2016,2022]

#select the list of games used in "all_games_data" function, extract from OpenCritic Hall of Fame
list_of_games = hall_of_fame['name']

# Use "all_games_data" function and save the list returned
json_all_games_data = all_games_data(list_of_games,years, features)

# This is an important step, that Flattens the list of lists of lists in a list of lists
all_games_data1=[ite for sub_list in json_all_games_data for ite in sub_list]


In [247]:
# Create a DataFrame
df_games = pd.DataFrame(all_games_data1)
df_games.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 780 entries, 0 to 779
Data columns (total 12 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       780 non-null    object
 1   1       780 non-null    object
 2   2       780 non-null    object
 3   3       780 non-null    object
 4   4       780 non-null    object
 5   5       780 non-null    object
 6   6       780 non-null    object
 7   7       780 non-null    object
 8   8       780 non-null    object
 9   9       780 non-null    object
 10  10      780 non-null    object
 11  11      780 non-null    object
dtypes: object(12)
memory usage: 73.2+ KB


In [260]:
# Define the columns name
columns = ['Rank',
             'name',
             'Month',
             'Year',
             'Hours_watched',
             'Hours_Streamed',
             'Peak_viewers',
             'Peak_channels',
             'Streamers',
             'Avg_viewers',
             'Avg_channels',
             'Avg_viewer_ratio']
df_games.columns = columns
df_games

ValueError: Length mismatch: Expected axis has 14 elements, new values have 12 elements

In [249]:
# Remove the text Hours 
df_games['Hours_Streamed'] = df_games['Hours_Streamed'].str[:-6]
# Sort values by ['name','Year','Month']
df_games = df_games.sort_values(by=['name','Year','Month'])
df_games

Unnamed: 0,Rank,name,Month,Year,Hours_watched,Hours_Streamed,Peak_viewers,Peak_channels,Streamers,Avg_viewers,Avg_channels,Avg_viewer_ratio
731,8,Animal Crossing: New Horizons,03,2020,30832827,455308,244171,4403,40068,41497,612,67.72
732,15,Animal Crossing: New Horizons,04,2020,22494164,519504,130992,1759,38121,31285,722,43.3
733,24,Animal Crossing: New Horizons,05,2020,10150782,306001,58370,862,25999,13661,411,33.17
734,41,Animal Crossing: New Horizons,06,2020,4530366,169253,24104,541,17257,6300,235,26.77
735,52,Animal Crossing: New Horizons,07,2020,3878213,156666,43056,681,16671,5219,210,24.75
...,...,...,...,...,...,...,...,...,...,...,...,...
6,192,Uncharted 4: A Thief's End,04,2017,129993,21675,1665,65,3110,180,30,6
7,171,Uncharted 4: A Thief's End,06,2017,166236,20712,6514,63,4261,231,28,8.03
8,157,Uncharted 4: A Thief's End,08,2017,234356,20074,11747,60,3796,315,27,11.67
451,186,What Remains of Edith Finch,04,2017,134691,769,45323,26,271,187,1,175.15


In [251]:
# Join the Hall_of_fame OpenCritic DataFrame to Twitch DataFrame (df_games) 
df_games = df_games.set_index('name').join(hall_of_fame.set_index('name'), on='name', how='left')

# Conert index name to a columns name
df_games = df_games.reset_index()
df_games

Unnamed: 0,name,Rank,Month,Year,Hours_watched,Hours_Streamed,Peak_viewers,Peak_channels,Streamers,Avg_viewers,Avg_channels,Avg_viewer_ratio,firstReleaseDate,topCriticScore
0,Animal Crossing: New Horizons,8,03,2020,30832827,455308,244171,4403,40068,41497,612,67.72,2020-03-20,90
1,Animal Crossing: New Horizons,15,04,2020,22494164,519504,130992,1759,38121,31285,722,43.3,2020-03-20,90
2,Animal Crossing: New Horizons,24,05,2020,10150782,306001,58370,862,25999,13661,411,33.17,2020-03-20,90
3,Animal Crossing: New Horizons,41,06,2020,4530366,169253,24104,541,17257,6300,235,26.77,2020-03-20,90
4,Animal Crossing: New Horizons,52,07,2020,3878213,156666,43056,681,16671,5219,210,24.75,2020-03-20,90
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,Uncharted 4: A Thief's End,192,04,2017,129993,21675,1665,65,3110,180,30,6,2016-05-10,93
776,Uncharted 4: A Thief's End,171,06,2017,166236,20712,6514,63,4261,231,28,8.03,2016-05-10,93
777,Uncharted 4: A Thief's End,157,08,2017,234356,20074,11747,60,3796,315,27,11.67,2016-05-10,93
778,What Remains of Edith Finch,186,04,2017,134691,769,45323,26,271,187,1,175.15,2017-04-25,88


In [252]:
# Save the joined DataFrame in a CSV file
df_games.to_csv('list_of_games.csv')