# Steam Library Analyzer
### GitHub repository: [https://github.com/saulTejeda117/Steam-Data-Analyzer](https://github.com/saulTejeda117/Steam-Data-Analyzer)

Steam Library Analyzer es un projecto de ciencia de datos enfocado principalmente en el analisis predictivo de los habitos de juego de los usuarios de [_`Steam`_](https://store.steampowered.com). Su objetivo principal es determinar el tiempo estimado para completar todos los juegos de la biblioteca de un jugador. Para ello se ha hecho uso de fuentes de información tales como la WEB dedicada a videojuegos [_`How Long To Beat`_](https://howlongtobeat.com) y la [_`Steam API`_](https://steamcommunity.com/dev) que hacen posible acceder a información necesaria como:

- **Completion Rate:** Es la métrica que indica la proporción de juegos que un jugador ha completado en comparación con el total de juegos en su biblioteca.
  
- **Total games:** Hace referencia a la cantidad total de juegos que un usuario tiene actualmente en su biblioteca de juegos de Steam.
  
- **Perfect Games:** Se refiere aquellosjuegos cuyas metas y logros han sido alcanzados al 100%, según las estadísticas proporcionadas por Steam.

<hr>

In [1]:
import requests
import json
import time
import pandas as pd
from IPython import display
import matplotlib.pyplot as plt
import numpy as np
import re

## 1.1 Obtain User Steam Profile Data

El proceso de análisis se inicia mediante la obtención de la información esencial de la cuenta de usuario que se pretende evaluar. En este sentido, se procede a extraer los datos pertinentes del archivo _JSON_ denominado  _`"steam_credentials.json"`_, dicho archivo alberga información crucial, incluyendo:

- **Steam API key:** son identificadores únicos e irrepetibles proporcionados por Steam a desarrolladores y aplicaciones que desean acceder a la Steam API.
  
- **Steam ID:** se refiere a un identificador único utilizado para identificar de manera única a los usuarios y sus perfiles en la plataforma Steam.

In [2]:
# Load the steam credentials JSON file 
with open('steam_credentials.json') as json_file:
    credentials = json.load(json_file)

# 
api_key = credentials.get('api_key')
steam_id = credentials.get('steam_id')

Posteriormente se realiza una consulta a la [_`Steam API`_](https://steamcommunity.com/dev) para obtener los datos de la cuenta de usuario al que pertecenen las credenciales ingresadas. Se comprueba la  respuesta de la petición.


In [3]:
# Obtener la URL de la información del jugador utilizando la API de Steam
player_info_url = f'http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key={api_key}&steamids={steam_id}'
response = requests.get(player_info_url)

if (response != None):
    data = response.json()
    # print(data, "\n\n\n")
    print("Username: ", data['response']['players'][0]['personaname'])
    print("Avatar: ", data['response']['players'][0]['avatarfull'])
    print("Link: ", data['response']['players'][0]['profileurl'])

else:
    print("Something  went wrong!")


Username:  Grabma
Avatar:  https://avatars.steamstatic.com/af32b9e84f67edb7cdacc52177c5f8f05ce0fded_full.jpg
Link:  https://steamcommunity.com/id/saultejm/


<hr>

## 1.2 Obtain User Steam Library Data

Después de obtener los datos del usuario de la cuenta de Steam procedemos a obtener los datos de juego de su biblioteca, de los datos principales que nos interesan obtener en esta parte del proceso destacan principalmente

- **appid:** son identificadores únicos e irrepetibles proporcionados por Steam a desarrolladores y aplicaciones que desean acceder a la Steam API.
  
- **playtime_forever:** se refiere a un identificador único utilizado para identificar de manera única a los usuarios y sus perfiles en la plataforma Steam.


### 1.2.1 Obtain AppID and Playtime Data

In [4]:
# Get data from my Steam library
games_endpoint = f"https://api.steampowered.com/IPlayerService/GetOwnedGames/v1/?key={api_key}&steamid={steam_id}"
response_games = requests.get(games_endpoint)
data_games = response_games.json()

df_games = pd.json_normalize(data_games['response']['games'])
df_games['game_name'] = None
df_games

Unnamed: 0,appid,playtime_forever,playtime_windows_forever,playtime_mac_forever,playtime_linux_forever,rtime_last_played,playtime_disconnected,playtime_2weeks,game_name
0,9050,142,142,0,0,1597370032,0,,
1,9070,0,0,0,0,0,0,,
2,208200,0,0,0,0,0,0,,
3,400,245,245,0,0,1594507407,0,,
4,20900,0,0,0,0,0,0,,
...,...,...,...,...,...,...,...,...,...
238,1144770,3,3,0,0,1698369396,0,,
239,544610,0,0,0,0,0,0,,
240,226620,0,0,0,0,0,0,,
241,43160,0,0,0,0,0,0,,


### 1.2.1 Obtain Name of the games

In [8]:
errors = 0
all_games_data = []

df_games['appid'] = df_games['appid'].astype(str)

for game in range(len(df_games)):
    appid = df_games.iloc[game]['appid']
    app_details_endpoint = f"https://store.steampowered.com/api/appdetails/?appids={appid}"
    response_app_details = requests.get(app_details_endpoint)
    
    if response_app_details.status_code == 200:
        data_app_details = response_app_details.json()
        try:
            game_name = data_app_details[str(appid)]['data']['name']
            df_games.loc[game, 'game_name'] = game_name
            print(game, " AppID: ", appid, "\n", "Name: ", game_name, "\n", "Play Time: ", df_games.iloc[game]['playtime_forever'], "\n")
            
        except:
            print("Error: ", appid, "\n", "Name: ", game_name, "\n")
            errors += 1
            pass
        
    else:
        print(f"Error: {game} {appid} {game_name} {response_app_details.status_code}")
        errors += 1
        pass
        
    time.sleep(1)
        
        
print(f"Process Completed. Errors {errors}")

0  AppID:  9050 
 Name:  DOOM 3 
 Play Time:  142 

1  AppID:  9070 
 Name:  DOOM 3 Resurrection of Evil 
 Play Time:  0 

2  AppID:  208200 
 Name:  DOOM 3 
 Play Time:  0 

3  AppID:  400 
 Name:  Portal 
 Play Time:  245 

4  AppID:  20900 
 Name:  The Witcher: Enhanced Edition Director's Cut 
 Play Time:  0 

5  AppID:  13500 
 Name:  Prince of Persia: Warrior Within™ 
 Play Time:  0 

6  AppID:  13530 
 Name:  Prince of Persia: The Two Thrones™ 
 Play Time:  0 

7  AppID:  13600 
 Name:  Prince of Persia®: The Sands of Time 
 Play Time:  0 

8  AppID:  19980 
 Name:  Prince of Persia® 
 Play Time:  0 

9  AppID:  17470 
 Name:  Dead Space (2008) 
 Play Time:  37 

10  AppID:  32440 
 Name:  LEGO® Star Wars™ - The Complete Saga 
 Play Time:  101 

11  AppID:  550 
 Name:  Left 4 Dead 2 
 Play Time:  2244 

12  AppID:  33320 
 Name:  Prince of Persia: The Forgotten Sands™ 
 Play Time:  472 

13  AppID:  40800 
 Name:  Super Meat Boy 
 Play Time:  751 

14  AppID:  47780 
 Name:  Dea

In [10]:
df_games

Unnamed: 0,appid,playtime_forever,game_name
0,9050,142,DOOM 3
1,9070,0,DOOM 3 Resurrection of Evil
2,208200,0,DOOM 3
3,400,245,Portal
4,20900,0,The Witcher: Enhanced Edition Director's Cut
...,...,...,...
238,1144770,3,SLUDGE LIFE
239,544610,0,Battlestar Galactica Deadlock
240,226620,0,Desktop Dungeons
241,43160,0,


<hr>

## 1.3 Data sets: Get the howlongtobeat data

In [7]:
games = 0
for game in played_time_file:  
    game_name = game["Name"]
    game_name1 = re.sub(r'[^a-zA-Z0-9\s\:\.\-\,]', '', game_name)

    beat_time_data = f"https://hltb-api.vercel.app/api?name={game_name1}"
    beat_time_response = requests.get(beat_time_data)

    if beat_time_response.status_code == 200:
        beat_time_data = beat_time_response.json()
        try:
            print("Name:", beat_time_data[0]['name'], " - Beat Time:", beat_time_data[0]['gameplayMain'])
            games += 1
            #if(beat_time_data[0]['gameplayCompletionist'] != 0):
            #    print("Name:", beat_time_data[0]['name'], " - Beat Time:", beat_time_data[0]['gameplayCompletionist'])
                
            #else:
            #    print("Name:", beat_time_data[0]['name'], " - Beat Time:", beat_time_data[0]['gameplayMain'])
            #    games += 1
        except:
            print("ERROR:", game_name1)

NameError: name 'played_time_file' is not defined