**Se utilizó la API del sitio web RAWG.
**
**La API de RAWG es una plataforma de datos de videojuegos que proporciona información sobre videojuegos, incluyendo su título, género, plataforma, fecha de lanzamiento, clasificación, desarrollador, editor, descripción, imágenes, vídeos, y más.

La API permite realizar consultas para obtener información sobre juegos específicos o listas de juegos basados en diferentes parámetros, como el nombre del juego, el género, la plataforma, la fecha de lanzamiento, entre otros. También es posible ordenar los resultados en función de diversos criterios, como la popularidad o la fecha de lanzamiento.

La API de RAWG es utilizada por muchos desarrolladores, empresas y entusiastas de los videojuegos para obtener información precisa y actualizada sobre los videojuegos. Es una fuente valiosa de datos para proyectos de data science y análisis de mercado relacionados con los videojuegos.**

In [15]:
import requests
import pandas as pd

# Obtener la API key de RAWG desde https://rawg.io/apidocs
api_key = 'b26e238e1b1041dbb8ff82e2c24a5398'

# Construir la URL de la API con los parámetros necesarios
url = "https://api.rawg.io/api/games"
params = {
    "key": api_key,
    "page_size": 100,
    "page": 1,
    "ordering": "-released",
    "platforms": "18,1,7,8"
}


In [16]:
import time

games = []
start_time = time.time()
max_pages = 1500

while len(games) < 10000 and params["page"] <= max_pages:
    response = requests.get(url, params=params)
    if response.status_code == 401:
        print("Error de autenticación en la API")
        break
    data = response.json().get("results")
    if not data:
        print("Error: No se pudieron obtener suficientes resultados")
        break
    games += data
    params["page"] += 1
    time.sleep(1) # Esperar un segundo entre cada solicitud para no sobrecargar la API
    if time.time() - start_time > 1000: #Por iteracion llegue a la conclusion de que recupera 1 juego por 24 segundos aprox, entonces 10000 juegos serian 6 minutos aprox
        print("Se ha alcanzado el tiempo máximo de ejecución")
        break

df = pd.DataFrame(games, columns=['id', 'name', 'background_image', 'rating', 'released', 'added', 'playtime', 'metacritic', 'genres', 'publishers', 'developers', 'tags', 'esrb_rating', 'platforms', 'stores', 'clip', 'short_screenshots', 'description_raw', 'website', 'reviews_count'])


In [17]:
print(f"Se han recuperado {len(df)} juegos.")


Se han recuperado 10000 juegos.


In [18]:
df.to_csv("games.csv", index=False)

In [19]:
df.head(10)

Unnamed: 0,id,name,background_image,rating,released,added,playtime,metacritic,genres,publishers,developers,tags,esrb_rating,platforms,stores,clip,short_screenshots,description_raw,website,reviews_count
0,335919,The Snack World,,0.0,2030-01-01,2,0,,"[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...",,,[],,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...",,,[],,,0
1,333751,Galapagos,https://media.rawg.io/media/screenshots/3af/3a...,0.0,2030-01-01,5,0,,"[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...",,,[],,"[{'platform': {'id': 18, 'name': 'PlayStation ...",,,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,0
2,499270,Untitled Carnivores Game (2024),,0.0,2024-12-20,0,0,,"[{'id': 40, 'name': 'Casual', 'slug': 'casual'...",,,[],,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",,,[],,,0
3,485219,Unknown 9: Awakening,https://media.rawg.io/media/games/57e/57e410a4...,0.0,2024-04-06,143,0,,"[{'id': 3, 'name': 'Adventure', 'slug': 'adven...",,,[],,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",,,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,5
4,490430,Prince of Persia: The Sands of Time Remake,https://media.rawg.io/media/games/b92/b92b55ae...,3.63,2023-12-31,566,0,,"[{'id': 4, 'name': 'Action', 'slug': 'action'}]",,,"[{'id': 193, 'name': 'Classic', 'slug': 'class...","{'id': 3, 'name': 'Teen', 'slug': 'teen', 'nam...","[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 3, 'name': 'PlayStation Stor...",,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,38
5,845261,Assassin's Creed Mirage,https://media.rawg.io/media/games/fbd/fbd01280...,0.0,2023-12-31,276,0,,"[{'id': 4, 'name': 'Action', 'slug': 'action'}]",,,[],"{'id': 5, 'name': 'Adults Only', 'slug': 'adul...","[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 3, 'name': 'PlayStation Stor...",,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,3
6,874262,Midline '85,https://media.rawg.io/media/games/a2c/a2c23ea6...,0.0,2023-12-31,1,0,,"[{'id': 51, 'name': 'Indie', 'slug': 'indie'},...",,,[],,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",,,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,0
7,554700,Eiyuden Chronicle: Hundred Heroes,https://media.rawg.io/media/games/8b7/8b7ae9f5...,0.0,2023-12-31,49,0,,"[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...",,,[],,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 1, 'name': 'Steam', 'slug': ...",,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,2
8,303576,Vampire: The Masquerade - Bloodlines 2,https://media.rawg.io/media/games/fb5/fb5e0fdb...,3.96,2023-12-31,1731,329,,"[{'id': 4, 'name': 'Action', 'slug': 'action'}...",,,"[{'id': 31, 'name': 'Singleplayer', 'slug': 's...",,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 1, 'name': 'Steam', 'slug': ...",,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,161
9,43253,Little Devil Inside,https://media.rawg.io/media/games/63b/63bd91ad...,4.62,2023-12-31,359,0,,"[{'id': 4, 'name': 'Action', 'slug': 'action'}...",,,"[{'id': 1, 'name': 'Survival', 'slug': 'surviv...",,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",,,"[{'id': -1, 'image': 'https://media.rawg.io/me...",,,8


In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 20 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   id                 10000 non-null  int64  
 1   name               10000 non-null  object 
 2   background_image   9402 non-null   object 
 3   rating             10000 non-null  float64
 4   released           10000 non-null  object 
 5   added              10000 non-null  int64  
 6   playtime           10000 non-null  int64  
 7   metacritic         2703 non-null   float64
 8   genres             10000 non-null  object 
 9   publishers         0 non-null      float64
 10  developers         0 non-null      float64
 11  tags               9987 non-null   object 
 12  esrb_rating        4918 non-null   object 
 13  platforms          10000 non-null  object 
 14  stores             9474 non-null   object 
 15  clip               0 non-null      object 
 16  short_screenshots  9995

In [22]:
df.fillna(0)

Unnamed: 0,id,name,background_image,rating,released,added,playtime,metacritic,genres,publishers,developers,tags,esrb_rating,platforms,stores,clip,short_screenshots,description_raw,website,reviews_count
0,335919,The Snack World,0,0.00,2030-01-01,2,0,0.0,"[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...",0.0,0.0,[],0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...",0,0,[],0.0,0.0,0
1,333751,Galapagos,https://media.rawg.io/media/screenshots/3af/3a...,0.00,2030-01-01,5,0,0.0,"[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...",0.0,0.0,[],0,"[{'platform': {'id': 18, 'name': 'PlayStation ...",0,0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,0
2,499270,Untitled Carnivores Game (2024),0,0.00,2024-12-20,0,0,0.0,"[{'id': 40, 'name': 'Casual', 'slug': 'casual'...",0.0,0.0,[],0,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",0,0,[],0.0,0.0,0
3,485219,Unknown 9: Awakening,https://media.rawg.io/media/games/57e/57e410a4...,0.00,2024-04-06,143,0,0.0,"[{'id': 3, 'name': 'Adventure', 'slug': 'adven...",0.0,0.0,[],0,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...",0,0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,5
4,490430,Prince of Persia: The Sands of Time Remake,https://media.rawg.io/media/games/b92/b92b55ae...,3.63,2023-12-31,566,0,0.0,"[{'id': 4, 'name': 'Action', 'slug': 'action'}]",0.0,0.0,"[{'id': 193, 'name': 'Classic', 'slug': 'class...","{'id': 3, 'name': 'Teen', 'slug': 'teen', 'nam...","[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 3, 'name': 'PlayStation Stor...",0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,38
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,334945,Moshi Monsters: Katsuma Unleashed,0,0.00,2013-10-11,1,0,0.0,"[{'id': 11, 'name': 'Arcade', 'slug': 'arcade'}]",0.0,0.0,[],0,"[{'platform': {'id': 8, 'name': 'Nintendo 3DS'...",0,0,[],0.0,0.0,0
9996,13281,Doom & Destiny,https://media.rawg.io/media/screenshots/db9/db...,3.40,2013-10-11,186,4,0.0,"[{'id': 51, 'name': 'Indie', 'slug': 'indie'},...",0.0,0.0,"[{'id': 31, 'name': 'Singleplayer', 'slug': 's...","{'id': 4, 'name': 'Mature', 'slug': 'mature', ...","[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 1, 'name': 'Steam', 'slug': ...",0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,10
9997,27310,Escape From Zombie City,https://media.rawg.io/media/screenshots/eaf/ea...,0.00,2013-10-10,0,0,0.0,[],0.0,0.0,"[{'id': 63, 'name': 'Zombies', 'slug': 'zombie...",0,"[{'platform': {'id': 8, 'name': 'Nintendo 3DS'...","[{'store': {'id': 6, 'name': 'Nintendo Store',...",0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,0
9998,8764,KAMI,https://media.rawg.io/media/screenshots/9a0/9a...,3.52,2013-10-10,552,1,0.0,"[{'id': 40, 'name': 'Casual', 'slug': 'casual'...",0.0,0.0,"[{'id': 31, 'name': 'Singleplayer', 'slug': 's...","{'id': 2, 'name': 'Everyone 10+', 'slug': 'eve...","[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'store': {'id': 1, 'name': 'Steam', 'slug': ...",0,"[{'id': -1, 'image': 'https://media.rawg.io/me...",0.0,0.0,44


In [23]:
# Elimino las columnas que no son relevantes para el análisis
df = df[['id', 'name', 'rating', 'released', 'genres', 'platforms', 'metacritic', 'playtime']]

# Elimino filas con valores faltantes
df.dropna(inplace=True)

In [24]:
# Convirtiendo la columna de fechas a formato datetime
df['released'] = pd.to_datetime(df['released'])

# Convirtiendo la columna de géneros a una lista de géneros
df['genres'] = df['genres'].str.split(',')

# Convirtiendo la columna de plataformas a una lista de plataformas
df['platforms'] = df['platforms'].str.split(',')

# Cambiando el tipo de datos de las columnas de rating, metacritic y playtime a float
df['rating'] = df['rating'].astype(float)
df['metacritic'] = df['metacritic'].astype(float)
df['playtime'] = df['playtime'].astype(float)

# Breve analisi exploratorio del nuevo dataset:
Estas líneas de código permitirán obtener un resumen estadístico del dataset, así como contar el número de juegos por género, plataforma y año de lanzamiento. A partir de estos resultados, se podrá realizar un análisis más profundo de los datos y obtener información valiosa para el proyecto de data science.

In [25]:
# Obtener un resumen estadístico del dataset
print(df.describe())

                  id       rating                       released  genres   
count    2703.000000  2703.000000                           2703     0.0  \
mean   217041.758787     2.977225  2018-07-18 23:33:21.775804416     NaN   
min         2.000000     0.000000            2013-10-12 00:00:00     NaN   
25%      9833.000000     2.815000            2016-05-16 12:00:00     NaN   
50%     52384.000000     3.440000            2018-07-24 00:00:00     NaN   
75%    403041.500000     3.875000            2020-10-26 12:00:00     NaN   
max    934469.000000     4.800000            2023-04-11 00:00:00     NaN   
std    254021.149999     1.388433                            NaN     NaN   

       platforms   metacritic     playtime  
count        0.0  2703.000000  2703.000000  
mean         NaN    73.286349     4.275990  
min          NaN    15.000000     0.000000  
25%          NaN    68.000000     1.000000  
50%          NaN    74.000000     3.000000  
75%          NaN    80.000000     5.000000  


In [29]:
# Obtener el número de juegos por año
game_counts_by_year = df.groupby(df['released'].dt.year)['id'].count()
print(game_counts_by_year)

released
2013     51
2014    218
2015    286
2016    340
2017    322
2018    270
2019    287
2020    323
2021    342
2022    243
2023     21
Name: id, dtype: int64


In [30]:
df.count()

id            2703
name          2703
rating        2703
released      2703
genres           0
platforms        0
metacritic    2703
playtime      2703
dtype: int64

In [31]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2703 entries, 47 to 9993
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   id          2703 non-null   int64         
 1   name        2703 non-null   object        
 2   rating      2703 non-null   float64       
 3   released    2703 non-null   datetime64[ns]
 4   genres      0 non-null      float64       
 5   platforms   0 non-null      float64       
 6   metacritic  2703 non-null   float64       
 7   playtime    2703 non-null   float64       
dtypes: datetime64[ns](1), float64(5), int64(1), object(1)
memory usage: 190.1+ KB


In [32]:
df.describe()

Unnamed: 0,id,rating,released,genres,platforms,metacritic,playtime
count,2703.0,2703.0,2703,0.0,0.0,2703.0,2703.0
mean,217041.758787,2.977225,2018-07-18 23:33:21.775804416,,,73.286349,4.27599
min,2.0,0.0,2013-10-12 00:00:00,,,15.0,0.0
25%,9833.0,2.815,2016-05-16 12:00:00,,,68.0,1.0
50%,52384.0,3.44,2018-07-24 00:00:00,,,74.0,3.0
75%,403041.5,3.875,2020-10-26 12:00:00,,,80.0,5.0
max,934469.0,4.8,2023-04-11 00:00:00,,,97.0,156.0
std,254021.149999,1.388433,,,,9.993158,8.724929


In [34]:
df['rating'].nunique() #N de valores unicos en una columna

271

In [35]:
df.sort_values('released', ascending=False)

Unnamed: 0,id,name,rating,released,genres,platforms,metacritic,playtime
47,830271,Sherlock Holmes The Awakened,0.00,2023-04-11,,,75.0,0.0
55,842402,Dredge,4.33,2023-03-29,,,83.0,26.0
58,795632,Resident Evil 4,4.68,2023-03-24,,,92.0,3.0
60,850691,Atelier Ryza 3: Alchemist of the End & the Sec...,0.00,2023-03-24,,,84.0,0.0
62,705258,Have a Nice Death,3.85,2023-03-22,,,83.0,2.0
...,...,...,...,...,...,...,...,...
9988,1366,140,3.63,2013-10-15,,,80.0,2.0
9990,14,Tetrobot and Co.,3.27,2013-10-15,,,72.0,3.0
9991,15712,Goodbye Deponia,4.00,2013-10-15,,,80.0,5.0
9992,26968,Skylanders SWAP Force,3.44,2013-10-13,,,80.0,0.0


###### El dataset se ve reducido dado que se eliminaron columnas a modo de análisis