**Modelo de Aprendizaje Automático:**

Después de completar el Análisis Exploratorio de Datos (EDA) para comprender la naturaleza de nuestros datos, avanzamos hacia la creación de un modelo de recomendación basado en la similitud del coseno. El enfoque del modelo se fundamenta en la similitud entre juegos. Al proporcionar el nombre o identificador de un juego específico, el sistema generará una lista de 5 juegos recomendados que comparten similitudes con el juego de referencia. 

**Importaciones**

In [1]:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import scipy.sparse as sp
import pyarrow.parquet as pq
import matplotlib.pyplot as plt
import seaborn as sns

**Carga de Datos**

In [3]:
df_games = pd.read_csv("../Archivos Json/steam_games_limpio.csv")
df_games

Unnamed: 0,app_name,id,developer,price_number,release_year,main_genre
0,Lost Summoner Kitty,761140,Kotoshiro,4.99,2018,Action
1,Ironbound,643980,Secret Level SRL,0.00,2018,Free to Play
2,Real Pool 3D - Poolians,670290,Poolians.com,0.00,2017,Casual
3,弹炸人2222,767400,彼岸领域,0.99,2017,Action
4,Battle Royale Trainer,772540,Trickjump Games Ltd,3.99,2018,Action
...,...,...,...,...,...,...
28829,Kebab it Up!,745400,Bidoniera Games,1.99,2018,Action
28830,Colony On Mars,773640,"Nikita ""Ghost_RUS""",1.99,2018,Casual
28831,LOGistICAL: South Africa,733530,Sacada,4.99,2018,Casual
28832,Russian Roads,610660,Laush Dmitriy Sergeevich,1.99,2018,Indie


**Selección de Columnas Relevantes**

En esta sección, se seleccionaron las columnas relevantes ("app_name", "id", y "main_genre") del DataFrame original df_games.

In [4]:
df_recomendacion = df_games[["app_name","id","main_genre"]]
df_recomendacion

Unnamed: 0,app_name,id,main_genre
0,Lost Summoner Kitty,761140,Action
1,Ironbound,643980,Free to Play
2,Real Pool 3D - Poolians,670290,Casual
3,弹炸人2222,767400,Action
4,Battle Royale Trainer,772540,Action
...,...,...,...
28829,Kebab it Up!,745400,Action
28830,Colony On Mars,773640,Casual
28831,LOGistICAL: South Africa,733530,Casual
28832,Russian Roads,610660,Indie


**Codificación One-Hot**

Se aplicó la codificación one-hot a la columna "main_genre" utilizando la función pd.get_dummies.

In [5]:
modelo_recomendacion = pd.get_dummies(df_recomendacion, columns=["main_genre"])
modelo_recomendacion 

Unnamed: 0,app_name,id,main_genre_Accounting,main_genre_Action,main_genre_Adventure,main_genre_Animation &amp; Modeling,main_genre_Audio Production,main_genre_Casual,main_genre_Design &amp; Illustration,main_genre_Early Access,...,main_genre_RPG,main_genre_Racing,main_genre_Simulation,main_genre_Sin Dato,main_genre_Software Training,main_genre_Sports,main_genre_Strategy,main_genre_Utilities,main_genre_Video Production,main_genre_Web Publishing
0,Lost Summoner Kitty,761140,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Ironbound,643980,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Real Pool 3D - Poolians,670290,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
3,弹炸人2222,767400,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Battle Royale Trainer,772540,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
28829,Kebab it Up!,745400,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
28830,Colony On Mars,773640,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
28831,LOGistICAL: South Africa,733530,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
28832,Russian Roads,610660,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


**Modificación de Nombres de Columnas**

Se realizaron modificaciones en los nombres de las columnas para quitar el prefijo "main_genre_".

In [6]:
modelo_recomendacion.columns = modelo_recomendacion.columns.str.replace("main_genre_", "")
modelo_recomendacion

Unnamed: 0,app_name,id,Accounting,Action,Adventure,Animation &amp; Modeling,Audio Production,Casual,Design &amp; Illustration,Early Access,...,RPG,Racing,Simulation,Sin Dato,Software Training,Sports,Strategy,Utilities,Video Production,Web Publishing
0,Lost Summoner Kitty,761140,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Ironbound,643980,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Real Pool 3D - Poolians,670290,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
3,弹炸人2222,767400,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Battle Royale Trainer,772540,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
28829,Kebab it Up!,745400,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
28830,Colony On Mars,773640,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
28831,LOGistICAL: South Africa,733530,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
28832,Russian Roads,610660,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [7]:
modelo_recomendacion.columns

Index(['app_name', 'id', 'Accounting', 'Action', 'Adventure',
       'Animation &amp; Modeling', 'Audio Production', 'Casual',
       'Design &amp; Illustration', 'Early Access', 'Education',
       'Free to Play', 'Indie', 'Massively Multiplayer', 'Photo Editing',
       'RPG', 'Racing', 'Simulation', 'Sin Dato', 'Software Training',
       'Sports', 'Strategy', 'Utilities', 'Video Production',
       'Web Publishing'],
      dtype='object')

**Exportamos el CSV limpio**

In [None]:
archivo_limpio = "modelo_reco_final.csv"
modelo_recomendacion.to_csv(archivo_limpio, index=False, encoding="utf-8")