### En este proceso la unión de los datasets y la selección de características numéricas son pasos esenciales para desarrollar un modelo de sistema de recomendación de videojuegos efectivo.

In [1]:
import pandas as pd

### Se cargan los archivos .csv en DataFrames. 

In [2]:
df_games = pd.read_csv('output_steam_games_clean.csv')
df_items = pd.read_csv('australian_users_items_clean.csv')

### Se unen los dos DataFrames por la columna “id” utilizando el método .merge()

In [3]:
df_merged = pd.merge(df_games, df_items, on='id', how='inner')

### Visualización del DataFrame Final.

In [4]:
df_merged.head(3)

Unnamed: 0.1,Unnamed: 0,publisher,genres,title,id,developer,Action,Adventure,Animation &amp; Modeling,Audio Production,...,Strategy,Utilities,Video Production,Web Publishing,year,user_id,items_count,steam_id,playtime_forever,playtime_2weeks
0,88338,Valve,Action,Half-Life,70.0,Valve,1,0,0,0,...,0,0,0,0,1998,kube134,476,76561198031442694,4,0
1,88338,Valve,Action,Half-Life,70.0,Valve,1,0,0,0,...,0,0,0,0,1998,76561198030567998,75,76561198030567998,21,0
2,88338,Valve,Action,Half-Life,70.0,Valve,1,0,0,0,...,0,0,0,0,1998,itslonk,133,76561198027380739,8,0


### Se elimina la columna 'genre' para reducir la dimensionalidad 

In [5]:
df_merged.drop(['genres'],axis=1, inplace=True)

### El DataFrame Final se exporta a un archivo CSV.

In [6]:
df_merged.to_csv('dataset_final.csv', index=False)

### Se carga el dataset australian_user_reviews_clean.csv

In [7]:
df_reviews = pd.read_csv('australian_user_reviews_clean.csv') 

### Se unen ambos DataFrame por la columna 'user_id' para obtener un DataFrame Final.

In [8]:
df_final = pd.merge(df_merged, df_reviews, on = 'user_id') 

In [9]:
df_final.columns

Index(['Unnamed: 0_x', 'publisher', 'title', 'id', 'developer', 'Action',
       'Adventure', 'Animation &amp; Modeling', 'Audio Production', 'Casual',
       'Design &amp; Illustration', 'Early Access', 'Education',
       'Free to Play', 'Indie', 'Massively Multiplayer', 'Photo Editing',
       'RPG', 'Racing', 'Simulation', 'Software Training', 'Sports',
       'Strategy', 'Utilities', 'Video Production', 'Web Publishing', 'year',
       'user_id', 'items_count', 'steam_id', 'playtime_forever',
       'playtime_2weeks', 'Unnamed: 0_y', 'user_url', 'funny', 'last_edited',
       'item_id', 'helpful', 'recommend', 'review', '0', 'posted_year',
       'sentiment_analysis'],
      dtype='object')

### Se eliminan columnas específicas del DataFrame con el objetivo de reducir su tamaño y mejorar su eficiencia.

In [10]:
df_final.drop(columns=['0', 'Unnamed: 0_x', 'Unnamed: 0_y'], inplace=True)

### El DataFrame obtenido de los diferentes procesos se exporta a un archivo CSV.

In [11]:
df_final.to_csv('dataset_final.csv', index=False)

In [12]:
df_final.head(3)

Unnamed: 0,publisher,title,id,developer,Action,Adventure,Animation &amp; Modeling,Audio Production,Casual,Design &amp; Illustration,...,playtime_2weeks,user_url,funny,last_edited,item_id,helpful,recommend,review,posted_year,sentiment_analysis
0,Valve,Half-Life,70.0,Valve,1,0,0,0,0,0,...,0,http://steamcommunity.com/id/kube134,,,251990.0,1 of 1 people (100%) found this review helpful,True,It's good to be a magical queen... if you surv...,2014.0,2
1,Valve,Half-Life,70.0,Valve,1,0,0,0,0,0,...,0,http://steamcommunity.com/profiles/76561198030...,2 people found this review funny,,332800.0,194 of 282 people (69%) found this review helpful,True,10/10 would take kids here for birthday,2014.0,2
2,Valve,Half-Life,70.0,Valve,1,0,0,0,0,0,...,0,http://steamcommunity.com/profiles/76561198030...,,,319630.0,1 of 2 people (50%) found this review helpful,True,"Well for starters, when I write reviews they a...",2015.0,2


### Se crea un DataSet para seleccionar características numéricas relevantes antes de aplicar las técnicas de recomendación.

In [13]:
df_ord = df_final.select_dtypes(include = (int,float))
df_ord.drop(columns=['Action', 'Adventure',
       'Audio Production', 'Casual', 'Design &amp; Illustration',
       'Early Access', 'Education', 'Free to Play', 'Indie', 'Animation &amp; Modeling',
       'Massively Multiplayer', 'Photo Editing', 'RPG', 'Racing', 'Simulation',
       'Software Training', 'Sports', 'Strategy', 'Utilities',
       'Video Production', 'Web Publishing'], inplace=True)

### El DataFrame obtenido de los diferentes procesos se exporta a un archivo CSV.

In [16]:
df_ord.to_csv('dataset_final_ord.csv', index=False)