# Guru : Game Recommendation System

#### 추천 시스템에 사용할 주요 피처
- id: 고유 식별자
- name: 게임 이름
- description: 게임 설명
- released: 출시일
- rating: 사용자 평점
- genre: 장르
- tags: 태그
- developers: 개발자
- publishers: 배급사

#### 필터링에 사용할 주요 피처
- genre: 장르
- released: 출시일 (특정 연도 필터링)
- developers: 개발자
- publishers: 배급사
- rating: 평점 (평점 기준 필터링)
- mode: 게임 모드 (예: 싱글 플레이어, 멀티 플레이어)
- platforms: 플랫폼 (예: PC, 콘솔)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.neighbors import NearestNeighbors
from sklearn.model_selection import train_test_split
from sklearn.metrics import average_precision_score
import os

print(os.getcwd())

/Users/AIFFELthon/final/code/modelling


## 데이터 로드

In [2]:
df = pd.read_csv('/Users/AIFFELthon/final/data/modified_data_03.csv')
df.head()

Unnamed: 0,id,slug,name,description,released,status,tba,background_image,website,rating,...,mode,developers,publishers,requirements,added_status_yet,added_status_owned,added_status_beaten,added_status_toplay,added_status_dropped,added_status_playing
0,741344,peace-angel,Peace Angel,２０２０年度１年生特進クラス 中村 桃香さんの作品です。天使を操作し、悪魔から死者を守りつつ...,2022-02-14,Released,False,https://media.rawg.io/media/screenshots/415/41...,,0.0,...,,神戸電子ゲームソフト分野,,{},0.0,0.0,0.0,0.0,0.0,0.0
1,374441,brawl-planet,Brawl Planet,Eres un comandante al mando de la nave inteles...,2019-09-09,Released,False,https://media.rawg.io/media/screenshots/bd6/bd...,,0.0,...,Singleplayer,AlexisBot,,{},0.0,0.0,0.0,0.0,0.0,0.0
2,97470,obelus-arcade-boss-rush,OBELUS - Arcade Boss Rush,"In OBELUS, a bold robot battles three gargantu...",2018-05-22,Released,False,https://media.rawg.io/media/screenshots/736/73...,,0.0,...,Boss Rush,"3xBlast,BlauwPrint",,{},0.0,0.0,0.0,0.0,0.0,0.0
3,306287,pimple-popper-lite,Pimple Popper Lite,"Hello, you! We know you're itching for some fi...",2009-10-12,Released,False,https://media.rawg.io/media/screenshots/be3/be...,http://www.roomcandygames.com,0.0,...,,Room Candy Games,Room Candy Games,"{'minimum': 'iPad 2 Wifi, iPad 2 3G, iPhone 4S...",0.0,0.0,0.0,0.0,0.0,0.0
4,176964,square-square,SQUARE SQUARE,Left/right arrows - moveUp - restartClick on t...,2016-04-07,Released,False,https://media.rawg.io/media/screenshots/f26/f2...,,0.0,...,,Dmitry Degtyarev,,{},0.0,0.0,0.0,0.0,0.0,0.0


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 550564 entries, 0 to 550563
Data columns (total 36 columns):
 #   Column                Non-Null Count   Dtype  
---  ------                --------------   -----  
 0   id                    550564 non-null  int64  
 1   slug                  550562 non-null  object 
 2   name                  550561 non-null  object 
 3   description           550564 non-null  object 
 4   released              550564 non-null  object 
 5   status                550564 non-null  object 
 6   tba                   550564 non-null  bool   
 7   background_image      550564 non-null  object 
 8   website               550564 non-null  object 
 9   rating                550564 non-null  float64
 10  playtime              550564 non-null  float64
 11  achievements_count    550564 non-null  float64
 12  reddit_count          550564 non-null  float64
 13  twitch_count          550564 non-null  float64
 14  youtube_count         550564 non-null  float64
 15  

In [4]:
df.isna().sum()

id                      0
slug                    2
name                    3
description             0
released                0
status                  0
tba                     0
background_image        0
website                 0
rating                  0
playtime                0
achievements_count      0
reddit_count            0
twitch_count            0
youtube_count           0
reviews_text_count      0
ratings_count           0
suggestions_count       0
additions_count         0
game_series_count       0
series_game             0
esrb_rating             0
platforms               9
genre                   0
theme                   0
tags                    0
mode                    0
developers              0
publishers              0
requirements            0
added_status_yet        0
added_status_owned      0
added_status_beaten     0
added_status_toplay     0
added_status_dropped    0
added_status_playing    0
dtype: int64

### 데이터 전처리

In [5]:
df.loc[df['id'] == 119609, 'slug'] = 'nan'
df.loc[df['id'] == 119609, 'name'] = 'NaN'
df.loc[df['id'] == 100122, 'slug'] = 'null'
df.loc[df['id'] == 100122, 'name'] = 'NULL'
df.loc[df['id'] == 468408, 'slug'] = 'none'
df.loc[df['id'] == 468408, 'name'] = 'None'