# Video Game Recommender Project

## Part 3: Modeling

In [30]:
import pandas as pd
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.feature_extraction.text import TfidfVectorizer
from scipy import sparse as sp

In [31]:
df = pd.read_csv('./Data/games_clean.csv')

Using a combination of TF-IDF vectorization for game summaries and one-hot encoding for categorical features such as platforms, genres, and game modes, the model creates a comprehensive feature matrix. We stack the features using sp.hstack, which stacks the matrices horizontially, allowing the model to consider both textual similarity from the 'summary' column, and the categorical features. By employing the Nearest Neighbors algorithm, the model efficiently identifies games with similar characteristics to the input title. The model incorporates a filtering mechanism to exclude games with titles resembling the input, ensuring a diverse and relevant set of recommendations.

In [32]:
# vectorize summary feature
tfidf_vectorizer = TfidfVectorizer()
tfidf_matrix = tfidf_vectorizer.fit_transform(df['summary'])

# one-hot encode categorical features (platforms, genres, game_modes)
mlb = MultiLabelBinarizer()
platforms_encoded = mlb.fit_transform(df['platforms'])
genres_encoded = mlb.fit_transform(df['genres'])
game_modes_encoded = mlb.fit_transform(df['game_modes'])

# combine the features using sp.hstack so we can put the arrays/matrices together
combined_features = sp.hstack((tfidf_matrix, platforms_encoded, genres_encoded, game_modes_encoded), format='csr')

# nearest neighbors model
nn_model = NearestNeighbors(n_neighbors=10, algorithm='auto') #auto, scikit learn choses the algo
nn_model.fit(combined_features)

In [35]:
def recommend_game(game_title, k=10):
    game_title = game_title.lower() # convert game title to lower case so that casing will not matter when inputing game
    # find the index of the input game
    game_index = df[df['name'].str.lower() == game_title].index[0]
    
    # get the combined feature vector for the input game
    input_features = sp.hstack((tfidf_vectorizer.transform([df.iloc[game_index]['summary']]), 
                                platforms_encoded[game_index], 
                                genres_encoded[game_index], 
                                game_modes_encoded[game_index]), format='csr')
    
    # lets say input is call of duty, game was recommending different call of duty's, 
    # so i need more k's so i can eliminate the ones with similar names in my list if needed, and still get 10 recs.
    n_neighbors = max(2 * k, 100)  
    
    # find the nearest neighbors
    distances, indices = nn_model.kneighbors(input_features, n_neighbors=n_neighbors)
    
    # list for recs.
    base_game_recommendations = {}
    
    # go through the list 
    for dist, idx in zip(distances.squeeze(), indices.squeeze()):
        game_name = df.iloc[idx]['name']
        
        base_name = game_name.split(':')[0].strip().lower()
        
        # to exclude games with similar titles
        if game_title.lower() not in game_name.lower() and game_name not in base_game_recommendations.values():
            base_game_recommendations[base_name] = {
                'name': game_name,
                'url': df.iloc[idx]['url']
            }
            
            
    
    recommended_games = list(base_game_recommendations.values())[:k]  # return k unique recommendations
    
    print(f"Top {k} recommended games for '{game_title}':")
    for i, game in enumerate(recommended_games):
        print(f"{i+1}. {game['name']} {game['url']}")

In [40]:
recommend_game('clash of clans')

Top 10 recommended games for 'clash of clans':
1. Flame VS Blaze https://www.igdb.com/games/flame-vs-blaze
2. Hero Royale https://www.igdb.com/games/hero-royale
3. Magic: ManaStrike https://www.igdb.com/games/magic-manastrike
4. Dungeon Keeper https://www.igdb.com/games/dungeon-keeper--1
5. Army Men Strike: Toy Wars https://www.igdb.com/games/army-men-strike-toy-wars
6. Might & Magic Heroes: Era of Chaos https://www.igdb.com/games/might-and-magic-heroes-era-of-chaos
7. Mutant Forge https://www.igdb.com/games/mutant-forge--1
8. Servant of Thrones https://www.igdb.com/games/servant-of-thrones
9. Fantasy Stars: Battle Arena https://www.igdb.com/games/fantasy-stars-battle-arena
10. Rush Wars https://www.igdb.com/games/rush-wars


In [26]:
print(platforms_encoded[1].shape)
print(genres_encoded[1].shape)
print(game_modes_encoded[1].shape)

(70,)
(46,)
(26,)


In [29]:
print(tfidf_matrix[1].shape)

(1, 102781)
