Video Game Recommendation Model (VGRM)

A video game recommendation model trained using self-curated data from IGDB, making use of their REST API and python. This project is a work in progress, and this repo is meant to document the entire process of creating a machine learning model mostly from scratch, including building our own dataset to be used in training testing, and validating the model.

For this model, we are using a Content-Based Filtering approach, where we feed a model data like video game titles, genres, platforms, etc., rather than feeding it user preference data. This methodology has its pros and cons, but ultimately was chosen due to the relative ease of access to game content data over user preference data. The goal for this model is for a user to provide the name of a video game title, and the model will respond with a short list of titles most similar to it. Obviously, this is quite limited as it doesn't factor in a user's preferences or games they've previously enjoyed. Eventually we will shift our methodology to more of a mixed-filtering Approach, by allowing the user to provide some preferential information which influences the model's recommendations, and present various options and configurations for how to present the recommendations, which may remedy the disadvantages of a solely Content-Based filtering approach.

Current Status:

We are in the process of expanding the game library which will be used in model training. So far, we have created a data pipeline to and from the external data source (IGDB), and defined a procedure for preprocessing the resulting data. We have performed a baseline analysis of the data obtained so far (roughly 5000 titles), identified the most viable features, as well as the different encoding methods we will use in translating this data for the model to use. By the end of this section of the project, we hope to have accrued a dataset of at least 10000 titles, and their various selected features.

Project Outline:

Curating the dataset

define a base set of video game titles to find data for, using a dataset from kaggle here
pull data on each title and store in a csv representing the 'game library'
preprocess/clean the stored data

Feature Selection

perform exploratory analysis on the game library, identify underlying trends in the data
identify best features to utilize (genres, themes, summary, ...)

Feature Encoding

determine best methods for encoding each of the selected features from previous section
choose a similarity metric (cosine similarity)
calculate normalized similarity scores for each feature
determine/fine-tune weights for each feature (how much each feature contributes the the overall similarity score)
aggregate/calculate similarity scores pairwise

Model Instantiation/Training

design the network layout for the model
define process of model training/testing/validation
train model on split datasets based off of game library
test the model (a lot)
examine performance & perform model diagnostic checks

Optimization

using the results of the previous step, identify possible improvements to the model at various stages

Develop a GUI

use pygame + tkinter to develop a front-end interface for inputting data into the model and viewing results

Sources:

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
text_processing		text_processing
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
api_auth_keys.py		api_auth_keys.py
data_curation.ipynb		data_curation.ipynb
data_pull.py		data_pull.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video Game Recommendation Model (VGRM)

Current Status:

Project Outline:

Curating the dataset

Feature Selection

Feature Encoding

Model Instantiation/Training

Optimization

Develop a GUI

About

Uh oh!

Releases

Packages

Languages

m-blair/VideoGameRecommendationModel

Folders and files

Latest commit

History

Repository files navigation

Video Game Recommendation Model (VGRM)

Current Status:

Project Outline:

Curating the dataset

Feature Selection

Feature Encoding

Model Instantiation/Training

Optimization

Develop a GUI

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages