🎮 Game Recommendation Engine 🚀

Discover your next favorite Steam game! This project leverages the power of Natural Language Processing (NLP) to provide personalized game recommendations. By analyzing game descriptions with BERT word embeddings and uncovering thematic topics from user reviews using Latent Dirichlet Allocation (LDA), our system intelligently matches games to your unique preferences.

📜 Table of Contents

🌟 Overview
🛠️ Technologies Powering the Engine
⚙️ How It Works: The Journey from Data to Recommendation
🏁 Conclusion & Future Horizons

🌟 Overview

The Steamlit-main project is an intelligent recommendation system designed to navigate the vast world of Steam games. It uniquely combines:

Topic Modeling (LDA): To understand the underlying themes and genres discussed in game reviews.
Contextual Word Embeddings (BERT): To grasp the nuanced meaning of game descriptions.

This dual approach allows the system to recommend games based on a textual description you provide, going beyond simple keyword matching to understand the essence of what you're looking for.

🛠️ Technologies Powering the Engine

This project is built with a robust stack of modern data science and web technologies:

Python: The core programming language.
Streamlit: For crafting the interactive web interface.
Pandas & NumPy: For efficient data manipulation and numerical operations.
scikit-learn: For machine learning utilities and algorithms.
Hugging Face Transformers: Providing the pre-trained BERT model for cutting-edge embeddings.
Latent Dirichlet Allocation (LDA): Implemented for sophisticated topic modeling.
SQLite: For lightweight and persistent data storage.

⚙️ How It Works: The Journey from Data to Recommendation

The recommendation process unfolds in several key stages:

1. Data Extraction & Preparation

Source: Game details (descriptions, metadata) and user reviews are meticulously extracted using the Steam Web API.
Storage: Raw data is cleaned and organized into an SQLite database, creating a structured foundation for analysis.

2. Exploratory Data Analysis (EDA)

Before model building, a thorough EDA ensures data quality and uncovers insights:

Filtering: Non-game items (soundtracks, DLCs, demos) are pruned to focus on core game titles.
Analysis: Distributions of game tags, genres, and user reviews are examined.
Visualization: Relationships between features are visualized to better understand the dataset's characteristics.

3. Unveiling Game DNA: LDA & BERT

Two powerful NLP techniques work in tandem to understand each game:

Latent Dirichlet Allocation (LDA) for Topic Modeling

LDA sifts through user reviews to identify latent topics.

Core Idea: Assumes each review is a mix of topics, and each word contributes to one of those topics.
Output: Generates a topic distribution for each game, summarizing the key themes discussed by players.

BERT Embeddings for Semantic Understanding

BERT (Bidirectional Encoder Representations from Transformers) converts game descriptions into rich numerical representations (embeddings).

Process: Descriptions are tokenized and processed by BERT to produce dense vectors.
Benefit: These embeddings capture deep semantic meaning, allowing the system to understand context and similarity beyond keywords.

4. Finding Your Match: Cosine Similarity

To find games that resonate with your input, we use Cosine Similarity.

Mathematics: It measures the cosine of the angle between two vectors. A smaller angle (cosine closer to 1) means higher similarity.
```
cosine_similarity(A, B) = (A · B) / (||A|| ||B||)
```
Application: Calculates the similarity between your input description's embedding and the combined (description + topic) embeddings of all games in our database.

5. Bringing It All Together: Implementation Highlights

The system is implemented through a streamlined pipeline:

Data Foundation: Extract, clean, and store game data from the Steam API into SQLite.
Topic Insights: Apply LDA to reviews to generate topic profiles for each game.
Semantic Understanding: Use BERT to create embeddings from game descriptions.
Unified Feature Matrix: Combine LDA topic vectors and BERT embeddings into a comprehensive feature set for each game.
Interactive Recommendations: A Streamlit app takes your textual game preferences, computes cosine similarity against the feature matrix, and presents the top matching games.

🏁 Conclusion & Future Horizons

The Steamlit-main project successfully demonstrates how combining classical topic modeling (LDA) with state-of-the-art transformer models (BERT) can create a nuanced and effective game recommendation system. It offers users a powerful way to discover games based on rich, descriptive input.

Future enhancements could include:

Incorporating more diverse data sources (e.g., user tags, gameplay statistics).
Exploring more advanced hybrid recommendation algorithms.
Personalizing recommendations based on individual user history.

Thank you for exploring the Game Recommendation Engine!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.streamlit		.streamlit
Data		Data
notebooks		notebooks
.gitattributes		.gitattributes
.gitignore		.gitignore
EDA.py		EDA.py
LICENSE		LICENSE
README.md		README.md
fetch_data.log		fetch_data.log
fetch_data.py		fetch_data.py
preprocess.py		preprocess.py
processed_ids.txt		processed_ids.txt
requirements.txt		requirements.txt
setup_db.py		setup_db.py
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎮 Game Recommendation Engine 🚀

📜 Table of Contents

🌟 Overview

🛠️ Technologies Powering the Engine

⚙️ How It Works: The Journey from Data to Recommendation

1. Data Extraction & Preparation

2. Exploratory Data Analysis (EDA)

3. Unveiling Game DNA: LDA & BERT

Latent Dirichlet Allocation (LDA) for Topic Modeling

BERT Embeddings for Semantic Understanding

4. Finding Your Match: Cosine Similarity

5. Bringing It All Together: Implementation Highlights

🏁 Conclusion & Future Horizons

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎮 Game Recommendation Engine 🚀

📜 Table of Contents

🌟 Overview

🛠️ Technologies Powering the Engine

⚙️ How It Works: The Journey from Data to Recommendation

1. Data Extraction & Preparation

2. Exploratory Data Analysis (EDA)

3. Unveiling Game DNA: LDA & BERT

Latent Dirichlet Allocation (LDA) for Topic Modeling

BERT Embeddings for Semantic Understanding

4. Finding Your Match: Cosine Similarity

5. Bringing It All Together: Implementation Highlights

🏁 Conclusion & Future Horizons

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages