This project contains three Jupyter Notebook files focusing on different recommendation techniques: Collaborative, Content-Based, and Popularity-Based Filtering
This notebook explores collaborative-based filtering for recommendation systems. It covers the following steps:
- Introduction to Collaborative-Based Filtering: Overview of the technique and its goal to predict user preferences based on similar users' preferences.
- Data Loading: Loading movie ratings data into a DataFrame.
- Dataset Creation: Using Surprise library to create a dataset from the ratings data.
- Trainset Building: Building a training set for the collaborative filtering model.
- Model Training: Training a Singular Value Decomposition (SVD) model.
- Prediction: Making predictions for user ratings.
- Validation: Cross-validation to evaluate the model's performance.
This notebook explores content-based filtering for recommendation systems. It includes the following sections:
- Data Loading: Loading movie data and preparing it for analysis.
- TF-IDF Vectorization: Converting movie overviews into TF-IDF vectors.
- Similarity Matrix: Computing the similarity matrix based on TF-IDF vectors.
- Similarity Search: Finding similar movies based on their overviews.
This notebook demonstrates popularity-based filtering for recommendation systems. It covers the following steps:
- Data Loading: Loading movie data, credits data, and ratings data.
- Minimum Votes Calculation: Determining the minimum number of votes required for consideration.
- Data Filtering: Filtering movies based on the minimum vote count.
- Weighted Rating Calculation: Calculating the weighted rating for each movie.
- Sorting and Cleaning Data: Sorting and cleaning the movie data based on weighted ratings.
Each notebook provides detailed explanations and code implementation for its respective recommendation technique.
- Each notebook provides detailed explanations and code implementation for its respective recommendation technique.
- Utilizes popular libraries such as Surprise for collaborative filtering and scikit-learn for content-based filtering.
- Demonstrates different approaches to recommendation systems, including collaborative, content-based, and popularity-based methods.
- JupyterLab: An interactive development environment for notebooks, code, and data.
- pandas: A data manipulation and analysis library.
- scikit-learn: A machine learning library for predictive data analysis.
- scikit-surprise: A library for building and analyzing recommender systems.
- conda: An open-source package and environment management system.
To use these notebooks, follow these steps:
-
Clone or download this repository to your local machine.
-
Install Anaconda or Miniconda if not already installed. Anaconda can be downloaded from the Anaconda website, and Miniconda can be downloaded from the Miniconda website.
-
Create a Conda environment using the provided
environment.ymlfile. Run the following command in your terminal or command prompt:conda env create -f environment.yml
This will create a new Conda environment named
recommender_systems_envwith all the required packages installed. -
Activate the Conda environment:
conda activate recommender_systems_env
-
Launch Jupyter Lab:
jupyter lab
-
Open the desired notebook from the Jupyter Notebook interface and follow the instructions within each notebook to execute the code and explore the recommendation techniques.
Contributions are welcome! Here are some ways you can contribute to the project:
- Report bugs and issues
- Suggest new features or improvements
- Submit pull requests with bug fixes or enhancements
This project is licensed under the MIT License, which grants permission for free use, modification, distribution, and sublicense of the code, provided that the copyright notice (attributed to emads22) and permission notice are included in all copies or substantial portions of the software. This license is permissive and allows users to utilize the code for both commercial and non-commercial purposes.
Please see the LICENSE file for more details.
