Udacity Capstone project - Recommender system

Motivation

In this project, i will build a recommender system web-app to make recommendation for user. I recommend for user the items they may like and a recommender system to find out which item is similar to this item.

Project Description

In this project, I build a application about movie recommender system with Movielens 1M dataset. For more information about the dataset, please check here. I use Alternating Least Square which has been implemented in implicit for user recommendation and a Content-based module for related item recommendation. I also implement a simple web-app to perform recommendation for the user or the item (model serving).

This project contains four steps of ML pipeline:

ETL: Clean data and save cleaned data to file and database.
Feature engineering: Transform feature to meet model fitting.
Modelling: Build a Machine learning pipeline to feature engineering and train ML model.
Model serving: Build Flask web app to predict user's input query.

Installation

The code was implemented in Python 3.9. All necessary package was contained in requirements.txt file.

For quick installation:

pip install -r requirements.txt

EDA

The Movielens 1M dataset contains information about history of user and movie's profile. To see the EDA of Movielens 1M, please go to this notebook

Methodology

Model

Alternating Leasts Squares (ALS): An approach of matrix factorization. this model try to decompose rating matrix into two factos matrix
Content-based (CB): A content based approach use cosine similarity to find most similar item.

With ALS model, i use implementation from implicit for better performance. With CB model, i implement my own and try to combine multiple of data type. My CB implementation can handle multiple of content with data type can be list, category or text. Final similarity of pair items is average of all features input. Code of this implementation you can find here

Metrics

In this project, i only implement evaluation for ALS. To evaluate ALS, I use 3 metrics: RMSE, MAP@k and P@k.

RMSE (Root Mean Squares Error): the differences between predicted rating and true rating.
P@k (Precision at k): Precision of recommendation with top k result.
MAP@k (Mean average precision at k): Mean of P@k with all users.

Results

The result of ALS for Movielens 1M. You can view detail at here

Factors	RMSE	MAP@k	P@k
10	3.21	0.10	0.206
30	3.18	0.122	0.244
50	3.18	0.128	0.246
100	3.214	0.121	0.234
300	3.42	0.087	0.171
1000	3.67	0.0364	0.0733

Instructions

ETL pipeline

We need to pre-processing for user's dataset and item's dataset. To run ETL pipeline for clean user dataset, run the code below:

python movielens_rating_etl.py

To run ETL pipeline for clean item's dataset:

python movielens_meta_etl.py

To build and train model

We have 2 model is ALS and ContentBased. To run ALS model

python als.py

To run ContentBased model

python cb.py

Run web app

Run the code below to start the web app at localhost

python run.py

And go to http://localhost:3000 to see the web app

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
notebooks		notebooks
raw		raw
recommend		recommend
static		static
templates		templates
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
als.py		als.py
app.py		app.py
cb.py		cb.py
movielens_meta_etl.py		movielens_meta_etl.py
movielens_rating_etl.py		movielens_rating_etl.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity Capstone project - Recommender system

Motivation

Table of Contents

Project Description

Installation

EDA

Methodology

Model

Metrics

Results

Instructions

About

Releases

Packages

Languages

winterlovet44/capstone-recommend

Folders and files

Latest commit

History

Repository files navigation

Udacity Capstone project - Recommender system

Motivation

Table of Contents

Project Description

Installation

EDA

Methodology

Model

Metrics

Results

Instructions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages