Python-Movie-Recommender Author: Amir Ali

Built using Python 2.7.13 | with Python Anaconda 4.4.0 (64-bit) IDE Required Python Modules = [pandas, numpy, scipy, sklearn, sqlite3, matplotlib ]

The data used is provided by GroupLens Research Project at the University of Minnesota. Downloaded data from https://grouplens.org/datasets/movielens/

There are 3 data sets used:

100k - Consists of 100,000 ratings (1-5) from 943 users on 1682 movies
1m - Consists of 1,000,209 anonymous ratings of approximately 3,900 movies made by 6,040 MovieLens users
20m - Consists of 20000263 ratings and 465564 tag applications across 27278 movies **See the README.txt file in each data set folder for more details.

There are 3 python recommenders:

recommender.py - Uses the 100k data set. Baseline recommender using collaborative filtering.

Notes:

There are a lot of design choices when making a recommendation system. I have started the Coursera Specialization in Recommender Systems to get a better understanding of the pros and cons each design choice.

Journal

Day1: -import data into python. Started with the 100k dataset. -learned pandas and numpy libraries. Read documentations. -created a user item matrix using pandas and numpy libraries.
-implemented most popular movie recommender

Day 2: -Implemented and tested pearson correlation measure method -Implemented collabrotive filtering both user and item based

Day 3: -Added metrics, evaluation and testing -will add accuracy and error measures: MAE, RMSE and MSE -Setting up the evaluation methods now is necessary for tuning and optimizing the recommender.

Day 4: -Learned/Reviewed SQLite commands -Created a database and read the data using python -Added to test cases for the first recommender. -Added proper python method comments. -Tested item and user CF methods with a hand made small data set.

Day 5: -Added SVD method. Normalized the matrix before factorization and calcualted the RMSE -based on the test set. -Transitioned to the 20 milllion ratings dataset. -Will implement stochastic gradient descent to estimate the SVD for larger matrices.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
database		database
datasets/100k		datasets/100k
models		models
tests		tests
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python-Movie-Recommender Author: Amir Ali

Built using Python 2.7.13 | with Python Anaconda 4.4.0 (64-bit) IDE Required Python Modules = [pandas, numpy, scipy, sklearn, sqlite3, matplotlib ]

About

Uh oh!

Releases

Packages

Languages

AmirSafi/Python-Movie-Recommender

Folders and files

Latest commit

History

Repository files navigation

Python-Movie-Recommender Author: Amir Ali

Built using Python 2.7.13 | with Python Anaconda 4.4.0 (64-bit) IDE Required Python Modules = [pandas, numpy, scipy, sklearn, sqlite3, matplotlib ]

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages