This repository is about the movie recommender system
Data set:
Dataset is taken from https://grouplens.org/datasets/movielens/
The data set consists of 930 users with more than 1630 movies.
Each movie has 19 features
The data set contains user rated each movie and the features of the movie represented as 0's and 1's
1 represent movie is of the feature type whereas 0 represents movie is not
Approach:
Using Kmeans Clustering and Pearson correlation Similarity the users who has similar interests are identified and movies are
recommended to the users
Test Data:
It consists of 9300 test examples.
Algorithm acheived mean squared error of 1.24154447389
optimising mean squared error:
mean squared error for n_clusters =2 is 1.08155166095 This is the most optimised algorithm so far.
The plot after applying PCA to data and clustering by Kmeans
https://user-images.githubusercontent.com/22453634/31859508-b7b310be-b72a-11e7-91a6-7fdcde97d2e3.png