Data analysis from Movie Dataset
This is a data analysis demo. Dataset is provided by Grouplens, extracted from the movie website, MovieLens. The dataset contains over 20 million ratings across 27278 movies. Dataset comes from 138493 users between January 09, 1995 and March 31, 2015. In this report, only two datasets involving movie data and user ratings were used.
File description / Usage
The data is publicly available and is not provided here. There are two major files.
This file runs a PCA and K-means cluster for the user dataset within MovieLens.
This file runs some data cleaning and plotting functions to plot movie and views information for observational purposes.
Part of the code is hardcoded to produce the desired images in the exp*, or experiment folders. This could be further adjusted by writing the main function to pass in different variables and values.
The full analysis can be in my blog post on my personal website