Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
exp1
exp2
exp3
exp4
exp5
exp6
README.md
cluster+.py
cluster.py
data_process.py

README.md

Data analysis from Movie Dataset

This is a data analysis demo. Dataset is provided by Grouplens, extracted from the movie website, MovieLens. The dataset contains over 20 million ratings across 27278 movies. Dataset comes from 138493 users between January 09, 1995 and March 31, 2015. In this report, only two datasets involving movie data and user ratings were used.

File description / Usage

The data is publicly available and is not provided here. There are two major files.

- Cluster+.py

This file runs a PCA and K-means cluster for the user dataset within MovieLens.

- data_process.py

This file runs some data cleaning and plotting functions to plot movie and views information for observational purposes.

Part of the code is hardcoded to produce the desired images in the exp*, or experiment folders. This could be further adjusted by writing the main function to pass in different variables and values.

Results

K-means

Remarks

The full analysis can be in my blog post on my personal website

You can’t perform that action at this time.