Skip to content
This is the code for the "How to Best Visualize a Dataset Easily" Siraj Raval on Youtube
Branch: master
Clone or download
Latest commit 7f5a1d1 Feb 10, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
data_visualization.py Update data_visualization.py Dec 24, 2016
output_tSNE_visualization.jpeg first Dec 24, 2016

README.md

#How to Best Visualize a Dataset Easily

#Overview

This is the code for this video by Siraj Raval on Youtube. The human activities dataset contains 5 classes (sitting-down, standing-up, standing, walking, and sitting) collected on 8 hours of activities of 4 healthy subjects. The data set is downloaded from here. This code downloads the dataset, cleans it, creates feature vectors, then uses T-SNE to reduce the dimensionality of the feature vectors to just 2. Then, we use matplotlib to visualize the data.

##Dependencies

Install dependencies via 'pip install'. (i.e pip install pandas).

Note** updated dataset is here if the other link is broken http://rstudio-pubs-static.s3.amazonaws.com/19668_2a08e88c36ab4b47876a589bb1d61c37.html

##Usage

To run this code, just run the following in terminal:

python data_visualization.py

##Challenge

The challenge for this video is to visualize this Game of Thrones dataset. Use T-SNE to lower the dimensionality of the data and plot it using matplotlib. In your README, write our 2-3 sentences of something you discovered about the data after visualizing it. This will be great practice in understanding why dimensionality reduction is so important and analyzing data visually.

##Due Date is December 29th 2016

##Credits

The credits for this code go to Yifeng-He. I've merely created a wrapper around the code to make it easy for people to get started.

You can’t perform that action at this time.