Skip to content

llSourcell/visualize_dataset_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

#How to Best Visualize a Dataset Easily

#Overview

This is the code for this video by Siraj Raval on Youtube. The human activities dataset contains 5 classes (sitting-down, standing-up, standing, walking, and sitting) collected on 8 hours of activities of 4 healthy subjects. The data set is downloaded from here. This code downloads the dataset, cleans it, creates feature vectors, then uses T-SNE to reduce the dimensionality of the feature vectors to just 2. Then, we use matplotlib to visualize the data.

##Dependencies

Install dependencies via 'pip install'. (i.e pip install pandas).

Note** updated dataset is here if the other link is broken http://rstudio-pubs-static.s3.amazonaws.com/19668_2a08e88c36ab4b47876a589bb1d61c37.html

##Usage

To run this code, just run the following in terminal:

python data_visualization.py

##Challenge

The challenge for this video is to visualize this Game of Thrones dataset. Use T-SNE to lower the dimensionality of the data and plot it using matplotlib. In your README, write our 2-3 sentences of something you discovered about the data after visualizing it. This will be great practice in understanding why dimensionality reduction is so important and analyzing data visually.

##Due Date is December 29th 2016

##Credits

The credits for this code go to Yifeng-He. I've merely created a wrapper around the code to make it easy for people to get started.

About

This is the code for the "How to Best Visualize a Dataset Easily" Siraj Raval on Youtube

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages