This is the code for"The best way to prepare a dataset easily " by @Sirajology on Youtube
Switch branches/tags
Nothing to show
Clone or download
Latest commit 65068b2 Dec 17, 2016
Failed to load latest commit information.
simdata Delete saturn_data_train.jpg Dec 16, 2016
LICENSE first Dec 16, 2016 Update Dec 16, 2016 Update Dec 16, 2016

#Prepare Dataset Challenge


This is the code for this video by Siraj on Youtube. The brainscan dataset is entirely fictional, but serves as a good example on how to prepare a dataset. Real examples do exist but, too many features to sift through for a short video.



Run the following in terminal

$ python --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv --num_epochs 5 --verbose True

Add your own test data to test the model out.

##Challenge The challenge for this video is to create a pokemon classifier by their type 1 (i.e fire, water, grass, etc.) using this pokemon dataset on Kaggle. It will be great practice in data preparation (feature selection, cleaning, etc.) Post your github link in the comments and i'll announce the winner in the next video. Due date is December 22nd at Noon PST.


Credits go to Jason Baldridge. I've merely created a wrapper to get people started.