Skip to content

datascienceit/challenge-titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Challenge-titanic

This repository is for training future data-scientists in "industry-like" environment.

Instructions

  1. Fork this repository.
  2. Read about the challenge and download the data.
  3. Write code to solve the problem.
  • Use branches (don't work on the GitHub master branch)
  1. Export the notebook to python script and push the notebooks and python script to GitHub.
  2. When having good results, create a pull request.
  3. I will comment on the changes.
  4. We reiterate with the comments until we're good to move forward to the next challenge.

Challenge

https://www.kaggle.com/c/titanic Get above 85% accuracy.

Notes

  • The idea is to write good code which theoretically could be used for future deployments.
  • This project is about training, not just results.
  • Work with branches, not on the master in Github.
  • Use Python, Jupyter, and Turi
  • Always start by splitting the data into three parts: train, validations and test. You can use the test dataset only once! to prevent overfitting.
  • The example code already have issues in it - good luck!
  • Try to coomit every small change to github, instead of big uploads of a lot of code.

Releases

No releases published

Packages

No packages published