A case study for predicting the tips in the New York City taxis
Switch branches/tags
Nothing to show
Clone or download



First of all, I must say thank you to Chris Whong for obtaining the incredible dataset that it's used in this project. Here is his little odyssey to get the data.



After cleaning and getting a sample from the original dataset, it's possible to predict, with an accuracy of 71.74%, if the tip of a trip in a NYC taxi it's going to be less than 20% or greater than or equal to 20% of the charge, without the possibility to use information about the passengers, a essential data for trying to accomplish this task.

Extended version

For read an extended version there are some IPython notebooks that describe the complete process. You can find them in this repo, but for a better reading use this nbviewer link.