Skip to content

Python / Kaggle / Regression - Predict the trip duration of NYC Taxi trip using location data with regression algorithms.

Notifications You must be signed in to change notification settings

SebastienPavot/Kaggle-NYC-Taxi-Trip-Duration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predict taxi trip duration of New York City - Kaggle Competition

In this Kaggle competition, the goal was to create a model able to predict the trip duration of New York City taxi trip. Within this Jupyter Notebook, we run an A to Z data science project. By in a first time analyzing the data and identify potential bias in the data. Then processing and cleaning the data. We also compute new features using features engineering process and finally add some more variables using open data. Finally, we created regression algorithms and optimize their hyperparameters. We finally reached a Log RMSE of 0.48757 with a Light GBM model.

Competition link: https://www.kaggle.com/c/nyc-taxi-trip-duration/overview

image

Example of features engineering we proceeded by creating neighborhoods of New York city using Kmeans algorithms

About

Python / Kaggle / Regression - Predict the trip duration of NYC Taxi trip using location data with regression algorithms.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages