In this case study, I solve a Time Series and Regression Problem to predict the demand of Yellow Taxis in the New York City in a interval of 10 minute Requirements.
The data has been collected from http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.
How it has been done?
Step 1 : As I used Google colab for this case study, I Mounted the drive localy. After loading the data I analyzed data
Step 2 : After analysising the data I perform some data cleaning Step 3 : Applied Baseline models like Moving average, weighted moving average etc. and find the corresponding MAPE.
Step 4 : Apply Baseline models like Moving average, weighted moving average etc. and find the corresponding MAPE.
Step 5 : Split data into Test and Train data.
Step 6 : Apply Linear Regression, Random Forest Classifier and XGBoost and found the corresponding MAPE.
Step 7 : Applied Holt-Winters Forecasting a.k.a triple exponential smoothing to reduce MAPE to <12%
We can see that after adding Holt-Winters Forecasting a.k.a triple exponential smoothing and some features our best mean absolute percentage error reduced to 9.48 for XGBOOST Regressor and 9.48 for XGBOOST Regressor.