Seoul-Bike-sharing-demand-prediction-prajwal

Problem statement :

Currently Rental bikes are introduced in many urban cities for the enhancement of mobility comfort. It is important to make the rental bike available and accessible to the public at the right time as it lessens the waiting time. Eventually, providing the city with a stable supply of rental bikes becomes a major concern. The crucial part is the prediction of bike count required at each hour for the stable supply of rental bikes.

Data Description :

Date : year-month-day
Rented Bike count - Count of bikes rented at each hour
Hour - Hour of he day
Temperature-Temperature in Celsius
Humidity - %
Windspeed - m/s
Visibility - 10m
Dew point temperature - Celsius
Solar radiation - MJ/m2
Rainfall - mm
Snowfall - cm
Seasons - Winter, Spring, Summer, Autumn
Holiday - Holiday/No holiday
Functional Day - NoFunc(Non Functional Hours), Fun(Functional hours)

Variables description :

EDA : Exploratory data analysis

Checking data distribution using distplot and boxplot.

And checking outliers in our dataset.

Creating some new features called weekend and timeshift.

Dropping un-wanted features which is not necessary.

Doing Label encoding on categorical variables. Seasons, Holiday, Functioning Day, timeshift.

Ploting all independent values with respect to Rental Bike Count using regplot.

Checking multicollinearity Heatmap using seaborn libraries.

Checking VIF score for all independent values. And, removing all where VIF score is higher.

And, again ploting the heatmap with our dependent variable.

Model building part

Know let's move to model building part

before moving on model building part we has to scale our dataset. Here we use MinMaxScaler()

Defining a function to train the input model and print evaluation matrix. In the function we Fitting model to test multiple algorithms to find which algorithms gives best result write evaluation matrix which contains mse,rmse,r2,etc finally we plot the result and return.

Providing the range of values for hyperparameters such as n_estimators, max_depth, min_sample_split, min_sample_leaf, eta.

Building some linear regression model to test which model gives better results.

Linear Regression
Decision Tree Regressor
Random Forest Regressor
XGBoost Regressor
GradientBoosting Regressor

We observed following results after completing the task:

Functioning day is the most influencing feature and temperature is at the second place for LinearRegressor.
Temperature is the most important feature for DecisionTree, RandomForest and GradientBoosting Regressor.
Functioning day is the most important feature and Winter is the second most for XGBoostRegressor.
RMSE Comparisons :

LinearRegressor RMSE : 370.46
DecissionTreeRegressor RMSE : 302.53
RandomForestRegressor RMSE : 290.02
XGBoostRegressor RMSE : 242.72
GradientBoostingRegressor RMSE : 248.18

The feature temperature is on the top list for all the regressors except XGBoost.
XGBoost is acting different from all the regressors as it is considering whether it is winter or not. And is it a working day or not. Though winter is also a function of temperature only but it seems this trick of XGBoost is giving better results.
XGBoostRegressor has the Least Root Mean Squared Error. So It can be considered as the best model for given problem.

XGBoost is giving the best accuracy of around 82%

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
BIKE SHARING DEMAND PREDICTION DOC.pdf		BIKE SHARING DEMAND PREDICTION DOC.pdf
Bike sharing demand prediction ppt.pdf		Bike sharing demand prediction ppt.pdf
Final_Bike_Sharing_Demand_Prediction_Capstone_Project.ipynb		Final_Bike_Sharing_Demand_Prediction_Capstone_Project.ipynb
README.md		README.md
SeoulBikeData.csv		SeoulBikeData.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seoul-Bike-sharing-demand-prediction-prajwal

Problem statement :

Data Description :

EDA : Exploratory data analysis

Model building part

We observed following results after completing the task:

About

Releases

Packages

Languages

prajwalDU/Bike-sharing-demand-prdiction-prajwal

Folders and files

Latest commit

History

Repository files navigation

Seoul-Bike-sharing-demand-prediction-prajwal

Problem statement :

Data Description :

EDA : Exploratory data analysis

Model building part

We observed following results after completing the task:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages