<a href="https://colab.research.google.com/github/SamuelWanjiru/Bike-sharing-forecast/blob/main/BikeSharing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  **Bike Sharing Washington DC 🚲** 
---
## **Context**
Climate change is forcing cities to re-imaging their transportation infrastructure. Shared mobility concepts, such as car sharing, bike sharing or scooter sharing become more and more popular.
And if they are implemented well, they can actually contribute to mitigating climate change. Bike sharing in particular is interesting because no electricity of gasoline is necessary (unless e-bikes are used) for this mode of transportation. However, there are inherent problems to this type of shared mobility:
*   varying demand at bike sharing stations needs to be balanced to avoid oversupply or shortages
*   Heavily used bikes break down more often

Forecasting the future demand can help address those issues. Moreover, demand forecasts can help operators decide whether to expand the business, determine adequate prices and generate additional income through advertisements at particularly busy stations.
But that's not all. Another challenge is redistributing bikes between stations and determining the optimal routes. And determining the location of new stations is also an area of interest for operators.

## **Content**
This dataset can be used to forecast demand to avoid oversupply and shortages. It spans from January 1, 2011, until December 31, 2018. Determining new station locations, analyzing movement patterns or planning routes will only be possible with additional data.

## **Connecting/mounting the google drive**

In [1]:
from google.colab import drive 
drive.mount('/content/gdrive')

Mounted at /content/gdrive


## **Importing the relevant data analysis libraries**

In [4]:
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sb
import math
from scipy.stats import kruskal, pearsonr, randint, uniform, chi2_contingency, boxcox

### Loading the dataset from google drive

In [6]:
bike_data=pd.read_csv(r'/content/gdrive/My Drive/KAGGLE PROJECTS/Bike Sharing Washington DC/bike_sharing_dataset.csv',parse_dates=['date'])

### Understanding the data

In [7]:
# Displaying the 1st 5 rows of the bike dataset
bike_data.head()

Unnamed: 0,date,temp_avg,temp_min,temp_max,temp_observ,precip,wind,wt_fog,wt_heavy_fog,wt_thunder,...,wt_freeze_rain,wt_snow,wt_ground_fog,wt_ice_fog,wt_freeze_drizzle,wt_unknown,casual,registered,total_cust,holiday
0,2011-01-01,,-1.566667,11.973333,2.772727,0.069333,2.575,1.0,,,...,,,,,,,330.0,629.0,959.0,
1,2011-01-02,,0.88,13.806667,7.327273,1.037349,3.925,1.0,1.0,,...,,,,,,,130.0,651.0,781.0,
2,2011-01-03,,-3.442857,7.464286,-3.06,1.878824,3.625,,,,...,,,,,,,120.0,1181.0,1301.0,
3,2011-01-04,,-5.957143,4.642857,-3.1,0.0,1.8,,,,...,,,,,,,107.0,1429.0,1536.0,
4,2011-01-05,,-4.293333,6.113333,-1.772727,0.0,2.95,,,,...,,,,,,,82.0,1489.0,1571.0,
