# Bike Sharing 

 <p align = "center">
<img src = "https://www.anylogic.com/upload/blog/c67/c671ba340460d0018a250d6c17fe8ad1.jpg">
</p>
<p align = "center">
Fig - <a href="https://www.anylogic.com/upload/blog/c67/c671ba340460d0018a250d6c17fe8ad1.jpg">Image source.</a>
</p>

#### **The aim of this post is to answer the following questions:**

1. Is there a relation between count of rents and temperature?
2. Is there a relation between count of rents and humidity?
3. How does the weather affect renting a bike?
4. In which season the number of the rents increased?


In [None]:
import json
import pandas as pd
import zipfile
import seaborn as sns
import matplotlib.pyplot as plt


## Load data from kaggle 

In [None]:

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


## Load dataset
[Bike Sharing in Washington D.C. Dataset](https://www.kaggle.com/marklvl/bike-sharing-dataset?select=hour.csv)

In [None]:
df = pd.read_csv('../input/bike-sharing-dataset/day.csv') 

## Exploratory Data Analysis (EDA)

In [None]:
df.head()

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
# convert the 'dteday' column to datetime format
df['dteday']= pd.to_datetime(df['dteday'])
 
# Check the format of 'dteday' column
df.info()

In [None]:
# create correlation matrix using heatmap
fig, ax = plt.subplots(figsize=(16,9)) # create figure with figsize in inches
sns.heatmap(df.corr(), annot=True, linewidths=.8, ax=ax);


##### From the above heatmap can summaries that:
- The **`casual`** has a strong negative correlation with **`workingday`** and strong positive correlation with **`cnt`**.
- There is negative correlation between **`workingday`** and **`holiday`**. And that we expected.
- Relation between **`temp`** and **`cnt`** is postive.

#### **Is there a relation between count of rents and temperature?**

In [None]:
fig, ax = plt.subplots(figsize=(8,6)) # create figure with size (8,6)

g1 = sns.scatterplot(data = df, x='cnt',y='temp',hue='season') # create scatter plot 
sns.despine(top=True, right=True, left=False, bottom=False) # remove border
plt.legend(title='Season', loc='best', labels=['Springer', 'Summer', 'Fall', 'Winter'], frameon=False) # add legend labels

# axis lablel 
plt.xlabel("Count of rents")
plt.ylabel("Normalized temperature")

# plot title 
# add the title, title size, and the distance between title and plot
plt.title("The relation between count of rents and temperature", size=14, y=1.12) 

plt.show(g1) # show the scatter plot

#### **Is there a relation between count of rents and humidity?**

In [None]:
fig, ax = plt.subplots(figsize=(8,6)) # create figure with size (8,6)

g2 = sns.scatterplot(data = df, x='cnt',y='hum',hue='season') # create scatter plot 
sns.despine(top=True, right=True, left=False, bottom=False) # remove border
plt.legend(title='Season', loc='best', labels=['Springer', 'Summer', 'Fall', 'Winter'], frameon=False) # add legend labels

# axis lablel 
plt.xlabel("Count of rents")
plt.ylabel("Normalized humidity.")

# plot title 
# add the title, title size, and the distance between title and plot
plt.title("The relation between count of rents and humidity.", size=14, y=1.12) 

plt.show(g2) # show the scatter plot

### **How does the weather affect renting a bike?**


##### **Weathersit Classes**
1. Clear, Few clouds, Partly cloudy, Partly cloudy
2. Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
3. Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
4. Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog


In [None]:
fig, ax = plt.subplots(figsize=(14,5)) # create figure with figsize in inches
g3 = sns.lineplot(data=df, x="dteday", y="cnt",hue="weathersit");

# axis lablel 
plt.xlabel("Date")
plt.ylabel("Count of rents")

# plot title 
# add the title, title size, and the distance between title and plot
plt.title("The distribution of count of rents over the two years", size=14, y=1.12)
plt.show(g3);



**As see in the line plot most of users rent bike when weathersit is class 1 (that's mean the weather is clear, few clouds, partly cloudy, partly cloudy)**


### **In which season the number of the rents increased?**

**Note:** 1:springer, 2:summer, 3:fall, 4:winter

In [None]:
g4 = sns.barplot(data=df,x="season",  y="cnt", hue="yr", palette="rocket",  ci=None)
sns.despine(top=True, right=True, left=False, bottom=False) # remove border

plt.ylabel('Count of rents')
plt.xlabel('Season')
plt.title("The count of user in year based on season", size=14, y=1.12)

plt.xticks([0,1,2,3],['Springer', 'Summer', 'Fall', 'Winter'])

plt.legend(title='Year', loc='best', labels=['2011', '2012'], frameon=False)
plt.show(g4);

As shows on the figure, the number of rents increse on Full.

### All plots

In [None]:
# Create 2 subplots. 2 plots per row and 2 plots per column

f, axes = plt.subplots(2, 2, figsize=(16,10))

# plot 1:

g1 = sns.scatterplot(data = df, x='cnt',y='temp',hue='season'  , ax=axes[0,0]) # create scatter plot 
sns.despine(top=True, right=True, left=False, bottom=False) # remove border
axes[0,0].legend(title='Season', loc='best', labels=['Springer', 'Summer', 'Fall', 'Winter'], frameon=False) # add legend labels

# axis lablel 
axes[0,0].set_xlabel("Count of rents")
axes[0,0].set_ylabel("Normalized temperature")

# plot title 
# add the title, title size, and the distance between title and plot
axes[0,0].set_title("The relation between count of rents and temperature", size=14, y=1.12) 


# plot 2:

 
g2 = sns.scatterplot(data = df, x='cnt',y='hum',hue='season', ax=axes[0,1]) # create scatter plot 
sns.despine(top=True, right=True, left=False, bottom=False) # remove border
axes[0,1].legend(title='Season', loc='best', labels=['Springer', 'Summer', 'Fall', 'Winter'], frameon=False) # add legend labels

# axis lablel 
axes[0,1].set_xlabel("Count of rents")
axes[0,1].set_ylabel("Normalized humidity.")

# plot title 
# add the title, title size, and the distance between title and plot
axes[0,1].set_title("The relation between count of rents and humidity.", size=14, y=1.12) 

#plt.show(g2) # show the scatter plot

# plot 3:

g3 = sns.lineplot(data=df, x="dteday", y="cnt",hue="weathersit", ax=axes[1,0]);

# axis lablel 
axes[1,0].set_xlabel("Date")
axes[1,0].set_ylabel("Count of rents")

# plot title 
# add the title, title size, and the distance between title and plot
axes[1,0].set_title("The distribution of count of rents over the two years", size=14, y=1.12)


# plot: 4


g4 = sns.barplot(data=df,x="season",  y="cnt", hue="yr", palette="rocket",  ci=None, ax=axes[1,1])
sns.despine(top=True, right=True, left=False, bottom=False) # remove border

plt.ylabel('Count of rents')
plt.xlabel('Season')
axes[1,1].set_title("The count of user in year based on season", size=14, y=1.12)

plt.xticks([0,1,2,3],['Springer', 'Summer', 'Fall', 'Winter'])

plt.legend(title='Year', loc='best', labels=['2011', '2012'], frameon=False);



plt.tight_layout() # To increase space between plots to prevent label overlap
plt.show()