## Table of contents
- [Load and view Raw Data](#load-data)
    - [Dataset characteristics](data-characteristics)
- [Data visualization](#Data-visualization)
    

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split

# Load and view raw Data <a name="load-data" />

In [2]:
raw_data_set_path="../dataset/raw/bike-sharing-patterns.csv"


In [3]:
raw_data_df=pd.read_csv(raw_data_set_path)

In [4]:
raw_data_df.head()

Unnamed: 0,instant,dteday,season,yr,mnth,hr,holiday,weekday,workingday,weathersit,temp,atemp,hum,windspeed,casual,registered,cnt
0,1,2011-01-01,1,0,1,0,0,6,0,1,0.24,0.2879,0.81,0.0,3,13,16
1,2,2011-01-01,1,0,1,1,0,6,0,1,0.22,0.2727,0.8,0.0,8,32,40
2,3,2011-01-01,1,0,1,2,0,6,0,1,0.22,0.2727,0.8,0.0,5,27,32
3,4,2011-01-01,1,0,1,3,0,6,0,1,0.24,0.2879,0.75,0.0,3,10,13
4,5,2011-01-01,1,0,1,4,0,6,0,1,0.24,0.2879,0.75,0.0,0,1,1


In [18]:
raw_data_df.describe().loc[['min','max',]].iloc[:,1:9]

Unnamed: 0,season,yr,mnth,hr,holiday,weekday,workingday,weathersit
min,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0
max,4.0,1.0,12.0,23.0,1.0,6.0,1.0,4.0


In [None]:
raw_data_df.describe().loc[['min','max','mean','std']].iloc[:,8:]

### Dataset characteristics <a name="data-characteristics" />
  - `instant`: record index
  - `dteday`: date
  - `season`: season (1:springer, 2:summer, 3:fall, 4:winter)
  - `yr`: year (0 for  2011, 1 for 2012)
  - `mnth`: month (1 to 12)
  - `hr`: hour (0 to 23)
  - `holiday`: weather day is holiday or not (0 not holiday , 1 holiday )
  - `weekday`: day of the week (from 0 to 6) `['Monday 0 ', 'Tuesday' 1, 'Wednesday' 2, 'Thursday' 3, 'Friday' 4, 'Saturday' 5,'Sunday' 6]`
  - `workingday`: if day is neither weekend nor holiday is 1, otherwise is 0.
  - `weathersit`:
    1. Clear, Few clouds, Partly cloudy, Partly cloudy
    2. Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
    3. Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
    4. Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
  - `temp`: Normalized temperature in Celsius. The values are divided to 41 (max)
  - `atemp`: Normalized feeling temperature in Celsius. The values are divided to 50 (max)
  - `hum`: Normalized humidity. The values are divided to 100 (max)
  - `windspeed`: Normalized wind speed. The values are divided to 67 (max)
  - `casual`: count of casual users
  - `registered`: count of registered users
  - `cnt`: count of total rental bikes including both casual and registered

In [None]:
raw_data_df.corr()['cnt']

- Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions, precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors.

# Data visualization <a name="Data-visualization"/>
- showing the relation between count and all columns

In [None]:
categorical_cols=['season','mnth','holiday','hr','weekday','workingday','weathersit']
fig,ax=plt.subplots(1,len(categorical_cols),figsize=(35,7) )

for i in range(len(categorical_cols)):
    data=raw_data_df.groupby(categorical_cols[i]).sum()['cnt']
    x=data.index.tolist()
    y=data.tolist()
    ax[i].set_xlabel(categorical_cols[i])
    ax[i].bar(x,y,width=0.5)



- most renting hours from 8 to 10 (going to work) and 16 to 20 (returning from work)
- weathersit, most of the renting in a  (Clear, Few clouds, Partly cloudy, Partly cloudy) weather
- most of the renting are also in non holidays

In [None]:
cont_cols=['temp','hum','windspeed']

fig,ax=plt.subplots(1,len(cont_cols),figsize=(25,7) )

for i in range(len(cont_cols)):
    data=raw_data_df.groupby(cont_cols[i]).sum()['cnt']
    x=data.index.tolist()
    y=data.tolist()
    ax[i].set_xlabel(cont_cols[i])
    ax[i].plot(x,y)

- in the temp graph high renting count in a moderate weather not cold and not hot
- in wind speed the count of renting cycles is inversely proportional to wind speed

In [None]:

raw_data_df.corr().loc['temp','atemp']

- we can select one of them as they have strong correlation almost ide