## introduction
Bike sharing systems are a means of renting bicycles where the process of obtaining membership, rental, and bike return is automated via a network of kiosk locations throughout a city. Using these systems, people are able rent a bike from a one location and return it to a different place on an as-needed basis.


In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import datasets



# You can configure the format of the images: ‘png’, ‘retina’, ‘jpeg’, ‘svg’, ‘pdf’.
%config InlineBackend.figure_format = 'svg'
# this statement allows the visuals to render within your Jupyter Notebook
%matplotlib inline 



In [11]:
df = pd.read_csv("C:/Users/الوعد للحاسبات/Downloads/SeoulBikeData.csv")

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 14 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   Date                       8760 non-null   object 
 1   Rented Bike Count          8760 non-null   int64  
 2   Hour                       8760 non-null   int64  
 3   Temperature(�C)            8760 non-null   float64
 4   Humidity(%)                8760 non-null   int64  
 5   Wind speed (m/s)           8760 non-null   float64
 6   Visibility (10m)           8760 non-null   int64  
 7   Dew point temperature(�C)  8760 non-null   float64
 8   Solar Radiation (MJ/m2)    8760 non-null   float64
 9   Rainfall(mm)               8760 non-null   float64
 10  Snowfall (cm)              8760 non-null   float64
 11  Seasons                    8760 non-null   object 
 12  Holiday                    8760 non-null   object 
 13  Functioning Day            8760 non-null   objec

### Dataset Description
There is:<br><b>8760 row</b><br><b>14 columns</b><br>Attribute Information:<br>
Date : year-month-day<br>
Rented Bike count - Count of bikes rented at each hour<br>
Hour - Hour of he day<br>
Temperature-Temperature in Celsius<br>
Humidity - %<br>
Windspeed - m/s<br>
Visibility - 10m<br>
Dew point temperature - Celsius<br>
Solar radiation - MJ/m2<br>
Rainfall - mm<br>
Snowfall - cm<br>
Seasons - Winter, Spring, Summer, Autumn<br>
Holiday - Holiday/No holiday<br>
Functional Day - NoFunc(Non Functional Hours), Fun(Functional hours)<br>

In [4]:
df.head()

Unnamed: 0,Date,Rented Bike Count,Hour,Temperature(�C),Humidity(%),Wind speed (m/s),Visibility (10m),Dew point temperature(�C),Solar Radiation (MJ/m2),Rainfall(mm),Snowfall (cm),Seasons,Holiday,Functioning Day
0,01/12/2017,254,0,-5.2,37,2.2,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
1,01/12/2017,204,1,-5.5,38,0.8,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
2,01/12/2017,173,2,-6.0,39,1.0,2000,-17.7,0.0,0.0,0.0,Winter,No Holiday,Yes
3,01/12/2017,107,3,-6.2,40,0.9,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
4,01/12/2017,78,4,-6.0,36,2.3,2000,-18.6,0.0,0.0,0.0,Winter,No Holiday,Yes


# Questions

### How do the temperatures change across the seasons? What are the mean and median temperatures?

### Is there a correlation between the temp/atemp/mean.temp.atemp and the total count of bike rentals?

### Is there a difference between the real temperature and the feeled temperature? If there is a difference will it still be there across the different seasons?

### is temperature associated with bike rentals (registered vs. casual)?

### Can the number of total bike rentals be predicted by holiday and weather?

### What are the mean temperature, humidity, windspeed and total rentals per months?

What percentage of days are appropriate for biking concerning the weather with conditions

Temperature > 5°, weather situation 1-3, windspeed < 40 km/h and

Temperature > 10°, weather situation 1-2, windspeed < 20 km/h?




## Encoding Categorical data:

In [12]:
ohe1=pd.get_dummies(df['Seasons'])
df = df.drop(['Seasons'], axis=1)
df = pd.merge(df, ohe1, right_index=True, left_index=True)

In [13]:
ohe2=pd.get_dummies(df['Functioning Day'])
df = df.drop(['Functioning Day'], axis=1)
df = pd.merge(df, ohe2, right_index=True, left_index=True)

In [14]:
ohe3=pd.get_dummies(df['Holiday'])
df = df.drop(['Holiday'], axis=1)
df = pd.merge(df, ohe3, right_index=True, left_index=True)

In [15]:
df

Unnamed: 0,Date,Rented Bike Count,Hour,Temperature(�C),Humidity(%),Wind speed (m/s),Visibility (10m),Dew point temperature(�C),Solar Radiation (MJ/m2),Rainfall(mm),Snowfall (cm),Autumn,Spring,Summer,Winter,No,Yes,Holiday,No Holiday
0,01/12/2017,254,0,-5.2,37,2.2,2000,-17.6,0.0,0.0,0.0,0,0,0,1,0,1,0,1
1,01/12/2017,204,1,-5.5,38,0.8,2000,-17.6,0.0,0.0,0.0,0,0,0,1,0,1,0,1
2,01/12/2017,173,2,-6.0,39,1.0,2000,-17.7,0.0,0.0,0.0,0,0,0,1,0,1,0,1
3,01/12/2017,107,3,-6.2,40,0.9,2000,-17.6,0.0,0.0,0.0,0,0,0,1,0,1,0,1
4,01/12/2017,78,4,-6.0,36,2.3,2000,-18.6,0.0,0.0,0.0,0,0,0,1,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8755,30/11/2018,1003,19,4.2,34,2.6,1894,-10.3,0.0,0.0,0.0,1,0,0,0,0,1,0,1
8756,30/11/2018,764,20,3.4,37,2.3,2000,-9.9,0.0,0.0,0.0,1,0,0,0,0,1,0,1
8757,30/11/2018,694,21,2.6,39,0.3,1968,-9.9,0.0,0.0,0.0,1,0,0,0,0,1,0,1
8758,30/11/2018,712,22,2.1,41,1.0,1859,-9.8,0.0,0.0,0.0,1,0,0,0,0,1,0,1


## Extracting independent variable:
First I move Y column to the end of dataframe

In [None]:
Autumn	Spring	Summer	Winter	No	Yes	Holiday	No Holiday

In [17]:
df = df[['Date', 'Hour', 'Temperature(�C)', 'Humidity(%)', \
         'Wind speed (m/s)', 'Visibility (10m)', 'Dew point temperature(�C)', \
         'Solar Radiation (MJ/m2)', 'Rainfall(mm)', 'Snowfall (cm)', 'Autumn', \
         'Spring', 'Summer', 'Winter', 'No', 'Yes', 'Holiday', 'No Holiday', 'Rented Bike Count']]

In [18]:
x= df.iloc[:,:-1].values  

In [19]:
x

array([['01/12/2017', 0, -5.2, ..., 1, 0, 1],
       ['01/12/2017', 1, -5.5, ..., 1, 0, 1],
       ['01/12/2017', 2, -6.0, ..., 1, 0, 1],
       ...,
       ['30/11/2018', 21, 2.6, ..., 1, 0, 1],
       ['30/11/2018', 22, 2.1, ..., 1, 0, 1],
       ['30/11/2018', 23, 1.9, ..., 1, 0, 1]], dtype=object)

## Extracting dependent variable:

In [20]:
y= df.iloc[:,-1].values  

In [21]:
y

array([254, 204, 173, ..., 694, 712, 584], dtype=int64)

### Analyses
1. How do the temperatures change across the seasons? What are the mean and median temperatures?

## Secondly we created a histogram displaying the temperatures of each season including lines for the mean and median temperatures.

## Is there a correlation between the temp/atemp/mean.temp.atemp and the total count of bike rentals?



In [None]:
plt.hist(df.Temp);

In [None]:
#sns.barplot(x = 'target',y='sepal length (cm)',data=data);