### Welcome to this captivating Google Colab data analytics project, where we embark on an exciting journey into the world of bike sharing in London. In this project, we have delved into the "London Bike Sharing Dataset," which provides a comprehensive view of bike-sharing activities from January 1, 2015, to December 31, 2016.


```

```


The primary objective of this project is to uncover valuable insights and patterns hidden within the data. To accomplish this, we have employed basic and simple data manipulation techniques to clean, filter, and prepare the data for analysis.

Throughout this Colab, we have used Python and the powerful data manipulation library Pandas to preprocess the data, making it suitable for analysis. The ability to manipulate and explore data is essential for drawing meaningful conclusions from any data-driven project.

As we dive into this project, we will be working with various data manipulation techniques, such as cleaning missing values, aggregating data, and visualizing trends. Moreover, we will gain valuable insights into the factors that influence bike-sharing activities in London.



```

```

Finally we will create an interactive dashboard for the obtained data:-

https://public.tableau.com/app/profile/yash.bhardwaj1294/viz/LondonBikeRides-MovingAverageandHeatmap_16926900073190/Dashboard

# Importing Dependencies

Importing Pandas

In [None]:
import pandas as pd

Mounting the Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# Data cleaning and Manupulation

Importing the Data

In [None]:
# Loaing the dataset into the colab
path = "/content/drive/MyDrive/data_analysis/london_merged.csv"

# Renaming the datset
bikes = pd.read_csv(path)

### Checking the data

In [None]:
# .info gives us the basic outline of the datatypes we are working with
bikes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17414 entries, 0 to 17413
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   timestamp     17414 non-null  object 
 1   cnt           17414 non-null  int64  
 2   t1            17414 non-null  float64
 3   t2            17414 non-null  float64
 4   hum           17414 non-null  float64
 5   wind_speed    17414 non-null  float64
 6   weather_code  17414 non-null  float64
 7   is_holiday    17414 non-null  float64
 8   is_weekend    17414 non-null  float64
 9   season        17414 non-null  float64
dtypes: float64(8), int64(1), object(1)
memory usage: 1.3+ MB


In [None]:
# .shape gives us the dimentions of the datasetprovided
bikes.shape

(17414, 10)

In [None]:
# printing he data in its orignal form
bikes

Unnamed: 0,timestamp,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0
...,...,...,...,...,...,...,...,...,...,...
17409,2017-01-03 19:00:00,1042,5.0,1.0,81.0,19.0,3.0,0.0,0.0,3.0
17410,2017-01-03 20:00:00,541,5.0,1.0,81.0,21.0,4.0,0.0,0.0,3.0
17411,2017-01-03 21:00:00,337,5.5,1.5,78.5,24.0,4.0,0.0,0.0,3.0
17412,2017-01-03 22:00:00,224,5.5,1.5,76.0,23.0,4.0,0.0,0.0,3.0


In [None]:
# Understanding the weather column
bikes.weather_code.value_counts()

1.0     6150
2.0     4034
3.0     3551
7.0     2141
4.0     1464
26.0      60
10.0      14
Name: weather_code, dtype: int64

In [None]:
# Understanding the season column
bikes.season.value_counts()

0.0    4394
1.0    4387
3.0    4330
2.0    4303
Name: season, dtype: int64

### Data manupulation

In [None]:
# Creating a dictionary for the different header for each column
new_column_dict ={
    'timestamp':'time',
    'cnt':'count',
    't1':'real_temp',
    't2':'temp_feel_like',
    'hum':'humidity_percentage',
    'wind_speed':'wind_speed',
    'weather_code':'weather',
    'is_holiday':'is_holiday',
    'is_weekend':'is_weekend',
    'season':'season',
}

# Replacing the orignal entries/headings with the new dictionary
bikes.rename(new_column_dict, axis=1, inplace=True)

In [None]:
# Checking the modified data dor discrepencies
bikes

Unnamed: 0,time,count,real_temp,temp_feel_like,humidity_percentage,wind_speed,weather,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0
...,...,...,...,...,...,...,...,...,...,...
17409,2017-01-03 19:00:00,1042,5.0,1.0,81.0,19.0,3.0,0.0,0.0,3.0
17410,2017-01-03 20:00:00,541,5.0,1.0,81.0,21.0,4.0,0.0,0.0,3.0
17411,2017-01-03 21:00:00,337,5.5,1.5,78.5,24.0,4.0,0.0,0.0,3.0
17412,2017-01-03 22:00:00,224,5.5,1.5,76.0,23.0,4.0,0.0,0.0,3.0


In [None]:
# Converting the perentage into a value between 0 and 1
bikes.humidity_percentage = bikes.humidity_percentage/100

In [None]:
# Checking the modified data dor discrepencies
bikes

Unnamed: 0,time,count,real_temp,temp_feel_like,humidity_percentage,wind_speed,weather,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,0.930,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,0.930,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,0.965,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,1.000,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,0.930,6.5,1.0,0.0,1.0,3.0
...,...,...,...,...,...,...,...,...,...,...
17409,2017-01-03 19:00:00,1042,5.0,1.0,0.810,19.0,3.0,0.0,0.0,3.0
17410,2017-01-03 20:00:00,541,5.0,1.0,0.810,21.0,4.0,0.0,0.0,3.0
17411,2017-01-03 21:00:00,337,5.5,1.5,0.785,24.0,4.0,0.0,0.0,3.0
17412,2017-01-03 22:00:00,224,5.5,1.5,0.760,23.0,4.0,0.0,0.0,3.0


In [None]:
# Creating a dictionary for the different seasons
season_dict = {
    '0.0':'spring',
    '1.0':'summer',
    '2.0':'autumn',
    '3.0':'winter',
}

# Creating a dictionary for the different seasons
weather_dict ={
    '1.0':'Clear',
    '2.0':'Scattered Clouds',
    '3.0':'Broken Clouds',
    '4.0':'Cloudy',
    '7.0':'Rain',
    '10.0':'Rain with Thunderstorm',
    '26.0':'Snowfall',
}

# Converting the seasons column to a string datatype
bikes.season = bikes.season.astype('str')

# Replacing the orignal entries with the new dictionary
bikes.season = bikes.season.map(season_dict)

# Converting the weather column to a string datatype
bikes.weather = bikes.weather.astype('str')

# Replacing the orignal entries with the new dictionary
bikes.weather = bikes.weather.map(weather_dict)

In [None]:
# Checking the modified data dor discrepencies
# .head() prints out only 5 top rows of the dataset
bikes.head()

Unnamed: 0,time,count,real_temp,temp_feel_like,humidity_percentage,wind_speed,weather,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,0.93,6.0,Broken Clouds,0.0,1.0,winter
1,2015-01-04 01:00:00,138,3.0,2.5,0.93,5.0,Clear,0.0,1.0,winter
2,2015-01-04 02:00:00,134,2.5,2.5,0.965,0.0,Clear,0.0,1.0,winter
3,2015-01-04 03:00:00,72,2.0,2.0,1.0,0.0,Clear,0.0,1.0,winter
4,2015-01-04 04:00:00,47,2.0,0.0,0.93,6.5,Clear,0.0,1.0,winter


# Exporting the modified data to an Excel Sheet

In [None]:
bikes.to_excel('Bikes_Final.xlsx', sheet_name='Data')