# *Exploring the Influence of Weather on TTC Streetcar Delays and Forecasting Delays*

- Created on: October, 2023
- Created by: Jessica Seo

---------

## 🚂Loading Streetcar Data

### Notebook content

- Introduction
- Data Loading
- Data Cleaning
- Merging Streetcar dataset and Weather Dataset
- Data Saving


----------

### Introduction

This notebook is designed to load three individual TTC Streetcar delay datasets and merge them into a single dataset. The data covers the period from January 01, 2021, to September 30, 2023, and is collected from the City of Toronto's Open Data Source. After merging the datasets, we will create a column that can be used to join the data with the Weather dataset.

____


### Data Loading

In [1]:
#importing necessary python libraries
import pandas as pd
import time

In [2]:
#reading df1 and checking
df1 = pd.read_csv('capstone/Streetcar_Delay_2021.csv')
df1

Unnamed: 0,Date,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Bound,Vehicle
0,1-Jan-21,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,W,4574
1,1-Jan-21,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,,4500
2,1-Jan-21,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,S,4589
3,1-Jan-21,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,W,4582
4,1-Jan-21,506,05:37,Friday,MAIN STATION,Mechanical,10,20,N,3480
...,...,...,...,...,...,...,...,...,...,...
14591,31-Dec-21,510,00:39,Friday,SPADINA AND QUEEN,General Delay,0,0,B,0
14592,31-Dec-21,504,00:51,Friday,WILSON GARAGE,Operations,5,10,,0
14593,31-Dec-21,509,01:19,Friday,CNE LOOP,Operations,8,16,,4553
14594,31-Dec-21,503,01:22,Friday,QUEEN AND CONNAUGHT,Operations,8,17,,4459


In [3]:
#reading df2 and checking
df2 = pd.read_csv('capstone/Streetcar_Delay_2022.csv')
df2

Unnamed: 0,Date,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Bound,Vehicle
0,1-Jan-22,504,02:21,Saturday,BROADVIEW STATION,Collision - TTC Involved,30,60,E,8333
1,1-Jan-22,501,03:22,Saturday,718 QUEEN ST EAST,Operations,16,35,W,8068
2,1-Jan-22,504,03:28,Saturday,BROADVIEW STATION,Operations,18,36,S,0
3,1-Jan-22,510,03:34,Saturday,UNION STATION,Operations,30,60,,4406
4,1-Jan-22,301,03:39,Saturday,LAKESHORE AND TENTH,Security,5,25,W,8622
...,...,...,...,...,...,...,...,...,...,...
17650,31-Dec-22,504,01:20,Saturday,KING AND PORTLAND,Cleaning - Unsanitary,10,20,E,4588
17651,31-Dec-22,510,01:24,Saturday,UNION STATION,Security,10,20,N,4463
17652,31-Dec-22,501,01:42,Saturday,KINGSTON LOOP,Security,0,0,,4534
17653,31-Dec-22,504,01:46,Saturday,KING AND PETER,Cleaning - Unsanitary,10,20,W,4431


In [4]:
#reading df3 and checking
df3 = pd.read_csv('Streetcar_Delay_2023.csv')
df3

Unnamed: 0,Date,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Bound,Vehicle
0,1-Jan-23,509,02:37,Sunday,QUEENS QUAY AND SPADIN,Operations,0,0,E,4403
1,1-Jan-23,505,02:40,Sunday,BROADVIEW AND GERRARD,Held By,15,25,W,4460
2,1-Jan-23,504,02:52,Sunday,KING AND BATHURST,Cleaning - Unsanitary,10,20,W,4427
3,1-Jan-23,504,02:59,Sunday,KING AND BATHURST,Held By,25,35,E,4560
4,1-Jan-23,509,03:33,Sunday,QUEENS QUAY AND SPADIN,Operations,0,0,E,4570
...,...,...,...,...,...,...,...,...,...,...
10146,30-Sep-23,511,22:53,Saturday,FLEET AND MANITOBA,Diversion,44,54,W,4593
10147,30-Sep-23,505,23:21,Saturday,KINGSTON RD LOOP,Security,10,20,,4503
10148,30-Sep-23,513,23:41,Saturday,1626 QUEEN ST EAST,Operations,10,10,W,8818
10149,30-Sep-23,501,00:48,Saturday,QUEEN AND AUGUSTA,Diversion,41,61,,0


In [5]:
print(df1.columns)
print(df2.columns)
print(df3.columns)

Index(['Date', 'Line', 'Time', 'Day', 'Location', 'Incident', 'Min Delay',
       'Min Gap', 'Bound', 'Vehicle'],
      dtype='object')
Index(['Date', 'Line', 'Time', 'Day', 'Location', 'Incident', 'Min Delay',
       'Min Gap', 'Bound', 'Vehicle'],
      dtype='object')
Index(['Date', 'Line', 'Time', 'Day', 'Location', 'Incident', 'Min Delay',
       'Min Gap', 'Bound', 'Vehicle'],
      dtype='object')


Now that we confirmed that three dataframes has identical columns, we can confidently merge them.

In [6]:
#Merging all three dataframes and checking
df = df1.merge(df2, how='outer').merge(df3, how='outer')
df

Unnamed: 0,Date,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Bound,Vehicle
0,1-Jan-21,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,W,4574
1,1-Jan-21,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,,4500
2,1-Jan-21,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,S,4589
3,1-Jan-21,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,W,4582
4,1-Jan-21,506,05:37,Friday,MAIN STATION,Mechanical,10,20,N,3480
...,...,...,...,...,...,...,...,...,...,...
42397,30-Sep-23,511,22:53,Saturday,FLEET AND MANITOBA,Diversion,44,54,W,4593
42398,30-Sep-23,505,23:21,Saturday,KINGSTON RD LOOP,Security,10,20,,4503
42399,30-Sep-23,513,23:41,Saturday,1626 QUEEN ST EAST,Operations,10,10,W,8818
42400,30-Sep-23,501,00:48,Saturday,QUEEN AND AUGUSTA,Diversion,41,61,,0


The data is merged well!

In [7]:
#checking the dataframe
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 42402 entries, 0 to 42401
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   Date       42402 non-null  object
 1   Line       42186 non-null  object
 2   Time       42402 non-null  object
 3   Day        42402 non-null  object
 4   Location   42402 non-null  object
 5   Incident   42401 non-null  object
 6   Min Delay  42402 non-null  int64 
 7   Min Gap    42402 non-null  int64 
 8   Bound      35018 non-null  object
 9   Vehicle    42402 non-null  int64 
dtypes: int64(3), object(7)
memory usage: 3.6+ MB


In [8]:
#Merging Date and Time column to delay time
df["delaytime"] = pd.to_datetime(df["Date"] + " " + df["Time"])

In [9]:
#sanity check
df.head()

Unnamed: 0,Date,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Bound,Vehicle,delaytime
0,1-Jan-21,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,W,4574,2021-01-01 03:15:00
1,1-Jan-21,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,,4500,2021-01-01 03:37:00
2,1-Jan-21,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,S,4589,2021-01-01 04:00:00
3,1-Jan-21,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,W,4582,2021-01-01 04:03:00
4,1-Jan-21,506,05:37,Friday,MAIN STATION,Mechanical,10,20,N,3480,2021-01-01 05:37:00


In [10]:
#Dropping Date after creating a new column
df.drop(['Date','Bound'],axis=1, inplace=True)

In [11]:
print(f'This dataframe has {df.shape[0]} rows and {df.shape[1]} columns.')

This dataframe has 42402 rows and 9 columns.


In [12]:
#sanity check
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 42402 entries, 0 to 42401
Data columns (total 9 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Line       42186 non-null  object        
 1   Time       42402 non-null  object        
 2   Day        42402 non-null  object        
 3   Location   42402 non-null  object        
 4   Incident   42401 non-null  object        
 5   Min Delay  42402 non-null  int64         
 6   Min Gap    42402 non-null  int64         
 7   Vehicle    42402 non-null  int64         
 8   delaytime  42402 non-null  datetime64[ns]
dtypes: datetime64[ns](1), int64(3), object(5)
memory usage: 3.2+ MB


In [13]:
#Create a new datetime column to join Weather dataframe later
df['datetime'] =pd.to_datetime(df['delaytime'])

In [14]:
#checking
df.head()

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,delaytime,datetime
0,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,4574,2021-01-01 03:15:00,2021-01-01 03:15:00
1,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,4500,2021-01-01 03:37:00,2021-01-01 03:37:00
2,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,4589,2021-01-01 04:00:00,2021-01-01 04:00:00
3,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,4582,2021-01-01 04:03:00,2021-01-01 04:03:00
4,506,05:37,Friday,MAIN STATION,Mechanical,10,20,3480,2021-01-01 05:37:00,2021-01-01 05:37:00


In [15]:
#Replacing the minute to 0 in datetime column to match with weather dataframe later
df['datetime']= df['datetime'].apply(lambda x:x.replace(minute=0))

In [16]:
#checking
df.head(3)

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,delaytime,datetime
0,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,4574,2021-01-01 03:15:00,2021-01-01 03:00:00
1,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,4500,2021-01-01 03:37:00,2021-01-01 03:00:00
2,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,4589,2021-01-01 04:00:00,2021-01-01 04:00:00


In [17]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 42402 entries, 0 to 42401
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Line       42186 non-null  object        
 1   Time       42402 non-null  object        
 2   Day        42402 non-null  object        
 3   Location   42402 non-null  object        
 4   Incident   42401 non-null  object        
 5   Min Delay  42402 non-null  int64         
 6   Min Gap    42402 non-null  int64         
 7   Vehicle    42402 non-null  int64         
 8   delaytime  42402 non-null  datetime64[ns]
 9   datetime   42402 non-null  datetime64[ns]
dtypes: datetime64[ns](2), int64(3), object(5)
memory usage: 3.6+ MB


------
### Merging Streetcar data and Weather data

In [18]:
#Loading Weather data
weather = pd.read_csv('final_weather.csv')
weather

Unnamed: 0,Temp Definition °C,Dew Point Definition °C,Rel Hum Definition %,Precip. Amount Definition mm,Wind Dir Definition 10's deg,Wind Spd Definition km/h,Visibility Definition km,Stn Press Definition kPa,Hmdx Definition,Wind Chill Definition,Weather Definition,datetime
0,-1.3,-4.7,78,0.0,25.0,4,16.1,102.17,,-3.0,LegendNANA,2021-01-01 00:00:00
1,-1.2,-3.6,84,0.0,24.0,4,16.1,102.16,,-3.0,LegendNANA,2021-01-01 01:00:00
2,-1.8,-3.2,90,0.0,27.0,4,16.1,102.19,,-3.0,LegendNANA,2021-01-01 02:00:00
3,-2.0,-3.3,91,0.0,30.0,5,16.1,102.26,,-4.0,LegendNANA,2021-01-01 03:00:00
4,-1.4,-3.5,85,0.0,27.0,5,16.1,102.24,,-3.0,LegendNANA,2021-01-01 04:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...
24067,17.6,15.7,89,0.0,24,4,16.1,101.52,,,LegendNANA,2023-09-30 19:00:00
24068,17.0,15.7,92,0.0,25,5,16.1,101.53,,,LegendNANA,2023-09-30 20:00:00
24069,16.6,15.1,91,0.0,26,8,16.1,101.53,,,LegendNANA,2023-09-30 21:00:00
24070,16.6,15.1,91,0.0,LegendMM,4,16.1,101.57,,,LegendNANA,2023-09-30 22:00:00


In [19]:
weather.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24072 entries, 0 to 24071
Data columns (total 12 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Temp Definition °C            24044 non-null  object 
 1   Dew Point Definition °C       24044 non-null  object 
 2   Rel Hum Definition %          24044 non-null  object 
 3   Precip. Amount Definition mm  24044 non-null  float64
 4   Wind Dir Definition 10's deg  23632 non-null  object 
 5   Wind Spd Definition km/h      24044 non-null  object 
 6   Visibility Definition km      24044 non-null  object 
 7   Stn Press Definition kPa      24044 non-null  object 
 8   Hmdx Definition               4005 non-null   float64
 9   Wind Chill Definition         3885 non-null   float64
 10  Weather Definition            24044 non-null  object 
 11  datetime                      24072 non-null  object 
dtypes: float64(3), object(9)
memory usage: 2.2+ MB


In [20]:
#converting datetime to datetime dtype to merge
weather["datetime"] = pd.to_datetime(weather['datetime'])

Explain why you are dropping them

In [21]:
#dropping five irrelevent columns
drops = ['Dew Point Definition °C','Rel Hum Definition %','Wind Dir Definition 10\'s deg','Wind Spd Definition km/h','Stn Press Definition kPa']

weather=weather.drop(drops, axis =1)

In [22]:
weather.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24072 entries, 0 to 24071
Data columns (total 7 columns):
 #   Column                        Non-Null Count  Dtype         
---  ------                        --------------  -----         
 0   Temp Definition °C            24044 non-null  object        
 1   Precip. Amount Definition mm  24044 non-null  float64       
 2   Visibility Definition km      24044 non-null  object        
 3   Hmdx Definition               4005 non-null   float64       
 4   Wind Chill Definition         3885 non-null   float64       
 5   Weather Definition            24044 non-null  object        
 6   datetime                      24072 non-null  datetime64[ns]
dtypes: datetime64[ns](1), float64(3), object(3)
memory usage: 1.3+ MB


In [23]:
weather.head()

Unnamed: 0,Temp Definition °C,Precip. Amount Definition mm,Visibility Definition km,Hmdx Definition,Wind Chill Definition,Weather Definition,datetime
0,-1.3,0.0,16.1,,-3.0,LegendNANA,2021-01-01 00:00:00
1,-1.2,0.0,16.1,,-3.0,LegendNANA,2021-01-01 01:00:00
2,-1.8,0.0,16.1,,-3.0,LegendNANA,2021-01-01 02:00:00
3,-2.0,0.0,16.1,,-4.0,LegendNANA,2021-01-01 03:00:00
4,-1.4,0.0,16.1,,-3.0,LegendNANA,2021-01-01 04:00:00


In [24]:
print(f'This dataframe has {weather.shape[0]} rows and {weather.shape[1]} columns.')

This dataframe has 24072 rows and 7 columns.


In [25]:
#testing to merge before merging the whole datasets
streetcar_test=df.head(20)
weather_test=weather.head(20)

In [26]:
#testing the merge
test_test=streetcar_test.merge(weather_test, on='datetime', how='left')
test_test

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,delaytime,datetime,Temp Definition °C,Precip. Amount Definition mm,Visibility Definition km,Hmdx Definition,Wind Chill Definition,Weather Definition
0,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,4574,2021-01-01 03:15:00,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
1,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,4500,2021-01-01 03:37:00,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
2,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,4589,2021-01-01 04:00:00,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
3,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,4582,2021-01-01 04:03:00,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
4,506,05:37,Friday,MAIN STATION,Mechanical,10,20,3480,2021-01-01 05:37:00,2021-01-01 05:00:00,0.4,0.0,16.1,,,LegendNANA
5,555,06:00,Friday,TORONTO TRANSIT COMMIS,General Delay,0,0,0,2021-01-01 06:00:00,2021-01-01 06:00:00,0.0,0.0,16.1,,,LegendNANA
6,501,06:59,Friday,RUSSELL YARD,Emergency Services,10,20,4496,2021-01-01 06:59:00,2021-01-01 06:00:00,0.0,0.0,16.1,,,LegendNANA
7,504,07:55,Friday,KING AND QUEEN,Mechanical,10,20,4520,2021-01-01 07:55:00,2021-01-01 07:00:00,0.5,0.0,16.1,,,LegendNANA
8,511,09:35,Friday,BATHURST AND QUEEN,Cleaning,8,16,1406,2021-01-01 09:35:00,2021-01-01 09:00:00,1.4,0.0,16.1,,,LegendNANA
9,512,09:55,Friday,ST CLAIR STATION,Emergency Services,7,12,4543,2021-01-01 09:55:00,2021-01-01 09:00:00,1.4,0.0,16.1,,,LegendNANA


Don't think delaytime column is needed, so make the decision to drop after merging everything.

It seems to have merged correctly. We will continue to merge all the rows.

In [27]:
#Finally merging the whole datasets after checking that it went well.
final_dataframe=df.merge(weather, on='datetime', how='left')

In [28]:
#checking 
final_dataframe

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,delaytime,datetime,Temp Definition °C,Precip. Amount Definition mm,Visibility Definition km,Hmdx Definition,Wind Chill Definition,Weather Definition
0,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,4574,2021-01-01 03:15:00,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
1,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,4500,2021-01-01 03:37:00,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
2,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,4589,2021-01-01 04:00:00,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
3,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,4582,2021-01-01 04:03:00,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
4,506,05:37,Friday,MAIN STATION,Mechanical,10,20,3480,2021-01-01 05:37:00,2021-01-01 05:00:00,0.4,0.0,16.1,,,LegendNANA
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
42397,511,22:53,Saturday,FLEET AND MANITOBA,Diversion,44,54,4593,2023-09-30 22:53:00,2023-09-30 22:00:00,16.6,0.0,16.1,,,LegendNANA
42398,505,23:21,Saturday,KINGSTON RD LOOP,Security,10,20,4503,2023-09-30 23:21:00,2023-09-30 23:00:00,16.9,0.0,16.1,,,LegendNANA
42399,513,23:41,Saturday,1626 QUEEN ST EAST,Operations,10,10,8818,2023-09-30 23:41:00,2023-09-30 23:00:00,16.9,0.0,16.1,,,LegendNANA
42400,501,00:48,Saturday,QUEEN AND AUGUSTA,Diversion,41,61,0,2023-09-30 00:48:00,2023-09-30 00:00:00,16.9,0.0,16.1,,,LegendNANA


In [29]:
final_dataframe=final_dataframe.drop('delaytime', axis =1)

In [30]:
final_dataframe.head()

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,datetime,Temp Definition °C,Precip. Amount Definition mm,Visibility Definition km,Hmdx Definition,Wind Chill Definition,Weather Definition
0,501,03:15,Friday,QUEEN AND MCCAUL,Operations,19,24,4574,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
1,504,03:37,Friday,BROADVIEW AND QUEEN,Operations,15,30,4500,2021-01-01 03:00:00,-2.0,0.0,16.1,,-4.0,LegendNANA
2,504,04:00,Friday,BROADVIEW STATION,Cleaning,15,30,4589,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
3,504,04:03,Friday,DUNDAS WEST STATION,Cleaning,15,30,4582,2021-01-01 04:00:00,-1.4,0.0,16.1,,-3.0,LegendNANA
4,506,05:37,Friday,MAIN STATION,Mechanical,10,20,3480,2021-01-01 05:00:00,0.4,0.0,16.1,,,LegendNANA


The dataset's time frame is from January 01, 2021 to September 30, 2023. Looking at the above dataframe tell us that everything is loaded correctly. Streetcar data has 42402 rows and 10 columns and Weather data has 24072 rows and 12 columns. It make sense that the new dataframe has 42402 rows and 21 columns.

In [31]:
final_dataframe.sample(20)

Unnamed: 0,Line,Time,Day,Location,Incident,Min Delay,Min Gap,Vehicle,datetime,Temp Definition °C,Precip. Amount Definition mm,Visibility Definition km,Hmdx Definition,Wind Chill Definition,Weather Definition
33049,510,10:10,Wednesday,SPADINA AND DUNDAS,Security,10,10,4457,2023-01-25 10:00:00,0.3,0.0,9.7,,,Snow
1900,510,13:45,Saturday,SPADINA AND QUEEN'S QU,Cleaning,4,8,4502,2021-02-27 13:00:00,5.6,0.0,16.1,,,LegendNANA
29914,501,08:30,Thursday,1262 LAWRENCE AVE W,Operations,6,12,8372,2022-11-03 08:00:00,7.8,0.0,0.6,,,Fog
22263,511,23:13,Sunday,CHARLOTTE LOOP,Security,10,20,4488,2022-05-08 23:00:00,10.8,0.0,16.1,,,LegendNANA
255,511,18:41,Wednesday,BATHURST STATION,Investigation,7,15,4505,2021-01-06 18:00:00,2.0,0.0,16.1,,,LegendNANA
14616,506,09:04,Saturday,HOWARD PARK AND RONCES,Operations,9,18,0,2022-01-01 09:00:00,4.8,0.0,16.1,,,LegendNANA
2839,505,01:35,Tuesday,DUNDAS AND SHERBOURNE,Cleaning,10,20,4488,2021-03-23 01:00:00,8.2,0.0,16.1,,,LegendNANA
2516,506,16:18,Tuesday,GERRARD AND JONES,Cleaning,6,12,3418,2021-03-16 16:00:00,1.7,0.0,16.1,,,LegendNANA
23340,504,21:59,Sunday,RONCESVALLES AND DUNDA,Diversion,162,169,8332,2022-05-29 21:00:00,LegendMM,0.0,16.1,,,LegendNANA
38226,504,01:55,Saturday,KING AND DUFFERIN,Emergency Services,24,34,4582,2023-06-03 01:00:00,24.4,0.0,16.1,30.0,,LegendNANA


----
###  Data Saving

In [32]:
#saving the dataframe to a new csv file! 
final_dataframe.to_csv('Streetcar_Weather_Data.csv', index=False)