# Final Challenge: Noise Pollution

In [1]:
import pandas as pd
import requests
from io import BytesIO

## Context

1. Introduction
In the last two decades changes in individuals’
behaviors, modifications of cities rhythms and
investments of Municipalities to revitalize
historical city centers led to increasing open-air
nightlife activities.
This is true especially in Italy (like others
Mediterranean and university cities), where a large
number of young people usually meet in
residential historic districts, attracted by a high
density of restaurants, bars, pubs, and clubs, and
spent all night for several days each week chatting
and drinking. Often the center of nightlife changes
in a few years, following new trends and new
commercial initiatives, renewing the need for both
citizens and the public administration to find a
good balance between amusement, security, and
quality of public spaces.
Municipalities are called to face this dynamic
phenomenon and its negative externalities, firstly
leisure noise, where sources are mostly people
with their behaviors in an urban open space.

In this paper, the approach of the City of Torino to
this complex challenge in the San Salvario district
is described, from the deployment of an ad hoc
sound monitoring network, the collection, and
analysis of data, the evaluation of indicators for
communication activities to the implementation
and monitoring of reduction actions. 

2. San Salvario District area
The area of interest is part of the historic district of
San Salvario, located near the central railway
station and bounded by Vittorio Emanuele II
(North), Nizza (West), Madama Cristina (East) and
Marconi (South) boulevards.
This residential area is characterized by the grid
plan typical of the old neighborhoods of Torino;
with about 470 four/five floors buildings with an
internal court; about 7300 people live in the area
with a surface of 0,26 km2
.
The district hosts a daily big open market and
offers various commercial activities. Home to an
increasing immigrants' community, the district is
an example of cultural integration. 

2.1 Nightlife “Movida” and noise issues
Starting from the 90s, the nightlife grew in this
city district thanks to a lot of pubs, low-cost bars,
restaurants, liquor stores and wine cellars,
boutiques and multi-ethnic shops that have been
opened. These activities stay open until late and
have completely reshaped the map of city
entertainment, known as “Movida” [from Spanish:
movement, happening].
The nightlife hot spots in San Salvario are in Largo
Saluzzo and Via Baretti, where crowds gradually
increase, from the areas in front of bars until
occupying all public spaces, thus causing huge
side effects: noise (chatting, shouting, quarrels),
traffic blockages, irregular parking, obstruction of
driveways, rubbish on the ground, etc.
The City Administration conducted some spot
monitoring campaigns on noise levels in summer
time; those measurements revealed that the legal
zonal limits (50 dB(A) Leq night (22-06) and 55 dB(A)
Leq 1hour on the façades) where overpassed during
the weekend nights, with Leq night form 58 dB(A) up
to 72 dB(A) and Leq 1hour from 64 dB(A) up to 75
dB(A) between 11 PM and 3 AM.
These campaigns pointed a high variability of
noise levels in the area when a lot of people meet
in narrow streets; they also showed how difficult
could be controlling noise levels in open urban
spaces.
At the same time, citizens asked for more
information and actions to face the noise problems
caused by “Movida” and they conducted by
themselves extra surveys on noise levels and noise
effects [8].
To promote an integrated approach to the
management of “Movida”, the Municipality
decided to strengthen its knowledge of noise levels
in San Salvario district and started the design and
implementation of an ad hoc noise monitoring
network. 

The location of sensors was optimized to cover all
significant feature of “Movida” area (Figure 3):
one in a very crowded square (S_03, not active in
daytime), three in narrow streets with pubs and
bars (S_01, S_04, S_05), one in a boulevard for
traffic noise measurement (S_06) and the last one
in a quieter area with no crowd and low traffic
(S_02), for global reference. The choice of points
of installation was driven also by the power
supply, so light poles, public offices and bike
sharing station where preferred. 


Previous Initiatives:
4.2 City Ordinances for noise reduction
In Summer 2017, two City Ordinances entered in
force, with the aim to limit noise pollution of
“Movida” in the central area, San Salvario district
included.
The first one, Ordinance n. 46, limits alcohol
takeaway selling, as many people reach the area of
“Movida” and buy alcohol in bottles at a low
price, spending all night wandering or sitting in the
streets, chatting and shouting. So, takeaway selling
alcohol in bottles was forbidden after 8 PM and for
all night, establishing the same rules for bars,
shop, and store. This Ordinance stayed in force
from 8th June until 30th September.
The second one, Ordinance n. 60, limits serving
food and beverage in terraces or outside bars and
shop, as many people stay outside these venues
enjoying mild weather, or smoking, or because of
overcrowding, disturbing inhabitants with an
increasing din. Serving food and beverage was
forbidden after 1:30 AM (from Monday to
Thursday), after 2 AM (Friday) or after 3 AM
(Saturday, Sunday and feast day). Furthermore, a
compulsory presence of stewards was introduced,
to limit bad behaviours of customers. This
Ordinance stayed in force from 8th July until 30th
July.
Data collected allowed the assessment of effects
induced to the new regulations on Lnights levels,
proving that both Ordinances led to a noise
reduction, with a cumulative benefit of more than
3 dB(A) (Table 2).
Table 2. Leisure noise improvements in summer 2017,
S_03 point
Lnight
Ordinance
n. 42 8
th
June30th September
Ordinance
n.60 8th-30th
July
1
st October31th December
2016 69.8 70.4 68.9
2017 67.7 66.9 68.4
∆ -2.1 -3.5 -0.5

In [2]:
df_sensors_def = pd.read_csv('/content/drive/MyDrive/finals/noise_sensor_list.csv', sep=';')
df_sensors_def

Unnamed: 0,code,address,Lat,Long,streaming
0,s_01,"Via Saluzzo, 26 Torino",45059172,7678986,https://userportal.smartdatanet.it/userportal/...
1,s_02,"Via Principe Tommaso, 18bis Torino",45057837,7681555,https://userportal.smartdatanet.it/userportal/...
2,s_03,Largo Saluzzo Torino,45058518,7678854,https://userportal.smartdatanet.it/userportal/...
3,s_05,Via Principe Tommaso angolo via Baretti Torino,45057603,7681348,https://userportal.smartdatanet.it/userportal/...
4,s_06,"Corso Marconi, 27 Torino",45055554,768259,https://userportal.smartdatanet.it/userportal/...


In [21]:
df_wifi = pd.read_csv('/content/drive/MyDrive/finals/WIFI Count.csv', sep=',')
df_wifi.head()

Unnamed: 0,Time,No. of Visitors
0,2018-10-24 17:00,47
1,2018-10-24 18:00,155
2,2018-10-24 19:00,181
3,2018-10-24 20:00,211
4,2018-10-24 21:00,239


In [22]:
df_businesses = pd.read_csv('/content/drive/MyDrive/finals/businesses.csv', sep=';')
df_businesses.head()

Unnamed: 0,WKT,ADDRESS,OPEN YEAR,OPEN MONTH,TYPE,Description,Merchandise Type
0,POINT (1396322.217 4990301.69),VIA CLAUDIO LUIGI BERTHOLLET 24,1977,1,EXTRALIMENTARI,PICCOLE STRUTTURE,Extralimentari
1,POINT (1396322.217 4990301.69),VIA CLAUDIO LUIGI BERTHOLLET 24,1985,6,ALIMENTARI,PICCOLE STRUTTURE,Panificio
2,POINT (1396303.762 4990325.001),VIA CLAUDIO LUIGI BERTHOLLET 25/F,2017,9,ALTRO,DIA di somministrazione,Nessuna
3,POINT (1396434.395 4990540.6),CORSO VITTORIO EMANUELE II 21/A,2013,10,ALTRO,DIA di somministrazione,Nessuna
4,POINT (1396434.395 4990540.6),CORSO VITTORIO EMANUELE II 21/A,2009,2,ALTRO,DIA di somministrazione,Nessuna


In [23]:
df_sim_june = pd.read_csv('/content/drive/MyDrive/finals/sim_count/SIM_count_04_100618.csv', sep=';', encoding='latin-1')
df_sim_june.head()

Unnamed: 0,cluster,data_da,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi)
0,Presenze,2018-06-10T21:00:00Z,2018-06-10T22:00:00Z,3278,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
1,Presenze,2018-06-10T20:00:00Z,2018-06-10T21:00:00Z,3324,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
2,Presenze,2018-06-10T19:00:00Z,2018-06-10T20:00:00Z,3318,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
3,Presenze,2018-06-10T18:00:00Z,2018-06-10T19:00:00Z,3187,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
4,Presenze,2018-06-10T17:00:00Z,2018-06-10T18:00:00Z,2980,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600


In [24]:
df_sim_jan = pd.read_csv('/content/drive/MyDrive/finals/sim_count/SIM_count_15_210118.csv', sep=';', encoding='latin-1')
df_sim_jan.head()

Unnamed: 0,cluster,data_da,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi)
0,Presenze,2018-01-21T22:00:00Z,2018-01-21T23:00:00Z,3026,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
1,Presenze,2018-01-21T21:00:00Z,2018-01-21T22:00:00Z,3088,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
2,Presenze,2018-01-21T20:00:00Z,2018-01-21T21:00:00Z,3119,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
3,Presenze,2018-01-21T19:00:00Z,2018-01-21T20:00:00Z,3114,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
4,Presenze,2018-01-21T18:00:00Z,2018-01-21T19:00:00Z,2991,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600


In [25]:
df_sim_march = pd.read_csv('/content/drive/MyDrive/finals/sim_count/SIM_count_19_250318.csv', sep=';', encoding='latin-1')
df_sim_march.head()

Unnamed: 0,cluster,data_da,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi)
0,Presenze,2018-03-25T21:00:00Z,2018-03-25T22:00:00Z,3267,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
1,Presenze,2018-03-25T20:00:00Z,2018-03-25T21:00:00Z,3373,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
2,Presenze,2018-03-25T19:00:00Z,2018-03-25T20:00:00Z,3410,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
3,Presenze,2018-03-25T18:00:00Z,2018-03-25T19:00:00Z,3358,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
4,Presenze,2018-03-25T17:00:00Z,2018-03-25T18:00:00Z,3229,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600


In [26]:
df_noise_2018 = pd.read_csv('/content/drive/MyDrive/finals/noise_data/san_salvario_2018.csv', skiprows= [0,1,2,3,4,5,6,7], sep =';')
df_noise_2018.head()

Unnamed: 0,Data,Ora,C1,C2,C3,C4,C5
0,01-01-2018,00:00,687,,760,,666
1,01-01-2018,01:00,683,,682,,654
2,01-01-2018,02:00,598,,644,,644
3,01-01-2018,03:00,674,,675,,618
4,01-01-2018,04:00,680,,645,,605


In [27]:
df_noise_2018['date_hour'] = df_noise_2018['Data'] + ' ' + df_noise_2018['Ora']
df_noise_2018.drop(columns= ['Data', 'Ora'], inplace= True)

In [28]:
df_police_1 = pd.read_excel('/content/drive/MyDrive/finals/police_complaints/OpenDataContact_Gennaio_Giugno_2018.xlsx')
df_police_1.head()

Unnamed: 0,Categoria criminologa,Sottocategoria Criminologica,Circoscrizione,Localita,Area Verde,Data,Ora
0,Allarme Sociale,Altro,6.0,BELMONTE/(VIA),,01/02/2018,
1,Allarme Sociale,Altro,6.0,DONATORE DI SANGUE/(PIAZZA DEL),,12/02/2018,
2,Allarme Sociale,Altro,4.0,CIBRARIO/LUIGI (VIA),,26/02/2018,
3,Allarme Sociale,Altro,1.0,ROMA/(VIA),,02/03/2018,
4,Allarme Sociale,Altro,4.0,ZUMAGLIA/(VIA),,05/03/2018,


In [29]:
df_police_2 = pd.read_csv('/content/drive/MyDrive/finals/police_complaints/OpenDataContact_Luglio_Dicembre_2018.csv')
df_police_2.head()

Unnamed: 0,Categoria criminologa,Sottocategoria Criminologica,Circoscrizione,Localita,Area Verde,Data,Ora
0,Allarme Sociale,Altro,8.0,D'AZEGLIO/MASSIMO (CORSO) ...,,16/07/2018,
1,Allarme Sociale,Altro,1.0,REGINA MARGHERITA/(CORSO) ...,,17/07/2018,
2,Allarme Sociale,Altro,10.0,DUINO/(VIA) ...,,14/09/2018,
3,Allarme Sociale,Altro,,,,02/10/2018,9.4
4,Allarme Sociale,Altro,9.0,CARDUCCI/GIOSUE' (PIAZZA) ...,,27/11/2018,11.53


In [30]:
df_police = pd.concat([df_police_1,df_police_2])
df_police.head()

Unnamed: 0,Categoria criminologa,Sottocategoria Criminologica,Circoscrizione,Localita,Area Verde,Data,Ora
0,Allarme Sociale,Altro,6.0,BELMONTE/(VIA),,01/02/2018,
1,Allarme Sociale,Altro,6.0,DONATORE DI SANGUE/(PIAZZA DEL),,12/02/2018,
2,Allarme Sociale,Altro,4.0,CIBRARIO/LUIGI (VIA),,26/02/2018,
3,Allarme Sociale,Altro,1.0,ROMA/(VIA),,02/03/2018,
4,Allarme Sociale,Altro,4.0,ZUMAGLIA/(VIA),,05/03/2018,


### Merging Dataframe

In [32]:
df_noise_2018.head()

Unnamed: 0,C1,C2,C3,C4,C5,date_hour
0,687,,760,,666,01-01-2018 00:00
1,683,,682,,654,01-01-2018 01:00
2,598,,644,,644,01-01-2018 02:00
3,674,,675,,618,01-01-2018 03:00
4,680,,645,,605,01-01-2018 04:00


In [14]:
df_wifi.rename(columns = {'Time': 'date_time'}, inplace=True)

In [15]:
df_sim_all = pd.concat([df_sim_jan, df_sim_march, df_sim_june], axis=0)
df_sim_all.reset_index(inplace=True)

In [16]:
for x, line in enumerate(df_sim_all['data_da']):
    df_sim_all['data_da'][x] = line[8:10] + line[4:7] + '-' + line[0:4] +' ' + line[11:16]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [17]:
df_sim_all.rename(columns= {'data_da' : 'date_time'}, inplace=True)

In [31]:
df_sim_all.head()

Unnamed: 0,index,cluster,date_time,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi)
0,0,Presenze,21-01-2018 22:00,2018-01-21T23:00:00Z,3026,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
1,1,Presenze,21-01-2018 21:00,2018-01-21T22:00:00Z,3088,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
2,2,Presenze,21-01-2018 20:00,2018-01-21T21:00:00Z,3119,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
3,3,Presenze,21-01-2018 19:00,2018-01-21T20:00:00Z,3114,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600
4,4,Presenze,21-01-2018 18:00,2018-01-21T19:00:00Z,2991,5491d6d2-0c9e-47b7-bfde-c84c632efacc,Area 1,3600


In [None]:
df_police[df_police['Ora'].isna()] #many complaints do not have hours associated with them 

In [40]:
df_police['Ora'].mean()

11.180547368421063

In [50]:
#Replacing NaN values in time with zero hour
df_police['Ora'] = df_police['Ora'].apply(lambda   x: '11.00' if pd.isna(x) else x )
df_police['Ora'] = df_police['Ora'].apply(lambda x: str(x).replace('.',':'))

In [56]:
df_police['date_time'] = df_police['Data'] + ' ' + df_police['Ora'].astype('str')
df_police['date_time'] = pd.to_datetime(df_police['date_time'])
df_police['date_time'] = df_police['date_time'].dt.strftime("%d-%m-%y %H:%M")

In [57]:
df_police.head()

Unnamed: 0,Categoria criminologa,Sottocategoria Criminologica,Circoscrizione,Localita,Area Verde,Data,Ora,date_time
0,Allarme Sociale,Altro,6.0,BELMONTE/(VIA),,01/02/2018,11:00,02-01-18 11:00
1,Allarme Sociale,Altro,6.0,DONATORE DI SANGUE/(PIAZZA DEL),,12/02/2018,11:00,02-12-18 11:00
2,Allarme Sociale,Altro,4.0,CIBRARIO/LUIGI (VIA),,26/02/2018,11:00,26-02-18 11:00
3,Allarme Sociale,Altro,1.0,ROMA/(VIA),,02/03/2018,11:00,03-02-18 11:00
4,Allarme Sociale,Altro,4.0,ZUMAGLIA/(VIA),,05/03/2018,11:00,03-05-18 11:00


In [60]:
df_weather = pd.read_csv("/content/drive/MyDrive/finals/hourly_weather.csv")
df_weather.head()

Unnamed: 0,hourly_date,winds,rainfall_mm,snowfall_mm
0,2016-01-06 00:00:00,0.716667,-0.003333,14.2
1,2016-01-06 01:00:00,0.51,-0.013,14.166667
2,2016-01-06 02:00:00,0.33,-0.007,14.3
3,2016-01-06 03:00:00,0.18,-0.014,14.266667
4,2016-01-06 04:00:00,0.3,0.008,14.233333


## Merging All dataframes


Merging noise, wifi, sim,weather,... police

In [81]:
df_noise_2018['date_hour'] = pd.to_datetime(df_noise_2018['date_hour'])
df_noise_2018['date_hour'] = df_noise_2018['date_hour'].dt.strftime("%d-%m-%y %H:%M")

In [83]:
 df_wifi['Time'] = pd.to_datetime(df_wifi['Time'])
 df_wifi['Time'] = df_wifi['Time'].dt.strftime("%d-%m-%y %H:%M")

In [84]:
df_final = df_noise_2018.merge(df_wifi, left_on= 'date_hour', right_on= 'Time', how='left')
df_final

Unnamed: 0,C1,C2,C3,C4,C5,date_hour,Time,No. of Visitors
0,687,,760,,666,01-01-18 00:00,,
1,683,,682,,654,01-01-18 01:00,,
2,598,,644,,644,01-01-18 02:00,,
3,674,,675,,618,01-01-18 03:00,,
4,680,,645,,605,01-01-18 04:00,,
...,...,...,...,...,...,...,...,...
8755,619,602,603,596,616,31-12-18 19:00,31-12-18 19:00,158.0
8756,625,589,582,616,616,31-12-18 20:00,31-12-18 20:00,171.0
8757,628,567,592,582,593,31-12-18 21:00,31-12-18 21:00,151.0
8758,605,572,589,581,572,31-12-18 22:00,31-12-18 22:00,125.0


In [85]:
df_sim_all['date_time'] = pd.to_datetime(df_sim_all['date_time'])
df_sim_all['date_time'] = df_sim_all['date_time'].dt.strftime("%d-%m-%y %H:%M")

In [86]:
df_final_1 = df_final.merge(df_sim_all, left_on= 'date_hour', right_on= 'date_time', how='left')
df_final_1

Unnamed: 0,C1,C2,C3,C4,C5,date_hour,Time,No. of Visitors,index,cluster,date_time,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi)
0,687,,760,,666,01-01-18 00:00,,,,,,,,,,
1,683,,682,,654,01-01-18 01:00,,,,,,,,,,
2,598,,644,,644,01-01-18 02:00,,,,,,,,,,
3,674,,675,,618,01-01-18 03:00,,,,,,,,,,
4,680,,645,,605,01-01-18 04:00,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17306,619,602,603,596,616,31-12-18 19:00,31-12-18 19:00,158.0,,,,,,,,
17307,625,589,582,616,616,31-12-18 20:00,31-12-18 20:00,171.0,,,,,,,,
17308,628,567,592,582,593,31-12-18 21:00,31-12-18 21:00,151.0,,,,,,,,
17309,605,572,589,581,572,31-12-18 22:00,31-12-18 22:00,125.0,,,,,,,,


In [89]:
df_weather['hourly_date'] = pd.to_datetime(df_weather['hourly_date'])
df_weather['hourly_date'] = df_weather['hourly_date'].dt.strftime("%d-%m-%y %H:%M")

In [90]:
df_final_2 = df_final_1.merge(df_weather, left_on= 'date_hour', right_on= 'hourly_date', how='left')
df_final_2

Unnamed: 0,C1,C2,C3,C4,C5,date_hour,Time,No. of Visitors,index,cluster,date_time,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi),hourly_date,winds,rainfall_mm,snowfall_mm
0,687,,760,,666,01-01-18 00:00,,,,,,,,,,,01-01-18 00:00,0.366667,-0.010,2.600000
1,683,,682,,654,01-01-18 01:00,,,,,,,,,,,01-01-18 01:00,0.590000,0.009,2.600000
2,598,,644,,644,01-01-18 02:00,,,,,,,,,,,01-01-18 02:00,0.450000,0.008,2.266667
3,674,,675,,618,01-01-18 03:00,,,,,,,,,,,01-01-18 03:00,0.400000,0.006,2.266667
4,680,,645,,605,01-01-18 04:00,,,,,,,,,,,01-01-18 04:00,0.780000,-0.011,2.300000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17306,619,602,603,596,616,31-12-18 19:00,31-12-18 19:00,158.0,,,,,,,,,31-12-18 19:00,,0.002,4.200000
17307,625,589,582,616,616,31-12-18 20:00,31-12-18 20:00,171.0,,,,,,,,,31-12-18 20:00,,0.001,3.633333
17308,628,567,592,582,593,31-12-18 21:00,31-12-18 21:00,151.0,,,,,,,,,31-12-18 21:00,,0.011,2.600000
17309,605,572,589,581,572,31-12-18 22:00,31-12-18 22:00,125.0,,,,,,,,,31-12-18 22:00,,0.011,1.966667


In [91]:
df_final_2.columns

Index(['C1', 'C2', 'C3', 'C4', 'C5', 'date_hour', 'Time', 'No. of Visitors',
       'index', 'cluster', 'date_time', 'data_a', 'numero_presenze',
       'layer_id', 'layer_nome', 'dettaglio(secondi)', 'hourly_date', 'winds',
       'rainfall_mm', 'snowfall_mm'],
      dtype='object')

In [92]:
df_finalized = df_final_2.drop(columns = ['Time', 'date_time', 'hourly_date'] )

In [93]:
df_finalized

Unnamed: 0,C1,C2,C3,C4,C5,date_hour,No. of Visitors,index,cluster,data_a,numero_presenze,layer_id,layer_nome,dettaglio(secondi),winds,rainfall_mm,snowfall_mm
0,687,,760,,666,01-01-18 00:00,,,,,,,,,0.366667,-0.010,2.600000
1,683,,682,,654,01-01-18 01:00,,,,,,,,,0.590000,0.009,2.600000
2,598,,644,,644,01-01-18 02:00,,,,,,,,,0.450000,0.008,2.266667
3,674,,675,,618,01-01-18 03:00,,,,,,,,,0.400000,0.006,2.266667
4,680,,645,,605,01-01-18 04:00,,,,,,,,,0.780000,-0.011,2.300000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17306,619,602,603,596,616,31-12-18 19:00,158.0,,,,,,,,,0.002,4.200000
17307,625,589,582,616,616,31-12-18 20:00,171.0,,,,,,,,,0.001,3.633333
17308,628,567,592,582,593,31-12-18 21:00,151.0,,,,,,,,,0.011,2.600000
17309,605,572,589,581,572,31-12-18 22:00,125.0,,,,,,,,,0.011,1.966667


### About the df_finalized

In [94]:
df_finalized.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 17311 entries, 0 to 17310
Data columns (total 17 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   C1                  16588 non-null  object 
 1   C2                  11923 non-null  object 
 2   C3                  7003 non-null   object 
 3   C4                  10086 non-null  object 
 4   C5                  17155 non-null  object 
 5   date_hour           17311 non-null  object 
 6   No. of Visitors     1639 non-null   float64
 7   index               9054 non-null   float64
 8   cluster             9054 non-null   object 
 9   data_a              9054 non-null   object 
 10  numero_presenze     9054 non-null   float64
 11  layer_id            9054 non-null   object 
 12  layer_nome          9054 non-null   object 
 13  dettaglio(secondi)  9054 non-null   float64
 14  winds               2261 non-null   float64
 15  rainfall_mm         17098 non-null  float64
 16  snow