# Investigating daytime/ night-time and holiday period effect on sound levels

This notebook describes code used to label sound recording hourly noise in to 'day' and 'night'. 

Also described is a regular expression to extract Christmas holiday dates in order to look at any effect holiday time might have on the sound recording values.

## Looking at 'daytime' and 'night-time' values for sound recordings

In [1]:
# import libraries
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv('../Data/datasets/2015_2021noise_pollution_Cleaned.csv', sep = '\t', parse_dates = ['datetime'])

In [3]:
# want to add a column with labels splitting the data in to 'day' and 'night'

# set the hours for when day starts and night-time begins

day_start = 6  # 6am
day_end = 18   # 6pm

# create time of day column and set to night

df['time_of_day'] = 'night'

# use hour of the day to specify 6am - 6pm as being 'day'

df.loc[(df['hour_of_day'] >= day_start) & (df['hour_of_day'] < day_end), 'time_of_day'] = 'day'

In [4]:
# have a look at mean summary of day and night values, would expect night to have lower mean noise values

# group by time of day and rename the axis to the monitors, also hide unwanted columns

df.groupby(by = 'time_of_day').mean().transpose().rename_axis(columns= 'monitor').style.hide(['day_of_month', 'day_of_week', 'hour_of_day']) 


monitor,day,night
laeq_BullIsland2,52.847966,49.816465
laeq_Ballyferm3,57.266977,53.977654
laeq_Ballymun4,63.558368,60.281761
laeq_DCCRowingClub5,56.663883,52.851832
laeq_NavanRoad8,55.607841,53.122628
laeq_Raheny9,56.081498,52.391986
laeq_ChanceryPark11,62.731346,59.483342
laeq_BlessingtonBasin12,55.287723,49.6882
laeq_DolphinsBarn13,59.07175,55.944874


In [5]:
# create two different dataframes after splitting them in to night and day

df_day = df.loc[df['time_of_day'] == 'day'].copy()
df_night = df.loc[df['time_of_day'] == 'night'].copy()

# write df_day and df_night to csv file in case use in other analyses

df_day.to_csv('../Data/datasets/2015_2021noise_pollution_Daytime.csv', sep='\t', index = False)
df_night.to_csv('../Data/datasets/2015_2021noise_pollution_Nighttime.csv', sep='\t', index = False)

# Looking at holiday dates and effect on noise recordings

Using a regular expression pattern to extract Christmas eve, Christmas day and New Years day from the data and comparing it to the average noise recordings.

Will have a look at the differences when the whole data is used and also have a look at noise differences when only daytime data is used, or for night-time only values.

In [6]:
# drop unwanted columns
unwanted_cols = ['day_of_month','day_of_week','hour_of_day','month', 'day_name', 'date', 'part_of_week' ,'time_of_day']
df.drop(unwanted_cols, axis = 1, inplace=True)
df_day.drop(unwanted_cols, axis = 1, inplace=True)
df_night.drop(unwanted_cols, axis = 1, inplace=True)

In [7]:
# change datetime to string
# use Regex pattern to create a new dataframe containing only noise values for Christmas day, 25th December, Stephen's day 26th December, New Year's day 1st Jan
df['datetime'] = df['datetime'].astype(str)
df_day['datetime'] = df_day['datetime'].astype(str)
df_night['datetime'] = df_night['datetime'].astype(str)


# reg expression pattern
df_christmas = df[df['datetime'].str.contains('\d{4}-12-25|\d{4}-12-26|\d{4}-01-01')] 
df_christmas_daytime = df_day[df_day['datetime'].str.contains('\d{4}-12-25|\d{4}-12-26|\d{4}-01-01')]
df_christmas_night = df_night[df_night['datetime'].str.contains('\d{4}-12-25|\d{4}-12-26|\d{4}-01-01')] 

# extract array of the mean values for comparison
whole_data_mean = df.mean(numeric_only=True)
whole_Christmas_mean = df_christmas.mean(numeric_only=True)

daytime_mean = df_day.mean(numeric_only=True)
daytime_christmas = df_christmas_daytime.mean(numeric_only=True)

night_mean = df_night.mean(numeric_only=True)
night_christmas = df_christmas_night.mean(numeric_only=True)

In [8]:
# make a dataframe with the difference between the whole datasets mean and the holiday period mean

mean_holiday_df = pd.DataFrame({'Whole day holiday difference': whole_data_mean - whole_Christmas_mean,'Daytime holiday difference': daytime_mean - daytime_christmas, 'Night-time holiday difference': night_mean - night_christmas})

# rename index for table and give 'Monitor' name to index column
mean_holiday_df.rename(index = {'laeq_BullIsland2': 'Bull Island', 'laeq_Ballyferm3': 'Ballyfermot', 'laeq_Ballymun4': 'Ballymun', 'laeq_DCCRowingClub5': 'DCC Rowing Club', 'laeq_NavanRoad8': 'Navan Road', 'laeq_Raheny9': 'Raheny', 'laeq_ChanceryPark11': 'Chancery Park', 'laeq_BlessingtonBasin12': 'Blessington Basin', 'laeq_DolphinsBarn13': 'Dolphins Barn'}).rename_axis(columns= 'Monitor')

Monitor,Whole day holiday difference,Daytime holiday difference,Night-time holiday difference
Bull Island,-2.557271,-0.5862,-4.535001
Ballyfermot,-0.951063,-0.056252,-1.846587
Ballymun,0.342525,1.351649,-0.667925
DCC Rowing Club,0.575082,1.828206,-0.68
Navan Road,-1.014637,-0.189763,-1.840827
Raheny,-1.053535,-0.178607,-1.92859
Chancery Park,0.932422,2.010929,-0.14781
Blessington Basin,-0.722226,1.414702,-2.86358
Dolphins Barn,-0.253641,0.967739,-1.477639
