# Integrate Monthly Datasets

## Set Up

Ensure that the required libraries are available by running the below code in the terminal before execution:
- pip install pandas


Execute the following in the jupyter notebook before execution to ensure that the required libraries are imported:

In [1]:
import pandas as pd

## Load Datasets

In [14]:
# Load data into dataframes.
df_air_quality = pd.read_csv('../../2-nsw-air-quality/data-processed-monthly.csv')
df_asthma = pd.read_csv('../../3-nsw-health-stats/respiratory-health/asthma/emergency-department-presentations/monthly/data-processed-alt.csv')

# View Headers.
print("Air Quality Headers:")
print(df_air_quality.columns.tolist())
print("\nAsthma Headers:")
print(df_asthma.columns.tolist())

Air Quality Headers:
['date', 'lhd', 'CO ppm', 'NO pphm', 'NO2 pphm', 'OZONE pphm', 'PM10 µg/m³', 'SO2 pphm']

Asthma Headers:
['date', 'lhd', 'Female rate per 100,000 population', 'Male rate per 100,000 population']


## Data Manipulation

Rename columns for clarity.

In [15]:
df_asthma = df_asthma.rename(columns={
    'Female rate per 100,000 population': 'asthma edh [f] [rate per 100,000]',
    'Male rate per 100,000 population': 'asthma edh [m] [rate per 100,000]',
})

## Merge Datasets

Merge dataframes on 'date' and 'lhd' columns.

In [16]:
# Merge dataframes on 'date' and 'lhd' columns.
df_merged = pd.merge(df_air_quality, df_asthma, on=['date', 'lhd'], how='inner')

# View headers of merged dataframe.
df_merged.head()

Unnamed: 0,date,lhd,CO ppm,NO pphm,NO2 pphm,OZONE pphm,PM10 µg/m³,SO2 pphm,"asthma edh [f] [rate per 100,000]","asthma edh [m] [rate per 100,000]"
0,2014-07,Hunter New England,0.2,1.22,1.14,1.5,15.6,0.12,45.3,32.3
1,2014-08,Hunter New England,0.2,0.72,0.94,1.9,14.875,0.14,49.4,42.6
2,2014-09,Hunter New England,0.2,0.6,0.88,1.966667,16.4,0.12,36.0,26.7
3,2014-10,Hunter New England,0.3,0.42,0.82,2.333333,23.775,0.14,34.4,31.2
4,2014-11,Hunter New England,0.2,0.24,0.62,2.433333,24.75,0.14,24.7,25.9


## Output Dataset

In [17]:
df_merged.to_csv('data-merged.csv', index=False)