# Asthma-Like Illness Emergency Department Presentations (monthly) | Processing

The main tasks completed to clean and preprocess this dataset were:

**Data Manipulation**
1. Rename columns.
2. Remove 'LHD' from LHD name values.
3. Remove 'All' data (Representing a state-wide average).
4. Remove columns holding Confidence Interval data.
5. Remove rows holding 'Persons' data in the sex column (Representing a genderless rate per 100,000).

## Set Up

Ensure that the required libraries are available by running the below code in the terminal before execution:
- pip install pandas


Execute the following in the jupyter notebook before execution to ensure that the required libraries are imported:

In [1]:
import pandas as pd

## Load Dataset

In [2]:
# File path.
file_path = 'data-raw.csv'

# Read the file.
df = pd.read_csv(file_path)

## Data Manipulation

Rename columns to match Air Quality data set.

In [3]:
# Rename columns.
df = df.rename(columns={
    'LHD': 'lhd',
    'Period': 'date'
})

# Set column names to lower case.
df.columns = df.columns.str.lower()

Remove ' LHD' for Local Health District values.

In [4]:
# Remove ' LHD' from the 'lhd' column.
df['lhd'] = df['lhd'].str.replace(' LHD', '')

Remove rows representing state-wide aggregated data.

In [5]:
# Remove rows with 'All' in the 'lhd' column.
df = df[~df['lhd'].str.contains('All')]

Remove columns holding Confidence Interval data.

In [6]:
# Drop columns with '% ci' in the header
df = df.loc[:, ~df.columns.str.contains('% ci')]

Remove rows holding 'Persons' data in the sex column.

In [7]:
# Drop rows with 'Persons' in the 'sex' column.
df = df[~df['sex'].str.contains('Persons')]

## Output Processed Dataset

In [8]:
# File path.
file_path_output = 'data-processed.csv'

# Save the file.
df.to_csv(file_path_output, index=False)

## View Dataset

In [9]:
df

Unnamed: 0,sex,lhd,date,"rate per 100,000 population"
0,Males,Sydney,2014-07,22.6
1,Males,Sydney,2014-08,28.9
2,Males,Sydney,2014-09,15.7
3,Males,Sydney,2014-10,19.2
4,Males,Sydney,2014-11,19.7
...,...,...,...,...
3127,Females,Western NSW,2023-02,41.4
3128,Females,Western NSW,2023-03,43.3
3129,Females,Western NSW,2023-04,43.9
3130,Females,Western NSW,2023-05,57.7
