# Death Comparison

This script was written to provide easy interpretation of death statistics for four locations. It has been initialised to load the data available [here](https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/causesofdeath/datasets/deathregistrationsandoccurrencesbylocalauthorityandhealthboard) The link allows an excel file to be downloaded. This script requires the 'Occurrences - All data' to be exported into a single CSV file.

I make no claim that this script is overly efficient, it was made for personal interest.

Any questions please contact:
> Andrew Paul Barnes<br>
> Doctoral Student & Teaching Assistant<br>
> Department of Architecture and Civil Engineering<br>
> University of Bath<br>
> a.p.barnes@bath.ac.uk

## Libraries

To begin several libraries are imported to allow easy manipulation of the data.

In [10]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Loading Data Required

First the areas of interest are defined in the variable *AREAS* and the file name/location are given in *MASTER_FILE*.

In [38]:
AREAS = ["Swindon", "Northampton", "Wiltshire", "Cheltenham"]
MASTER_FILE = "occurrences-alldata.csv"

Next, the CSV file is loaded and mined to retrieve only the areas indicated above and 'all deaths'.

In [71]:
# Load data
weekly_deaths = pd.read_csv(MASTER_FILE, delimiter=',', skiprows=3)

# Filter to only contain the areas of interest
weekly_deaths = weekly_deaths.loc[weekly_deaths['Area name'].isin(AREAS)]
weekly_deaths = weekly_deaths.loc[weekly_deaths['Cause of death'] == "All causes"]

# Select the columns required
weekly_deaths = weekly_deaths[["Area name", "Week number", "Place of death", "Number of deaths"]]

# Extract the range of death locations and enumerate them.
death_locations = weekly_deaths["Place of death"].unique().tolist()
for locidx, dloc in enumerate(death_locations):
    weekly_deaths = weekly_deaths.replace(to_replace=dloc, value=locidx)

Finally, split the data into areas. This step is to simplify the process of plotting and aggregating of weekly data.

In [73]:
def split_data(deaths, areas):
    """ Splits the data from a master pandas sheet into area matrices. """
    area_matrices = {}
    for area in areas:
        area_matrices[area] = deaths.loc[deaths['Area name'] == area].iloc[:, 1:].to_numpy()
split_data(weekly_deaths, AREAS)