# A Quick Example: Some Visualisations on Harold Shipman's Cases

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import pandas as pd
import seaborn as sns
sns.set(style="ticks", color_codes=True)
sns.set_context("talk")

In [None]:
mpl.rcParams['figure.figsize'] = (12*1.5, 6*1.5)

## Essentials of Running/Navigating/Editing Notebooks

- **command** and **edit** modes.
- _command_: affects the structure of the notebook; _edit_: the content of a cell.
- `H` gives a key-binding overview for both modes.
- `Esc` brings you back from _edit_ to _command_ mode.
- `Shift+Enter` evaluates the content of a cell.
- `M` makes a cell of the type _Markdown_ (for a lightweight-formatted text); `Y` changes it to code.
- `A` create a new cell above the current one; `B`, below.
- there was a cheatsheet [here](https://conferences.oreilly.com/jupyter/jup-ny/public/content/jupyter-shortcuts); it seems now to have been placed [here](https://treehouse-code-samples.s3.amazonaws.com/Python/jupyter-shortcuts.pdf).
- but there is an overview of the Markdown syntax [here](https://guides.github.com/features/mastering-markdown/).


## Links from The Art of Statistics for Shipman's Example

* [David Spiegelhalter's GitHub for the book](https://github.com/dspiegel29/ArtofStatistics).

* Within Spiegelhalter's area, the resources for age and year of victims are [here](https://github.com/dspiegel29/ArtofStatistics/tree/master/00-1-age-and-year-of-deathofharold-shipmans-victims), and comparable percentages for times of deaths are [here](https://github.com/dspiegel29/ArtofStatistics/tree/master/00-2-shipman-times). The raw datasets, respectively, are [here](https://raw.githubusercontent.com/dspiegel29/ArtofStatistics/master/00-1-age-and-year-of-deathofharold-shipmans-victims/00-1-shipman-confirmed-victims-x.csv) and [here](https://raw.githubusercontent.com/dspiegel29/ArtofStatistics/master/00-2-shipman-times/00-2-shipman-times-x.csv).

## Importing/Loading/Inspecting Dataset

In [None]:
# !wget https://raw.githubusercontent.com/dspiegel29/ArtofStatistics/master/00-1-age-and-year-of-deathofharold-shipmans-victims/00-1-shipman-confirmed-victims-x.csv -O ../data/shipman-confirmed-victims.csv

In [None]:
# !ls ../data

In [None]:
# !head -n 3 ../data/shipman-confirmed-victims.csv

In [None]:
cases_df = pd.read_csv('../data/shipman-confirmed-victims.csv', parse_dates=["DateofDeath"])

In [None]:
cases_df.head()

In [None]:
cases_df = cases_df.rename(columns={ 'gender': 'gender_01'}).rename(columns={ 'gender2': 'gender' })
cases_df.head()

## Shipman's Confirmed Victims: Year of Death, Age, Gender

In [None]:
ax = sns.scatterplot(data=cases_df, x='fractionalDeathYear', y='Age', hue='gender', alpha=0.7)
ax.set_title('Shipman\'s Confirmed Victims: Year of Death Against Age, with Gender')
ax.set_xlabel('Year of Death')
ax.get_figure().savefig('../figures/shipman-cases-deathyear-age-gender.png', pad_inches = 0.1, bbox_inches='tight')

In [None]:
# !ls -lh ../figures/

In [None]:
# !open ../figures/shipman-cases-deathyear-age-gender.png

## Shipman's Victims by Time of Day

In [None]:
# !wget https://raw.githubusercontent.com/dspiegel29/ArtofStatistics/master/00-2-shipman-times/00-2-shipman-times-x.csv -O ../data/shipman-comparison-times.csv

In [None]:
times_df = pd.read_csv('../data/shipman-comparison-times.csv', index_col='Hour')

In [None]:
times_df.head()

In [None]:
ax = times_df.plot.line()
ax.set_title('Percentage of Deaths by Hour of Day: Shipman against Typical')
ax.set_ylabel('Percentage of Deaths')
fig = ax.get_figure()
fig.savefig('../figures/shipman-time-percent.png', pad_inches = 0.1, bbox_inches='tight');

In [None]:
# !open ../figures/shipman-time-percent.png