# Lecture 6 - Thinking about place

## Classroom exercise

### Nigel de Noronha - Sociology

**Setup code** - imports packages, sets up plotting of maps using `geopandas` Python package.

In [None]:
using DataFrames
using Plots
import PyPlot
using PyCall

py"""
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt

import matplotlib
matplotlib.rcParams['figure.figsize'] = [10, 8]

cov_map = gpd.read_file("cov.geojson")
cov_map = cov_map.to_crs({"init": "epsg:27700"}) # use ONS GB projection

def plot_map(map, df, column, left_on="lsoa11nm", right_on="LSOA", 
             cmap="PuBu", **kwargs):
    df = pd.DataFrame(df)
    map = map.merge(df, left_on=left_on, right_on=right_on)
    ax = map.plot(column=column, edgecolors="black", cmap=cmap, **kwargs)
    vmin = map[column].min()
    vmax = map[column].max()
    fig = ax.get_figure()
    cax = fig.add_axes([0.9, 0.1, 0.03, 0.8])
    sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(vmin, vmax))
    sm._A = []
    fig.colorbar(sm, cax=cax)
"""
py_plot_map = py"plot_map"
cov_map = py"cov_map"

plot_map(map, df, column; args...) = 
    py_plot_map(map, Dict(zip(names(df), DataFrames.columns(df))), String(column); args...)

## Introduction

Open the dataset `age.csv`. The header shows the age bands and each cell holds the count for each LSOA. 

In [None]:
age = readtable("age.csv")
head(age)

If the `geopandas` Python package is installed, we can visualise this data on a map, e.g. to look at the numbers of people between the ages of 20 and 24:

In [None]:
plot_map(cov_map, age, :a20_24)

## Dependency ratios

We will look at a couple of key demographic ratios, the age-related dependency ratios, and the way they vary between local areas in Coventry.

The dependency ratios measure the relationship between the working age population, children and older people.  The working age population is defined as those aged 16-64.  Younger people are aged 0-15 and older people aged 65 or over.
Calculate three new fields to hold these three figures for each LSOA.

The child dependency ratio is calculated as 

$$
\text{Child dependency ratio} = \frac{\text{Number of children}}{\text{Working age population}}
$$

The older people dependency ratio is calculated as 

$$
\text{Adult dependency ratio} = \frac{\text{Number of people aged 65 or over}}{\text{Working age population}}
$$

a)	calculate the dependency ratios for each LSOA

b)	generate a histogram of the dependency ratios

c)	calculate summary statistics (mean, standard deviation, minimum and maximum)

### Solution to part (a)

Looking at the column names, we can see which columns we need to include to count children, working age population and those over 65.

In [None]:
names(age)

We can count the number of people in each category by summing the relevant columns. We can then add two new columns to the `age` dataset with the dependency ratios. Note that we use `./` to do element-wise division

In [None]:
age[:children] = age[:a0_4] + age[:a5_7] + age[:a8_9] + age[:a10_14] + age[:a15] 
age[:working_age] = + age[:a16_17] + age[:a20_24] + age[:a25_29] + age[:a30_44] + age[:a45_59] + age[:a60_64]
age[:over65] = age[:a75_84] + age[:a85_89] + age[:a90]

age[:child_dep_ratio] = age[:children] ./ age[:working_age]
age[:adult_dep_ratio] = age[:over65] ./ age[:working_age]

head(age)

### Solution to part (b)

We can use the new columns we have computed to plot histograms of the two ratios:

In [None]:
histogram(age[:child_dep_ratio], label="Child dependency ratio", alpha=0.5)
histogram!(age[:adult_dep_ratio], label="Adult dependency ratio", alpha=0.5)

We can also plot these data on a map of Coventry:

In [None]:
plot_map(cov_map, age, :child_dep_ratio)

In [None]:
plot_map(cov_map, age, :adult_dep_ratio)

### Solution to part (c)

We can use `describe()` to show summary statistics:

In [None]:
describe(age[:child_dep_ratio])

In [None]:
describe(age[:adult_dep_ratio])

## Public service demand

In this application we will look at the school places required for Coventry using the 2011 census data.  In reality these calculations are regularly updated to try and reflect births and migration in and out of local areas.  We are interested in four education sectors:

-	Pre-primary for ages 0-4
-	Primary for ages 5-10
-	Secondary for ages 11-15
-	Tertiary for ages 16-17

You will notice that one of the age bands (10-14) goes across the boundary of the bands we want.  We can apportion this by taking 20% into primary and 80% into secondary.

a)	calculate the number of potential students for each sector by LSOA

b)	calculate summary statistics (mean, standard deviation, minimum and maximum)

c)	identify the top five LSOAs for each education sector

### Solution to part (a)

We compute the new columns as before

In [None]:
age[:pre_primary] = age[:a0_4]
age[:primary] = age[:a5_7] + age[:a8_9] + 0.2*age[:a10_14]
age[:secondary] = 0.8*age[:a10_14] + age[:a15]
age[:tertiary] = age[:a16_17];

### Solution to part (b)

As before, we can use `describe()` to print summary stats:

In [None]:
for sector in [:pre_primary, :primary, :secondary, :tertiary]
    println(sector)
    println("-----------")
    describe(age[sector])
    println()
end

### Solution to part (c)

We use `sort!()` to sort in place, specifying the column to sort by and that we want to reverse, so that the LSOAs with the largest numbers of places required come first. For example, for primary:

In [None]:
sort!(age, cols=:primary, rev=true)
head(age[[:LSOA,:primary]], 5) # show the top 5

We can also extract the relvant LSOAs directly:

In [None]:
age[1:5,:LSOA]

We can loop over all the sectors, using `display()` to output the tables (otherwise only the last line of code in a cell leads to output).

In [None]:
for sector in [:pre_primary, :primary, :secondary, :tertiary]
    sort!(age, cols=sector, rev=true)
    display(head(age[[:LSOA, sector]], 5))
end

Finally, we can also plot these data on maps, e.g. to highlight the areas with highest demand for primary schools. We use `convert()` to make an integer array because `geopandas` doesn't know how to plot Boolean (`true`/`false`) data on maps.

In [None]:
age[:top5_primary] = convert(Array{Int}, age[:primary] .>= 211.0) 
plot_map(cov_map, age, :top5_primary)