# Analysis and visualisation of Essex population by local authorities
<br>
Essex is a county in the region East of England located in the South-East of England. According to wikipedia (https://en.wikipedia.org/wiki/Essex), the population of Essex was 1,832,752 (mid-2019 est.) with an area of 3,670 km<sup>2</sup>.
<br><br>
This document provides an analysis and visualisation of the population in Essex by local authorities (https://en.wikipedia.org/wiki/Wards_and_electoral_divisions_of_the_United_Kingdom). Two datasets are used to perform this work:<br>
- Ward boundaries (https://osdatahub.os.uk)<br>
- Ward population (https://www.ons.gov.uk)<br>
<br>
<b>Python libraries:</b>
<br>
- geopandas<br>
- pandas<br>
- numpy<br>
- bokeh<br>
<br>
<b>Steps:</b>
<br>
1. Check population and area from the datasets<br>
2. Plot ward boundaries with population as color<br>
3. Plot ward boundaries with population density as color<br>
4. Conclusion and next steps<br>
<br>
<b>BELOW TO KEEP INTERNAL</b><br>
<b>Data must be obtain from different sources:</b>
<br>
- Ward boundaries (https://osdatahub.os.uk/downloads/open/BoundaryLine)
<br>
- Ward population (https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/wardlevelmidyearpopulationestimatesexperimental)


<b>Sources that gave ideas:</b>
<br>
https://www.citypopulation.de/en/uk/eastofengland/
<br>
http://darribas.org/gds15/content/labs/lab_03.html
<br>

## Import libraries

In [1]:
import geopandas as gpd
import pandas as pd
import numpy as np
#import matplotlib.pyplot as plt
import json
from bokeh.io import output_notebook, show
from bokeh.models import (CDSView, ColorBar, ColumnDataSource,
                          CustomJS, CustomJSFilter, 
                          GeoJSONDataSource, HoverTool,
                          LinearColorMapper, Slider)
#from bokeh.layouts import column, row, widgetbox
from bokeh.plotting import figure, save
from bokeh.tile_providers import STAMEN_TERRAIN, CARTODBPOSITRON
from bokeh.models import NumeralTickFormatter
output_notebook()
#from shapely.geometry import Polygon # Required to use "overlay" https://gist.github.com/korakot/1cc3764602628dfdfcfe586305c31788
import Functions as fcn # custom functions

In [2]:
inputPath = './data/'
outputsPath = './docs/'

## Load and pre-process datasets previously prepared and saved as pickle

In [3]:
boundariesPopulationEssex = pd.read_pickle(inputPath + "/02_Preprocessed/boundariesPopulationEssex.pkl")

In [4]:
boundariesPopulationEssex = boundariesPopulationEssex.rename(columns={"Ward Name 1": "wardName", "All Ages": "allAges","LA name (2019 boundaries)": "LAname"})
boundariesPopulationEssex['density'] = round(boundariesPopulationEssex['allAges']/boundariesPopulationEssex['AREA_km2'],2)
boundariesPopulationEssex.head()

Unnamed: 0,FILE_NAME,AREA_CODE,DESCRIPTIO,CODE,HECTARES,geometry,AREA_km2,wardName,LAname,allAges,density
0,ESSEX_COUNTY,DIW,District Ward,E05004070,3256.236,"POLYGON ((561084.097 197904.500, 561080.403 19...",32.56236,Brizes and Doddinghurst,Brentwood,6272.0,192.62
1,ESSEX_COUNTY,DIW,District Ward,E05004076,2795.113,"POLYGON ((561248.800 198814.899, 561242.699 19...",27.95113,"Ingatestone, Fryerning and Mountnessing",Brentwood,6260.0,223.96
2,ESSEX_COUNTY,DIW,District Ward,E05004081,1827.16,"POLYGON ((560409.997 187764.202, 560412.297 18...",18.2716,Warley,Brentwood,6399.0,350.22
3,ESSEX_COUNTY,DIW,District Ward,E05004071,1984.608,"POLYGON ((560409.997 187764.202, 560407.804 18...",19.84608,"Herongate, Ingrave and West Horndon",Brentwood,3696.0,186.23
4,ESSEX_COUNTY,DIW,District Ward,E05004078,683.749,"POLYGON ((561084.097 197904.500, 561101.296 19...",6.83749,Shenfield,Brentwood,5400.0,789.76


In [5]:
# Groupby "LAname"
boundariesPopulationEssex_1 = boundariesPopulationEssex.groupby('LAname').sum().reset_index()
boundariesPopulationEssex_2 = boundariesPopulationEssex[['geometry','LAname']].dissolve(by='LAname').reset_index()
boundariesPopulationEssexLA = boundariesPopulationEssex_1.merge(boundariesPopulationEssex_2, on='LAname')
boundariesPopulationEssexLA['density'] = round(boundariesPopulationEssexLA['allAges']/boundariesPopulationEssexLA['AREA_km2'],2)
boundariesPopulationEssexLA = gpd.GeoDataFrame(boundariesPopulationEssexLA, geometry=boundariesPopulationEssexLA.geometry,crs="epsg:27700")
boundariesPopulationEssexLA

Unnamed: 0,LAname,HECTARES,AREA_km2,allAges,density,geometry
0,Basildon,11044.911,110.44911,187199.0,1694.89,"POLYGON ((577858.599 190816.499, 577862.499 19..."
1,Braintree,61170.799,611.70799,152604.0,249.47,"POLYGON ((583228.698 213642.605, 583244.397 21..."
2,Brentwood,15312.403,153.12403,77021.0,503.0,"POLYGON ((560409.997 187764.202, 560412.297 18..."
3,Castle Point,6374.319,63.74319,90376.0,1417.81,"POLYGON ((580730.701 181261.304, 580527.501 18..."
4,Chelmsford,34299.729,342.99729,178388.0,520.09,"POLYGON ((581875.601 197839.098, 581885.100 19..."
5,Colchester,34677.317,346.77317,194706.0,561.48,"POLYGON ((591668.904 216198.903, 591657.402 21..."
6,Epping Forest,33898.412,338.98412,131689.0,388.48,"POLYGON ((545356.198 192817.096, 545325.996 19..."
7,Harlow,3053.794,30.53794,87067.0,2851.11,"POLYGON ((548352.296 208891.505, 548341.396 20..."
8,Maldon,42804.921,428.04921,64926.0,151.68,"POLYGON ((596571.796 195062.103, 596548.702 19..."
9,Rochford,26290.041,262.90041,87368.0,332.32,"POLYGON ((599147.999 195110.999, 599181.498 19..."


## 1. Check population and area from the datasets

In [6]:
print('Total population Essex: ' + str('{:,}'.format(int(boundariesPopulationEssexLA['allAges'].sum()))))
print('Total area Essex [km2]: ' + str('{:,}'.format(round(boundariesPopulationEssexLA['AREA_km2'].sum(),2))))
print('Total population density Essex [/km2]: ' + str(round(boundariesPopulationEssexLA['allAges'].sum()/boundariesPopulationEssexLA['AREA_km2'].sum(),2)))

Total population Essex: 1,846,655
Total area Essex [km2]: 3,948.91
Total population density Essex [/km2]: 467.64


The total population is close to the population from wikipedia (1,832,752 vs 1,846,655). The ward area is higher when calculated from datasets than the wikipedia value (3,670 km<sup>2</sup> vs 3,949 km<sup>2</sup>). This difference is due to the projection of the Earth on a 2D plane. This projection deforms the reality and cannot exactly represent the Earth curvature. However, when the boundaries are plotted on a map it can be seen that some of the boundaries are in the sea hence the area calculated is higher.

## 2. Plot ward boundaries with population as color

In [7]:
# Input GeoJSON source that contains features for plotting
geosource = GeoJSONDataSource(geojson = boundariesPopulationEssexLA.to_crs("EPSG:3857").to_json()) # convert to mercator projection

In [12]:
nbColor = 50
custom_colors1 = fcn.linear_gradient("#fffff7","#fcc586",nbColor)['hex']
custom_colors2 = fcn.linear_gradient("#fcc586","#9e0000",nbColor)['hex']
custom_colors = custom_colors1 + custom_colors2[1:] # Use 2 linear gradient to be able to choose a "middle" color

color_mapper = LinearColorMapper(palette = custom_colors, low = 0, high = boundariesPopulationEssexLA['allAges'].max())

color_bar = ColorBar(color_mapper = color_mapper, 
                     label_standoff = 8,
                     width = 500, height = 20,
                     border_line_color = None,
                     location = (0,0), 
                     orientation = 'horizontal',
                     formatter = NumeralTickFormatter(format="0,0"))

p = figure(title = 'Essex population by local authorities (mid-2019)', 
           plot_height = 800 ,
           plot_width = 800, 
           toolbar_location = 'right',
           tools = "pan, wheel_zoom, box_zoom, reset",
           x_axis_type="linear", y_axis_type="linear")
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
# Add patch renderer to figure.
states = p.patches('xs','ys', source = geosource,
                   fill_color = {'field' :'allAges',
                                 'transform' : color_mapper},
                   line_color = 'gray',
                   line_width = 0.25, 
                   fill_alpha = 0.7)
# Create hover tool
p.add_tools(HoverTool(renderers = [states],
                      tooltips = [('LA name','@LAname'),
                                  ('Population','@allAges{0,0}'),
                                  ('Density [/km\u00b2]','@density{0,0}')]))
p.add_layout(color_bar, 'below')
p.axis.visible = False
p.add_tile(CARTODBPOSITRON)
p.title.text_font_size = '10pt'
p.title.text_font = 'verdana'
p.title.text_color = 'black'
show(p)

In [13]:
figureName = p
figureName.toolbar_location = None
fcn.saveFigure(figureName,outputsPath,'bokeh','png','mapPopulationEssexLA')

  warn("save() called but no resources were supplied and output_file(...) was never called, defaulting to resources.CDN")
  warn("save() called but no title was supplied and output_file(...) was never called, using default title 'Bokeh Plot'")


## 3. Plot ward boundaries with population density as color

In [10]:
nbColor = 50
custom_colors1 = fcn.linear_gradient("#fffff7","#fcc586",nbColor)['hex']
custom_colors2 = fcn.linear_gradient("#fcc586","#9e0000",nbColor)['hex']
custom_colors = custom_colors1 + custom_colors2[1:] # Use 2 linear gradient to be able to choose a "middle" color

color_mapper = LinearColorMapper(palette = custom_colors, low = 0, high = boundariesPopulationEssexLA['density'].max())

color_bar = ColorBar(color_mapper = color_mapper, 
                     label_standoff = 8,
                     width = 500, height = 20,
                     border_line_color = None,
                     location = (0,0), 
                     orientation = 'horizontal',
                     formatter = NumeralTickFormatter(format="0,0"))

p = figure(title = 'Essex population density by local authorities (mid-2019)', 
           plot_height = 800 ,
           plot_width = 800, 
           toolbar_location = 'right',
           tools = "pan, wheel_zoom, box_zoom, reset",
           x_axis_type="linear", y_axis_type="linear")
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
# Add patch renderer to figure.
states = p.patches('xs','ys', source = geosource,
                   fill_color = {'field' :'density',
                                 'transform' : color_mapper},
                   line_color = 'gray',
                   line_width = 0.25, 
                   fill_alpha = 0.7)
# Create hover tool
p.add_tools(HoverTool(renderers = [states],
                      tooltips = [('LA name','@LAname'),
                                  ('Population','@allAges{0,0}'),
                                  ('Density [/km\u00b2]','@density{0,0}')]))
p.add_layout(color_bar, 'below')
p.axis.visible = False
p.add_tile(CARTODBPOSITRON)
p.title.text_font_size = '10pt'
p.title.text_font = 'verdana'
p.title.text_color = 'black'
show(p)

In [11]:
figureName = p
figureName.toolbar_location = None
fcn.saveFigure(figureName,outputsPath,'bokeh','png','mapPopulationDensityEssexLA')

## 4. Conclusion and next steps

The South and North-East of Essex are the most populated part of the county. Southend-on-Sea and Harlow are the most densely populated areas. Some ward having a sea shore have a part of their area on the sea, this leads to approximate population density calculation as these areas are uninhabitable.