# Election Data Visualisation Walkthrough

## Introduction

This walkthrough covers an example of data visualisation using election result data that is publicly available. The final result is a geographical map that shows the colour of the winning party in each constituency.

To be able to run this exact code, you will need your file structure set up with a data folder inside your working directory. analysis and output folders are also recommended to keep your files tidy, but you can follow this walkthrough without. The data you will need can be found on the [Github repository for this Hackathon](https://github.com/MangoTheCat/rss-2018-hackathon). 

Before starting this walkthrough, make sure you folloed the setup instruction in the [README.md](https://github.com/MangoTheCat/rss-2018-hackathon/blob/master/README.md) and download and extracted all the datasets into your data folder.

Firstly, we load all the packages that we will need for this walkthrough, installing them first if necessary.

In [1]:
import pandas as pd
import geopandas as gpd
import folium

We set some options for  how the output is displayed, so it appears better in the notebook. 

In [2]:
pd.options.display.max_columns = None

## Loading the Election Results Data

The next step is to import the election data that we will be using and assign it to an object. This is the `ge_2010_results.csv` file that should now be in your data folder. We will then investigate the resulting dataframe so we know what we're dealing with.

In [3]:
results_2010 = pd.read_csv('data/election/ge_2010_results.csv')
results_2010.shape
results_2010.head()

Unnamed: 0,Press Association Reference,Constituency Name,Region,Election Year,Electorate,Votes,AC,AD,AGS,APNI,APP,AWL,AWP,BB,BCP,Bean,Best,BGPV,BIB,BIC,Blue,BNP,BP Elvis,C28,Cam Soc,CG,Ch M,Ch P,CIP,CITY,CNPG,Comm,Comm L,Con,Cor D,CPA,CSP,CTDP,CURE,D Lab,D Nat,DDP,DUP,ED,EIP,EPA,FAWG,FDP,FFR,Grn,GSOT,Hum,ICHC,IEAC,IFED,ILEU,Impact,Ind1,Ind2,Ind3,Ind4,Ind5,IPT,ISGB,ISQM,IUK,IVH,IZB,JAC,Joy,JP,Lab,Land,LD,Lib,Libert,LIND,LLPB,LTT,MACI,MCP,MEDI,MEP,MIF,MK,MPEA,MRLP,MRP,Nat Lib,NCDV,ND,New,NF,NFP,NICF,Nobody,NSPS,PBP,PC,Pirate,PNDP,Poet,PPBF,PPE,PPNV,Reform,Respect,Rest,RRG,RTBP,SACL,Sci,SDLP,SEP,SF,SIG,SJP,SKGP,SMA,SMRA,SNP,Soc,Soc Alt,Soc Dem,Soc Lab,South,Speaker,SSP,TF,TOC,Trust,TUSC,TUV,UCUNF,UKIP,UPS,UV,VCCA,Vote,Wessex Reg,WRP,You,Youth,YRDPL
0,1.0,Aberavon,Wales,2010.0,50838.0,30958,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,558.0,0.0,0.0,0.0,0.0,0.0,1276.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4411.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,919.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16073.0,0.0,5034.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2198.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,489.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2.0,Aberconwy,Wales,2010.0,44593.0,29966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,137.0,0.0,0.0,0.0,0.0,0.0,10734.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7336.0,0.0,5786.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5341.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,632.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,3.0,Aberdeen North,Scotland,2010.0,64808.0,37701,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,635.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4666.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16746.0,0.0,7001.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,8385.0,0.0,0.0,0.0,0.0,0.0,0.0,268.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,4.0,Aberdeen South,Scotland,2010.0,64031.0,43034,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,529.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,8914.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,413.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15722.0,0.0,12216.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,138.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5102.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,5.0,Aberdeenshire West & Kincardine,Scotland,2010.0,66110.0,45195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,513.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13678.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6159.0,0.0,17362.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7086.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,397.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [4]:
results_2010.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 650 entries, 0 to 649
Columns: 144 entries, Press Association Reference to YRDPL
dtypes: float64(141), int64(1), object(2)
memory usage: 731.3+ KB


This table has 650 rows and 144 columns, so we will not inspect the whole table or even the top few full rows. Instead let's look at the first 6 rows of the first 9 and last columns to get an idea of what is in the dataframe.

In [5]:
cols = list(results_2010.columns[:8])
cols.append(results_2010.columns[-1])
results_2010[cols].head()

Unnamed: 0,Press Association Reference,Constituency Name,Region,Election Year,Electorate,Votes,AC,AD,YRDPL
0,1.0,Aberavon,Wales,2010.0,50838.0,30958,0.0,0.0,0.0
1,2.0,Aberconwy,Wales,2010.0,44593.0,29966,0.0,0.0,0.0
2,3.0,Aberdeen North,Scotland,2010.0,64808.0,37701,0.0,0.0,0.0
3,4.0,Aberdeen South,Scotland,2010.0,64031.0,43034,0.0,0.0,0.0
4,5.0,Aberdeenshire West & Kincardine,Scotland,2010.0,66110.0,45195,0.0,0.0,0.0


From this we can see that the first 6 columns are constituency information and all columns from then on are the number of votes for each party. For simplicity, in this walkthrough we will focus on the results in Wales only, and so we need to filter the full results to just give us the Welsh entries.

In [6]:
wales_results_2010 = results_2010.loc[results_2010.Region == "Wales", :]
wales_results_2010.shape

(40, 144)

We can see this has worked by checking that the dataframe dimensions have changed. Now we are going to simplify the dataframe as much as possible by deleting any columns that we do not need for this evaluation, including parties with no votes in any Welsh constituencies, as well as some of the general information columns.

In [7]:
cols_not_needed = results_2010.columns[[3, 4, 5]]
wales_results_2010 = wales_results_2010.drop(cols_not_needed, axis=1)

In [8]:
parties_with_no_votes = wales_results_2010.columns[wales_results_2010.sum() == 0]
wales_results_2010 = wales_results_2010.drop(parties_with_no_votes, axis=1)

In [9]:
wales_results_2010.shape

(40, 21)

Now we have a much more manageable dataframe, with 40 rows and 20 columns.

# Processing the Data

## Finding the Winning Party

Next, we need to work out which party won in each constituency, which is done by finding the name of the column with the highest number of votes in each row. Here a vector of the winning party names is created, then added as an extra column to our simplified Wales dataframe. Again we check the dimensions have changed as expected.

In [10]:
party_cols = wales_results_2010.columns[3:]
winner = wales_results_2010[party_cols].idxmax(axis=1)
wales_results_2010["Winning"] = winner

In [11]:
wales_results_2010.head()

Unnamed: 0,Press Association Reference,Constituency Name,Region,AGS,Bean,BGPV,BNP,Ch P,Comm,Con,Grn,Ind1,Ind2,Lab,LD,MRLP,NF,PC,Soc Lab,TUSC,UKIP,Winning
0,1.0,Aberavon,Wales,0.0,558.0,0.0,1276.0,0.0,0.0,4411.0,0.0,919.0,0.0,16073.0,5034.0,0.0,0.0,2198.0,0.0,0.0,489.0,Lab
1,2.0,Aberconwy,Wales,0.0,0.0,0.0,0.0,137.0,0.0,10734.0,0.0,0.0,0.0,7336.0,5786.0,0.0,0.0,5341.0,0.0,0.0,632.0,Con
9,10.0,Alyn & Deeside,Wales,0.0,0.0,0.0,1368.0,0.0,0.0,12885.0,0.0,0.0,0.0,15804.0,7308.0,0.0,0.0,1549.0,0.0,0.0,1009.0,Lab
15,16.0,Arfon,Wales,0.0,0.0,0.0,0.0,0.0,0.0,4416.0,0.0,0.0,0.0,7928.0,3666.0,0.0,0.0,9383.0,0.0,0.0,685.0,PC
70,71.0,Blaenau Gwent,Wales,0.0,0.0,6458.0,1211.0,0.0,0.0,2265.0,0.0,0.0,0.0,16974.0,3285.0,0.0,0.0,1333.0,381.0,0.0,488.0,Lab


In [12]:
wales_results_2010.shape

(40, 22)

## Assigning a Colour

In order to show this data on a map, we need to assign a colour to each winning party. To begin with we will look at the names of the winning parties in this data.

In [13]:
all_parties = wales_results_2010.Winning.unique()
all_parties

array(['Lab', 'Con', 'PC', 'LD'], dtype=object)

From this we can see that there were only 4 winning parties in Wales, and so we can easily assign each of these their appropriate colour by hand and create a reference table. However, we will also include code that would show any other parties as pink, which could be useful if mapping a larger area with more parties (such as the whole UK).

In [14]:
main_parties = ["Lab", "Con", "PC", "LD"]
other_parties = list(set(all_parties) - set(main_parties))
main_colours = ["red", "blue", "green", "orange"]

colour_reference = pd.DataFrame(
    {"Winning": main_parties + other_parties, 
     "Colour": main_colours + ["pink"] * len(other_parties)}
)

Let's look at this reference table to check it is as expected:

In [15]:
colour_reference

Unnamed: 0,Winning,Colour
0,Lab,red
1,Con,blue
2,PC,green
3,LD,orange


Now we want to use this reference frame to add the correct colour to each constituency in the main dataframe. We can do this by merging the two dataframes.

In [16]:
wales_results_2010 = wales_results_2010.merge(colour_reference, how='left')

In order to map the colours, all we now need from the main dataframe are the constituency names and respective colours, so we extract these two columns into a new dataframe.

In [17]:
wales_colours_2010 = wales_results_2010.filter(items=["Press Association Reference", "Constituency Name", "Colour"])
wales_colours_2010 = wales_colours_2010.rename(columns={"Constituency Name": "Name"})

In [18]:
wales_colours_2010.head(6)

Unnamed: 0,Press Association Reference,Name,Colour
0,1.0,Aberavon,red
1,2.0,Aberconwy,blue
2,10.0,Alyn & Deeside,red
3,16.0,Arfon,green
4,71.0,Blaenau Gwent,red
5,89.0,Brecon & Radnorshire,orange


# Geospatial Data

Now that we've dealt with the election data, we need to create a map that we can put these colours onto. To do this we need to be able to map the borders of the constituencies. This requires a new set of data located at `data/geographic/Wales-Constituency-boundaries` in the GitHub repo for this Hackathon. This shapefile file should be saved into your data folder and unzipped before continuing. We will then read this data and save it as an object, before extracting the information we need into a dataframe by using the geopandas package.

In [19]:
borders = gpd.read_file(
    'data/geographic/Wales-Constituency-boundaries/National_Assembly_for_Wales_Constituencies_December_2015_Super_Generalised_Clipped_Boundaries_in_Wales.shp'
)

Let’s take a look at some features of this borders dataframe before continuing.

In [20]:
borders.shape

(40, 6)

In [21]:
borders.objectid.nunique()

40

In [22]:
borders.nawc15nm.nunique()

40

In [23]:
borders.head()

Unnamed: 0,objectid,nawc15cd,nawc15nm,st_areasha,st_lengths,geometry
0,1,W09000001,Ynys Mon,713430200.0,245438.947583,(POLYGON ((238892.2153000003 395248.7354000006...
1,2,W09000002,Arfon,409601600.0,112884.695237,"POLYGON ((265248.2999999998 356615.0999999996,..."
2,3,W09000003,Aberconwy,606410300.0,164784.269486,"POLYGON ((283201.0451999996 381406.0425000004,..."
3,4,W09000004,Clwyd West,925135800.0,200726.490374,"POLYGON ((300311.9201999996 379240.3910000008,..."
4,5,W09000005,Vale of Clwyd,215474800.0,107307.191139,"POLYGON ((314674.2999999998 365751.5999999996,..."


## Matching the Names

We can see that our constituency names are present in this data in the `nawc15nm` column, but these names are not in alphabetical order. Also the ID we used in our original data "Press Association Reference" does not match with this `objectid` in the shapefile. But we do have the names, so can we use those?

Lets check things line up between these two data sets, before we try any kind of merge. 

In [24]:
set(wales_colours_2010.Name) - set(borders.nawc15nm) 

{'Alyn & Deeside',
 'Brecon & Radnorshire',
 'Cardiff South & Penarth',
 'Carmarthen East & Dinefwr',
 'Carmarthen West & Pembrokeshire South',
 'Merthyr Tydfil & Rhymney'}

In [25]:
set(borders.nawc15nm) - set(wales_colours_2010.Name)

{'Alyn and Deeside',
 'Brecon and Radnorshire',
 'Cardiff South and Penarth',
 'Carmarthen East and Dinefwr',
 'Carmarthen West and South Pembrokeshire',
 'Merthyr Tydfil and Rhymney'}

Ah, right, not so straight forward after all! We can see that there are some discrepancies, and these are because the geospatial data uses 'and' instead of &, and also names constituencies 'South Pembrokeshire' vs 'Pembrokeshire South'. 

We could of course make the necessary change to these names to match the original election data, but for larger data sets this could get tedious. Its better if we instead try to use the ID column in some way. 

## ONS Geographic ID's

The ID's used are provided by the ONS and follow a standard naming policy (e.g. codes beginning with `W` relate to Wales). However the various geographical boundaries and the hierarchies that make up the UK are far from straight forward, and to add further complication, these can also change over time, as new boundaries get agreed. The end result is many codes, which may or may not map to the same region over time and at different levels. Great!

For Wales this work of linking the two ID's has been done as part of the Data Manipulation Walkthrough notebook, and the output stored under `data/geographic/wales_region_data.csv`, so we will use that. For more details refer to this other notebook. 

In [26]:
wales_region_data = pd.read_csv('data/geographic/wales_region_data.csv')

In [27]:
wales_region_data.head()

Unnamed: 0,nawc15cd,Press Association ID Number,Constituency ID,CHD_Name
0,W09000022,1.0,W07000049,Aberavon
1,W09000003,2.0,W07000058,Aberconwy
2,W09000007,10.0,W07000043,Alyn and Deeside
3,W09000002,16.0,W07000057,Arfon
4,W09000038,71.0,W07000072,Blaenau Gwent


## Merge Results with Geographic Borders

We begin by merging the borders data with this region data, using the `nawc15cd` ID column.

In [28]:
mapping_data = pd.merge(borders, wales_region_data, how='left')

In [29]:
mapping_data.shape

(41, 9)

Good start, now lets do the same with the General election results colours, using the `Press Association ID`. The column is named slightly differently in the two data sets, but the region names the numbers refer to are the same. 

In [30]:
mapping_data = pd.merge(
    mapping_data, wales_colours_2010, 
    left_on="Press Association ID Number", 
    right_on="Press Association Reference"
)

In [31]:
mapping_data.head()

Unnamed: 0,objectid,nawc15cd,nawc15nm,st_areasha,st_lengths,geometry,Press Association ID Number,Constituency ID,CHD_Name,Press Association Reference,Name,Colour
0,1,W09000001,Ynys Mon,713430200.0,245438.947583,(POLYGON ((238892.2153000003 395248.7354000006...,647.0,W07000041,Ynys Môn,647.0,Ynys Mon,red
1,2,W09000002,Arfon,409601600.0,112884.695237,"POLYGON ((265248.2999999998 356615.0999999996,...",16.0,W07000057,Arfon,16.0,Arfon,green
2,3,W09000003,Aberconwy,606410300.0,164784.269486,"POLYGON ((283201.0451999996 381406.0425000004,...",2.0,W07000058,Aberconwy,2.0,Aberconwy,blue
3,4,W09000004,Clwyd West,925135800.0,200726.490374,"POLYGON ((300311.9201999996 379240.3910000008,...",155.0,W07000059,Clwyd West,155.0,Clwyd West,blue
4,5,W09000005,Vale of Clwyd,215474800.0,107307.191139,"POLYGON ((314674.2999999998 365751.5999999996,...",588.0,W07000060,Vale of Clwyd,588.0,Vale of Clwyd,red


We now have a dataframe with both the geospatial data and colour data we need to finally plot. We now clean it up to keep the information we need. 


In [32]:
mapping_data.columns

Index(['objectid', 'nawc15cd', 'nawc15nm', 'st_areasha', 'st_lengths',
       'geometry', 'Press Association ID Number', 'Constituency ID',
       'CHD_Name', 'Press Association Reference', 'Name', 'Colour'],
      dtype='object')

In [33]:
mapping_data = mapping_data.filter(['nawc15cd', 'Press Association ID Number', 'CHD_Name', 'Colour', 'geometry'])
mapping_data = mapping_data.rename(
    columns={'Press Association ID Number': 'PA_ID', 
             'CHD_Name': 'Region'}
)

Final step is to save the output. To make use of this later we will save it as a file in geoJSON format, which saves ours column variables alongside the region information for easy reference later.

In [34]:
mapping_data.to_file(filename='data/geographic/wales_mapping_data.json', driver='GeoJSON')

# Plotting the Map

To do the actual visulisation we will use the `folium` library which allow us to do some very nice interactive plots with maps. 

We start by displaying a map of wales, where the latitude and longitude are for the centre of Wales, and the zoom is set (by trial and error) to show the whole of Wales. We use the Mapbox tiles to give a cleaner view to work with.

In [35]:
# Center of Wales 52.1307° N, 3.7837° W
m = folium.Map(
        location=[52.4, -3.58], 
        zoom_start=8,
        tiles='Mapbox Bright',
)
m

Next we will display our border information from the geoJSON file we saved earlier. 

In [36]:
folium.GeoJson(mapping_data).add_to(m)
m

The borders display correctly, but now we need to change the styling to shade each region based on the colours we defined earlier. The `GeoJSON` class in `folium` has an argument `style_function` which can be used to map each region to a colour/style. To use it we define a function that would take in a row of data, and outputs a dictionary defining the style. To access any of the column in the mapping_data data frame, access the "properties" dictionary for the data row.

In [37]:
def get_style(data_row):
    return {
        'fillColor': data_row['properties']['Colour'],
        'color' : 'grey',
        'weight' : 1.5,
        'dashArray' : '5, 5'
    }

In [38]:
m = folium.Map(
        location=[52.4, -3.58], 
        zoom_start=8,
        tiles='Mapbox Bright',
)

folium.GeoJson(
    data=mapping_data, 
    style_function=get_style,
    name='2010 Outcome',
).add_to(m)

# This lets us toggle the layers on and off 
folium.LayerControl().add_to(m)

m

Finally, we plot our borders in white as a layer over the satellite map, and set the fill colour to be the winning party colour.

# Extensions

Now that you have followed this walkthrough to get you going, try any (or all) of the following ideas for yourself:

* Repeat a similar method for the `ge_2015_results.csv` data set on the Github repository.
* Repeat a similar process for the Scotland/England data, or for the whole of Great Britain. You will need the relevant geospatial data and to combine it with the relevant results data. The following notebook will help guide you: [process-geographic-codes-Wales.ipynb](process-geographic-codes-Wales.ipynb)
* Repeat a similar method to map predicted results, as found in the [predict-general-election-results.ipynb](predict-general-election-results.ipynb), so that these can be visually compared to actual results. 
* Rewrite this process into functions so it can be reused on any year's or countries data.
* Anything else you can think of, be creative!

**Good Luck and Have Fun!**