Let's import our favourite module: pandas

In [1]:
import pandas as pd

Loading CSV files into Python is easy with `pandas`:

`pd.read_csv("path/to/file.csv")`

In [3]:
# stores our CSV into a data frame
scraping_results = pd.read_csv("../scraping-traitors/data.csv")

# describe it... here we go: 122 rows, as expected
scraping_results.describe()

Unnamed: 0,MP,Party,Constituency
count,122,122,122
unique,122,10,122
top,Mike Weir,Labour,Cardiff Central
freq,1,44,1


This file I prepared for us: that's a list of all the referendum constituencies, as well as their voting data

In [18]:
constituencies = pd.read_csv("constituencies.csv")
constituencies.describe()

Unnamed: 0,refno,Estimated Leave proportion,Known result,Figure to use,Errors,Unnamed: 9
count,632.0,632.0,155.0,632.0,155.0,1.0
mean,327.530063,0.521149,0.508449,0.520727,-0.001722,-0.001722
std,186.631494,0.114091,0.139143,0.114213,0.023033,
min,1.0,0.184812,0.205397,0.205397,-0.096978,-0.001722
25%,165.75,0.453627,0.427662,0.45393,-0.014553,-0.001722
50%,327.5,0.537723,0.522086,0.537383,0.0,-0.001722
75%,488.25,0.601774,0.618988,0.601387,0.00784,-0.001722
max,650.0,0.749608,0.741852,0.749608,0.065017,-0.001722


### Merging
Merging is super-duper important.

In this example, we use `pd.merge()`, which works this way:

`pd.merge(data_frame_one, data_frame_two, on="a String", how="how we perform the merge")`

So we start by passing in our two data frames, then:

- `on=""` is the column these data frames have in common. It's "Constituency" in our example, which should be unique enough
- `how=""` is how we do the merge: left, right, inner, outer... see [this doc](https://www.shanelynn.ie/merge-join-dataframes-python-pandas-index-1/#mergetypes)

In [36]:
# we store the result of the merge in a variable
merge = pd.merge(scraping_results, constituencies, on='Constituency', how='left')
merge.describe()

Unnamed: 0,refno,Estimated Leave proportion,Known result,Figure to use,Errors,Unnamed: 9
count,109.0,109.0,35.0,109.0,35.0,1.0
mean,283.513761,0.394789,0.365457,0.394125,-0.002065,-0.001722
std,170.332008,0.098378,0.112221,0.098248,0.026685,
min,3.0,0.184812,0.207115,0.207115,-0.096187,-0.001722
25%,156.0,0.323257,0.277492,0.321961,-0.011143,-0.001722
50%,263.0,0.400267,0.362034,0.402813,0.0,-0.001722
75%,431.0,0.455067,0.446354,0.4582,0.00939,-0.001722
max,624.0,0.65499,0.638081,0.638081,0.065017,-0.001722


Finally we can hand pick things from the merge. In this case we want to see MPs from our list of anti-Article 50 who come from constituencies that voted majoritarily Leave, i.e. for which `Figure to use` is greater than 50%

In [30]:
merge[merge['Figure to use']>= 0.5]

Unnamed: 0,MP,Party,Constituency,refno,PCON11CD,Estimated Leave proportion,Known result,Figure to use,Notes,Region,Errors,Unnamed: 9
3,Mr Graham Allen,Labour,Nottingham North,433.0,E14000866,0.65499,0.638081,0.638081,http://www.bbc.co.uk/news/uk-politics-38762034,East Midlands,-0.01691,
13,Tom Brake,Liberal Democrat,Carshalton and Wallington,133.0,E14000621,0.563024,,0.563024,,London,,
18,Chris Bryant,Labour,Rhondda,470.0,W07000052,0.612429,,0.612429,,Wales,,
28,Ann Clwyd,Labour,Cynon Valley,174.0,W07000070,0.569627,,0.569627,,Wales,,
33,Mary Creagh,Labour,Wakefield,591.0,E14001009,0.619813,0.627651,0.627651,http://www.bbc.co.uk/news/uk-politics-38762034,Yorkshire and The Humber,0.007838,
44,Jonathan Edwards,Plaid Cymru,Carmarthen East and Dinefwr,131.0,W07000067,0.537655,,0.537655,,Wales,,
46,Paul Farrelly,Labour,Newcastle-under-Lyme,413.0,E14000834,0.616674,,0.616674,,West Midlands,,
81,Catherine McKinnell,Labour,Newcastle upon Tyne North,416.0,E14000833,0.571474,,0.571474,,North East,,
114,Dr Eilidh Whiteford,Scottish National Party,Banff and Buchan,27.0,S14000007,0.540184,,0.540184,,Scotland,,
115,Dr Alan Whitehead,Labour,"Southampton, Test",524.0,E14000956,0.506528,,0.506528,,South East,,
