In [None]:
import pandas as pd
from urllib.request import urlopen
import json
from time import sleep

# Parliament members

The following is the names and ids of MPs. Using ids, we will extract more information through parliament members API.

In [None]:
!wget https://www.dropbox.com/s/n038gl81vpanv4e/df_all_mps_sub.csv

In [None]:
df_mps = pd.read_csv("df_all_mps_sub.csv")
df_mps.head()

Unnamed: 0,id,name
0,172,"Abbott, Ms Diane"
1,4212,"Abrahams, Debbie"
2,4057,"Adams, Nigel"
3,4639,"Afolami, Bim"
4,1586,"Afriyie, Adam"


## Get members

The API we are going to access is this:

* Documentation: https://members-api.parliament.uk/index.html

* Endpoint: https://members-api.parliament.uk/api/Members/{id}


### Kier Stermer

Now let's try to access Kier Stermer's information. The following line will provide his id:

In [None]:
df_mps[df_mps['name'].str.contains("Starmer")]

Unnamed: 0,id,name,gender,party,const
563,4514,"Starmer, Keir",M,Labour,Holborn and St Pancras


#### Create URL

We need to construct an url to access. The following code is an example of that using `.format()` method. For this example, it's not necessary as simple concaetnation is sufficient, but for more complicated cases, this will help.



In [1]:
url_template = 'https://members-api.parliament.uk/api/Members/{id}'
url_template.format(id = "4514")

'https://members-api.parliament.uk/api/Members/4514'

Now, let's access the page with `urlopen()` and hand the response to `json.load()`

In [None]:
res = urlopen(url_template.format(id = current_id))
member_json = json.load(res)

Once you load json, you can print out it with `json.dumps()`

In [None]:
print(json.dumps(member_json, indent=4))

Let's extract
- gender
- party
- electoral constituency
- membership start date

For example first elected canbe ectracted like this:


In [None]:
member_json.get("value").get('latestHouseMembership').get('membershipStartDate')

'2015-05-07T00:00:00'

Let's extract all 4 fields, convert them (with id) to a dataframe, called `df_info`

The results looks like this:

|id |membershipStartDate|gender             |party|membershipFrom|
|---|-------------------|-------------------|-----|--------------|
|4514|2015-05-07T00:00:00|M                  |Labour|Holborn and St Pancras|


#### Accessing contact information

There is another Members API, which is about the contact information. 

* Endpoint: https://members-api.parliament.uk/api/Members/{id}/Contact

Let's accces it, and extract some more information



In [2]:
url_template_contact = 'https://members-api.parliament.uk/api/Members/{id}/Contact'
url_template_contact.format(id = "4514")


'https://members-api.parliament.uk/api/Members/4514/Contact'

Access the endpoint and extract some information

Dump the json 

Get website address and twitter as an dictionary. The resulting outcome is like this:

```
{'Website': 'http://www.keirstarmer.com/',
 'Twitter': 'https://twitter.com/keir_starmer'}
```

Convert the dictionary into a dataframe

Concatenate with `df_info`

The resulting dataframe should look like this:

|id  |membershipStartDate|gender|party |membershipFrom        |Website                    |Twitter                         |
|----|-------------------|------|------|----------------------|---------------------------|--------------------------------|
|4514|2015-05-07T00:00:00|M     |Labour|Holborn and St Pancras|http://www.keirstarmer.com/|https://twitter.com/keir_starmer|


## Combine the previous work as a function

In [None]:
def get_member_info(id, sleep_sec = 3):
  df_member_info = pd.DataFrame()
  sleep(sleep_sec)
  return(df_member_info)

Check if the function works

In [None]:
get_member_info('4212')

#### loop over first 20 members and cocatenate. Check if all works

In [None]:
list_df = [get_member_info(cid) for cid in df_mps[:20]['id']]
df_20_members = pd.concat(list_df, axis = 0)
df_20_members

Unnamed: 0,id,membershipStartDate,gender,party,membershipFrom,Website,Twitter,Facebook,Instagram
0,172,1987-06-11T00:00:00,F,Labour,Hackney North and Stoke Newington,http://www.dianeabbott.org.uk,https://twitter.com/HackneyAbbott,,
0,4212,2011-01-13T00:00:00,F,Labour,Oldham East and Saddleworth,http://www.debbieabrahams.org.uk/,https://twitter.com/Debbie_abrahams,,
0,4057,2010-05-06T00:00:00,M,Conservative,Selby and Ainsty,http://www.selbyandainsty.com/,,https://www.facebook.com/nigel.adamsmp,
0,4639,2017-06-08T00:00:00,M,Conservative,Hitchin and Harpenden,,https://twitter.com/BimAfolami,,
0,1586,2005-05-05T00:00:00,M,Conservative,Windsor,http://www.adamafriyie.org/,https://twitter.com/AdamAfriyie,,
0,4741,2019-12-12T00:00:00,F,Conservative,Cities of London and Westminster,https://www.nickieaiken.org.uk/,https://twitter.com/twocitiesnickie,https://www.facebook.com/twocitiesnickie,https://www.instagram.com/twocitiesnickie/
0,4069,2010-05-06T00:00:00,M,Conservative,Waveney,http://www.peteraldous.com/,https://twitter.com/peter_aldous,,
0,4138,2010-05-06T00:00:00,F,Labour,Bethnal Green and Bow,http://www.rushanaraali.org/,https://twitter.com/rushanaraali,,
0,4747,2019-12-12T00:00:00,M,Labour,"Birmingham, Hall Green",,,,
0,4411,2015-05-07T00:00:00,F,Conservative,Telford,http://www.lucyallan.com/,https://twitter.com/lucyallan,,
