This is a notebook state saved on 16/2

This notebook contains several attempts at using an API (Geopy, Nominatim and HERE) for location data around Manchester.

<h1><center>Discovering Manchester</center></h1>

<h4><center> IBM Capstone Project - Exploring the best areas for a young professional to move to in Greater Manchester </center></h4>

## [Table of Contents:](#Table-of-Contents:)

* [Project Goals](#Project-Goals)
* [Libraries](#Libraries)
* [Data](#Data)
    * [Areas](#Areas)
        * [Metropolitan Districts](#Metropolitan-Districts:)
        * [Wards](#Wards:)
        * [Postcodes](#Postcodes:)
        * [Distance from Centre](#Distance-from-Centre)
    * [Desirability](#Desirability)
* [References](#References)

## Project Goals 

This project seeks to analyse areas of Greater Manchester and seeks to answer the question; where are be the best locations for a young professional to move to?

This notebook shows my thought process and approach to this project and the avenues I have explored in trying to reach my ultimate goal of area comparison in Manchester, as such some of these sections could be removed to add readability and make the notebook much more concise and clean.

## Libraries

The following libraries have been used in this project:

In [4]:
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
from urllib.request import urlopen
import re
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="aaa")
import folium
import requests

## Data

### Areas

The first task in this project is to determine how the areas of Greater Manchester will be divided.

Manchester is somewhat difficult to divide into clearly defined areas for historical reasons. Areas {EXPAND ON THIS}!!!!!!!!!!!!!!!!!!!!!!!  <sup>[1](#1.)</sup>

The following options were considered:

#### <b>Metropolitan Districts:</b><br>

Greater Manchester comprises of ten metropolitan areas:
1. City of Manchester
2. Stockport
3. Tameside
4. Oldham
5. Rochdale
6. Bury
7. Bolton
8. Wigan
9. City of Salford
10. Trafford

<figure>
<img src="images/Greater_Manchester_numbered_districts.svg.png" style="width:500px;height:300px;">
<figcaption>Image 2: Greater Manchester metropolitan districts. Source: Wikipedia.</figcaption>
</figure>
<br>

PLACE SOME OBSERVATIONS HERE<br>
<br>

#### <b>Wards:</b><br>

The City of Manchester countains contain 32 Wards<sup>[2](#2.)</sup>:

<figure>
<img src="Images/Ward-Map-Manchester_District_(B).jpg" style="width:500px;height:500px;">
<figcaption>Image 2: Greater Manchester metropolitan districts. Source: Wikipedia.</figcaption>
</figure>
<br>

PLACE SOME OBSERVATIONS HERE<br>
<br>


#### <b>Postcodes</b>:


Postcodes in the United Kingdom include a outward and an inward code. <br>
The outward code comprised of a postcode area, a one or two letter code, and a postcode district, one or two digits or a digit followed by a letter. For example, Machester city centre has the outward postal code M1.<br>
The inward code comprises of a single digit postcode sector followed by a two character postcode unit. A postal code may be a single street, or even a single building or organisation.<sup>[4](#4.)</sup>


<figure>
<img src="images/map-postcode-area-M-Manchester.jpg" style="width:500px;height:300px;">
<figcaption>Image 2: M-Postcode districts. Source: Geopunk</figcaption>
</figure>

While inner codes seem to be far too precise for this project's aims, outer codes with the M prefix seem to give good coverage of Greater Manchester while remaining precise enough to give locally defined areas. This is explored below, details on how this is acheived is noted in further detail the comments.

Unfortunately, Geopy was unable to find locations using only outer code of Postcodes. A CSV  for postcode outcode coordinates in the format of id, postcode, latitude, longitude was found from the Office of National Statistics (ONS) <sup>[5](#5.)</sup> on [free map tools](https://www.freemaptools.com/download-uk-postcode-lat-lng.htm) and used for geographical co-ordinates instead. Initially, Geopunk was scraped for postcode information, however once it was compared to this CSV it was found that many postcodes were missing from the Geopunk source. The csv file alone will be used for all postcode district geographical information as it is considerably easier to work with and any extra work to scrape websites or search for geographical co-ordinates with Geopy is superfluous.

The orignal code for scraping geopunks website and using Geopy for is left below for reference of the first method acheived for geographical location. It has been changed to a Markdown cell to save computational time and can be viewed in the dropdown below.

<details>
<summary>Old code</summary>

```python

# Scrape geopunk for M-Postcode list

# open site to bs4 object
with urlopen("https://geopunk.co.uk/postcode-areas/M") as fp:
    soup = BeautifulSoup(fp)

# Extract list as result set.
# Note the list of postcodes are in links on the website so first all links are extracted.
res_set = soup.find('a').find_all('a')

# Next the text form the links is converted to a list; a pattern match function has been added to the list comprehension so only 'Mx' 
# Postcodes have been extracted.

#Search for postcodes beginning with M then a digit
pattern1 = re.compile("M\d")

#Search for postcodes with a space to make distinction between full and partial post codes
pattern2 = re.compile(" ")

# Create DataFrame (list name is old syntax)
list_PC = pd.DataFrame([link.text for link in res_set if (pattern1.match(link.text) is not None) and (pattern2.search(link.text) is None)],
                      columns = ['Postcode_District'])

# Obtain geographical information using geopy (method1)
MCR_geo1 = [geolocator.geocode({'postalcode':PC, 'city':'Manchester','country':'United Kingdom'}) for PC in list_PC.Postcode_District]

# Obtain geographical information using geopy (method2)
MCR_geo2 = [geolocator.geocode('{}, Manchester, United Kingdom'.format(PC)) for PC in list_PC.Postcode_District]

```

Neither of these methods provided sufficient results for the geographical locations of the postcodes. Further inspection of Nominatim.openstreetmap showed very little mention of postcodes in the Manchester area so this method of obtaining geographical coordinates was abandoned.
    
</details>

Geographical coordinates for the M-Postcode Districts extracted from the postcode outcodes ONS csv file and loaded into a DataFrame.

In [175]:
# Get DataFrame of all UK Postcodes. Note there is some kwargs passed for formatting purposes and ease of use.
UK_PC = pd.read_csv('https://www.freemaptools.com/download/outcode-postcodes/postcode-outcodes.csv', usecols=[1,2,3], 
                    header = 0, names = ['Postcode_District', 'Latitude','Longitude'])

# Extract M- only postcodes
MP_ONS = UK_PC.loc[UK_PC.Postcode_District.str.match('M\d')].reset_index(drop=True)

# Drop Non geographic postcode M61 - See link 5
MP_ONS = MP_ONS[MP_ONS.Postcode_District != 'M61']

This list of postcodes is compared to Wikipedia's [M postcode area](https://en.wikipedia.org/wiki/M_postcode_area)<sup>[6](#6.)</sup> page to ensure no postcodes are missing.

Extract postcodes from Wikipedia:

In [3]:
with urlopen("https://en.wikipedia.org/wiki/M_postcode_area") as fp:
    soup = BeautifulSoup(fp)

In [4]:
res_set = soup.find_all('table')
res_set

# Define function for converting html to list
def cleanhtml(raw_html):
    cleaner = re.compile('<.*?>')
    text = re.sub(cleaner, '', str(raw_html)).splitlines()
    return list(filter(None, text))

# further cleaning of list and conversion to Pandas df
table = [cleanhtml(row) for row in res_set][1]
table = np.array(table).reshape(len(table)//4,4)
MP_w_f = pd.DataFrame(table, columns = table[0])
MP_w_f = MP_w_f.drop(MP_w_f.index[0])

# Drop Non geographic postcode M61 - See link 5
MP_w_f = MP_w_f[MP_w_f['Postcode district'] != 'M61']

# We can also see in the dataframe that M60 and M99 are non-residential so are also dropped
MP_w_f = MP_w_f[(MP_w_f['Postcode district'].str.match('M60') == False)]
MP_w_f = MP_w_f[MP_w_f['Postcode district'] != 'M99']

# Merge M3 Postcode
# Remove post code sector information
cleaner2 = re.compile('\((.*?)\)')
MP_w_f['Postcode district'] = [re.sub(cleaner2, '', i) for i in MP_w_f['Postcode district']]

# Merge columns
MP_W = MP_w_f.groupby(by = MP_w_f['Postcode district']).aggregate(lambda x: ', '.join(x)).reset_index()

As can be seen in the comments above some cleaning of the wikipedia table was required to produce a final dataframe of M-Postcodes (MP_W), including the removal of non-residential postcodes. Checking the ONS csv dataframe we see that the M60 and M99 postcodes also exist here. As such, they are also removed from this dataframe.

In [5]:
MP_ONS = MP_ONS[MP_ONS.Postcode_District != 'M60']
MP_ONS = MP_ONS[MP_ONS.Postcode_District != 'M99']

Finally, we check Wikipedia and ONS csv match, first by ensuring they have the same length and then that the sort postcode values match for all indices: 

In [6]:
len(MP_W) == len(MP_ONS)

True

In [9]:
MP_match = [(i, MP_W["Postcode district"].sort_values().iloc[i] == MP_ONS["Postcode_District"].sort_values().iloc[i]) 
            for i in range(len(MP_W))]

Exploring the values for MPC match we can see the 43 Postcodes match.

##### <b>Distance from Centre</b>

The final option for determing the areas used in analysis is searching for areas with type boundary:political on OpenStreetMap using teh Overpass api Overpy.

In [22]:
import overpy

api = overpy.Overpass()

result = api.query("""
area[name="Manchester, UK"];
out;
""")

for area in result.areas:
    print(
        "Name: %s (%i)" % (
            area.tags.get("name", "n/a"),
            area.id
        )
    )
    for n, v in area.tags.items():
        print("  Tag: %s = %s" % (n, v))

HERE API using url; Includes api key (NB: Key expires every 60 minutes)

In [83]:
# Get 

URL = 'https://geocode.search.hereapi.com/v1/geocode'
location = 'Manchester' #taking user input
api_key = 'vjCV6buP-fTPKKILqN6sU03PB6J6KQOp8YbbVVnB0yA' 
PARAMS = {'apikey':api_key,'q':location} 

# sending get request and saving the response as response object 
r = requests.get(url = URL, params = PARAMS) 
data = r.json()

latitude = data['items'][0]['position']['lat']
longitude = data['items'][0]['position']['lng']

HERE API geopy gecoder object initialisation

In [180]:
from geopy.geocoders import Here
geolocator = Here(apikey=api_key)


In [176]:
MP_ONS

Unnamed: 0,Postcode_District,Latitude,Longitude
0,M1,53.47734,-2.23508
1,M11,53.47834,-2.17933
2,M12,53.46482,-2.20187
3,M13,53.4603,-2.21389
4,M14,53.4477,-2.22437
5,M15,53.46563,-2.25008
6,M16,53.45481,-2.26357
7,M17,53.46906,-2.31789
8,M18,53.46127,-2.16871
9,M19,53.43696,-2.19421


Had some luck with finding postcodes with this API - this should be investigated further

In [183]:
###################################33 TRY TO RUN THROUGH THIS ###################
for PC in MP_ONS.Postcode_District:
    print(PC, geolocator.geocode({'postalcode':PC, 'city':'Manchester'}))
# test = [geolocator.geocode({'postalcode':PC}) for PC in MP_ONS.Postcode_District]

M1 M1 2, Manchester, England, United Kingdom, Manchester, England M1 2, GBR
M11 M1 1, Manchester, England, United Kingdom, Manchester, England M1 1, GBR
M12 M1 2, Manchester, England, United Kingdom, Manchester, England M1 2, GBR
M13 M1 3, Manchester, England, United Kingdom, Manchester, England M1 3, GBR
M14 M1 4, Manchester, England, United Kingdom, Manchester, England M1 4, GBR
M15 M1 5, Manchester, England, United Kingdom, Manchester, England M1 5, GBR
M16 None
M17 M1 7, Manchester, England, United Kingdom, Manchester, England M1 7, GBR
M18 None
M19 None
M2 M2 3, Manchester, England, United Kingdom, Manchester, England M2 3, GBR
M20 None
M21 None
M22 None
M23 None
M24 M2 4, Manchester, England, United Kingdom, Manchester, England M2 4, GBR
M25 M2 5, Manchester, England, United Kingdom, Manchester, England M2 5, GBR
M26 M2 6, Manchester, England, United Kingdom, Manchester, England M2 6, GBR
M27 M2 7, Manchester, England, United Kingdom, Manchester, England M2 7, GBR
M28 None
M29 No

This API is also useful for named searches

In [7]:
geolocator.geocode('Manchester')

Location(Manchester, Greater Manchester, North West England, England, United Kingdom, (53.4794892, -2.2451148, 0.0))

Unfortunately only one area could be found using HERE reverse method

In [110]:
geolocator.reverse(query = (53.44251, -2.27656), exactly_one=False, radius =1000.00, mode='retrieveAreas', maxresults=5) 

[Location(Manchester, England, United Kingdom, Manchester, England M3 3, GBR, (53.44251, -2.27656, 0.0))]

Nominatim API was initialised and explored

In [112]:
from geopy.geocoders import Nominatim
geolocatorN = Nominatim(user_agent="aaa")


Again only one area could be found with reverse method

In [123]:
geolocatorN.reverse(query = (53.44, -2.27), exactly_one=False, zoom=12)

[Location(Wythenshawe, Manchester, Greater Manchester, North West England, England, M22 5RE, United Kingdom, (53.3803596, -2.2632079, 0.0))]

API through URL like above should be set up: Lookup method and details should be checked

In [170]:
URL = 'https://autocomplete.search.hereapi.com/v1/browse'
location = 'Manchester' 
PARAMS = {'q':location,'at':'53.4794892,-2.2451148','apikey':api_key,'in':'circle:53.4794892,-2.2451148;r=1000000','types':'aa'}

r = requests.get(url = URL, params = PARAMS) 
data = r.json()

In [186]:
################## THIS IS GOOOOOOD

URL = 'https://browse.search.hereapi.com/v1/browse'
api_key = 'foERkdIHR5_9FKln3qpHQNWvW2auvWr63bkdfDo3DPM' 
PARAMS = {'at':'53.4794892,-2.2451148','apikey':api_key,'in':'circle:53.4794892,-2.2451148;r=50000','types':'district', 'limit':'100'} 

r = requests.get(url = URL, params = PARAMS) 
data = r.json()

In [191]:
for sall in data['items']:
     print(sall['title'])

Manchester Central, Manchester, England, United Kingdom
Shopping District, Salford, England, United Kingdom
China Town, Manchester, England, United Kingdom
Spinningfields, Salford, England, United Kingdom
The Gay Village, Manchester, England, United Kingdom
Deansgate Locks, Manchester, England, United Kingdom
Castlefield, Salford, England, United Kingdom
Northern Quarter, Manchester, England, United Kingdom
Blackfriars, Salford, England, United Kingdom
Victoria, Manchester, England, United Kingdom
Ancoats, Manchester, England, United Kingdom
Piccadilly, Manchester, England, United Kingdom
Greenquarter, Manchester, England, United Kingdom
Hulme, Manchester, England, United Kingdom
New Islington, Manchester, England, United Kingdom
University District, Manchester, England, United Kingdom
Manchester Science Park, Manchester, England, United Kingdom
Ardwick, Manchester, England, United Kingdom
Cheetham Hill, Salford, England, United Kingdom
Ordsall, Salford, England, United Kingdom
Collyhu

In [5]:
URL = 'https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189&format=json'

with urlopen(URL) as fp:
    json = pd.read_json(fp)

In [6]:
json

Unnamed: 0,place_id,licence,osm_type,osm_id,boundingbox,lat,lon,display_name,class,type,importance,address
0,256898587,"Data © OpenStreetMap contributors, ODbL 1.0. h...",relation,146656,"[53.3401044, 53.5445923, -2.3199185, -2.1468288]",53.479489,-2.245115,"Manchester, Greater Manchester, North West Eng...",boundary,administrative,0.681719,"{'city': 'Manchester', 'county': 'Greater Manc..."
1,566458,"Data © OpenStreetMap contributors, ODbL 1.0. h...",node,240109189,"[52.3570365, 52.6770365, 13.2288599, 13.5488599]",52.517037,13.38886,"Berlin, 10117, Deutschland",place,city,0.787539,"{'city': 'Berlin', 'state': 'Berlin', 'postcod..."


### Desirability

The next task will be to determine what features will be used to determine the desirabilty of the given areas. Some examples of this may be:
* Public Transport
* Pubs and Restaraunts
* Green Spaces
* Average age of community (i.e. do similar people live in the area)
* Landmarks and points of interest (e.g Old Trafford, City of Manchester Stadium)
* Museums and Libraries
* Gyms and Leisure
* Access to Healthcare

## References

##### 1. 
https://ilovemanchester.com/manchester-and-salford-so-whats-the-difference
##### 2. 
https://geopunk.co.uk/council/Manchester-District-(B)
##### 3. 
https://www.geopunk.co.uk/postcode-areas/M  #### No longer in use
##### 4. 
https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Outward_code
##### 5. 
postcode-outcodes.csv, Office for National Statistics licensed under the Open Government Licence v.3.0, https://www.freemaptools.com/download-uk-postcode-lat-lng.htm
##### 6. 
https://en.wikipedia.org/wiki/M_postcode_area


https://en.wikipedia.org/wiki/Greater_Manchester <br>

https://en.wikipedia.org/wiki/Transport_in_Manchester <br>

https://www.cityoftrees.org.uk/explore <br>

[Top](#Discovering-Manchester)