# London Neighborhood mining

This notebook combines data sources (wikipedia and Foursquare API) to assemble a list of neighborhoods and map them to their coordinates for futher processing.

In [2]:
import pandas as pd
import re

from geocoder import enrich_neighborhoods_with_geocoder, map_neighborhoods

### Scrape wikipedia to compile Frankfurt neighborhood list

In [3]:
tables = pd.read_html('https://en.wikipedia.org/wiki/List_of_London_boroughs')
tables[0].head()

Unnamed: 0,Borough,Inner,Status,Local authority,Political control,Headquarters,Area (sq mi),Population (2013 est)[1],Co-ordinates,Nr. in map
0,Barking and Dagenham [note 1],,,Barking and Dagenham London Borough Council,Labour,"Town Hall, 1 Town Square",13.93,194352,51°33′39″N 0°09′21″E﻿ / ﻿51.5607°N 0.1557°E,25
1,Barnet,,,Barnet London Borough Council,Conservative,"North London Business Park, Oakleigh Road South",33.49,369088,51°37′31″N 0°09′06″W﻿ / ﻿51.6252°N 0.1517°W,31
2,Bexley,,,Bexley London Borough Council,Conservative,"Civic Offices, 2 Watling Street",23.38,236687,51°27′18″N 0°09′02″E﻿ / ﻿51.4549°N 0.1505°E,23
3,Brent,,,Brent London Borough Council,Labour,"Brent Civic Centre, Engineers Way",16.7,317264,51°33′32″N 0°16′54″W﻿ / ﻿51.5588°N 0.2817°W,12
4,Bromley,,,Bromley London Borough Council,Conservative,"Civic Centre, Stockwell Close",57.97,317899,51°24′14″N 0°01′11″E﻿ / ﻿51.4039°N 0.0198°E,20


In [6]:
df = tables[0][['Borough']]
df.loc[:,'Borough'] = df['Borough'].apply(lambda x: re.sub(r'\[.*\]', '', x))
df.columns = ['Neighborhood']
df.head()

Unnamed: 0,Neighborhood
0,Barking and Dagenham
1,Barnet
2,Bexley
3,Brent
4,Bromley


#### Drop duplicates

In [7]:
df['Neighborhood'].value_counts(dropna=False)

Wandsworth                 1
Haringey                   1
Barking and Dagenham       1
Hillingdon                 1
Camden                     1
Richmond upon Thames       1
Tower Hamlets              1
Westminster                1
Sutton                     1
Kingston upon Thames       1
Bexley                     1
Brent                      1
Harrow                     1
Enfield                    1
Lambeth                    1
Croydon                    1
Hackney                    1
Kensington and Chelsea     1
Newham                     1
Hounslow                   1
Lewisham                   1
Havering                   1
Greenwich                  1
Islington                  1
Barnet                     1
Southwark                  1
Merton                     1
Redbridge                  1
Hammersmith and Fulham     1
Waltham Forest             1
Bromley                    1
Ealing                     1
Name: Neighborhood, dtype: int64

### Combine wikipedia data with geocoder data


In [8]:
enrich_neighborhoods_with_geocoder(df, "London, England")
df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Barking and Dagenham,51.554117,0.150504
1,Barnet,51.648784,-0.172913
2,Bexley,51.441679,0.150488
3,Brent,51.584778,-0.29918
4,Bromley,51.402805,0.014814


### Save neighborhood coordinates dataset

In [10]:
df.to_csv('data/london_neighborhood_coords.csv')

### Visualize London neighborhoods
<a id="vis-neighborhoods"></a>

In [11]:
m = map_neighborhoods(df, "London, England")
m