<h1 align="center">The Battle of Neighborhoods (Week 1)</h1> 

<h3 align="center">Alan Sun</h3> 










## In which London borough should Starbucks open a new coffee shop?

### Introduction/Business problem

Starbucks is a staple of London's high streets, with shops all over the City. As with every chain store, Starbucks is always looking for opportunities and locations to open new shops. This notebook will provide analysis and advice to the Starbucks stakeholders as to where in London they should open up their newest coffee shop. (Please don't sue me for defamation - all companies mentioned in this notebook are fictitious, and likeness to any existing companies is purely coincidental).

Greater London is already very naturally divided into 33 principal divisions - The 32 London boroughs, as well as the City of London. So we will use these divisions to decide which would be the most profitable borough in which to open Starbucks' newest branch. This decision will be based on Foursquare location data for each borough, and the venue types that feature most frequently in each borough.

To give a crude example, it could be the case that people who visit coffee shops also eat at Italian restaurants. Then you would expect that it would be the case that a new coffee shop opened in an area with a high density of Italian restaurants would be more profitable than one opened in an area with very few Italian restaurants.

### Data source and uses

In [1]:
from bs4 import BeautifulSoup
import pandas as pd
import requests # library to handle requests
print('Libraries imported.')

Libraries imported.


We will scrape data from the following Wikipedia page to find a list of all the London boroughs, along with their latitude and longitude.

In [2]:
boroughs_url = "https://en.wikipedia.org/wiki/List_of_London_boroughs"
r = requests.get(boroughs_url)
r.status_code

200

In [3]:
soup = BeautifulSoup(r.text, 'html.parser')

In [4]:
table = soup.find('table')
table_contents = []
for row in table.findAll('tr')[1:]:
    cell = {} 
    cell['Borough'] = row.findAll('td')[0].text.split(' [')[0].split('\n')[0]
    coords = row.findAll('td')[8].text.split('/')[1]
    cell['Latitude'] = coords.split('°N ')[0]
    cell['Longitude'] = coords.split('°N ')[1][:-4]
    if 'W' in coords:
        cell['Longitude'] = '-' + cell['Longitude']
    table_contents.append(cell)

table_contents.append({'Borough': 'City of London', 'Latitude': '51.5155', 'Longitude': '-0.0922'})
# print(table_contents)
df = pd.DataFrame(table_contents)
print(df.shape)
df

(33, 3)


Unnamed: 0,Borough,Latitude,Longitude
0,Barking and Dagenham,﻿51.5607,0.1557
1,Barnet,﻿51.6252,-0.1517
2,Bexley,﻿51.4549,0.1505
3,Brent,﻿51.5588,-0.2817
4,Bromley,﻿51.4039,0.0198
5,Camden,﻿51.5290,-0.1255
6,Croydon,﻿51.3714,-0.0977
7,Ealing,﻿51.5130,-0.3089
8,Enfield,﻿51.6538,-0.0799
9,Greenwich,﻿51.4892,0.0648


In addition to the above data, we will be leveraging Foursquare location data to gain insight into the venue types in each Borough.