## Final Project 

#### Find a location to open a Mcdonald's restaurant in Liverpool, UK. 

1. Build a dataframe of neighborhoods in Liverpool, UK by web scraping the information from a Wikipedia page 
2. Assign the geodata of the neighborhoods 
3. Sort the venues data for each neighborhood with Foursquare API
4. Cluster the neighborhoods

### Import the libs

In [5]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes  
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    branca-0.4.1               |             py_0          26 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    openssl-1.1.1g             |       h516909a_1         2.1 MB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ------------------------------------------------------------
                       

In [6]:
! pip install geocoder

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 6.8MB/s ta 0:00:011
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


In [7]:
from bs4 import BeautifulSoup 
import geocoder

In [8]:
from sklearn.datasets import make_blobs

! pip install yellowbrick
from yellowbrick.cluster import KElbowVisualizer

Collecting yellowbrick
[?25l  Downloading https://files.pythonhosted.org/packages/13/95/a14e4fdfb8b1c8753bbe74a626e910a98219ef9c87c6763585bbd30d84cf/yellowbrick-1.1-py3-none-any.whl (263kB)
[K     |████████████████████████████████| 266kB 13.2MB/s eta 0:00:01
Installing collected packages: yellowbrick
Successfully installed yellowbrick-1.1


### Import data

In [9]:
#request to fetch data
neighborhooddata=requests.get("https://en.wikipedia.org/wiki/Liverpool").text

In [10]:
soup=BeautifulSoup(neighborhooddata,'html.parser')

In [11]:
neighborhoodlist=[]#a list to store data

In [12]:

for row in soup.find_all("ol")[0].findAll("li"):
    neighborhoodlist.append(row.text)
for row in soup.find_all("ol",start="16")[0].findAll("li"):
    neighborhoodlist.append(row.text)

In [14]:
dfnbh=pd.DataFrame({"Neighborhood":neighborhoodlist})
dfnbh

Unnamed: 0,Neighborhood
0,Allerton and Hunts Cross
1,Anfield
2,Belle Vale
3,Central
4,Childwall
5,Church
6,Clubmoor
7,County
8,Cressington
9,Croxteth


In [15]:
dfnbh.shape

(30, 1)

### The Geo Data

In [16]:
def get_coords(neighborhood):
    latlngcoords=None
    while (latlngcoords is None):
        g=geocoder.arcgis('{},Liverpool'.format(neighborhood))
        latlngcoords=g.latlng
    return latlngcoords

In [17]:
coordinates=[get_coords(neighborhood) for neighborhood in dfnbh["Neighborhood"].tolist()]

In [18]:
coordinates

[[53.35987000000006, -2.856179999999938],
 [53.430540000000065, -2.947469999999953],
 [53.39044000000007, -2.8528799999999706],
 [28.652250000000038, 77.18306000000007],
 [53.39581000000004, -2.8892499999999472],
 [43.10388000000006, -76.20652999999999],
 [53.43463000000003, -2.9336399999999685],
 [53.44300000000004, -2.9707199999999716],
 [53.35883000000007, -2.911929999999927],
 [53.461820000000046, -2.895369999999957],
 [53.429490000000044, -2.967389999999966],
 [53.469100000000026, -2.915269999999964],
 [53.473088474057676, -3.020253150581013],
 [53.41201950839983, -2.9503748296923713],
 [53.43436000000003, -2.9855499999999324],
 [53.41772000000003, -2.8893999999999664],
 [53.380220000000065, -2.913479999999936],
 [53.442090000000064, -2.9188599999999383],
 [53.41349000000008, -2.9127399999999284],
 [53.39979000000005, -2.925799999999981],
 [53.39268000000004, -2.9545499999999265],
 [53.440590026011435, -2.885208107796332],
 [53.351153505786804, -2.8851477389598434],
 [53.363330000

In [19]:
dfcoords=pd.DataFrame(coordinates,columns=['Latitude','Longitude'])

In [20]:
dfnbh['Latitude']=dfcoords['Latitude']
dfnbh['Longitude']=dfcoords['Longitude']

In [50]:
dfnbh

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Allerton and Hunts Cross,53.35987,-2.85618
1,Anfield,53.43054,-2.94747
2,Belle Vale,53.39044,-2.85288
3,Central,28.65225,77.18306
4,Childwall,53.39581,-2.88925
5,Church,43.10388,-76.20653
6,Clubmoor,53.43463,-2.93364
7,County,53.443,-2.97072
8,Cressington,53.35883,-2.91193
9,Croxteth,53.46182,-2.89537


In [21]:
dfnbh.to_csv("neighborhood.csv", index=False)

### Mapping 

In [22]:
address='Liverpool,United Kingdom'
geolocator=Nominatim(user_agent="my-application")
location=geolocator.geocode(address)
latitude=location.latitude
longitude=location.longitude

print("Coordinates of Liverpool, UK {},{}.".format(latitude, longitude))

Coordinates of Liverpool, UK 53.407154,-2.991665.


In [23]:
liverpoolmap=folium.Map(location=[latitude,longitude],zoom_start=12)

for lat, lng, neighborhood in zip(dfnbh['Latitude'],dfnbh['Longitude'],dfnbh['Neighborhood']):
        label='{}'.format(neighborhood)
        label=folium.Popup(label,parse_html=True)
        folium.CircleMarker(
            [lat,lng],
            radius=3,
            popup=label,
            color='blue',
            fill=True,
            fill_color="#F2E109",
            fill_opacity=0).add_to(liverpoolmap)
liverpoolmap


In [24]:
liverpoolmap.save('liverpoolmap.html')

### Get the venues by Foursquare

In [25]:
CLIENT_ID = '4ZECKELTFNPQBBUIXEZU1WWE0QTI2JP5TB0R4DYPWGOA2QY3'
CLIENT_SECRET = 'LDDACG44NEDPHPWFIJOPPRSXLUDNSLK4VAQ0AQQEMFJSIVP5' 
VERSION = '20180605' 

In [26]:
radius=3000
limit=100


venues=[]
for lat, long, neighborhood in zip(dfnbh['Latitude'],dfnbh['Longitude'],dfnbh['Neighborhood']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        limit)

    venueresults=requests.get(url).json()["response"]['groups'][0]['items']
    for venue in venueresults:
        venues.append((
            neighborhood,
            lat,
            long,
            venue['venue']['name'],
            venue['venue']['location']['lat'],
            venue['venue']['location']['lng'],
            venue['venue']['categories'][0]['name']))


In [27]:
dfvenues=pd.DataFrame(venues)
dfvenues.columns=['Neighborhood','Latitude','Longitude','Venuename','Venuelat','Venuelong','Category']

dfvenues

Unnamed: 0,Neighborhood,Latitude,Longitude,Venuename,Venuelat,Venuelong,Category
0,Allerton and Hunts Cross,53.35987,-2.85618,The Elephant,53.375512,-2.866899,Pub
1,Allerton and Hunts Cross,53.35987,-2.85618,Childhood Home of John Lennon,53.377164,-2.881661,Historic Site
2,Allerton and Hunts Cross,53.35987,-2.85618,Childhood Home of Paul McCartney,53.369586,-2.897883,Historic Site
3,Allerton and Hunts Cross,53.35987,-2.85618,Woolton Picture House,53.375562,-2.867691,Movie Theater
4,Allerton and Hunts Cross,53.35987,-2.85618,Dobbies Garden Centre Liverpool,53.348783,-2.864384,Garden Center
5,Allerton and Hunts Cross,53.35987,-2.85618,"Speke Hall, Garden and Estate",53.336292,-2.870727,History Museum
6,Allerton and Hunts Cross,53.35987,-2.85618,Strawberry Field,53.380427,-2.883861,Historic Site
7,Allerton and Hunts Cross,53.35987,-2.85618,Starbucks,53.34866,-2.862175,Coffee Shop
8,Allerton and Hunts Cross,53.35987,-2.85618,M&S Simply Food,53.35168,-2.880489,Grocery Store
9,Allerton and Hunts Cross,53.35987,-2.85618,Crowne Plaza,53.347567,-2.880677,Hotel


In [28]:
dfvenues.groupby(['Neighborhood']).count()

Unnamed: 0_level_0,Latitude,Longitude,Venuename,Venuelat,Venuelong,Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allerton and Hunts Cross,64,64,64,64,64,64
Anfield,56,56,56,56,56,56
Belle Vale,38,38,38,38,38,38
Central,58,58,58,58,58,58
Childwall,96,96,96,96,96,96
Church,48,48,48,48,48,48
Clubmoor,86,86,86,86,86,86
County,76,76,76,76,76,76
Cressington,40,40,40,40,40,40
Croxteth,23,23,23,23,23,23


In [29]:
dfvenues['Category'].unique()

array(['Pub', 'Historic Site', 'Movie Theater', 'Garden Center',
       'History Museum', 'Coffee Shop', 'Grocery Store', 'Hotel',
       'Discount Store', 'Asian Restaurant', 'Gym / Fitness Center',
       'Fast Food Restaurant', 'Sandwich Place', 'Furniture / Home Store',
       'Airport', 'Gas Station', 'Train Station', 'Clothing Store',
       'Pizza Place', 'Pet Store', 'Supermarket', 'Pharmacy',
       'Toy / Game Store', 'Pool', 'Soccer Field', 'Bookstore',
       'Hardware Store', 'Airport Lounge', 'Warehouse Store',
       'Shoe Store', 'Duty-free Shop', 'Shopping Plaza', 'Gym',
       'Outdoor Sculpture', 'Café', 'Gift Shop', 'Athletics & Sports',
       'Soccer Stadium', 'Souvenir Shop', 'Museum', 'Park',
       'Deli / Bodega', 'Bar', 'English Restaurant', 'Bowling Alley',
       'Mexican Restaurant', 'Steakhouse', 'Restaurant', 'Rock Club',
       'Golf Driving Range', 'Bus Stop', 'Ice Cream Shop',
       'Indian Restaurant', 'Snack Place', 'Dessert Shop',
       'Food & D

### Analyze Neighborhoods

In [30]:
liverpoolonehot=pd.get_dummies(dfvenues['Category'],prefix='',prefix_sep='')
liverpoolonehot['Neighborhood']=dfvenues['Neighborhood']

fixedcolumns=[liverpoolonehot.columns[-1]]+list(liverpoolonehot.columns[:-1])
liverpoolonehot=liverpoolonehot[fixedcolumns]

liverpoolonehot.head()


Unnamed: 0,Neighborhood,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Beer Bar,Beer Garden,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Bus Station,Bus Stop,Café,Caribbean Restaurant,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Football Field,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cricket Ground,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Donut Shop,Duty-free Shop,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gift Shop,Golf Course,Golf Driving Range,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Historic Site,History Museum,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Other Great Outdoors,Outdoor Sculpture,Outdoor Supply Store,Park,Pedestrian Plaza,Pet Store,Pharmacy,Pizza Place,Platform,Plaza,Pool,Pub,Racecourse,Restaurant,Road,Rock Club,Rugby Pitch,Sandwich Place,Sculpture Garden,Shoe Store,Shopping Mall,Shopping Plaza,Snack Place,Soccer Field,Soccer Stadium,Souvenir Shop,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Warehouse Store,Wine Bar,Yoga Studio
0,Allerton and Hunts Cross,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Allerton and Hunts Cross,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Allerton and Hunts Cross,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Allerton and Hunts Cross,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Allerton and Hunts Cross,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [31]:
lvpgrouped=liverpoolonehot.groupby(["Neighborhood"]).mean().reset_index()
lvpgrouped

Unnamed: 0,Neighborhood,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Beach,Beer Bar,Beer Garden,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Bus Station,Bus Stop,Café,Caribbean Restaurant,Chinese Restaurant,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Football Field,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cricket Ground,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Donut Shop,Duty-free Shop,English Restaurant,Fast Food Restaurant,Fish & Chips Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gift Shop,Golf Course,Golf Driving Range,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Historic Site,History Museum,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Other Great Outdoors,Outdoor Sculpture,Outdoor Supply Store,Park,Pedestrian Plaza,Pet Store,Pharmacy,Pizza Place,Platform,Plaza,Pool,Pub,Racecourse,Restaurant,Road,Rock Club,Rugby Pitch,Sandwich Place,Sculpture Garden,Shoe Store,Shopping Mall,Shopping Plaza,Snack Place,Soccer Field,Soccer Stadium,Souvenir Shop,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Warehouse Store,Wine Bar,Yoga Studio
0,Allerton and Hunts Cross,0.015625,0.015625,0.0,0.0,0.0,0.0,0.015625,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.046875,0.0,0.09375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.046875,0.0,0.015625,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.046875,0.0,0.015625,0.03125,0.0,0.015625,0.0,0.0,0.046875,0.015625,0.015625,0.0,0.0,0.015625,0.046875,0.015625,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.015625,0.03125,0.015625,0.0,0.0,0.015625,0.078125,0.0,0.0,0.0,0.0,0.0,0.046875,0.0,0.015625,0.0,0.015625,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.015625,0.0,0.015625,0.0,0.0,0.0,0.0,0.015625,0.0,0.0
1,Anfield,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.053571,0.0,0.0,0.017857,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.017857,0.017857,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.053571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.017857,0.017857,0.0,0.0,0.0,0.0,0.071429,0.0,0.017857,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.053571,0.017857,0.0,0.0,0.0,0.017857,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0
2,Belle Vale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.078947,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.026316,0.0,0.0,0.0,0.026316,0.078947,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.078947,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.157895,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.026316,0.0,0.0
3,Central,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.103448,0.0,0.017241,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.137931,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.12069,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Childwall,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.0,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.010417,0.010417,0.0,0.010417,0.0,0.0,0.020833,0.010417,0.104167,0.0,0.010417,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.010417,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,0.020833,0.010417,0.0,0.0,0.0,0.09375,0.010417,0.010417,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.03125,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.03125,0.020833,0.0,0.0,0.010417,0.145833,0.0,0.0,0.010417,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.010417,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.0
5,Church,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.020833,0.020833,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0625,0.0,0.041667,0.041667,0.0,0.0625,0.020833,0.0,0.020833,0.0,0.020833,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.020833,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.020833,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833
6,Clubmoor,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.023256,0.0,0.0,0.011628,0.0,0.034884,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.081395,0.0,0.0,0.011628,0.05814,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.104651,0.011628,0.023256,0.0,0.0,0.011628,0.0,0.0,0.011628,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.011628,0.0,0.011628,0.0,0.011628,0.0,0.0,0.011628,0.0,0.069767,0.0,0.0,0.011628,0.011628,0.0,0.0,0.0,0.151163,0.0,0.011628,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.011628,0.046512,0.011628,0.011628,0.0,0.011628,0.0,0.05814,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0
7,County,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.039474,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.013158,0.013158,0.0,0.026316,0.013158,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.065789,0.0,0.0,0.0,0.0,0.0,0.0,0.013158,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.039474,0.0,0.013158,0.0,0.0,0.013158,0.0,0.0,0.0,0.0,0.013158,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013158,0.0,0.0,0.0,0.0,0.013158,0.013158,0.039474,0.0,0.0,0.039474,0.026316,0.0,0.0,0.0,0.131579,0.0,0.026316,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.013158,0.013158,0.0,0.013158,0.0,0.078947,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013158,0.0,0.039474,0.0,0.0
8,Cressington,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.05,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.025,0.075,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.025,0.0,0.0,0.025,0.075,0.0,0.0,0.0,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.025,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0
9,Croxteth,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.173913,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.130435,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.130435,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.043478,0.0,0.0


In [32]:
#create a df for fastfoods

lvpfastfoods=lvpgrouped[["Neighborhood",'Fast Food Restaurant']]
lvpfastfoods.head()

Unnamed: 0,Neighborhood,Fast Food Restaurant
0,Allerton and Hunts Cross,0.03125
1,Anfield,0.107143
2,Belle Vale,0.105263
3,Central,0.103448
4,Childwall,0.083333


### Clustering 

In [33]:
k=3

lvpclustering=lvpfastfoods.drop(["Neighborhood"],1)

kmeans=KMeans(n_clusters=k, random_state=0).fit(lvpclustering)

kmeans.labels_[0:10]

array([0, 1, 1, 1, 2, 0, 2, 2, 2, 1], dtype=int32)

In [34]:
lvpmerged=lvpfastfoods.copy()
lvpmerged["Cluster Labels"]=kmeans.labels_

In [91]:
lvpmerged

Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels
0,Allerton and Hunts Cross,0.03125,0
1,Anfield,0.107143,1
2,Belle Vale,0.105263,1
3,Central,0.103448,1
4,Childwall,0.083333,2
5,Church,0.0,0
6,Clubmoor,0.05814,2
7,County,0.065789,2
8,Cressington,0.075,2
9,Croxteth,0.173913,1


In [35]:
lvpmerged=lvpmerged.join(dfnbh.set_index("Neighborhood"),on="Neighborhood")

lvpmerged.head()

Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels,Latitude,Longitude
0,Allerton and Hunts Cross,0.03125,0,53.35987,-2.85618
1,Anfield,0.107143,1,53.43054,-2.94747
2,Belle Vale,0.105263,1,53.39044,-2.85288
3,Central,0.103448,1,28.65225,77.18306
4,Childwall,0.083333,2,53.39581,-2.88925


In [36]:
mapclusters=folium.Map(location=[latitude,longitude],zoom_start=12)

x=np.arange(k)
ys=[i+x+(i*x)**2 for i in range (k)]
colorarray=cm.rainbow(np.linspace(0,1,len(ys)))
rainbow=[colors.rgb2hex(i) for i in colorarray]

markerscolor=[]
for lat,lng,poi,cluster in zip(lvpmerged['Latitude'],lvpmerged['Longitude'],lvpmerged['Neighborhood'],lvpmerged['Cluster Labels']):
    label=folium.Popup(str(poi)+'- Cluster' + str(cluster),parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(mapclusters)
       
mapclusters

In [37]:
mapclusters.save('mapclusters.html')

### Examine Clusters

In [38]:
lvpmerged.loc[lvpmerged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels,Latitude,Longitude
0,Allerton and Hunts Cross,0.03125,0,53.35987,-2.85618
5,Church,0.0,0,43.10388,-76.20653
12,Greenbank,0.022222,0,53.473088,-3.020253
13,Kensington and Fairfield,0.01,0,53.41202,-2.950375
16,Mossley Hill,0.041096,0,53.38022,-2.91348
18,Old Swan,0.02381,0,53.41349,-2.91274
19,Picton,0.02,0,53.39979,-2.9258
20,Princes Park,0.01,0,53.39268,-2.95455
22,Speke-Garston,0.034483,0,53.351154,-2.885148
26,Wavertree,0.02,0,53.39738,-2.92815


In [40]:
print ("Sum of occurance of fastfood restaurants is:{}".format(lvpmerged.groupby("Cluster Labels")['Fast Food Restaurant'].sum()[0]))

Sum of occurance of fastfood restaurants is:0.24362962583262535


In [107]:
lvpmerged.loc[lvpmerged['Cluster Labels'] == 1]


Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels,Latitude,Longitude
1,Anfield,0.107143,1,53.43054,-2.94747
2,Belle Vale,0.105263,1,53.39044,-2.85288
3,Central,0.103448,1,28.65225,77.18306
9,Croxteth,0.173913,1,53.46182,-2.89537
11,Fazakerley,0.111111,1,53.4691,-2.91527
21,Riverside,0.114286,1,53.44059,-2.885208
27,West Derby,0.113636,1,53.43272,-2.90977


In [41]:
lvpmerged.loc[lvpmerged['Cluster Labels'] == 2]


Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels,Latitude,Longitude
4,Childwall,0.083333,2,53.39581,-2.88925
6,Clubmoor,0.05814,2,53.43463,-2.93364
7,County,0.065789,2,53.443,-2.97072
8,Cressington,0.075,2,53.35883,-2.91193
10,Everton,0.071429,2,53.42949,-2.96739
14,Kirkdale,0.054545,2,53.43436,-2.98555
15,Knotty Ash,0.058824,2,53.41772,-2.8894
17,Norris Green,0.085106,2,53.44209,-2.91886
23,St Michaels,0.045455,2,53.36333,-2.88513
24,Tuebrook and Stoneycroft,0.058824,2,53.41966,-2.91448


In [44]:
for i in range (k):
    print ("Sum of occurance of fastfood restaurants in cluster {} is:{}".format(i,lvpmerged.groupby("Cluster Labels")['Fast Food Restaurant'].sum()[i]))

Sum of occurance of fastfood restaurants in cluster 0 is:0.24362962583262535
Sum of occurance of fastfood restaurants in cluster 1 is:0.8288005234111129
Sum of occurance of fastfood restaurants in cluster 2 is:0.7797840135761118


### At this point, cluster 0 has the lowest density of fastfood resturants-0.24. Thus a ideal location a new mcdonald's would be in cluster 0. Now I want to narrow the area, by a more detail the clustering. 


In [46]:
k2=4

lvpclustering2=lvpfastfoods.drop(["Neighborhood"],1)

kmeans2=KMeans(n_clusters=k2, random_state=0).fit(lvpclustering)

kmeans2.labels_[0:10]

array([0, 1, 1, 1, 2, 0, 2, 2, 2, 3], dtype=int32)

In [47]:
lvpmerged2=lvpfastfoods.copy()
lvpmerged2["Cluster Labels"]=kmeans2.labels_

In [48]:
lvpmerged2

Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels
0,Allerton and Hunts Cross,0.03125,0
1,Anfield,0.107143,1
2,Belle Vale,0.105263,1
3,Central,0.103448,1
4,Childwall,0.083333,2
5,Church,0.0,0
6,Clubmoor,0.05814,2
7,County,0.065789,2
8,Cressington,0.075,2
9,Croxteth,0.173913,3


In [50]:
lvpmerged2=lvpmerged2.join(dfnbh.set_index("Neighborhood"),on="Neighborhood")

lvpmerged2.head()

Unnamed: 0,Neighborhood,Fast Food Restaurant,Cluster Labels,Latitude,Longitude
0,Allerton and Hunts Cross,0.03125,0,53.35987,-2.85618
1,Anfield,0.107143,1,53.43054,-2.94747
2,Belle Vale,0.105263,1,53.39044,-2.85288
3,Central,0.103448,1,28.65225,77.18306
4,Childwall,0.083333,2,53.39581,-2.88925


In [51]:
mapclusters2=folium.Map(location=[latitude,longitude],zoom_start=12)

x=np.arange(k2)
ys=[i+x+(i*x)**2 for i in range (k2)]
colorarray2=cm.rainbow(np.linspace(0,1,len(ys)))
rainbow2=[colors.rgb2hex(i) for i in colorarray2]

markerscolor=[]
for lat,lng,poi,cluster in zip(lvpmerged['Latitude'],lvpmerged['Longitude'],lvpmerged2['Neighborhood'],lvpmerged2['Cluster Labels']):
    label=folium.Popup(str(poi)+'- Cluster' + str(cluster),parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow2[cluster-1],
        fill=True,
        fill_color=rainbow2[cluster-1],
        fill_opacity=0.7).add_to(mapclusters2)
       
mapclusters2

In [53]:
for i in range (k2):
    print ("Sum of occurance of fastfood restaurants in cluster {} is:{}".format(i,lvpmerged2.groupby("Cluster Labels")['Fast Food Restaurant'].sum()[i]))

Sum of occurance of fastfood restaurants in cluster 0 is:0.24362962583262535
Sum of occurance of fastfood restaurants in cluster 1 is:0.654887479932852
Sum of occurance of fastfood restaurants in cluster 2 is:0.7797840135761118
Sum of occurance of fastfood restaurants in cluster 3 is:0.17391304347826086


### Now I know that cluster 0 and cluster 2 didn't change. cluster 3 was a part of cluster, which is recognized as a different cluster after I changed the k to 4. With the lowest occurance of fastfood restaurants, cluster 3, ie. croxteth represnts a great oppournity and potenial for openin