## Final Assignment IBM-Data Science-Coursera-Course: The Battle of Neighborhoods (Week 1)

### Title: "Basis for a decision-making concerning an investment in establishing a shopping mall in Oxford, Great Britain"

#### 1. Szenario (Introduction, Business Problem)

Suppose there's an investor who plans to establish a shopping mall in Oxford. One of his key-questions at the beginning of the development-process regards the number of competitors (existing shopping malls) in the particular regions of Oxford. 

Your job is to give a recommendation on the basis of existing data, in what region the investor should open a shopping mall. 

Your approach: The decision-criterion is the number of competitors in the region. A "no go-area" is on the one side an area  with a high number of competitors. On the other side, regions with no shopping malls could imply, that there's absolute no demand for a shopping mall. So the most interesting area is that "in between". 

#### 2. Data

At fist I build a dataframe of neighborhoods in Oxford by scraping the data from Wikipedia. Then I get the geographical coordinates of the concerning neighborhoods and obtain the corresponding venue data for the neighborhoods from my Foursquare API.

#### 3. Model-Approach: Clustering

Finally I run a k-means-algorythm to cluster the neighborhoods in Oxford into 3 clusters. On this basis I give a recommendation to the investor.

### (1) Import libraries

In [1]:
import numpy as np 
import pandas as pd 
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json 
from geopy.geocoders import Nominatim
import geocoder
import requests
from bs4 import BeautifulSoup

from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import folium 

print("Libraries imported.")

Libraries imported.


### (2) Get Data and build dataframe

In [2]:
data = requests.get("https://en.wikipedia.org/wiki/Category:Areas_of_Oxford").text

In [3]:
soup = BeautifulSoup(data, 'html.parser')

In [4]:
neighborhoodList = []

In [5]:
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    neighborhoodList.append(row.text)

In [6]:
kl_df = pd.DataFrame({"Neighborhood": neighborhoodList})
kl_df.head()

Unnamed: 0,Neighborhood
0,Oxford (UK Parliament constituency)
1,Oxford East (UK Parliament constituency)
2,Oxford West and Abingdon (UK Parliament consti...
3,"Barton, Oxfordshire"
4,"Binsey, Oxfordshire"


In [7]:
kl_df.shape

(58, 1)

In [8]:
# def function and call it
def get_latlng(neighborhood):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Oxford, Great Britain'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [9]:
coords = [ get_latlng(neighborhood) for neighborhood in kl_df["Neighborhood"].tolist() ]

In [10]:
coords

[[51.756290000000035, -1.2595099999999775],
 [51.756290000000035, -1.2595099999999775],
 [51.73443900000001, -1.332828000000005],
 [51.76426112003864, -1.2035406366771342],
 [51.76518000000004, -1.2888299999999617],
 [51.72415282980344, -1.2086661199400672],
 [51.71624000000003, -1.2926099999999678],
 [51.750410000000045, -1.3030299999999784],
 [51.735870000000034, -1.2089199999999778],
 [51.73378000000008, -1.2099699999999416],
 [51.788079753113124, -1.2703304937737556],
 [51.74569579080694, -1.3099329267357462],
 [51.74087000000003, -1.2309699999999566],
 [51.734898000000015, -1.2254185340547725],
 [51.784365003607896, -1.288094811432409],
 [51.74330000000003, -1.2592499999999518],
 [51.756290000000035, -1.2595099999999775],
 [51.74134824936639, -1.2887829179648909],
 [51.75795000000005, -1.2180399999999736],
 [51.75455277506092, -1.226168816704663],
 [51.75795000000005, -1.2180399999999736],
 [51.72517423241277, -1.264455033599848],
 [51.75498615102831, -1.2518055606259113],
 [51.72

In [11]:
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [12]:
kl_df['Latitude'] = df_coords['Latitude']
kl_df['Longitude'] = df_coords['Longitude']

In [13]:
print(kl_df.shape)
kl_df

(58, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Oxford (UK Parliament constituency),51.75629,-1.25951
1,Oxford East (UK Parliament constituency),51.75629,-1.25951
2,Oxford West and Abingdon (UK Parliament consti...,51.734439,-1.332828
3,"Barton, Oxfordshire",51.764261,-1.203541
4,"Binsey, Oxfordshire",51.76518,-1.28883
5,Blackbird Leys,51.724153,-1.208666
6,Boars Hill,51.71624,-1.29261
7,"Botley, Oxfordshire",51.75041,-1.30303
8,Church Cowley,51.73587,-1.20892
9,"Cowley, Oxfordshire",51.73378,-1.20997


In [14]:
# CSV 
kl_df.to_csv("kl_df.csv", index=False)

In [15]:
# Coordinates 
address = 'Oxford, Great Britain'
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The coordinate of Oxford, Great Britain {}, {}.'.format(latitude, longitude))

The coordinate of Oxford, Great Britain 51.7520131, -1.2578499.


### (3) Mapping

In [16]:
map_kl = folium.Map(location=[latitude, longitude], zoom_start=11)

for lat, lng, neighborhood in zip(kl_df['Latitude'], kl_df['Longitude'], kl_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_kl)  
    
map_kl



In [17]:
# HTML file
map_kl.save('map_kl.html')

### (4) Foursquare-Data and advanced dataframe

In [18]:
CLIENT_ID = 'WRGFNYARCKALJ0TIBY2OXQ3CJPGOTRDJBN0X5OGFJNMW5TRQ'
CLIENT_SECRET = '0KREPQ4BXGHEQ3TPTKG03ZPSFMMQSPGQVFON2WPDTNH1BEA3'
VERSION = '20191229'

print('My credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentails:
CLIENT_ID: WRGFNYARCKALJ0TIBY2OXQ3CJPGOTRDJBN0X5OGFJNMW5TRQ
CLIENT_SECRET:0KREPQ4BXGHEQ3TPTKG03ZPSFMMQSPGQVFON2WPDTNH1BEA3


In [19]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(kl_df['Latitude'], kl_df['Longitude'], kl_df['Neighborhood']):
    
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [20]:
venues_df = pd.DataFrame(venues)

venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(3962, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Oxford (UK Parliament constituency),51.75629,-1.25951,The Ashmolean Museum,51.755246,-1.260079,History Museum
1,Oxford (UK Parliament constituency),51.75629,-1.25951,Thirsty Meeples Board Game Cafe,51.754165,-1.261674,Gaming Cafe
2,Oxford (UK Parliament constituency),51.75629,-1.25951,Blackwell's,51.754635,-1.255517,Bookstore
3,Oxford (UK Parliament constituency),51.75629,-1.25951,Waterstones,51.754102,-1.258802,Bookstore
4,Oxford (UK Parliament constituency),51.75629,-1.25951,Oxford University Museum of Natural History,51.75869,-1.255595,History Museum


In [21]:
venues_df.groupby(["Neighborhood"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Barton, Oxfordshire",39,39,39,39,39,39
"Binsey, Oxfordshire",57,57,57,57,57,57
Blackbird Leys,41,41,41,41,41,41
Boars Hill,5,5,5,5,5,5
"Botley, Oxfordshire",16,16,16,16,16,16
Church Cowley,51,51,51,51,51,51
"Cowley, Oxfordshire",56,56,56,56,56,56
Cutteslowe,30,30,30,30,30,30
"Dean Court, Oxfordshire",25,25,25,25,25,25
"Donnington, Oxfordshire",100,100,100,100,100,100


In [22]:
print('Number of uniques categories: {}'.format(len(venues_df['VenueCategory'].unique())))

Number of uniques categories: 137


In [23]:
venues_df['VenueCategory'].unique()[:50]

array(['History Museum', 'Gaming Cafe', 'Bookstore', 'Coffee Shop',
       'Hotel', 'Pizza Place', 'Art Gallery', 'Ice Cream Shop',
       'Cocktail Bar', 'Dessert Shop', 'Restaurant', 'French Restaurant',
       'Pub', 'Park', 'Plaza', 'Bakery', 'Candy Store', 'Sandwich Place',
       'Science Museum', 'Bridge', 'Thai Restaurant', 'Canal', 'Café',
       'Theater', 'Indian Restaurant', 'Italian Restaurant', 'Roof Deck',
       'Juice Bar', 'Market', 'Brazilian Restaurant', 'Shopping Mall',
       'Department Store', 'Monument / Landmark', 'Chinese Restaurant',
       'Portuguese Restaurant', 'Field', 'Pie Shop',
       'Vegetarian / Vegan Restaurant', 'Burger Joint',
       'Caribbean Restaurant', 'Middle Eastern Restaurant',
       'Sushi Restaurant', 'Movie Theater', 'Wine Bar', 'Farmers Market',
       'Coworking Space', 'English Restaurant', 'Lebanese Restaurant',
       'Bar', 'Steakhouse'], dtype=object)

In [24]:
"Shopping Mall" in venues_df['VenueCategory'].unique()

True

In [25]:
kl_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")
kl_onehot['Neighborhoods'] = venues_df['Neighborhood'] 

fixed_columns = [kl_onehot.columns[-1]] + list(kl_onehot.columns[:-1])
kl_onehot = kl_onehot[fixed_columns]

print(kl_onehot.shape)
kl_onehot.head()

(3962, 138)


Unnamed: 0,Neighborhoods,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Automotive Shop,Bakery,Bar,Beer Bar,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Burger Joint,Burrito Place,Bus Station,Bus Stop,Café,Campground,Canal,Canal Lock,Candy Store,Caribbean Restaurant,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,College Gym,College Library,Concert Hall,Construction & Landscaping,Convenience Store,Coworking Space,Deli / Bodega,Department Store,Dessert Shop,Eastern European Restaurant,Electronics Store,English Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Food Truck,Forest,French Restaurant,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hardware Store,History Museum,Hookah Bar,Hostel,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Juice Bar,Lebanese Restaurant,Liquor Store,Market,Mediterranean Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,Outdoor Supply Store,Park,Parking,Performing Arts Venue,Pet Store,Pharmacy,Pie Shop,Pizza Place,Platform,Plaza,Pool,Portuguese Restaurant,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,River,Roof Deck,Sandwich Place,Scandinavian Restaurant,Science Museum,Sculpture Garden,Seafood Restaurant,Shopping Mall,Skating Rink,Soccer Field,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Sports Club,Stationery Store,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tennis Court,Thai Restaurant,Theater,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop
0,Oxford (UK Parliament constituency),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Oxford (UK Parliament constituency),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Oxford (UK Parliament constituency),0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Oxford (UK Parliament constituency),0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Oxford (UK Parliament constituency),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [26]:
kl_grouped = kl_onehot.groupby(["Neighborhoods"]).mean().reset_index()

print(kl_grouped.shape)
kl_grouped

(58, 138)


Unnamed: 0,Neighborhoods,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Automotive Shop,Bakery,Bar,Beer Bar,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Burger Joint,Burrito Place,Bus Station,Bus Stop,Café,Campground,Canal,Canal Lock,Candy Store,Caribbean Restaurant,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,College Gym,College Library,Concert Hall,Construction & Landscaping,Convenience Store,Coworking Space,Deli / Bodega,Department Store,Dessert Shop,Eastern European Restaurant,Electronics Store,English Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Food Truck,Forest,French Restaurant,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hardware Store,History Museum,Hookah Bar,Hostel,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Juice Bar,Lebanese Restaurant,Liquor Store,Market,Mediterranean Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,Outdoor Supply Store,Park,Parking,Performing Arts Venue,Pet Store,Pharmacy,Pie Shop,Pizza Place,Platform,Plaza,Pool,Portuguese Restaurant,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,River,Roof Deck,Sandwich Place,Scandinavian Restaurant,Science Museum,Sculpture Garden,Seafood Restaurant,Shopping Mall,Skating Rink,Soccer Field,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Sports Club,Stationery Store,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tennis Court,Thai Restaurant,Theater,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop
0,"Barton, Oxfordshire",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.076923,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.128205,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.025641,0.0,0.128205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Binsey, Oxfordshire",0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.017544,0.0,0.0,0.017544,0.017544,0.0,0.017544,0.035088,0.0,0.0,0.0,0.017544,0.017544,0.017544,0.017544,0.035088,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.035088,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.035088,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.192982,0.0,0.0,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.017544,0.0
2,Blackbird Leys,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.02439,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.04878,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.097561,0.02439,0.073171,0.0,0.02439,0.0,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.097561,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.097561,0.0,0.0,0.0,0.0,0.0,0.073171,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Boars Hill,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Botley, Oxfordshire",0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0
5,Church Cowley,0.0,0.0,0.0,0.019608,0.019608,0.0,0.019608,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.039216,0.019608,0.058824,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.039216,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.098039,0.019608,0.058824,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.078431,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.058824,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.0,0.019608
6,"Cowley, Oxfordshire",0.017857,0.0,0.0,0.017857,0.017857,0.0,0.017857,0.017857,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.053571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.089286,0.017857,0.053571,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.053571,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.089286,0.0,0.0,0.0,0.017857,0.0,0.053571,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857
7,Cutteslowe,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.033333,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0
8,"Dean Court, Oxfordshire",0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04
9,"Donnington, Oxfordshire",0.0,0.01,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.04,0.01,0.02,0.01,0.0,0.0,0.01,0.0,0.05,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.18,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01


In [27]:
len(kl_grouped[kl_grouped["Shopping Mall"] > 0])

36

In [28]:
kl_mall = kl_grouped[["Neighborhoods","Shopping Mall"]]

In [29]:
kl_mall #.head()

Unnamed: 0,Neighborhoods,Shopping Mall
0,"Barton, Oxfordshire",0.0
1,"Binsey, Oxfordshire",0.017544
2,Blackbird Leys,0.02439
3,Boars Hill,0.0
4,"Botley, Oxfordshire",0.0625
5,Church Cowley,0.019608
6,"Cowley, Oxfordshire",0.017857
7,Cutteslowe,0.0
8,"Dean Court, Oxfordshire",0.04
9,"Donnington, Oxfordshire",0.01


### (5) Clustering

In [30]:
kclusters = 3
kl_clustering = kl_mall.drop(["Neighborhoods"], 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(kl_clustering)
kmeans.labels_[0:10]

array([0, 2, 2, 0, 1, 2, 2, 0, 1, 2], dtype=int32)

In [31]:
kl_merged = kl_mall.copy()
kl_merged["Cluster Labels"] = kmeans.labels_

In [32]:
kl_merged.rename(columns={"Neighborhoods": "Neighborhood"}, inplace=True)
kl_merged.head()

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels
0,"Barton, Oxfordshire",0.0,0
1,"Binsey, Oxfordshire",0.017544,2
2,Blackbird Leys,0.02439,2
3,Boars Hill,0.0,0
4,"Botley, Oxfordshire",0.0625,1


In [33]:
kl_merged = kl_merged.join(kl_df.set_index("Neighborhood"), on="Neighborhood")
print(kl_merged.shape)
kl_merged.head()

(58, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barton, Oxfordshire",0.0,0,51.764261,-1.203541
1,"Binsey, Oxfordshire",0.017544,2,51.76518,-1.28883
2,Blackbird Leys,0.02439,2,51.724153,-1.208666
3,Boars Hill,0.0,0,51.71624,-1.29261
4,"Botley, Oxfordshire",0.0625,1,51.75041,-1.30303


In [34]:
print(kl_merged.shape)
kl_merged.sort_values(["Cluster Labels"], inplace=True)
kl_merged

(58, 5)


Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barton, Oxfordshire",0.0,0,51.764261,-1.203541
55,Wolvercote,0.0,0,51.78394,-1.29087
52,Sunnymead,0.0,0,51.738731,-1.162294
51,"Summertown, Oxford",0.0,0,51.77624,-1.26357
45,"Sandhills, Oxfordshire",0.0,0,51.7642,-1.18093
37,Oxford West and Abingdon (UK Parliament consti...,0.0,0,51.734439,-1.332828
33,Old Marston,0.0,0,51.76644,-1.23514
32,Old Headington,0.0,0,51.753396,-1.213181
31,"Northway, Oxford",0.0,0,51.763113,-1.19703
56,Wolvercote Common,0.0,0,51.784658,-1.283562


### (6) Result Mapping

In [35]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(kl_merged['Latitude'], kl_merged['Longitude'], kl_merged['Neighborhood'], kl_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [36]:
# HTML
map_clusters.save('map_clusters.html')

In [37]:
kl_merged.loc[kl_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
0,"Barton, Oxfordshire",0.0,0,51.764261,-1.203541
55,Wolvercote,0.0,0,51.78394,-1.29087
52,Sunnymead,0.0,0,51.738731,-1.162294
51,"Summertown, Oxford",0.0,0,51.77624,-1.26357
45,"Sandhills, Oxfordshire",0.0,0,51.7642,-1.18093
37,Oxford West and Abingdon (UK Parliament consti...,0.0,0,51.734439,-1.332828
33,Old Marston,0.0,0,51.76644,-1.23514
32,Old Headington,0.0,0,51.753396,-1.213181
31,"Northway, Oxford",0.0,0,51.763113,-1.19703
56,Wolvercote Common,0.0,0,51.784658,-1.283562


In [38]:
kl_merged.loc[kl_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
14,Harcourt Hill,0.030303,1,51.741348,-1.288783
4,"Botley, Oxfordshire",0.0625,1,51.75041,-1.30303
24,New Botley,0.0625,1,51.75041,-1.30303
8,"Dean Court, Oxfordshire",0.04,1,51.745696,-1.309933
29,North Hinksey,0.032258,1,51.750286,-1.291099


In [39]:
kl_merged.loc[kl_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Shopping Mall,Cluster Labels,Latitude,Longitude
43,Risinghurst,0.01,2,51.75629,-1.25951
44,"Rose Hill, Oxfordshire",0.010526,2,51.73245,-1.22886
46,"Science Area, Oxford",0.01,2,51.75629,-1.25951
48,St John Street area,0.01,2,51.75616,-1.261843
42,"Redbridge, Oxford",0.01,2,51.75629,-1.25951
49,"St Thomas', Oxford",0.01,2,51.751757,-1.2659
50,St. Ebbes,0.01,2,51.750433,-1.259375
2,Blackbird Leys,0.02439,2,51.724153,-1.208666
53,Walton Manor,0.01,2,51.761108,-1.267151
54,"Waterways, Oxford",0.01,2,51.75629,-1.25951


### (7) Recommendation





Cluster 0 (red dots in the map) shows the area(s) with a neglectible number of shopping malls in Oxford. They are in the north/east and a little in south/west. Here the investor should be reserved to open a shopping mall. It is possible that there is not sufficient demand for it.

Cluster 1 (purple dots in the map): It's the middle of Oxford. Here seems to be the main potential for making money with a shopping mall: Not so many competitors but a apparently sufficient high demand. 

Cluster 2 (green dots in the map): It's a small region in the west of Oxford. Here is a high numbers of competitors. On the other side: Here is apparently the biggest demand. If the investor is succesful in cluster 1 and has a reputation, why not at a later time also be successful in this cluster?  