<h1 align=center><font size = 5>The Battle of Neighborhoods</font></h1> 

<h2 align=center><font size = 4>Finding business target clusters in Boston</font></h2>

## Introduction

Finding an appropriate location to launch a business or a location to launch a marketing campaign can be cumbersome. This project seeks to mitigate some of the problems involved in a pre-launch research business process. 

I have used a dataset of central Boston venues to locate appropriate cluster locations to launch a business, or launch a marketing campaign. Specifically, I have tried to locate clusters that are absent of chinese businesses to identify areas where a chinese oriented business can be implemented. Likewise, I have tried to locate cluster areas that are dense in chinese venues to identify areas where a chinese marketing campaign could be effective.

## Data

The dataset used in the project originates from the Foursquare API service. Venues in and around downtown Boston have been pooled, and then filtered by "chinese" indicators. This process has resulted in numerous locations found in and around downtown Boston.

In [1]:
import requests
import pandas as pd
import numpy as np
import random

!pip install geopy
!pip install folium==0.5.0

from geopy.geocoders import Nominatim
from IPython.display import Image 
from IPython.core.display import HTML 
from pandas.io.json import json_normalize
import folium

import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans



In [2]:
# The code was removed by Watson Studio for sharing.

In [3]:
address = '24 Beacon St, Boston, MA'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

42.35860195 -71.06387508501135


In [4]:
search_query = 'Chinese'
radius = 5000

In [5]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)

In [6]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5ffcab1855e19a665735f5f6'},
 'notifications': [{'type': 'notificationTray', 'item': {'unreadCount': 0}}],
 'response': {'venues': [{'id': '4bcf2b3d77b29c7466828882',
    'name': '68 Chinese',
    'location': {'address': '48 Winter St',
     'crossStreet': 'Tremont Street',
     'lat': 42.35590744018555,
     'lng': -71.06196594238281,
     'labeledLatLngs': [{'label': 'display',
       'lat': 42.35590744018555,
       'lng': -71.06196594238281}],
     'distance': 338,
     'postalCode': '02108',
     'cc': 'US',
     'city': 'Boston',
     'state': 'MA',
     'country': 'United States',
     'formattedAddress': ['48 Winter St (Tremont Street)',
      'Boston, MA 02108',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d145941735',
      'name': 'Chinese Restaurant',
      'pluralName': 'Chinese Restaurants',
      'shortName': 'Chinese',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/asian_',
       'suffix': 

In [7]:
venues = results['response']['venues']

dataframe = json_normalize(venues)

  app.launch_new_instance()


In [8]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
df = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df['categories'] = df.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df.columns = [column.split('.')[-1] for column in df.columns]

df

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,id
0,68 Chinese,Chinese Restaurant,48 Winter St,Tremont Street,42.355907,-71.061966,"[{'label': 'display', 'lat': 42.35590744018555...",338,2108.0,US,Boston,MA,United States,"[48 Winter St (Tremont Street), Boston, MA 021...",4bcf2b3d77b29c7466828882
1,Boston Chinese Evangelical Church,Church,249 Harrison Ave,Harrison Avenue & Pine Street,42.34731,-71.063232,"[{'label': 'display', 'lat': 42.34730970235004...",1258,2111.0,US,Boston,MA,United States,[249 Harrison Ave (Harrison Avenue & Pine Stre...,4b783a50f964a520f3bd2ee3
2,Gene's Chinese Flatbread Cafe,Chinese Restaurant,86 Bedford St,,42.353332,-71.059379,"[{'label': 'display', 'lat': 42.35333234922326...",693,2111.0,US,Boston,MA,United States,"[86 Bedford St, Boston, MA 02111, United States]",51eff56c498e1fe71b259d44
3,Chinese Acupunture & Herb Services,Acupuncturist,320 Washington St.,,42.356838,-71.058678,"[{'label': 'display', 'lat': 42.35683822631836...",470,,US,Boston,MA,United States,"[320 Washington St., Boston, MA, United States]",55f41721498e1449d3e8870a
4,Chinese food,Chinese Restaurant,,,42.362379,-71.064544,"[{'label': 'display', 'lat': 42.36237928407758...",424,2114.0,US,Boston,MA,United States,"[Boston, MA 02114, United States]",4fb51dd7e4b087193af7d10a
5,Chinese Food Truck,Food Truck,,,42.363691,-71.068583,"[{'label': 'display', 'lat': 42.36369070063822...",686,,US,Boston,MA,United States,"[Boston, MA, United States]",51f00006498ec2c34c860ff4
6,Chinese Spaghetti Factory,Food,73 Essex St,,42.352403,-71.060738,"[{'label': 'display', 'lat': 42.352403, 'lng':...",736,2111.0,US,Boston,MA,United States,"[73 Essex St, Boston, MA 02111, United States]",4f4357b919834bc91f561d20
7,The Chinese-American Fine Art Society,Art Gallery,11 Edinboro St,,42.352191,-71.059602,"[{'label': 'display', 'lat': 42.352191, 'lng':...",795,2111.0,US,Boston,MA,United States,"[11 Edinboro St, Boston, MA 02111, United States]",4f32ceca19836c91c7fce7cd
8,Chinese Consolidated Benevolent Association,Event Space,90 Tyler St,,42.348854,-71.061265,"[{'label': 'display', 'lat': 42.34885399999999...",1106,2111.0,US,Boston,MA,United States,"[90 Tyler St, Boston, MA 02111, United States]",4be2030218ab2d7f04b05cb4
9,Chinese Gourmet Express,Chinese Restaurant,8 Park Plz,,42.351376,-71.068741,"[{'label': 'display', 'lat': 42.35137557983398...",898,2116.0,US,Boston,MA,United States,"[8 Park Plz, Boston, MA 02116, United States]",4e4c3aa2bd413c4cc667d0ee


In [9]:
map_boston = folium.Map(location=[latitude,longitude],zoom_start=10)

for lat,lng,name,categories in zip(df['lat'],df['lng'],df['name'],df['categories']):
    label = '{}, {}'.format(name, categories)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_boston)
map_boston

In [10]:
k=5
df_clustering = df.drop(['name','categories','address','crossStreet','labeledLatLngs','distance','postalCode','cc','city','state','country','formattedAddress','id'],1)
kmeans = KMeans(n_clusters = k,random_state=0).fit(df_clustering)
kmeans.labels_
df.insert(0, 'Cluster Labels', kmeans.labels_)

df["Cluster Labels"].value_counts()

1    31
0     7
2     6
3     4
4     2
Name: Cluster Labels, dtype: int64

In [11]:
# create clustermap
map_clusters = folium.Map(location=[latitude, longitude],zoom_start=12)

# set color scheme
x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers
markers_colors = []
for lat, lng, name, cluster in zip(df['lat'], df['lng'], df['name'], df['Cluster Labels']):
    label = folium.Popup(' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters