# San Francisco Housing Choice For Kiki

## Scenario: 
#### Kiki got a new job in  San Francisco City, so she will move to SF City. The public transportation in SF is great and Kiki has enough time on the commute. Thus she does not need to stay near the company. However, Kiki needs to take her dogs for a walk in the parks and she also likes to do jogging in the park. Therefore, she prefers a place with most parks nearby.

## Goal: 
#### Help Kiki find the place with most parks or gardens.

Library imported:

In [9]:
import pandas as pd
from urllib.request import urlopen
from six.moves import urllib
import json
from pandas.io.json import json_normalize

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)


import requests # library to handle requests

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

## 1.The Dataset

In [10]:
url = 'https://data.sfgov.org/api/views/a8z7-xscr/rows.json?accessType=DOWNLOAD'

In [11]:
json_obj = json.load(urllib.request.urlopen(url))

In [12]:
with open('SanFrancisco.Neighborhoods.json') as f:
      data = json.load(f)

In [13]:
# assign relevant part of JSON to venues
venues = data['features']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head(5)

Unnamed: 0,id,type,geometry.geometries,geometry.type,properties.id,properties.neighborhood
0,94105,Feature,"[{'type': 'Polygon', 'coordinates': [[[-122.39...",GeometryCollection,94105,Rincon Hill
1,94107,Feature,"[{'type': 'Polygon', 'coordinates': [[[-122.38...",GeometryCollection,94107,South Beach
2,94108,Feature,"[{'type': 'Polygon', 'coordinates': [[[-122.40...",GeometryCollection,94108,Chinatown
3,94109,Feature,"[{'type': 'Polygon', 'coordinates': [[[-122.42...",GeometryCollection,94109,Nob Hill
4,94112,Feature,"[{'type': 'Polygon', 'coordinates': [[[-122.42...",GeometryCollection,94112,Ingleside


### Cleaning of the dataset:

In [14]:
def get_category_type(row):
    try:
        categories_list = row['geometry.geometries']
    except:
        categories_list = row['geometry.geometries']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['coordinates']

dataframe['coordinates'] = dataframe.apply(get_category_type, axis=1)

def get_category_type(row):
    try:
        categories_list = row['geometry.geometries']
    except:
        categories_list = row['geometry.geometries']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['type']

dataframe['coordinates_type'] = dataframe.apply(get_category_type, axis=1)

In [15]:
dataframe = dataframe.drop(columns=['geometry.geometries','id','type','geometry.type'])
dataframe = dataframe.rename(columns={'properties.id':'PostalCode','properties.neighborhood':'Neighborhood'})
dataframe = dataframe.sort_values(by='Neighborhood')

In [16]:
dataframe=dataframe[dataframe.coordinates_type!='MultiPolygon']

In [17]:
def avg(d):
    n=0
    x=0
    y=0
    for i in d:
        for j in i:
            x=j[0]+x
            y=j[1]+y
            n=n+1
    x=x/n
    y=y/n
    k=[x,y]
    return(k)

dataframe.coordinates=dataframe.coordinates.apply(avg)

In [18]:
dataframe['Latitude']= dataframe['coordinates'].apply(lambda x: x[1])
dataframe['Longitude']= dataframe['coordinates'].apply(lambda x: x[0])

In [19]:
SF_data = dataframe.drop(columns=['coordinates','coordinates_type'])

In [20]:
SF_data.head()

Unnamed: 0,PostalCode,Neighborhood,Latitude,Longitude
22,94134,Bayshore,37.718559,-122.412709
7,94124,Bayview,37.726869,-122.375455
5,94114,Castro,37.757239,-122.443479
2,94108,Chinatown,37.792016,-122.407486
15,94111,Financial District,37.801225,-122.399718


## 2. Clustering

### (1). Geographical coordinates of SF

In [21]:
from geopy.geocoders import Nominatim

address = 'San Francisco, California'

geolocator = Nominatim(user_agent="SF_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of San Francisco are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of San Francisco are 37.7790262, -122.4199061.


In [22]:
import folium # map rendering library

# create map of SF using latitude and longitude values
map_SF = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(SF_data['Latitude'], SF_data['Longitude'], SF_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_SF)  
    
map_SF

### (2). API of SF

In [23]:
CLIENT_ID = 'ZZBPAF0R0X2C2U0C0IHS55CD0J2JB5JAWN4OX5NL3EG435W5' # your Foursquare ID
CLIENT_SECRET = 'XD2XTKDO3ZEDBRYBEKKNA53NW4VYHDKRNW4U3GJPHV5XGUKF' # your Foursquare Secret
VERSION = '20180604' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZZBPAF0R0X2C2U0C0IHS55CD0J2JB5JAWN4OX5NL3EG435W5
CLIENT_SECRET:XD2XTKDO3ZEDBRYBEKKNA53NW4VYHDKRNW4U3GJPHV5XGUKF


In [24]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius

In [25]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [26]:
SF_Neighborhoods = SF_data

SF_venues = getNearbyVenues(names=SF_Neighborhoods['Neighborhood'],
                            latitudes=SF_Neighborhoods['Latitude'],
                            longitudes=SF_Neighborhoods['Longitude']
                             )

Bayshore
Bayview
Castro
Chinatown
Financial District
Ingleside
Inner Richmond
Lower Pacific Heights
Marina
Mission
Nob Hill
Panhandle
Portola
Rincon Hill
SoMa
South Beach
Sunset
Taraval
Tenderloin
Twin Peaks


In [27]:
SF_venues.groupby('Neighborhood')['Venue Category'].count().head(3)

Neighborhood
Bayshore     5
Bayview      4
Castro      15
Name: Venue Category, dtype: int64

In [28]:
print('There are {} uniques categories.'.format(len(SF_venues['Venue Category'].unique())))

There are 208 uniques categories.


### (3). Analyze Each Neighborhood

In [29]:
# one hot encoding
SF_onehot = pd.get_dummies(SF_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
SF_onehot['Neighborhood'] =SF_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [SF_onehot.columns[-1]] + list(SF_onehot.columns[:-1])
SF_onehot = SF_onehot[fixed_columns]

SF_onehot.head(5)

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,Auto Garage,Automotive Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beer Bar,Beer Garden,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Business Service,Cafeteria,Café,Camera Store,Cantonese Restaurant,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Concert Hall,Convenience Store,Cooking School,Cosmetics Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Health & Beauty Service,Herbs & Spices Store,Hill,Historic Site,History Museum,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Hunan Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Lawyer,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Marijuana Dispensary,Market,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motel,Motorcycle Shop,Moving Target,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Office,Organic Grocery,Outdoor Sculpture,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Poke Place,Pool,Pop-Up Shop,Pub,Public Art,Ramen Restaurant,Record Shop,Reservoir,Restaurant,Road,Rock Club,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Street Food Gathering,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Trade School,Trail,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Winery,Women's Store,Yoga Studio
0,Bayshore,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Bayshore,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Bayshore,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Bayshore,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Bayshore,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0


In [30]:
SF_onehot.shape

(839, 209)

In [31]:
SF_grouped = SF_onehot.groupby('Neighborhood').mean().reset_index()
SF_grouped.head(5)

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Austrian Restaurant,Auto Garage,Automotive Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beer Bar,Beer Garden,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Business Service,Cafeteria,Café,Camera Store,Cantonese Restaurant,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Concert Hall,Convenience Store,Cooking School,Cosmetics Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Health & Beauty Service,Herbs & Spices Store,Hill,Historic Site,History Museum,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Hunan Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Lawyer,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Marijuana Dispensary,Market,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motel,Motorcycle Shop,Moving Target,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Office,Organic Grocery,Outdoor Sculpture,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Poke Place,Pool,Pop-Up Shop,Pub,Public Art,Ramen Restaurant,Record Shop,Reservoir,Restaurant,Road,Rock Club,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Street Food Gathering,Supermarket,Supplement Shop,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Tiki Bar,Toy / Game Store,Trade School,Trail,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Winery,Women's Store,Yoga Studio
0,Bayshore,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bayview,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Castro,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667
3,Chinatown,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.05,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.03,0.04,0.07,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.06,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.01
4,Financial District,0.0,0.0,0.056604,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.056604,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.037736,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.075472,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.018868,0.0,0.018868,0.018868,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.018868,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.018868,0.0,0.037736,0.037736,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.018868,0.0,0.0,0.0,0.018868,0.037736,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037736,0.0,0.0,0.037736,0.0,0.0,0.0,0.0,0.0


In [32]:
SF_grouped.shape

(20, 209)

In [33]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [34]:
num_top_venues = 20

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = SF_grouped['Neighborhood']

for ind in np.arange(SF_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(SF_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(5)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Bayshore,Park,Garden,Baseball Field,Trail,Fried Chicken Joint,French Restaurant,Fountain,Football Stadium,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Filipino Restaurant,Fast Food Restaurant,Yoga Studio,Furniture / Home Store,Eye Doctor,Exhibit,Event Space
1,Bayview,Public Art,Bakery,Motorcycle Shop,Spa,Donut Shop,Filipino Restaurant,French Restaurant,Fountain,Football Stadium,Dive Bar,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Fast Food Restaurant,Dry Cleaner,Farmers Market,Fried Chicken Joint,Exhibit
2,Castro,Park,Scenic Lookout,Trail,Yoga Studio,Grocery Store,Reservoir,Road,Café,Speakeasy,Garden,Playground,Hill,Food,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Eye Doctor,Exhibit,Historic Site,Event Space
3,Chinatown,Coffee Shop,Hotel,Bakery,Cocktail Bar,Bubble Tea Shop,Gym,American Restaurant,Clothing Store,Café,Chinese Restaurant,Boutique,Sushi Restaurant,French Restaurant,Restaurant,Dim Sum Restaurant,Gym / Fitness Center,Vietnamese Restaurant,Men's Store,Jewelry Store,Ramen Restaurant
4,Financial District,Food Truck,American Restaurant,Café,Coffee Shop,Exhibit,Street Food Gathering,Vietnamese Restaurant,Shipping Store,Trail,Pier,Seafood Restaurant,Scenic Lookout,Burrito Place,Science Museum,Plaza,Mexican Restaurant,Clothing Store,Spanish Restaurant,Pub,Moving Target


In [35]:
neighborhoods_venues_sorted.shape

(20, 21)

### (4). Cluster Neighborhoods

In [36]:
# set number of clusters
kclusters = 3

SF_grouped_clustering = SF_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(SF_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 0, 2, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [37]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

SF_merged = SF_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
SF_merged = SF_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

SF_merged.head(3) # check the last columns!

Unnamed: 0,PostalCode,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
22,94134,Bayshore,37.718559,-122.412709,2,Park,Garden,Baseball Field,Trail,Fried Chicken Joint,French Restaurant,Fountain,Football Stadium,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Filipino Restaurant,Fast Food Restaurant,Yoga Studio,Furniture / Home Store,Eye Doctor,Exhibit,Event Space
7,94124,Bayview,37.726869,-122.375455,0,Public Art,Bakery,Motorcycle Shop,Spa,Donut Shop,Filipino Restaurant,French Restaurant,Fountain,Football Stadium,Dive Bar,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Fast Food Restaurant,Dry Cleaner,Farmers Market,Fried Chicken Joint,Exhibit
5,94114,Castro,37.757239,-122.443479,2,Park,Scenic Lookout,Trail,Yoga Studio,Grocery Store,Reservoir,Road,Café,Speakeasy,Garden,Playground,Hill,Food,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Eye Doctor,Exhibit,Historic Site,Event Space


In [38]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(SF_merged['Latitude'], SF_merged['Longitude'], SF_merged['Neighborhood'], SF_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### (5). Examine Clusters

In [39]:
SF_merged.loc[SF_merged['Cluster Labels'] == 0, SF_merged.columns[[1] + list(range(5, SF_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
7,Bayview,Public Art,Bakery,Motorcycle Shop,Spa,Donut Shop,Filipino Restaurant,French Restaurant,Fountain,Football Stadium,Dive Bar,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Fast Food Restaurant,Dry Cleaner,Farmers Market,Fried Chicken Joint,Exhibit


In [40]:
SF_merged.loc[SF_merged['Cluster Labels'] == 1, SF_merged.columns[[1] + list(range(5, SF_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
2,Chinatown,Coffee Shop,Hotel,Bakery,Cocktail Bar,Bubble Tea Shop,Gym,American Restaurant,Clothing Store,Café,Chinese Restaurant,Boutique,Sushi Restaurant,French Restaurant,Restaurant,Dim Sum Restaurant,Gym / Fitness Center,Vietnamese Restaurant,Men's Store,Jewelry Store,Ramen Restaurant
15,Financial District,Food Truck,American Restaurant,Café,Coffee Shop,Exhibit,Street Food Gathering,Vietnamese Restaurant,Shipping Store,Trail,Pier,Seafood Restaurant,Scenic Lookout,Burrito Place,Science Museum,Plaza,Mexican Restaurant,Clothing Store,Spanish Restaurant,Pub,Moving Target
4,Ingleside,Mexican Restaurant,Bakery,Latin American Restaurant,Vietnamese Restaurant,Filipino Restaurant,Grocery Store,Sandwich Place,Restaurant,Bank,Chinese Restaurant,Taco Place,Coffee Shop,Fast Food Restaurant,Salad Place,Mobile Phone Shop,Japanese Restaurant,Thai Restaurant,Bar,Thrift / Vintage Store,Deli / Bodega
19,Inner Richmond,Sushi Restaurant,Yoga Studio,Grocery Store,Dry Cleaner,Dance Studio,Park,New American Restaurant,Coffee Shop,Scenic Lookout,Chinese Restaurant,Fountain,Skating Rink,Burger Joint,Garden,Gift Shop,Southern / Soul Food Restaurant,Playground,Bakery,Korean Restaurant,Japanese Restaurant
16,Lower Pacific Heights,Chinese Restaurant,Yoga Studio,Café,Cosmetics Shop,Bakery,Sandwich Place,Furniture / Home Store,Spa,Gym / Fitness Center,Health & Beauty Service,Sushi Restaurant,Cafeteria,Pizza Place,Salon / Barbershop,Business Service,Bus Station,Church,Coffee Shop,Pet Store,Deli / Bodega
21,Marina,Art Gallery,Theater,Food Truck,Café,Arts & Crafts Store,Coffee Shop,Harbor / Marina,Gym / Fitness Center,Bookstore,Vegetarian / Vegan Restaurant,Cocktail Bar,Beer Garden,Monument / Landmark,German Restaurant,Lawyer,Museum,Farmers Market,Science Museum,Street Food Gathering,Grocery Store
14,Mission,Park,Deli / Bodega,New American Restaurant,Restaurant,Café,Food,Bus Station,Seafood Restaurant,Brewery,Sandwich Place,Sushi Restaurant,Event Space,Grocery Store,Dog Run,Mexican Restaurant,Playground,American Restaurant,Performing Arts Venue,Intersection,Pool
3,Nob Hill,Chocolate Shop,Wine Bar,Park,Pharmacy,Sushi Restaurant,Plaza,Clothing Store,Burger Joint,Cantonese Restaurant,Football Stadium,Scenic Lookout,Sandwich Place,Liquor Store,Lounge,Rock Club,Cooking School,Coffee Shop,Japanese Restaurant,Playground,Pizza Place
18,Panhandle,Coffee Shop,Boutique,Clothing Store,Breakfast Spot,Convenience Store,Bar,Gift Shop,Ice Cream Shop,Italian Restaurant,Dog Run,Park,Café,Bus Station,Gym,Thrift / Vintage Store,Thai Restaurant,Light Rail Station,Cheese Shop,Trail,Pizza Place
0,Rincon Hill,Coffee Shop,Gym,Scenic Lookout,Lounge,Spa,Italian Restaurant,Deli / Bodega,Gym / Fitness Center,Art Gallery,Café,American Restaurant,Market,Sandwich Place,Salon / Barbershop,Pet Store,Park,Outdoor Sculpture,New American Restaurant,Music Venue,Mediterranean Restaurant


In [41]:
SF_merged.loc[SF_merged['Cluster Labels'] == 2, SF_merged.columns[[1] + list(range(5, SF_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
22,Bayshore,Park,Garden,Baseball Field,Trail,Fried Chicken Joint,French Restaurant,Fountain,Football Stadium,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Filipino Restaurant,Fast Food Restaurant,Yoga Studio,Furniture / Home Store,Eye Doctor,Exhibit,Event Space
5,Castro,Park,Scenic Lookout,Trail,Yoga Studio,Grocery Store,Reservoir,Road,Café,Speakeasy,Garden,Playground,Hill,Food,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Eye Doctor,Exhibit,Historic Site,Event Space
8,Portola,Garden,Trail,Bus Line,Dive Bar,Dog Run,Filipino Restaurant,Fried Chicken Joint,French Restaurant,Fountain,Football Stadium,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Fast Food Restaurant,Donut Shop,Farmers Market,Furniture / Home Store,Exhibit
10,Twin Peaks,Scenic Lookout,Bus Station,Theater,Bus Line,Park,Hill,Trail,Food & Drink Shop,Food Court,Food Truck,Food,Football Stadium,Flower Shop,Fountain,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Yoga Studio,Farmers Market,Eye Doctor
