<h1 align=center><font size = 5>Where to Set-Up Shop in Los Angeles?</font></h1>

## Table of contents
* [Introduction](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction <a name="introduction"></a>

In this project we will try to find an optimal location for a restaurant and cafe. Specifically, this report will be targeted to stakeholders interested in opening an **Mexican restaurant** and a **Coffee Shop** in **Los Angeles, California**.

In this lab, we will convert addresses into their equivalent latitude and longitude values. Also, we will use the Foursquare API to explore neighborhoods in Los Angeles. We will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. We will use the *k*-means clustering algorithm to complete this task. We will use the Folium library to visualize the neighborhoods in Los Angeles and their emerging clusters. Finally, we will decide which neighborhood would be an ideal place to open up an Mexican Restaurant, and which neighborhood to open a Coffee Shop. 

Utilizing location and feature data to determine the ideal location for a new shop is beneficial for the shop owner to ensure they place their shop in the correct market. An ill-placed shop could mean inadequate costumer base and revenue steams, causing the shop to go out of business.

## Data  <a name="data"></a>

The data required for this project will come from two sources. The Neighborhoods in LA proper will come from zipdatamaps.com. Then using arcgis geocoder we will get the coordinates of the centers of each of these neighborhoods. This will label and provide the location of each neighborhood. The venue and shop data will come from foursquare. With the neighborhood location data we can find the top and most common venues in each neighborhood and consequently label each neighborhood accordingly. 

Based on definition of our problem, factors that will influence our decision are:
* number of existing restaurants in the neighborhood (any type of restaurant)
* number of and distance to Mexican restaurants or Coffee Shops in the neighborhood, if any
* distance of neighborhood from city center

Link for Neighborhood data:

https://www.zipdatamaps.com/list-of-zip-codes-in-california.php

### Neighborhood Locations

First, import all required libraries:

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

!pip install geocoder
import geocoder
import geopy

!conda install lxml --yes
import lxml

!pip install shapely
import shapely.geometry

!pip install pyproj
import pyproj

import math

print('Libraries imported.')

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.0

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    scikit-learn-0.20.1        |   py36h22eb022_0         5.7 MB
    liblapack-3.8.0            |      11_openblas          10 KB  conda-forge
    liblapacke-3.8.0           |      11_openblas          10 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    libopenblas-0.3.6          |       h5a2b251_2         7.7 MB
    numpy-1.17.3               |   py36h95a1406_0         5.2 MB  conda-forge
    scipy-1.4.1                |   py36h921218d_0        

In [2]:
# get list of zip codes from website for california
df = pd.read_html('https://www.zipdatamaps.com/list-of-zip-codes-in-california.php')[0]

In [3]:
df.head()

Unnamed: 0,Zip Code,Zip Code Type,Zip Code Name,County
0,90001,Non-Unique,Los Angeles,Los Angeles
1,90002,Non-Unique,Los Angeles,Los Angeles
2,90003,Non-Unique,Los Angeles,Los Angeles
3,90004,Non-Unique,Los Angeles,Los Angeles
4,90005,Non-Unique,Los Angeles,Los Angeles


In [4]:
# Keep only zip codes in Los Angeles county
df = df[df['County']=='Los Angeles']

In [5]:
#Rename and format dataframe
df.drop('Zip Code Type',axis=1, inplace = True)
df.rename(columns={"Zip Code Name":"Neighborhood"}, inplace = True)

In [6]:
#Group the data by postcode and join the neighbourhoods in one row
df=df.groupby('Neighborhood',as_index=False).agg({'County' : 'first','Zip Code' : ', '.join })

In [7]:
df.drop('County',axis=1,inplace=True)

In [8]:
df.head()

Unnamed: 0,Neighborhood,Zip Code
0,Acton,93510
1,Agoura Hills,"91301, 91376"
2,Alhambra,"91801, 91802, 91803, 91804, 91896, 91899"
3,Altadena,"91001, 91003"
4,Arcadia,"91006, 91007, 91066, 91077"


Now that we have the list of all the Neighborhoods in LA County, let's get the coordinates for each neighborhood center:

In [9]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Los Angeles, California'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [10]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in df["Neighborhood"].tolist() ]
coords

[[34.46815000000004, -118.19512999999995],
 [34.14584000000008, -118.77757999999994],
 [34.09370000000007, -118.12726999999995],
 [34.18549000000007, -118.13151999999997],
 [34.13614000000007, -118.03886999999997],
 [33.86108000000007, -118.07967999999994],
 [33.34411000000006, -118.32138999999995],
 [34.13361000000003, -117.90588999999994],
 [34.08518000000004, -117.96028999999999],
 [33.97977000000003, -118.18883999999997],
 [33.969940000000065, -118.14903999999996],
 [33.88274000000007, -118.12228999999996],
 [34.07346000000007, -118.40031999999997],
 [34.18182000000007, -118.30775999999997],
 [34.15778000000006, -118.63841999999994],
 [34.202210000000036, -118.60155999999995],
 [34.419170000000065, -118.43907999999999],
 [33.83161000000007, -118.26208999999994],
 [34.49538000000007, -118.62483999999995],
 [33.86854000000005, -118.06369999999998],
 [34.257250000000056, -118.59093999999999],
 [34.023360000000025, -117.95660999999996],
 [34.096100000000035, -117.71639999999996],
 [33.

In [11]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

# merge the coordinates into the original dataframe
df['Latitude'] = df_coords['Latitude']
df['Longitude'] = df_coords['Longitude']

In [12]:
df.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude
0,Acton,93510,34.46815,-118.19513
1,Agoura Hills,"91301, 91376",34.14584,-118.77758
2,Alhambra,"91801, 91802, 91803, 91804, 91896, 91899",34.0937,-118.12727
3,Altadena,"91001, 91003",34.18549,-118.13152
4,Arcadia,"91006, 91007, 91066, 91077",34.13614,-118.03887


In [13]:
df.shape

(131, 4)

To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in meters).

In [89]:
def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

### Define Foursquare Credentials and Version

In [17]:
CLIENT_ID = 'LP4HBWZLSJDP2RAXRBYEXE5OSKKRZCMBRFFBRQ4YO4CW32PZ' # your Foursquare ID
CLIENT_SECRET = '42YOOHIZECOMYGNPM2TRM5MGBM54WWMCNQE4CMPZ5GTPANM0' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

LIMIT = 100

Set city center for folium map:

In [18]:
address = 'Los Angeles, California'

geolocator = Nominatim(user_agent="la_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Los Angeles are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Los Angeles are 34.0536909, -118.2427666.


Now lets calculate the distance of each neighborhood center to the city center. This is another variable that may help in determining where to set up shop.

In [238]:
i=0
count = df['Neighborhood'].nunique()
df['Distance from City Center (ft)']=np.nan
Citycenter_x, Citycenter_y = lonlat_to_xy(longitude,latitude) #LA city center

for i in range(0,count):    
    x, y = lonlat_to_xy(df['Longitude'][i], df['Latitude'][i])
    distance= calc_xy_distance(Citycenter_x,Citycenter_y,x,y)
    df['Distance from City Center (ft)'][i]=distance

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  if __name__ == '__main__':


In [241]:
df.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude,Distance from City Center (ft)
0,Acton,93510,34.46815,-118.19513,57866.43304
1,Agoura Hills,"91301, 91376",34.14584,-118.77758,63044.810537
2,Alhambra,"91801, 91802, 91803, 91804, 91896, 91899",34.0937,-118.12727,14492.071107
3,Altadena,"91001, 91003",34.18549,-118.13152,22409.359777
4,Arcadia,"91006, 91007, 91066, 91077",34.13614,-118.03887,26262.126713


### Venues

In [19]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [20]:
def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [21]:
LA_venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude'])

Acton
Agoura Hills
Alhambra
Altadena
Arcadia
Artesia
Avalon
Azusa
Baldwin Park
Bell
Bell Gardens
Bellflower
Beverly Hills
Burbank
Calabasas
Canoga Park
Canyon Country
Carson
Castaic
Cerritos
Chatsworth
City of Industry
Claremont
Compton
Covina
Culver City
Diamond Bar
Dodgertown
Downey
Duarte
El Monte
El Segundo
Encino
Gardena
Glendale
Glendora
Granada Hills
Hacienda Heights
Harbor City
Hawaiian Gardens
Hawthorne
Hermosa Beach
Huntington Park
Inglewood
La Canada Flintridge
La Crescenta
La Mirada
La Puente
La Verne
Lake Hughes
Lakewood
Lancaster
Lawndale
Littlerock
Llano
Lomita
Long Beach
Los Angeles
Lynwood
Malibu
Manhattan Beach
Marina Del Rey
Maywood
Mission Hills
Monrovia
Montebello
Monterey Park
Montrose
Mount Wilson
Newhall
North Hills
North Hollywood
Northridge
Norwalk
Pacific Palisades
Pacoima
Palmdale
Palos Verdes Estates
Panorama City
Paramount
Pasadena
Pearblossom
Pico Rivera
Playa Del Rey
Pomona
Porter Ranch
Rancho Palos Verdes
Redondo Beach
Reseda
Rosemead
Rowland Heights
Sa

In [22]:
# one hot encoding
LA_onehot = pd.get_dummies(LA_venues[['Venue Category']], prefix="", prefix_sep="")
LA_onehot.drop('Neighborhood',axis=1,inplace=True)

# add neighborhood column back to dataframe
LA_onehot['Neighborhood'] = LA_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [LA_onehot.columns[-1]] + list(LA_onehot.columns[:-1])
LA_onehot = LA_onehot[fixed_columns]

LA_onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport,Airport Service,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Basketball Stadium,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bike Trail,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Camera Store,Campground,Canal,Candy Store,Cantonese Restaurant,Car Wash,Caribbean Restaurant,Casino,Castle,Cave,Cheese Shop,Child Care Service,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,City Hall,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Cafeteria,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Credit Union,Creperie,Cruise,Cuban Restaurant,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Distillery,Dive Bar,Dive Shop,Dive Spot,Doctor's Office,Dog Run,Donburi Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Film Studio,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,Gift Shop,Go Kart Track,Golf Course,Golf Driving Range,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gun Shop,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Leather Goods Store,Library,Light Rail Station,Lighthouse,Lingerie Store,Liquor Store,Locksmith,Lounge,Mac & Cheese Joint,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mongolian Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Night Market,Nightclub,Nightlife Spot,Noodle House,North Indian Restaurant,Notary,Observatory,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Supply Store,Outdoors & Recreation,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Piano Bar,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Pool Hall,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Reservoir,Resort,Rest Area,Restaurant,River,Rock Club,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shaanxi Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,Stadium,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Summer Camp,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tiki Bar,Tourist Information Center,Toy / Game Store,Track,Trade School,Trail,Train Station,Travel Lounge,Tunnel,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Volleyball Court,Warehouse,Warehouse Store,Watch Shop,Water Park,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Acton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Acton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Acton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Acton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Acton,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [23]:
LA_grouped = LA_onehot.groupby('Neighborhood').mean().reset_index()
LA_grouped.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport,Airport Service,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Basketball Stadium,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bike Trail,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Cambodian Restaurant,Camera Store,Campground,Canal,Candy Store,Cantonese Restaurant,Car Wash,Caribbean Restaurant,Casino,Castle,Cave,Cheese Shop,Child Care Service,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,City Hall,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Cafeteria,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Credit Union,Creperie,Cruise,Cuban Restaurant,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Distillery,Dive Bar,Dive Shop,Dive Spot,Doctor's Office,Dog Run,Donburi Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Film Studio,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,Gift Shop,Go Kart Track,Golf Course,Golf Driving Range,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gun Shop,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indonesian Restaurant,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Leather Goods Store,Library,Light Rail Station,Lighthouse,Lingerie Store,Liquor Store,Locksmith,Lounge,Mac & Cheese Joint,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts Dojo,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mongolian Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Night Market,Nightclub,Nightlife Spot,Noodle House,North Indian Restaurant,Notary,Observatory,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Supply Store,Outdoors & Recreation,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Piano Bar,Pie Shop,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Pool Hall,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Reservoir,Resort,Rest Area,Restaurant,River,Rock Club,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Sculpture Garden,Seafood Restaurant,Shaanxi Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,Stadium,State / Provincial Park,Stationery Store,Steakhouse,Storage Facility,Summer Camp,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Szechuan Restaurant,Taco Place,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tiki Bar,Tourist Information Center,Toy / Game Store,Track,Trade School,Trail,Train Station,Travel Lounge,Tunnel,Turkish Restaurant,Udon Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Volleyball Court,Warehouse,Warehouse Store,Watch Shop,Water Park,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Acton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Agoura Hills,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Alhambra,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0
3,Altadena,0.0,0.0,0.0,0.0,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.050847,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.067797,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.033898,0.016949,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.067797,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Arcadia,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.05,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.05,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [129]:
#function to sort and return most common venues in neighborhood
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = LA_grouped['Neighborhood']

for ind in np.arange(LA_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(LA_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Acton,Grocery Store,Park,Athletics & Sports,Pet Store,Construction & Landscaping,Event Space,Nature Preserve,Factory,Dumpling Restaurant,Eastern European Restaurant
1,Agoura Hills,Fast Food Restaurant,Mexican Restaurant,Pizza Place,Chinese Restaurant,Italian Restaurant,Grocery Store,Indian Restaurant,Burger Joint,ATM,Home Service
2,Alhambra,Mexican Restaurant,Convenience Store,Burger Joint,Chinese Restaurant,Bakery,Café,Sandwich Place,Park,Grocery Store,Dessert Shop
3,Altadena,Grocery Store,Pizza Place,American Restaurant,Bakery,Mexican Restaurant,Video Store,Burger Joint,Diner,Dive Bar,Liquor Store
4,Arcadia,Bakery,Clothing Store,American Restaurant,Mexican Restaurant,Chinese Restaurant,Cosmetics Shop,Snack Place,Coffee Shop,Dessert Shop,Shopping Mall


### Creating Clusters

In [26]:
# set number of clusters
kclusters = 10

LA_grouped_clustering = LA_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(LA_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:200] 

array([7, 5, 5, 5, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 5, 5, 2, 0, 5, 2,
       0, 2, 2, 0, 5, 8, 0, 2, 2, 0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 0, 2, 2,
       0, 5, 2, 2, 2, 1, 5, 5, 5, 2, 9, 0, 0, 0, 2, 0, 0, 0, 2, 2, 2, 2,
       5, 0, 6, 2, 5, 0, 5, 5, 8, 2, 5, 8, 5, 2, 0, 3, 2, 0, 2, 5, 8, 0,
       5, 5, 5, 5, 2, 5, 0, 0, 0, 2, 0, 0, 0, 5, 2, 2, 0, 0, 0, 5, 5, 2,
       0, 5, 0, 0, 0, 2, 0, 0, 0, 4, 5, 0, 0, 5, 5, 5, 0, 0, 2, 5, 0],
      dtype=int32)

In [104]:
# add clustering labels
neighborhoods_venues_sorted.drop('Cluster Labels',axis=1,inplace=True)
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

LA_merged = df

# merge LA_grouped with LA_data to add latitude/longitude and distances for each neighborhood
LA_merged = LA_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

LA_merged.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude,Distance from City Center (m),Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Acton,93510,34.46815,-118.19513,57866.43304,7,Grocery Store,Park,Athletics & Sports,Pet Store,Construction & Landscaping,Event Space,Nature Preserve,Factory,Dumpling Restaurant,Eastern European Restaurant
1,Agoura Hills,"91301, 91376",34.14584,-118.77758,63044.810537,5,Fast Food Restaurant,Mexican Restaurant,Pizza Place,Chinese Restaurant,Italian Restaurant,Grocery Store,Indian Restaurant,Burger Joint,ATM,Home Service
2,Alhambra,"91801, 91802, 91803, 91804, 91896, 91899",34.0937,-118.12727,14492.071107,5,Mexican Restaurant,Convenience Store,Burger Joint,Chinese Restaurant,Bakery,Café,Sandwich Place,Park,Grocery Store,Dessert Shop
3,Altadena,"91001, 91003",34.18549,-118.13152,22409.359777,5,Grocery Store,Pizza Place,American Restaurant,Bakery,Mexican Restaurant,Video Store,Burger Joint,Diner,Dive Bar,Liquor Store
4,Arcadia,"91006, 91007, 91066, 91077",34.13614,-118.03887,26262.126713,0,Bakery,Clothing Store,American Restaurant,Mexican Restaurant,Chinese Restaurant,Cosmetics Shop,Snack Place,Coffee Shop,Dessert Shop,Shopping Mall


In [105]:
LA_merged=LA_merged.dropna()

### Map Clusters

In [30]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(LA_merged['Latitude'], LA_merged['Longitude'], LA_merged['Neighborhood'], LA_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster-1)],
        fill=True,
        fill_color=rainbow[int(cluster-1)],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [130]:
# save the map as HTML file
map_clusters.save('map_LA_clusters.html')

### Examine and Label Clusters

In [106]:
#Examine Clusters for distinguishing characteristics
LA_merged.loc[LA_merged['Cluster Labels'] == 0, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,"91006, 91007, 91066, 91077",0,Bakery,Clothing Store,American Restaurant,Mexican Restaurant,Chinese Restaurant,Cosmetics Shop,Snack Place,Coffee Shop,Dessert Shop,Shopping Mall
5,"90701, 90702",0,Indian Restaurant,Café,Grocery Store,Chinese Restaurant,Bakery,Ice Cream Shop,Korean Restaurant,Thai Restaurant,Coffee Shop,Asian Restaurant
6,90704,0,Hotel,Boat or Ferry,Harbor / Marina,Seafood Restaurant,Bar,Pizza Place,Ice Cream Shop,American Restaurant,Beach,Mexican Restaurant
12,"90209, 90210, 90211, 90212, 90213",0,Boutique,Italian Restaurant,American Restaurant,Hotel,Clothing Store,Coffee Shop,Park,Men's Store,Sushi Restaurant,Cosmetics Shop
13,"91501, 91502, 91503, 91504, 91505, 91506, 9150...",0,American Restaurant,Sandwich Place,Burger Joint,Mexican Restaurant,Bakery,Japanese Restaurant,Grocery Store,Pizza Place,Ice Cream Shop,Sushi Restaurant
14,"91302, 91372",0,Coffee Shop,Grocery Store,Pizza Place,Fast Food Restaurant,Seafood Restaurant,Juice Bar,Frozen Yogurt Shop,Sushi Restaurant,Cosmetics Shop,Shopping Mall
15,"91303, 91304, 91305, 91309",0,Mexican Restaurant,Burger Joint,Indian Restaurant,American Restaurant,Sandwich Place,Pet Store,Donut Shop,Clothing Store,Cosmetics Shop,Furniture / Home Store
19,90703,0,Indian Restaurant,Sandwich Place,Coffee Shop,Grocery Store,Filipino Restaurant,Chinese Restaurant,Bakery,Ice Cream Shop,Café,Bubble Tea Shop
22,91711,0,American Restaurant,Mexican Restaurant,Coffee Shop,Pizza Place,Bakery,Dessert Shop,Garden,Sandwich Place,Mediterranean Restaurant,Italian Restaurant
25,"90230, 90231, 90232, 90233",0,Pizza Place,Coffee Shop,American Restaurant,Grocery Store,Italian Restaurant,Gastropub,New American Restaurant,Taco Place,Ice Cream Shop,Indonesian Restaurant


In [107]:
#create dataframe for cluster 0 data and get just venues in a list
cluster0 = LA_merged[LA_merged['Cluster Labels'] ==0]

#calculate average distance of neighborhood to city center
meandistance0 = cluster0['Distance from City Center (m)'].mean()

cluster0.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster0.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster0['Venues'].append([cluster0['2nd Most Common Venue'],cluster0['3rd Most Common Venue'],cluster0['4th Most Common Venue'],cluster0['5th Most Common Venue'],cluster0['6th Most Common Venue'],cluster0['7th Most Common Venue'],cluster0['8th Most Common Venue'],cluster0['9th Most Common Venue'],cluster0['10th Most Common Venue']])
cluster0.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [108]:
#search for most common venues across the cluster
Cluster0Labels = cluster0['Venues'].value_counts()[:5].index.tolist()
Cluster0Labels, meandistance0


(['Coffee Shop',
  'American Restaurant',
  'Sushi Restaurant',
  'Mexican Restaurant',
  'Japanese Restaurant'],
 32268.67362495291)

In [34]:
LA_merged.loc[LA_merged['Cluster Labels'] == 1, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
49,93532,Pub,Trail,American Restaurant,Arcade,Farm,Falafel Restaurant,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant


In [109]:
#create dataframe for cluster 1 data and get just venues in a list
cluster1 = LA_merged[LA_merged['Cluster Labels'] ==1]

#calculate average distance of neighborhood to city center
meandistance1 = cluster1['Distance from City Center (m)'].mean()

cluster1.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster1.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster1['Venues'].append([cluster1['2nd Most Common Venue'],cluster1['3rd Most Common Venue'],cluster1['4th Most Common Venue'],cluster1['5th Most Common Venue'],cluster1['6th Most Common Venue'],cluster1['7th Most Common Venue'],cluster1['8th Most Common Venue'],cluster1['9th Most Common Venue'],cluster1['10th Most Common Venue']])
cluster1.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [110]:
#search for most common venues across the cluster
Cluster1Labels = cluster1['Venues'].value_counts()[:5].index.tolist()
Cluster1Labels, meandistance1

(['Pub'], 89987.09945661186)

In [37]:
LA_merged.loc[LA_merged['Cluster Labels'] == 2, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,91702,Mexican Restaurant,Fast Food Restaurant,Coffee Shop,Burger Joint,Pizza Place,Sandwich Place,Park,Convenience Store,Liquor Store,Theater
8,91706,Mexican Restaurant,Fast Food Restaurant,Pizza Place,Burger Joint,Grocery Store,Liquor Store,Discount Store,Bank,Convenience Store,Gas Station
9,90201,Mexican Restaurant,Pizza Place,Fast Food Restaurant,Burger Joint,Convenience Store,Park,Seafood Restaurant,Grocery Store,Diner,Coffee Shop
10,90202,Mexican Restaurant,Convenience Store,Fast Food Restaurant,Pizza Place,Seafood Restaurant,Coffee Shop,Sandwich Place,Burger Joint,Food,American Restaurant
11,"90706, 90707",Fast Food Restaurant,Mexican Restaurant,Convenience Store,Burger Joint,Sandwich Place,Pizza Place,BBQ Joint,Grocery Store,Bar,Sporting Goods Shop
18,"91310, 91384",Fast Food Restaurant,Gas Station,Pharmacy,Mexican Restaurant,Diner,Video Store,Campground,Thai Restaurant,Pizza Place,Burger Joint
21,"91714, 91715, 91716",Mexican Restaurant,Burger Joint,Pizza Place,Chinese Restaurant,Convenience Store,Sandwich Place,Sushi Restaurant,Asian Restaurant,Seafood Restaurant,Fast Food Restaurant
23,"90220, 90221, 90222, 90223, 90224",Fast Food Restaurant,Pizza Place,Fried Chicken Joint,Sandwich Place,Discount Store,Video Game Store,Burger Joint,Mexican Restaurant,Convenience Store,Pharmacy
24,"91722, 91723, 91724",Mexican Restaurant,Burger Joint,Coffee Shop,American Restaurant,Pizza Place,Grocery Store,Pharmacy,Ice Cream Shop,Spa,Fast Food Restaurant
29,"91008, 91009, 91010",Fast Food Restaurant,Mexican Restaurant,Coffee Shop,Pizza Place,Gym / Fitness Center,Convenience Store,Chinese Restaurant,Pharmacy,Donut Shop,Park


In [111]:
#create dataframe for cluster 2 data and get just venues in a list
cluster2 = LA_merged[LA_merged['Cluster Labels'] ==2]

#calculate average distance of neighborhood to city center
meandistance2 = cluster2['Distance from City Center (m)'].mean()

cluster2.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster2.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster2['Venues'].append([cluster2['2nd Most Common Venue'],cluster2['3rd Most Common Venue'],cluster2['4th Most Common Venue'],cluster2['5th Most Common Venue'],cluster2['6th Most Common Venue'],cluster2['7th Most Common Venue'],cluster2['8th Most Common Venue'],cluster2['9th Most Common Venue'],cluster2['10th Most Common Venue']])
cluster2.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [112]:
#search for most common venues across the cluster
Cluster2Labels = cluster2['Venues'].value_counts()[:5].index.tolist()
Cluster2Labels, meandistance2

(['Mexican Restaurant',
  'Fast Food Restaurant',
  'Sandwich Place',
  'Pizza Place',
  'Park'],
 32483.687826558)

In [209]:
LA_merged.loc[LA_merged['Cluster Labels'] == 3, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,90704,Boat or Ferry,Hotel,Harbor / Marina,Seafood Restaurant,Bar,Pizza Place,American Restaurant,Mexican Restaurant,Ice Cream Shop,Beach
12,"90209, 90210, 90211, 90212, 90213",Boutique,Italian Restaurant,Hotel,American Restaurant,Clothing Store,Coffee Shop,Sushi Restaurant,Bakery,Men's Store,Dessert Shop
25,"90230, 90231, 90232, 90233",Coffee Shop,Pizza Place,Italian Restaurant,Grocery Store,American Restaurant,Gastropub,New American Restaurant,Taco Place,Indonesian Restaurant,Burger Joint
31,90245,Sandwich Place,Mexican Restaurant,Diner,Coffee Shop,Burger Joint,Park,Pet Store,American Restaurant,Hotel,Sports Bar
41,90254,Mexican Restaurant,Beach,American Restaurant,Italian Restaurant,Seafood Restaurant,Hotel,Coffee Shop,Board Shop,Sushi Restaurant,Sandwich Place
56,"90801, 90802, 90803, 90804, 90805, 90806, 9080...",Hotel,American Restaurant,Bar,Coffee Shop,Mexican Restaurant,Park,Seafood Restaurant,Vegetarian / Vegan Restaurant,Pizza Place,Breakfast Spot
57,"90001, 90002, 90003, 90004, 90005, 90006, 9000...",Coffee Shop,Bar,Mexican Restaurant,Ice Cream Shop,Seafood Restaurant,Gastropub,Bookstore,Sushi Restaurant,Plaza,Japanese Restaurant
59,"90263, 90264, 90265",Coffee Shop,Beach,Surf Spot,Park,Pharmacy,Shopping Mall,American Restaurant,Ice Cream Shop,Juice Bar,Food
60,"90266, 90267",Mexican Restaurant,Seafood Restaurant,American Restaurant,Pizza Place,Restaurant,Ice Cream Shop,Sandwich Place,Gastropub,Bar,Italian Restaurant
61,"90292, 90295",American Restaurant,Hotel,Seafood Restaurant,Restaurant,New American Restaurant,Italian Restaurant,Café,Sushi Restaurant,Ice Cream Shop,Cosmetics Shop


In [113]:
#create dataframe for cluster 3 data and get just venues in a list
cluster3 = LA_merged[LA_merged['Cluster Labels'] ==3]

#calculate average distance of neighborhood to city center
meandistance3 = cluster3['Distance from City Center (m)'].mean()

cluster3.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster3.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster3['Venues'].append([cluster3['2nd Most Common Venue'],cluster3['3rd Most Common Venue'],cluster3['4th Most Common Venue'],cluster3['5th Most Common Venue'],cluster3['6th Most Common Venue'],cluster3['7th Most Common Venue'],cluster3['8th Most Common Venue'],cluster3['9th Most Common Venue'],cluster3['10th Most Common Venue']])
cluster3.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [114]:
#search for most common venues across the cluster
Cluster3Labels = cluster3['Venues'].value_counts()[:5].index.tolist()
Cluster3Labels,meandistance3

(['Steakhouse'], 73066.0669393608)

In [212]:
LA_merged.loc[LA_merged['Cluster Labels'] == 4, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
27,90090,Trail,Mountain,Park,Scenic Lookout,Café,BBQ Joint,Grocery Store,Sculpture Garden,Coffee Shop,Reservoir


In [115]:
#create dataframe for cluster 4 data and get just venues in a list
cluster4 = LA_merged[LA_merged['Cluster Labels'] ==4]

#calculate average distance of neighborhood to city center
meandistance4 = cluster4['Distance from City Center (m)'].mean()

cluster4.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster4.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster4['Venues'].append([cluster4['2nd Most Common Venue'],cluster4['3rd Most Common Venue'],cluster4['4th Most Common Venue'],cluster4['5th Most Common Venue'],cluster4['6th Most Common Venue'],cluster4['7th Most Common Venue'],cluster4['8th Most Common Venue'],cluster4['9th Most Common Venue'],cluster4['10th Most Common Venue']])
cluster4.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [116]:
#search for most common venues across the cluster
Cluster4Labels = cluster4['Venues'].value_counts()[:5].index.tolist()
Cluster4Labels,meandistance4

(['Campground'], 70818.6509797232)

In [215]:
LA_merged.loc[LA_merged['Cluster Labels'] == 5, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
49,93532,Farm,Pub,Summer Camp,Trail,American Restaurant,Campground,Arcade,Factory,Dry Cleaner,Dumpling Restaurant


In [118]:
#create dataframe for cluster 5 data and get just venues in a list
cluster5 = LA_merged[LA_merged['Cluster Labels'] ==5]

#calculate average distance of neighborhood to city center
meandistance5 = cluster5['Distance from City Center (m)'].mean()

cluster5.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster5.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster5['Venues'].append([cluster5['2nd Most Common Venue'],cluster5['3rd Most Common Venue'],cluster5['4th Most Common Venue'],cluster5['5th Most Common Venue'],cluster5['6th Most Common Venue'],cluster5['7th Most Common Venue'],cluster5['8th Most Common Venue'],cluster5['9th Most Common Venue'],cluster5['10th Most Common Venue']])
cluster5.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [119]:
#search for most common venues across the cluster
Cluster5Labels = cluster5['Venues'].value_counts()[:5].index.tolist()
Cluster5Labels,meandistance5

(['Fast Food Restaurant',
  'Chinese Restaurant',
  'Pizza Place',
  'Mexican Restaurant',
  'Convenience Store'],
 37582.08283112825)

In [218]:
LA_merged.loc[LA_merged['Cluster Labels'] == 6, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
68,91023,Mountain,Forest,Observatory,Snack Place,Zoo,Fabric Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store


In [121]:
#create dataframe for cluster 6 data and get just venues in a list
cluster6 = LA_merged[LA_merged['Cluster Labels'] ==6]

#calculate average distance of neighborhood to city center
meandistance6 = cluster6['Distance from City Center (m)'].mean()

cluster6.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster6.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster6['Venues'].append([cluster6['2nd Most Common Venue'],cluster6['3rd Most Common Venue'],cluster6['4th Most Common Venue'],cluster6['5th Most Common Venue'],cluster6['6th Most Common Venue'],cluster6['7th Most Common Venue'],cluster6['8th Most Common Venue'],cluster6['9th Most Common Venue'],cluster6['10th Most Common Venue']])
cluster6.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [122]:
#search for most common venues across the cluster
Cluster6Labels = cluster6['Venues'].value_counts()[:5].index.tolist()
Cluster6Labels,meandistance6

(['Mountain'], 31550.253946703695)

In [221]:
LA_merged.loc[LA_merged['Cluster Labels'] == 7, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
81,93553,Steakhouse,Gift Shop,American Restaurant,Sandwich Place,Deli / Bodega,Factory,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store


In [123]:
#create dataframe for cluster 7 data and get just venues in a list
cluster7 = LA_merged[LA_merged['Cluster Labels'] ==7]

#calculate average distance of neighborhood to city center
meandistance7 = cluster7['Distance from City Center (m)'].mean()

cluster7.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster7.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster7['Venues'].append([cluster7['2nd Most Common Venue'],cluster7['3rd Most Common Venue'],cluster7['4th Most Common Venue'],cluster7['5th Most Common Venue'],cluster7['6th Most Common Venue'],cluster7['7th Most Common Venue'],cluster7['8th Most Common Venue'],cluster7['9th Most Common Venue'],cluster7['10th Most Common Venue']])
cluster7.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [124]:
#search for most common venues across the cluster
Cluster7Labels = cluster7['Venues'].value_counts()[:5].index.tolist()
Cluster7Labels,meandistance7

(['Grocery Store'], 57866.4330397741)

In [224]:
LA_merged.loc[LA_merged['Cluster Labels'] == 8, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
81,93553,Steakhouse,Gift Shop,American Restaurant,Sandwich Place,Deli / Bodega,Factory,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store


In [125]:
#create dataframe for cluster 7 data and get just venues in a list
cluster8 = LA_merged[LA_merged['Cluster Labels'] ==8]

#calculate average distance of neighborhood to city center
meandistance8 = cluster8['Distance from City Center (m)'].mean()

cluster8.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster8.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster8['Venues'].append([cluster8['2nd Most Common Venue'],cluster8['3rd Most Common Venue'],cluster8['4th Most Common Venue'],cluster8['5th Most Common Venue'],cluster8['6th Most Common Venue'],cluster8['7th Most Common Venue'],cluster8['8th Most Common Venue'],cluster8['9th Most Common Venue'],cluster8['10th Most Common Venue']])
cluster8.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [126]:
#search for most common venues across the cluster
Cluster8Labels = cluster8['Venues'].value_counts()[:5].index.tolist()
Cluster8Labels,meandistance8

(['Scenic Lookout', 'Trail', 'Beach'], 33136.337428643)

In [232]:
LA_merged.loc[LA_merged['Cluster Labels'] == 9, LA_merged.columns[[1] + list(range(5, LA_merged.shape[1]))]]

Unnamed: 0,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,91706,Mexican Restaurant,Fast Food Restaurant,Pizza Place,Grocery Store,Burger Joint,Liquor Store,Bank,Discount Store,Convenience Store,Gas Station
9,90201,Mexican Restaurant,Fast Food Restaurant,Pizza Place,Convenience Store,Burger Joint,Park,Grocery Store,Seafood Restaurant,ATM,Bakery
10,90202,Mexican Restaurant,Convenience Store,Fast Food Restaurant,Pizza Place,Seafood Restaurant,Burger Joint,Sandwich Place,Coffee Shop,American Restaurant,Food
18,"91310, 91384",Fast Food Restaurant,Mexican Restaurant,Diner,Pharmacy,Video Store,Gas Station,Gastropub,Discount Store,Sandwich Place,Chinese Restaurant
21,"91714, 91715, 91716",Mexican Restaurant,Burger Joint,Sandwich Place,Sushi Restaurant,Chinese Restaurant,Convenience Store,Pizza Place,Seafood Restaurant,Asian Restaurant,Supermarket
24,"91722, 91723, 91724",Mexican Restaurant,Burger Joint,Pizza Place,Coffee Shop,American Restaurant,Grocery Store,Pharmacy,Ice Cream Shop,Fast Food Restaurant,Spa
42,90255,Mexican Restaurant,Burger Joint,Convenience Store,Pizza Place,Sandwich Place,Fast Food Restaurant,Grocery Store,Pharmacy,Discount Store,Mobile Phone Shop
47,"91744, 91746, 91747, 91749",Mexican Restaurant,Burger Joint,Sandwich Place,Pharmacy,Fast Food Restaurant,Fried Chicken Joint,Convenience Store,Supermarket,Coffee Shop,Chinese Restaurant
53,93543,Mexican Restaurant,Fast Food Restaurant,Discount Store,Diner,Food,Liquor Store,Pizza Place,Restaurant,Farmers Market,Zoo
62,90270,Mexican Restaurant,Pizza Place,Convenience Store,Park,Seafood Restaurant,Burger Joint,Fast Food Restaurant,Grocery Store,Intersection,Diner


In [127]:
#create dataframe for cluster 7 data and get just venues in a list
cluster9 = LA_merged[LA_merged['Cluster Labels'] ==9]

#calculate average distance of neighborhood to city center
meandistance9 = cluster9['Distance from City Center (m)'].mean()

cluster9.drop(['Zip Code','Latitude','Longitude','Neighborhood','Cluster Labels'],axis=1,inplace=True)
cluster9.rename(columns={"1st Most Common Venue":"Venues"}, inplace = True)
cluster9['Venues'].append([cluster9['2nd Most Common Venue'],cluster9['3rd Most Common Venue'],cluster9['4th Most Common Venue'],cluster9['5th Most Common Venue'],cluster9['6th Most Common Venue'],cluster9['7th Most Common Venue'],cluster9['8th Most Common Venue'],cluster9['9th Most Common Venue'],cluster9['10th Most Common Venue']])
cluster9.drop(['2nd Most Common Venue','3rd Most Common Venue','4th Most Common Venue','5th Most Common Venue','6th Most Common Venue','7th Most Common Venue','8th Most Common Venue','9th Most Common Venue','10th Most Common Venue'],axis=1,inplace=True)

In [128]:
#search for most common venues across the cluster
Cluster9Labels = cluster9['Venues'].value_counts()[:5].index.tolist()
Cluster9Labels,meandistance9

(['Motel'], 76127.62516961987)

## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of Los Angeles that have high restaurant density, particularly those with high numbers of Mexican Restaurants and Coffee shops. In addition we will favor clusters with locations closer to the City Center (higher density areas). These clusters and neighborhoods will indicate areas of success with high traffic for Mexican restaurants and Coffee shops, and thus is the location we will decide to set up shop.

In step one, we have collected the required data: location of every neighborhood and it's distance from the city center in LA. The second step was to identify the top venues in each of these neighborhoods within a 2000m radius.

The next step was to group each neighborhood into like clusters based on their top 10 venues via k-means clustering. We created 10 different clusters across LA. Next, we analyzed each cluster individually in order to determine and label the type of area each cluster was and how far on average each cluster is from the city center. 

With this analysis we are able to make an informed decision on where to set up shop in Los Angeles.

## Analysis <a name="analysis"></a>

Let's get all the pertinent information in one place to make our decision:

In [162]:
results = pd.DataFrame({'Cluster Label':[0,1,2,3,4,5,6,7,8,9],'Distance':[meandistance0,meandistance1,meandistance2,meandistance3,meandistance4,meandistance5,meandistance6,meandistance7,meandistance8,meandistance9]})
results['Top Venues'] = [Cluster0Labels,Cluster1Labels,Cluster2Labels,Cluster3Labels,Cluster4Labels,Cluster5Labels,Cluster6Labels,Cluster7Labels,Cluster8Labels,Cluster9Labels]

In [164]:
results.head(10)

Unnamed: 0,Cluster Label,Distance,Top Venues
0,0,32268.673625,"[Coffee Shop, American Restaurant, Sushi Resta..."
1,1,89987.099457,[Pub]
2,2,32483.687827,"[Mexican Restaurant, Fast Food Restaurant, San..."
3,3,73066.066939,[Steakhouse]
4,4,70818.65098,[Campground]
5,5,37582.082831,"[Fast Food Restaurant, Chinese Restaurant, Piz..."
6,6,31550.253947,[Mountain]
7,7,57866.43304,[Grocery Store]
8,8,33136.337429,"[Scenic Lookout, Trail, Beach]"
9,9,76127.62517,[Motel]


Fortunately, the choice is relatively easy for the which cluster we want to establish a Mexican Restaurant and which cluster to establish a Coffee Shop: Cluster 0 for the Coffee shop, and Cluster 2 for the Mexican Restaurant. 

Now the question is, which neighborhoods specifically?

Let's look at the Coffee Shop cluster first:

In [166]:
#create dataframe for cluster 0 data and get just venues in a list
cluster0 = LA_merged[LA_merged['Cluster Labels'] ==0]
cluster0.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude,Distance from City Center (m),Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Arcadia,"91006, 91007, 91066, 91077",34.13614,-118.03887,26262.126713,0,Bakery,Clothing Store,American Restaurant,Mexican Restaurant,Chinese Restaurant,Cosmetics Shop,Snack Place,Coffee Shop,Dessert Shop,Shopping Mall
5,Artesia,"90701, 90702",33.86108,-118.07968,32847.701004,0,Indian Restaurant,Café,Grocery Store,Chinese Restaurant,Bakery,Ice Cream Shop,Korean Restaurant,Thai Restaurant,Coffee Shop,Asian Restaurant
6,Avalon,90704,33.34411,-118.32139,99356.901007,0,Hotel,Boat or Ferry,Harbor / Marina,Seafood Restaurant,Bar,Pizza Place,Ice Cream Shop,American Restaurant,Beach,Mexican Restaurant
12,Beverly Hills,"90209, 90210, 90211, 90212, 90213",34.07346,-118.40032,18437.051666,0,Boutique,Italian Restaurant,American Restaurant,Hotel,Clothing Store,Coffee Shop,Park,Men's Store,Sushi Restaurant,Cosmetics Shop
13,Burbank,"91501, 91502, 91503, 91504, 91505, 91506, 9150...",34.18182,-118.30776,19336.619289,0,American Restaurant,Sandwich Place,Burger Joint,Mexican Restaurant,Bakery,Japanese Restaurant,Grocery Store,Pizza Place,Ice Cream Shop,Sushi Restaurant


In [171]:
#filter to only neighborhoods with Coffee shop as the most common venue
cluster0 = cluster0[cluster0['1st Most Common Venue']=="Coffee Shop"]

#sort neighborhoods by those closest to the city center
cluster0 = cluster0.sort_values(by=['Distance from City Center (m)'])

cluster0.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude,Distance from City Center (m),Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
34,Glendale,"91201, 91202, 91203, 91204, 91205, 91206, 9120...",34.14633,-118.24864,12904.216308,0,Coffee Shop,Burger Joint,Bakery,Department Store,Toy / Game Store,Gym / Fitness Center,Middle Eastern Restaurant,Café,Mediterranean Restaurant,Japanese Restaurant
104,South Pasadena,"91030, 91031",34.1158,-118.15213,13597.390623,0,Coffee Shop,Pizza Place,American Restaurant,Garden,Park,Hotel,Restaurant,Ice Cream Shop,Pet Store,Tea Room
28,Downey,"90239, 90240, 90241, 90242",33.94041,-118.12794,20648.500198,0,Coffee Shop,Mexican Restaurant,Burger Joint,Pizza Place,Sushi Restaurant,American Restaurant,Department Store,Grocery Store,Breakfast Spot,Lingerie Store
67,Montrose,"91020, 91021",34.20639,-118.22424,21346.79165,0,Coffee Shop,Sushi Restaurant,Italian Restaurant,Bakery,Garden,Asian Restaurant,Playground,Park,Pet Store,New American Restaurant
44,La Canada Flintridge,"91011, 91012",34.20766,-118.20725,21807.697769,0,Coffee Shop,Italian Restaurant,Pizza Place,Sandwich Place,Mexican Restaurant,Bakery,Grocery Store,Garden,Sushi Restaurant,Spa


In [177]:
cluster0.shape

(13, 16)

Based on our methodology and analysis it appears like **Glendale** will be the best location to set up a Coffee shop!

Now let's  take a look at the Mexican Restaurant cluster:

In [174]:
#create dataframe for cluster 2 data and get just venues in a list
cluster2 = LA_merged[LA_merged['Cluster Labels'] ==2]

#filter to only neighborhoods with Mexican Restaurant as the most common venue
cluster2 = cluster2[cluster2['1st Most Common Venue']=="Mexican Restaurant"]

#sort neighborhoods by those closest to the city center
cluster2 = cluster2.sort_values(by=['Distance from City Center (ft)'])

cluster2.head()

Unnamed: 0,Neighborhood,Zip Code,Latitude,Longitude,Distance from City Center (m),Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
42,Huntington Park,90255,33.9814,-118.21914,10428.742908,2,Mexican Restaurant,Burger Joint,Sandwich Place,Convenience Store,Pizza Place,Grocery Store,Pharmacy,Fast Food Restaurant,Mobile Phone Shop,Discount Store
62,Maywood,90270,33.98757,-118.18981,11064.162066,2,Mexican Restaurant,Pizza Place,Convenience Store,Seafood Restaurant,Burger Joint,Fast Food Restaurant,Park,Grocery Store,Intersection,Discount Store
9,Bell,90201,33.97977,-118.18884,12041.23351,2,Mexican Restaurant,Pizza Place,Fast Food Restaurant,Burger Joint,Convenience Store,Park,Seafood Restaurant,Grocery Store,Diner,Coffee Shop
103,South Gate,90280,33.9563,-118.20577,14220.977445,2,Mexican Restaurant,Fast Food Restaurant,Pharmacy,Burger Joint,Convenience Store,Pizza Place,Seafood Restaurant,Sandwich Place,Supplement Shop,Bank
65,Montebello,90640,34.01959,-118.11643,15396.209314,2,Mexican Restaurant,Grocery Store,Pizza Place,Burger Joint,Chinese Restaurant,Pharmacy,Convenience Store,Coffee Shop,Fast Food Restaurant,Bank


In [176]:
cluster2.shape

(20, 16)

Based on our methodology and analysis it appears like **Huntington Park** will be the best location to set up a Mexican Restaurant!


## Results and Discussion <a name="results"></a>

Our analysis of k-means clustering clearly showed which clusters were favorable to set up shop for a Mexican Restaurant and which to set up a Coffee shop. Within the Coffee shop cluster, there were 13 neighborhoods where a Coffee shop was the most common venue, and within the Mexican Restaurant cluster, there were 20 neighborhoods where a Mexican Restaurant was the most common venue.

There are neighborhoods with high concentrations of Mexican Restaurants and Coffee Shops all over Los Angeles County, but by considering distance to the Los Angeles City center, and thus the high tourist areas with ease of public transportation, we were able to find the best neighborhoods for each shop. Glendale for the Coffee Shop, and Huntington Park for the Mexican Restaurant.

Let's plot the 2 neighborhoods on a map:

In [228]:
locations = df[['Neighborhood','Latitude','Longitude']]
neighborhoods = ['Glendale','Huntington Park']
venues = ['Coffee Shop','Mexican Restaurant']
locations = locations[locations['Neighborhood'].isin(neighborhoods)]
location1 = locations[['Latitude','Longitude']]
locationlist = location1.values.tolist()

In [237]:
# create map
map_results = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to the map
for point in range(0, len(locationlist)):
    folium.CircleMarker(locationlist[point], radius = 25, popup=neighborhoods[point] + " - " + venues[point]).add_to(map_results)

folium.Marker([latitude,longitude], popup="City Center").add_to(map_results)
       
map_results

## Conclusion <a name="conclusion"></a>

Purpose of this project was to identify Los Angeles areas close to center with high number of venues (particularly Mexican restaurants and Coffee shops) in order to aid stakeholders in narrowing down the search for optimal location for a new restaurant/shop. Clustering of those locations was then performed in order to create major zones of interest (containing greatest number of potential locations) and then to be used as starting points (Cluster 0 and Cluster 2) for exploration by stakeholders. By clustering neighborhoods by restaurant distribution from Foursquare data we have first identified general boroughs that justify further analysis.

Final decision on optimal restaurant location was based on location to city center to indicate neighborhoods on convenient access and higher traffic.