### Applied Data Science Capstone Project

## The Battle of Neighborhoods (Week2)

# Topic: Finding an Optimal Location for a New Sushi Restaurant in San Diego, CA

## Table of contents
* [Introduction:](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Results](#results)
* [Discussion](#discussion)
* [Conclusion](#conclusion)

## 1. Introduction<a name="introduction"></a>

In this project, I will try to help my friend who is looking for an optimal location for his sushi restaurant in beautiful San Diego, CA.

Since there are already many sushi restaurants in San Diego, he is looking for a location where there are various kinds of business stores but few sushi or Japanese restaurants in vicinity. If other conditions are the same, he prefers a location which is near the ocean.

I will use my data science skills to detect the optimal location for his new sushi restaurant.

Before getting the data and starting exploring it, all the necessary dependencies should be imported.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


## 2. Data <a name="data"></a>

I will download "<b>community_bounds.csv</b>" from https://data.world/san-diego/community-bounds, and use it as an input data.

The csv file contains the information of latitude & longitude coordinates of 63 neiborhoods in San Diego.

I will use **Foursquare API** to extract number of restaurants and their type and location in every neighborhood in San Diego.

In [2]:
#!wget -q -O 'community_bounds.csv' https://data.world/san-diego/community-bounds/file/community_bounds.csv
# Actually, I will use 'community_bounds.csv' after I download it from the given URL to my desktop.

data = pd.read_csv('community_bounds.csv')
df = pd.DataFrame(data)
df.head()

Unnamed: 0,CPCODE,CPNAME,LOWER,UPPER,LEFT,RIGHT
0,1,Balboa Park,32.719314,32.741089,-117.159398,-117.133637
1,2,Barrio Logan,32.682282,32.705214,-117.151681,-117.111361
2,13,Black Mountain Ranch,32.977449,33.021842,-117.169853,-117.096847
3,3,Carmel Mountain Ranch,32.964175,32.993332,-117.095309,-117.066342
4,21,Carmel Valley,32.91968,32.967813,-117.245205,-117.189487


Using the dataframe, create another dataframe whose columns are 'Neighborhood', 'Latitude' and 'Longitude'

In [3]:
df['Latitude']  = (df['LOWER']+df['UPPER']) / 2
df['Longitude'] = (df['LEFT']+df['RIGHT']) / 2
df.rename(columns={'CPNAME': 'Neighborhood'}, inplace=True)

df.head()

Unnamed: 0,CPCODE,Neighborhood,LOWER,UPPER,LEFT,RIGHT,Latitude,Longitude
0,1,Balboa Park,32.719314,32.741089,-117.159398,-117.133637,32.730202,-117.146518
1,2,Barrio Logan,32.682282,32.705214,-117.151681,-117.111361,32.693748,-117.131521
2,13,Black Mountain Ranch,32.977449,33.021842,-117.169853,-117.096847,32.999645,-117.13335
3,3,Carmel Mountain Ranch,32.964175,32.993332,-117.095309,-117.066342,32.978753,-117.080826
4,21,Carmel Valley,32.91968,32.967813,-117.245205,-117.189487,32.943747,-117.217346


In [4]:
sd_data = df[['Neighborhood', 'Latitude', 'Longitude']]
print(sd_data.shape)
sd_data.head()

(63, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Balboa Park,32.730202,-117.146518
1,Barrio Logan,32.693748,-117.131521
2,Black Mountain Ranch,32.999645,-117.13335
3,Carmel Mountain Ranch,32.978753,-117.080826
4,Carmel Valley,32.943747,-117.217346


<b><font size=+1>Clean up the data:</font></b>

In the data, 63 neighborhoods are displayed. However, 'Military Facilities' and 'Reserve' occur multiple times, and they should be removed since no restaurants can be run in such neighborhoods.

In [5]:
print('The dataframe has {} unique neighborhoods.'.format(
        len(sd_data['Neighborhood'].unique()),
    )
)
sd_data = sd_data[sd_data.Neighborhood != 'Military Facilities']
sd_data = sd_data[sd_data.Neighborhood != 'Reserve']
sd_data.shape

The dataframe has 57 unique neighborhoods.


(55, 3)

## 3. Methodology <a name="methodology"></a>

I will use <b>geopy</b> library to get the latitude and longitude values of San Diego.

In order to define an instance of the <b>geocoder</b>, I will define a user_agent, which will be named 'SD_explorer', as shown below.

In [6]:
address = 'San Diego, CA'

geolocator = Nominatim(user_agent="SD_explorer")
location   = geolocator.geocode(address)
latitude   = location.latitude
longitude  = location.longitude
print('The geograpical coordinates of San Diego are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of San Diego are 32.7174209, -117.1627714.


The following will show a map of San Diego with neighborhoods superimposed on top. **Folium** is a great visualization library. Feel free to zoom into the above map and click on each circle mark to reveal the name of the neighborhood.

In [7]:
# create map of San Diego using latitude and longitude values from 'sd_data' dataframe
map_sandiego = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(sd_data['Latitude'], 
                                  sd_data['Longitude'], 
                                  sd_data['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_sandiego)  
    
map_sandiego

Next, I will start utilizing the <b>Foursquare API</b> to explore the neighborhoods and segment them.

#### Definition of Foursquare Credentials and Version

In [8]:
CLIENT_ID = '5SLJIJWQVYFKWPM00E0CRGNTTJO2IFR11FXDQGO5U3OOSQ5O' # your Foursquare ID
CLIENT_SECRET = '443OSXEXCZOHN1U5SY1ZGPTE15NKXGBEX4MDKCBCR03TBJYV' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 5SLJIJWQVYFKWPM00E0CRGNTTJO2IFR11FXDQGO5U3OOSQ5O
CLIENT_SECRET:443OSXEXCZOHN1U5SY1ZGPTE15NKXGBEX4MDKCBCR03TBJYV


<b>Exploration of neighborhoods in San Diego</b>

The following are two functions: 
* *get_category_type* function which is borrowed from the Foursquare lab. 
* *getNearbyVenues* function which uses items key, which contains all the necessary venue information of a relevant neighborhood within a radius of 500 meters.

In [9]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [10]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now I will write the code to run the above functions on each neighborhood and create a new dataframe called *sd_venues*.

In [11]:
LIMIT = 1000 # limit of number of venues returned by Foursquare API

# type your answer here
sd_venues = getNearbyVenues(names=sd_data['Neighborhood'],
                            latitudes=sd_data['Latitude'],
                            longitudes=sd_data['Longitude']
                           )

print(sd_venues.shape)
sd_venues.head()

(767, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Balboa Park,32.730202,-117.146518,Balboa Park Fountain,32.731453,-117.146809,Fountain
1,Balboa Park,32.730202,-117.146518,San Diego Model Railroad Museum,32.731132,-117.148365,Museum
2,Balboa Park,32.730202,-117.146518,San Diego Natural History Museum,32.732239,-117.147395,History Museum
3,Balboa Park,32.730202,-117.146518,San Diego History Center,32.731205,-117.148279,History Museum
4,Balboa Park,32.730202,-117.146518,Balboa Park Visitor's Center,32.731143,-117.149919,Gift Shop


<b> Analysis of Each Neighborhood

First, I will check which kind of venue categories are detected in each neighborhood.

In [12]:
# one hot encoding
sd_onehot = pd.get_dummies(sd_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
sd_onehot['Neighborhood'] = sd_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [sd_onehot.columns[-1]] + list(sd_onehot.columns[:-1])
sd_onehot = sd_onehot[fixed_columns]

print(sd_onehot.shape)
sd_onehot.head()

(767, 211)


Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Terminal,American Restaurant,Amphitheater,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beach,Beach Bar,Beer Bar,Beer Store,Big Box Store,Bike Shop,Bike Trail,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Camera Store,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Community College,Construction & Landscaping,Convenience Store,Cosmetics Shop,Credit Union,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distillery,Distribution Center,Dive Bar,Dog Run,Donut Shop,Electronics Store,Event Service,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fondue Restaurant,Food & Drink Shop,Food Truck,Football Stadium,Fountain,French Restaurant,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hotel,Hotel Pool,Ice Cream Shop,Indie Theater,Indoor Play Area,Irish Pub,Island,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kids Store,Laundromat,Leather Goods Store,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Massage Studio,Mattress Store,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mongolian Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Noodle House,Optical Shop,Outdoors & Recreation,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Piano Bar,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Post Office,Pub,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shopping Mall,Smoke Shop,Snack Place,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Stables,State / Provincial Park,Steakhouse,Street Food Gathering,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tanning Salon,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Theme Park Ride / Attraction,Theme Restaurant,Tour Provider,Toy / Game Store,Trail,Train Station,Tram Station,Vape Store,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo Exhibit
0,Balboa Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Balboa Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Balboa Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Balboa Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Balboa Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


211 kinds of venue categories were detected.

Second, I will check how many venues are detected in each neighborhood.

In [13]:
sd_grouped = sd_onehot.groupby('Neighborhood').sum().reset_index()
sd_grouped['Venues'] = sd_grouped.sum(axis=1)

# move neighborhood column to the first column
fixed_columns = [sd_grouped.columns[-1]] + list(sd_grouped.columns[:-1])
sd_grouped = sd_grouped[fixed_columns]

sd_grouped.head()

Unnamed: 0,Venues,Neighborhood,Accessories Store,Airport,Airport Terminal,American Restaurant,Amphitheater,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Stadium,Beach,Beach Bar,Beer Bar,Beer Store,Big Box Store,Bike Shop,Bike Trail,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Camera Store,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Community College,Construction & Landscaping,Convenience Store,Cosmetics Shop,Credit Union,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distillery,Distribution Center,Dive Bar,Dog Run,Donut Shop,Electronics Store,Event Service,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fondue Restaurant,Food & Drink Shop,Food Truck,Football Stadium,Fountain,French Restaurant,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hostel,Hotel,Hotel Pool,Ice Cream Shop,Indie Theater,Indoor Play Area,Irish Pub,Island,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kids Store,Laundromat,Leather Goods Store,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Massage Studio,Mattress Store,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mongolian Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nail Salon,National Park,New American Restaurant,Noodle House,Optical Shop,Outdoors & Recreation,Paper / Office Supplies Store,Park,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Piano Bar,Pier,Pilates Studio,Pizza Place,Playground,Plaza,Poke Place,Pool,Post Office,Pub,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shopping Mall,Smoke Shop,Snack Place,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Stables,State / Provincial Park,Steakhouse,Street Food Gathering,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tanning Salon,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Theme Park Ride / Attraction,Theme Restaurant,Tour Provider,Toy / Game Store,Trail,Train Station,Tram Station,Vape Store,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo Exhibit
0,57,Balboa Park,0,0,0,1,1,0,4,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,3,0,1,0,0,0,0,0,5,0,0,1,0,2,0,0,0,0,0,1,0,0,0,0,3,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,0,0,0,0,0,1,0,2,0,2,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,2,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,4
1,24,Barrio Logan,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
2,51,Carmel Mountain Ranch,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0,1,0,0,0,2,0,0,1,0,0,0,0,2,1,0,0,0,0,0,4,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,3,0,0,1,0,2,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,3,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,3,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0
3,7,Carmel Valley,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1,Clairemont Mesa,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## 4. Results <a name="results"></a>

Now, I will sort out the neighborhoods on the basis of *Sum*, and check how many Sushi or Japanese restaurants or even Seafood restaurants in each neighborhood. Index will be reset as well.

In [14]:
sd_grouped_new = sd_grouped[['Neighborhood', 'Venues', 'Sushi Restaurant', 'Japanese Restaurant', 'Seafood Restaurant']]
sd_grouped_new = sd_grouped_new.sort_values(by='Venues', ascending=False)
sd_grouped_new.reset_index(drop=True, inplace=True)
sd_grouped_new.head(10)

Unnamed: 0,Neighborhood,Venues,Sushi Restaurant,Japanese Restaurant,Seafood Restaurant
0,Downtown,100,3,0,4
1,Old Town San Diego,69,1,0,0
2,Greater North Park,63,2,0,1
3,Balboa Park,57,0,0,0
4,Pacific Beach,57,1,0,1
5,Carmel Mountain Ranch,51,0,1,0
6,Mission Valley,43,0,1,1
7,Mid-City:Normal Heights,37,0,1,1
8,Barrio Logan,24,0,0,0
9,Greater Golden Hill,22,0,0,0


I will also check which neighborhoods have neither Suchi Restaurants, Japanese Restaurants, nor even Seafood Restaurants.

In [15]:
sd_grouped_new.rename(columns={'Sushi Restaurant': 'Sushi_Restaurant', \
                               'Japanese Restaurant': 'Japanese_Restaurant', \
                               'Seafood Restaurant': 'Seafood_Restaurant'}, inplace=True)

print(sd_grouped_new.shape)
sd_neighborhoods = sd_grouped_new[sd_grouped_new.Sushi_Restaurant == 0]
print(sd_neighborhoods.shape)
sd_neighborhoods = sd_neighborhoods[sd_neighborhoods.Japanese_Restaurant == 0]
print(sd_neighborhoods.shape)
sd_neighborhoods = sd_neighborhoods[sd_neighborhoods.Seafood_Restaurant == 0]
print(sd_neighborhoods.shape)

sd_neighborhoods.reset_index(drop=True, inplace=True)
sd_neighborhoods.head()

(48, 5)
(42, 5)
(38, 5)
(36, 5)


Unnamed: 0,Neighborhood,Venues,Sushi_Restaurant,Japanese_Restaurant,Seafood_Restaurant
0,Balboa Park,57,0,0,0
1,Barrio Logan,24,0,0,0
2,Greater Golden Hill,22,0,0,0
3,Peninsula,21,0,0,0
4,Midway-Pacific Highway,15,0,0,0


## 5. Discussion <a name="discussion"></a>

This result shows that *Balboa Park* is the optimal location to open a new Sushi restaurant in San Diego, since there the greatest number of venues among neighborhood which do not have sushi, Japanese or seafood restaurants.

However, since even if there is one sushi restaurant in *Old Town San Diego*, there are 21% more venues than *Balboa Park*, which means the former is much busier neighborhood than the latter.

Of course, there must be many other factors to find an optimal location for a new restaurant such as availability, rent price, easy accessibility, convenience of parking, etc. In this project, all such factors are ignored.

## 6. Conclusion <a name="conclusion"></a>

I will give the following two neighborhood names to my friend:

     Balboa Park - having the greatest number of venues among neighborhoods without any sushi, Japanese, or seafood restaurants.

     Old Town San Diego - having 21% more venues than *Balboa Park* but there is one sushi restaurant

I think he will choose one of them, considering other factors.