# Capstone Data Science Project - with Foursquare API

## Introduction/Business Problem

Although modern vegan and vegetarian cuisine already is and becomes more popular in San Diego, CA, it seems like there is still much space for fine-dining plant-based restaurants and that some areas in San Diego County could be a good pick for embodying that idea.

The purpose of that project is to analyze chosen areas in San Diego County to find out where it could be a best choice to open upscale vegetarian/vegan restaurant.

While exploring in search for a good area to start that kind of business we consider:
+ amount of restaurants nearby
+ amount of vegetarian/vegan restaurants nearby
+ median household income in the neighborhood
+ distance from the main street/central area or other interesting attractions

## Data

Following resourses will be used to extract informations needed:

+ **Google Maps API geocoding** to find geolocations of points of interests 
+ **Foursquare API** for exploring neighborhoods, their venues, restaurants and attractions
+ **Median Household Income for San Diego County from the Census Bureau from datausa.io** website - csv including census geoid and median household income
+ **FCC Api** to convert geolocations to census geoid, to extract neighborhoods of interests from the Median Household Income csv


In [4]:
areas = ['West F Street, Encinitas, California',
'13th St, Del Mar, California',
'Girard Avenue, Village of La Jolla, California',
'Mission Blvd, Pacific Beach, California',
'University Av, Hillcrest, San Diego, California',
'Orange Ave, Coronado, California',
'North Park Way, North Park, San Diego, California',
'Rosecrans St, Point Loma, San Diego, California',
'Plaza St, Solana Beach, California',
'300 Mission Ave, Oceanside, California',
'600 Carlsbad Village Drive, Carlsbad, California',
'600 Fifth Avenue, San Diego, CA',
'1900 India Street, San Diego, CA']

In [164]:
import requests
import pandas as pd
import folium
from pandas.io.json import json_normalize

In [124]:
GOOGLE_API_KEY=''

In [125]:
neighborhoods = pd.DataFrame({'col1': [], 'col2': [], 'col3': [], 'col4': [], 'col5': [], 'col6': [], 'col7': [], 'lat': [], 'lng': []})    

In [126]:
def get_coords(areas, neighborhoods):
    for area in areas:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&address={}'.format(GOOGLE_API_KEY, area)
        response = requests.get(url).json()
        zipcode = response['results'][0]['address_components'][5]['short_name'] if len(response['results'][0]['address_components']) < 7 else response['results'][0]['address_components'][6]['short_name'] 
        neighborhoods = neighborhoods.append({
            'col1': response['results'][0]['address_components'][0]['short_name'],
            'col2': response['results'][0]['address_components'][1]['short_name'],
            'col3': response['results'][0]['address_components'][2]['short_name'],
            'col4': response['results'][0]['address_components'][3]['short_name'],
            'col5': response['results'][0]['address_components'][4]['short_name'],
            'col6': response['results'][0]['address_components'][5]['short_name'],
            'col7': zipcode,
            'lat': response['results'][0]['geometry']['location']['lat'],
            'lng': response['results'][0]['geometry']['location']['lng']
        }, ignore_index=True)
    return neighborhoods 

In [127]:
df_neighborhoods = get_coords(areas, neighborhoods)

In [74]:
df_neighborhoods.drop(['col6', 'col5'], axis=1, inplace=True)

In [93]:
df_neighborhoods.iat[9, 0] = '300 Mission Ave'
df_neighborhoods.iat[9, 1] = 'Oceanside'
df_neighborhoods.iat[10, 0] = '600 Carlsbad Village Dr'
df_neighborhoods.iat[10, 1] = 'Carlsbad'

In [95]:
df_neighborhoods.drop(['col3', 'col4'], axis=1, inplace=True)

In [99]:
df_neighborhoods.rename(columns={'col1': 'street', 'col2': 'neigborhood', 'col7': 'zipcode'}, inplace=True)

In [158]:
map_sd = folium.Map(location=[df_neighborhoods.iloc[3]['lat'], df_neighborhoods.iloc[3]['lng']], zoom_start=10)

for lat, lng, neighborhood in zip(df_neighborhoods['lat'], df_neighborhoods['lng'], df_neighborhoods['neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_sd)  
    
map_sd

In [159]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [167]:
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    df_neighborhoods.iloc[0]['lat'], 
    df_neighborhoods.iloc[0]['lng'], 
    1500, 
    LIMIT)

In [168]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '603e7a5d51ca0668feb43360'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'Encinitas',
  'headerFullLocation': 'Encinitas',
  'headerLocationGranularity': 'city',
  'totalResults': 108,
  'suggestedBounds': {'ne': {'lat': 33.056715513500016,
    'lng': -117.27886975598021},
   'sw': {'lat': 33.02971548649999, 'lng': -117.31101924401979}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '5751f7a9498e788997456512',
       'name': 'The Taco Stand',
       'location': {'address': '642 S Coast Highway 101',
        'lat': 33.04408749805209,
        'lng': -117.29372664619973,
        'labeledLatLngs': [{'label'

In [169]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [170]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

  This is separate from the ipykernel package so we can avoid doing imports until


In [179]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [180]:
sd_venues = getNearbyVenues(names=df_neighborhoods['neighborhood'],
                                   latitudes=df_neighborhoods['lat'],
                                   longitudes=df_neighborhoods['lng']
                                  )

Encinitas
Del Mar
Village of La Jolla
Pacific Beach
Hillcrest
Coronado
North Park
Point Loma
Solana Beach
Oceanside
Carlsbad
Gaslamp Quarter
Little Italy


In [185]:
sd_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Carlsbad,100,100,100,100,100,100
Coronado,37,37,37,37,37,37
Del Mar,38,38,38,38,38,38
Encinitas,43,43,43,43,43,43
Gaslamp Quarter,100,100,100,100,100,100
Hillcrest,54,54,54,54,54,54
Little Italy,100,100,100,100,100,100
North Park,58,58,58,58,58,58
Oceanside,75,75,75,75,75,75
Pacific Beach,59,59,59,59,59,59


In [205]:
sd_venues[sd_venues['Venue Category'] == 'Bakery']

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
33,Encinitas,33.043216,-117.294944,"Darshan Bakery, Coffee, and Tea",33.039689,-117.293857,Bakery
85,Village of La Jolla,32.843179,-117.273384,Sugar and Scribe,32.843129,-117.275004,Bakery
330,North Park,32.747411,-117.127709,Panchita's Kitchen & Bakery,32.747576,-117.124877,Bakery
400,Point Loma,32.724732,-117.229103,Sweetaly Bakery & Bistro,32.721643,-117.231845,Bakery
462,Oceanside,33.195067,-117.381072,Petite Madeline,33.196598,-117.380269,Bakery
568,Carlsbad,33.160211,-117.347692,Cafe Elysa,33.15719,-117.350444,Bakery
680,Gaslamp Quarter,32.711697,-117.16042,Le Parfait Paris,32.712566,-117.159572,Bakery


In [209]:
sd_venues[sd_venues['Venue Category'] == 'Breakfast Spot']

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
11,Encinitas,33.043216,-117.294944,Honey's,33.044162,-117.293732,Breakfast Spot
24,Encinitas,33.043216,-117.294944,St. Tropez Bistro,33.040338,-117.292728,Breakfast Spot
59,Del Mar,32.957393,-117.265067,Pacifica Breeze Cafe,32.959962,-117.26502,Breakfast Spot
82,Village of La Jolla,32.843179,-117.273384,The Cottage,32.843411,-117.2748,Breakfast Spot
111,Village of La Jolla,32.843179,-117.273384,Richard Walker's Pancake House La Jolla,32.846155,-117.275393,Breakfast Spot
112,Village of La Jolla,32.843179,-117.273384,Coffee Cup,32.846999,-117.273074,Breakfast Spot
163,Pacific Beach,32.79368,-117.254593,Breakfast Republic,32.79586,-117.255144,Breakfast Spot
169,Pacific Beach,32.79368,-117.254593,IHOP,32.795068,-117.254147,Breakfast Spot
334,North Park,32.747411,-117.127709,Swami's Cafe,32.748666,-117.130843,Breakfast Spot
363,North Park,32.747411,-117.127709,North Park Breakfast Company,32.74862,-117.12586,Breakfast Spot


In [210]:
sd_venues[sd_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
5,Encinitas,33.043216,-117.294944,Lotus Cafe & Juice Bar,33.041832,-117.293072,Vegetarian / Vegan Restaurant
10,Encinitas,33.043216,-117.294944,EVE Encinitas,33.045069,-117.29361,Vegetarian / Vegan Restaurant
338,North Park,32.747411,-117.127709,Moncai Vegan,32.748526,-117.126423,Vegetarian / Vegan Restaurant
750,Little Italy,32.724423,-117.168588,Cafe Gratitude,32.724236,-117.169579,Vegetarian / Vegan Restaurant


In [191]:
sd_venues['Venue Category'].unique()

array(['Taco Place', 'Pizza Place', 'Coffee Shop', 'Asian Restaurant',
       'Gastropub', 'Vegetarian / Vegan Restaurant', 'Diner', 'Dive Bar',
       'South American Restaurant', 'Breakfast Spot', 'Café',
       'Electronics Store', 'Italian Restaurant', 'Beach',
       'New American Restaurant', 'Wine Bar', 'Seafood Restaurant',
       'Art Gallery', 'Bar', 'Bookstore', 'Thrift / Vintage Store',
       'Vietnamese Restaurant', 'Brewery', 'Restaurant', 'Ice Cream Shop',
       'Gourmet Shop', 'Bakery', 'American Restaurant', 'Theater',
       'Burrito Place', 'Thai Restaurant', 'Lounge', 'Frozen Yogurt Shop',
       'Mediterranean Restaurant', 'Latin American Restaurant',
       'Sandwich Place', 'Mexican Restaurant', 'Deli / Bodega',
       'Shopping Mall', 'Hotel Bar', 'Hotel', 'Board Shop', 'Park',
       'Chinese Restaurant', 'Farmers Market', 'Clothing Store',
       'Sushi Restaurant', 'Shipping Store', 'Post Office', 'Sports Bar',
       'Accessories Store', 'Pet Store', 'Gift

In [178]:
# get bakeries, vegan restaurants and breakfast spots there are
# analyze each neighborhood - cluster - income?
# get how many reatuarants in each candidate neighborhood in general
# get income for each candidate - add to clustering / cluster by income


Unnamed: 0,name,categories,lat,lng
50,Via Italia Trattoria,Italian Restaurant,33.045236,-117.293577
51,The Crack Shack Encinitas,Fried Chicken Joint,33.047844,-117.28431
52,Buona Forchetta,Italian Restaurant,33.053593,-117.296676
53,Best Nails,Cosmetics Shop,33.046482,-117.283371
54,UNIV,Boutique,33.038252,-117.292726
55,GOODONYA Organic Restaurant,New American Restaurant,33.038555,-117.29267
56,Surfdog's Java Hut,Coffee Shop,33.037553,-117.292769
57,Rancho Coastal Humane Society,Pet Store,33.042537,-117.284166
58,Beachside Bar & Grill,American Restaurant,33.041817,-117.293427
59,Raul's Mexican Food,Burrito Place,33.04606,-117.294172
