Capstone Project - The Battle of the Neighborhoods (Week 2)

In this project, we will determine the best location to open a Chinese restaurant in Thousand Oaks, CA. We would want the location to have no other Chinese restaurants nearby. If possible, having an Asian or Chinese grocery store next to or near the location could boost sales. We would also want the restaurant to be near the center of the city, busy commercial district, or heavy traffic area.  

To find the best location for the Chinese Restaurant, we would need to look at several statistics.
•	The number of Chinese restaurants in the neighborhood 
•	Any Asian grocery stores near the location
•	The distance of the location from city center

To find this data, I will use several different programs and modules as data sources. I will use Google Maps API and Folium to provide the maps of the city. Also, I will use Foursquare to provide the restaurant locations. We will use these data sources to find locations that will be the best places to build a Chinese Restaurant. 


In [1]:
import requests

In [2]:
!pip install shapely
import shapely.geometry

!pip install pyproj
import pyproj

city_center = ['34.186999252', '-118.87166318']
import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


print('City center longitude={}, latitude={}'.format(city_center[1], city_center[0]))
x, y = lonlat_to_xy(city_center[1], city_center[0])
lo, la = xy_to_lonlat(x, y)
print('City center longitude={}, latitude={}'.format(lo, la))

Collecting shapely
[?25l  Downloading https://files.pythonhosted.org/packages/38/b6/b53f19062afd49bb5abd049aeed36f13bf8d57ef8f3fa07a5203531a0252/Shapely-1.6.4.post2-cp36-cp36m-manylinux1_x86_64.whl (1.5MB)
[K     |████████████████████████████████| 1.5MB 19.8MB/s eta 0:00:01
[?25hInstalling collected packages: shapely
Successfully installed shapely-1.6.4.post2
Collecting pyproj
[?25l  Downloading https://files.pythonhosted.org/packages/63/50/b9ccba9a4fdc25df31949bcc75c12b2dc735fffa3ac91e385d70ca702523/pyproj-2.3.0-cp36-cp36m-manylinux1_x86_64.whl (9.8MB)
[K     |████████████████████████████████| 9.8MB 19.3MB/s eta 0:00:01
[?25hInstalling collected packages: pyproj
Successfully installed pyproj-2.3.0
City center longitude=-118.87166318, latitude=34.186999252
City center longitude=-118.87166318, latitude=34.186999252


In [12]:
city_center_x, city_center_y = lonlat_to_xy(city_center[1], city_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = city_center_x - 6000
x_step = 600
y_min = city_center_y - 6000 - (int(21/k)*k*600 - 12000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(city_center_x, city_center_y, x, y)
        if (distance_from_center <= 6001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

364 candidate neighborhood centers generated.


In [13]:
!pip install folium
import folium
map_thousand_oaks = folium.Map(location=city_center, zoom_start=13)
folium.Marker(city_center, popup='Thousand Oaks').add_to(map_thousand_oaks)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=240, color='blue', fill=False).add_to(map_thousand_oaks) #300
map_thousand_oaks



In [14]:
!pip install geopy



In [15]:
import geopy 


In [16]:
from geopy.geocoders import Nominatim

In [17]:
def get_address(latitude, longitude):
    z = (latitude, longitude)
    geolocator = Nominatim(user_agent="specify_your_app_name_here")
    location = geolocator.reverse(z) 
    print(location.address)
    return location.address

addr = get_address(city_center[0], city_center[1])
print(addr)
print('Reverse geocoding check')
print('-----------------------')
print('Address of [{}, {}] is: {}'.format(city_center[0], city_center[1], addr))

199, East Wilbur Road, Thousand Oaks, Ventura County, California, 91360, USA
199, East Wilbur Road, Thousand Oaks, Ventura County, California, 91360, USA
Reverse geocoding check
-----------------------
Address of [34.186999252, -118.87166318] is: 199, East Wilbur Road, Thousand Oaks, Ventura County, California, 91360, USA


In [18]:
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    address = get_address(lat, lon)
    if address is None:
        address = 'NO ADDRESS'
    addresses.append(address)
    print(' .', end='')

Obtaining location addresses: Lang Ranch Open Space, Chaucer Place, Thousand Oaks, Ventura County, California, 91362, USA
 .2808, Parkview Drive, Thousand Oaks, Ventura County, California, 91362, USA
 .3180, Morningside Drive, Thousand Oaks, Ventura County, California, 91362, USA
 .3172, Montagne Way, Thousand Oaks, Ventura County, California, 91362, USA
 .Caraway Court, Thousand Oaks, Ventura County, California, 91360, USA
 .3423, Clarendon Place, Thousand Oaks, Ventura County, California, 91360, USA
 .3621, Field Crest Court, Thousand Oaks, Ventura County, California, 91360, USA
 .North Westlake Boulevard, Thousand Oaks, Ventura County, California, 91362, USA
 .2529, Northpark Street, Thousand Oaks, Ventura County, California, 91362, USA
 .2234, Green Oak Court, Thousand Oaks, Ventura County, California, 91362, USA
 .2767, Rikkard Drive, Thousand Oaks, Ventura County, California, 91362, USA
 .1770, Sweet Briar Place, Thousand Oaks, Ventura County, California, 91362, USA
 .Holly Court

 .1081, Valley High Avenue, Thousand Oaks, Ventura County, California, 91362, USA
 .1534, El Verano Drive, Thousand Oaks, Ventura County, California, 91362, USA
 .Conejo Creek Park South Trail, Thousand Oaks, Ventura County, California, 91360, USA
 .1076, East Janss Road, Thousand Oaks, Ventura County, California, 91360, USA
 .1735, Colgate Drive, Thousand Oaks, Ventura County, California, 91360, USA
 .1855, Burleson Avenue, Thousand Oaks, Ventura County, California, 91360, USA
 .58, West Columbia Road, Thousand Oaks, Ventura County, California, 91360, USA
 .109, West Sidlee Street, Thousand Oaks, Ventura County, California, 91360, USA
 .274, Siesta Avenue, Thousand Oaks, Ventura County, California, 91360, USA
 .2400, Dillon Court, Thousand Oaks, Ventura County, California, 91360, USA
 .Fort Trail, Thousand Oaks, Ventura County, California, 91360, USA
 .2683, Roca Avenue, Thousand Oaks, Ventura County, California, 91360, USA
 .Moonridge Trail, Thousand Oaks, Ventura County, California,

 .sage and cactus, Fireworks Hill Access Road, Thousand Oaks, Ventura County, California, 911360, USA
 .Conejo Valley Botanic Garden, Botanical Garden Nature Trail, Thousand Oaks, Ventura County, California, 91360, USA
 .Tree Top Lane, Thousand Oaks, Ventura County, California, 911360, USA
 .987, Calle Yucca, Thousand Oaks, Ventura County, California, 91360, USA
 .1113, Calle Las Trancas, Newbury Park, Ventura County, California, 91360, USA
 .Rockwell International Science Center Library, Calle Pecos, Newbury Park, Ventura County, California, CA 91320, USA
 .Arroyo Conejo Open Space, Lawrence Drive, Thousand Oaks, Ventura County, California, 91320, USA
 .Calle Yucca Trail, Thousand Oaks, Ventura County, California, 91320, USA
 .126, Triunfo Canyon Road, Thousand Oaks, Ventura County, California, 91361, USA
 .430, Hampshire Road, Thousand Oaks, Ventura County, California, 91361, USA
 .Skyline Open Space, Los Robles Trail, Thousand Oaks, Ventura County, California, CA 91362, USA
 .347, S

 .201, Larch Crest Court, Thousand Oaks, Ventura County, California, 91320, USA
 .143, Marjori Avenue, Thousand Oaks, Ventura County, California, 91320, USA
 .Lynn Road, Thousand Oaks, Ventura County, California, 91320, USA
 .198, Saint James Court, Thousand Oaks, Ventura County, California, 91320, USA
 .298, Twin Falls Court, Thousand Oaks, Ventura County, California, 91320, USA
 .Roadrunner, Newbury Road, Thousand Oaks, Ventura County, California, CA 91320, United States of America
 .34, West Hillcrest Drive, Thousand Oaks, Ventura County, California, 91320, United States of America
 .South Ranch Open Space, Bridgegate Trail, Thousand Oaks, Ventura County, California, 91361, USA
 .South Ranch Open Space, Bridgegate Trail, Thousand Oaks, Ventura County, California, 91361, USA
 .South Ranch Open Space, Bridgegate Trail, Thousand Oaks, Ventura County, California, 91361, USA
 .White Horse Canyon Trail, Thousand Oaks, Ventura County, California, CA 91362, USA
 .Janss Fire Road, Greenwich 

In [19]:
addresses[10:40]

['2767, Rikkard Drive, Thousand Oaks, Ventura County, California, 91362, USA',
 '1770, Sweet Briar Place, Thousand Oaks, Ventura County, California, 91362, USA',
 'Holly Court, Thousand Oaks, Ventura County, California, 91360, United States of America',
 '3168, Silver Maple Circle, Thousand Oaks, Ventura County, California, 91360, USA',
 '1032, Uppingham Drive, Thousand Oaks, Ventura County, California, 91360, USA',
 '3309, Camino Calandria, Thousand Oaks, Ventura County, California, 91360, USA',
 'East Olsen Road, Thousand Oaks, Ventura County, California, 91360, USA',
 'Oakbrook Edison Road, Thousand Oaks, Ventura County, California, CA 91362-4357, USA',
 'North Westlake Boulevard, Thousand Oaks, Ventura County, California, 91362, USA',
 'Santa Bella Place, Thousand Oaks, Ventura County, California, 91362, USA',
 '2180, Flintridge Court, Thousand Oaks, Ventura County, California, 91362, USA',
 '2000, Shady Brook Drive, Thousand Oaks, Ventura County, California, 91362, USA',
 '1691, S

In [20]:
import pandas as pd
df_locations = pd.DataFrame({'Address': addresses, 'Latitude': latitudes, 'Longitude': longitudes, 'X': xs, 'Y': ys, 'Distance from center': distances_from_center})

df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center
0,"Lang Ranch Open Space, Chaucer Place, Thousand...",34.216078,-118.832964,-3889918.0,15065240.0,5992.495307
1,"2808, Parkview Drive, Thousand Oaks, Ventura C...",34.218276,-118.83747,-3889318.0,15065240.0,5840.3767
2,"3180, Morningside Drive, Thousand Oaks, Ventur...",34.220473,-118.841976,-3888718.0,15065240.0,5747.173218
3,"3172, Montagne Way, Thousand Oaks, Ventura Cou...",34.222671,-118.846482,-3888118.0,15065240.0,5715.767665
4,"Caraway Court, Thousand Oaks, Ventura County, ...",34.224869,-118.85099,-3887518.0,15065240.0,5747.173218
5,"3423, Clarendon Place, Thousand Oaks, Ventura ...",34.227066,-118.855497,-3886918.0,15065240.0,5840.3767
6,"3621, Field Crest Court, Thousand Oaks, Ventur...",34.229264,-118.860005,-3886318.0,15065240.0,5992.495307
7,"North Westlake Boulevard, Thousand Oaks, Ventu...",34.20954,-118.828497,-3890818.0,15065760.0,5855.766389
8,"2529, Northpark Street, Thousand Oaks, Ventura...",34.211738,-118.833002,-3890218.0,15065760.0,5604.462508
9,"2234, Green Oak Court, Thousand Oaks, Ventura ...",34.213935,-118.837508,-3889618.0,15065760.0,5408.326913


In [21]:
df_locations.to_pickle('./locations.pkl')    

In [22]:
foursquare_client_id = '#'
foursquare_client_secret = '#'

In [23]:
food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

chinese_restaurant_categories = ['4bf58dd8d48988d145941735','52af3a5e3cf9994f4e043bea','52af3a723cf9994f4e043bec',
                                 '52af3a7c3cf9994f4e043bed','58daa1558bbb0b01f18ec1d3','52af3a673cf9994f4e043beb',
                                 '52af3a903cf9994f4e043bee','4bf58dd8d48988d1f5931735','52af3a9f3cf9994f4e043bef',
                                 '52af3aaa3cf9994f4e043bf0','52af3ab53cf9994f4e043bf1','52af3abe3cf9994f4e043bf2',
                                 '52af3ac83cf9994f4e043bf3','52af3ad23cf9994f4e043bf4','52af3add3cf9994f4e043bf5',
                                 '52af3af23cf9994f4e043bf7','52af3ae63cf9994f4e043bf6','52af3afc3cf9994f4e043bf8',
                                 '52af3b053cf9994f4e043bf9','52af3b213cf9994f4e043bfa','52af3b293cf9994f4e043bfb',
                                 '52af3b343cf9994f4e043bfc','52af3b3b3cf9994f4e043bfd','52af3b463cf9994f4e043bfe',
                                 '52af3b633cf9994f4e043c01','52af3b513cf9994f4e043bff','52af3b593cf9994f4e043c00',
                                 '52af3b6e3cf9994f4e043c02','52af3b773cf9994f4e043c03','52af3b813cf9994f4e043c04',
                                 '52af3b893cf9994f4e043c05','52af3b913cf9994f4e043c06','52af3b9a3cf9994f4e043c07', '52af3ba23cf9994f4e043c08']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', Deutschland', '')
    address = address.replace(', Germany', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=500, limit=100):
    version = '20180724'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

In [24]:
import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    chinese_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        venues = get_venues_near_location(lat, lon, food_category, foursquare_client_id, foursquare_client_secret, radius=350, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_chinese = is_restaurant(venue_categories, specific_filter=chinese_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_chinese, x, y)
                if venue_distance<=300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_chinese:
                    chinese_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, chinese_restaurants, location_restaurants

restaurants = {}
chinese_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('restaurants_350.pkl', 'rb') as f:
        restaurants = pickle.load(f)
    with open('italian_restaurants_350.pkl', 'rb') as f:
        italian_restaurants = pickle.load(f)
    with open('location_restaurants_350.pkl', 'rb') as f:
        location_restaurants = pickle.load(f)
    print('Restaurant data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    restaurants, chinese_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('restaurants_350.pkl', 'wb') as f:
        pickle.dump(restaurants, f)
    with open('italian_restaurants_350.pkl', 'wb') as f:
        pickle.dump(chinese_restaurants, f)
    with open('location_restaurants_350.pkl', 'wb') as f:
        pickle.dump(location_restaurants, f)
        

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [25]:
import numpy as np

print('Total number of restaurants:', len(restaurants))
print('Total number of Chinese restaurants:', len(chinese_restaurants))
print('Percentage of Chinese restaurants: {:.2f}%'.format(len(chinese_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 115
Total number of Chinese restaurants: 13
Percentage of Chinese restaurants: 11.30%
Average number of restaurants in neighborhood: 0.4175824175824176


In [26]:
print('List of all restaurants')
print('-----------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

List of all restaurants
-----------------------
('4bbf9220b492d13a20e2a260', 'Pacific Fresh', 34.210142554644506, -118.84012659118915, '2060 E Avenida de Los Arboles, Thousand Oaks, CA 91362, United States', 349, False, -3889624.195659887, 15066364.923957342)
('4e4e4116bd4101d0d7a60ceb', 'Panda Express', 34.21028034832754, -118.84115235329998, '2048 E Avenida de Los Arboles, Thousand Oaks, CA 91362, United States', 326, True, -3889512.9595384793, 15066408.127779774)
('5c77508841b6c90025de3c56', 'Habanero Mexican Grill', 34.210372, -118.84185, '2024 E Avenida de los Arboles, Thousand Oaks, CA 91362, United States', 325, False, -3889437.449959904, 15066437.75678927)
('584f2195a370b9190de71f57', 'American Fusion Grill', 34.210497466187604, -118.84161734211868, '2024 E Avenida de los Arboles (Erbes Rd.), Thousand Oaks, CA 91362, United States', 341, False, -3889451.7139249016, 15066409.274778338)
('588ab21751666a083ea461de', 'Cinqo', 34.21042, -118.84179, '2024 E Avenida de los Arboles #8,

In [27]:
print('List of Chinese restaurants')
print('---------------------------')
for r in list(chinese_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(chinese_restaurants))

List of Chinese restaurants
---------------------------
('4e4e4116bd4101d0d7a60ceb', 'Panda Express', 34.21028034832754, -118.84115235329998, '2048 E Avenida de Los Arboles, Thousand Oaks, CA 91362, United States', 326, True, -3889512.9595384793, 15066408.127779774)
('4c571caa6201e21e35e0406e', 'Chang 101', 34.210539631152706, -118.84176383990258, '2024 E Avenida de Los Arboles, Thousand Oaks, CA 91362, United States', 345, True, -3889434.2541823154, 15066412.764187222)
('4b889296f964a520b90132e3', "Han's Chinese", 34.214738455299695, -118.86060334848734, '1032 E Avenida de Los Arboles, Thousand Oaks, CA 91360, United States', 318, True, -3887274.7518669087, 15067007.01448557)
('4b6ba125f964a52058132ce3', 'Imperial Garden', 34.218500053536474, -118.87036312907834, '355 Avenida de los Arboles (Moorpark Rd), Thousand Oaks, CA 91360, United States', 304, True, -3886045.276485581, 15067125.488393527)
('4a1371d5f964a520df771fe3', 'Panda Express', 34.20103953137223, -118.86789815703833, '174

In [28]:
print('Restaurants around location')
print('---------------------------')
for i in range(100, 120):
    rs = location_restaurants[i][:8]
    names = ', '.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around location 101: 
Restaurants around location 102: 
Restaurants around location 103: 
Restaurants around location 104: 
Restaurants around location 105: 
Restaurants around location 106: Minato Sushi, Panda Express, Greenhouse Cafe
Restaurants around location 107: Panda Express, Pip's Place
Restaurants around location 108: 
Restaurants around location 109: 
Restaurants around location 110: 
Restaurants around location 111: 
Restaurants around location 112: 
Restaurants around location 113: 
Restaurants around location 114: 
Restaurants around location 115: 
Restaurants around location 116: 
Restaurants around location 117: 
Restaurants around location 118: 
Restaurants around location 119: 
Restaurants around location 120: 


In [29]:
map_thousand_oaks = folium.Map(location=city_center, zoom_start=13)
folium.Marker(city_center, popup='Thousand Oaks').add_to(map_thousand_oaks)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_chinese = res[6]
    color = 'red' if is_chinese else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_thousand_oaks)
map_thousand_oaks

In [30]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=300m:', np.array(location_restaurants_count).mean())

df_locations.head(10)

Average number of restaurants in every area with radius=300m: 0.4175824175824176


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,"Lang Ranch Open Space, Chaucer Place, Thousand...",34.216078,-118.832964,-3889918.0,15065240.0,5992.495307,0
1,"2808, Parkview Drive, Thousand Oaks, Ventura C...",34.218276,-118.83747,-3889318.0,15065240.0,5840.3767,0
2,"3180, Morningside Drive, Thousand Oaks, Ventur...",34.220473,-118.841976,-3888718.0,15065240.0,5747.173218,0
3,"3172, Montagne Way, Thousand Oaks, Ventura Cou...",34.222671,-118.846482,-3888118.0,15065240.0,5715.767665,0
4,"Caraway Court, Thousand Oaks, Ventura County, ...",34.224869,-118.85099,-3887518.0,15065240.0,5747.173218,0
5,"3423, Clarendon Place, Thousand Oaks, Ventura ...",34.227066,-118.855497,-3886918.0,15065240.0,5840.3767,0
6,"3621, Field Crest Court, Thousand Oaks, Ventur...",34.229264,-118.860005,-3886318.0,15065240.0,5992.495307,0
7,"North Westlake Boulevard, Thousand Oaks, Ventu...",34.20954,-118.828497,-3890818.0,15065760.0,5855.766389,0
8,"2529, Northpark Street, Thousand Oaks, Ventura...",34.211738,-118.833002,-3890218.0,15065760.0,5604.462508,0
9,"2234, Green Oak Court, Thousand Oaks, Ventura ...",34.213935,-118.837508,-3889618.0,15065760.0,5408.326913,0


In [31]:
distances_to_chinese_restaurant = []

for area_x, area_y in zip(xs, ys):
    min_distance = 10000
    for res in chinese_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_chinese_restaurant.append(min_distance)

df_locations['Distance to Chinese restaurant'] = distances_to_chinese_restaurant

In [32]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to Chinese restaurant
0,"Lang Ranch Open Space, Chaucer Place, Thousand...",34.216078,-118.832964,-3889918.0,15065240.0,5992.495307,0,1235.54332
1,"2808, Parkview Drive, Thousand Oaks, Ventura C...",34.218276,-118.83747,-3889318.0,15065240.0,5840.3767,0,1177.59403
2,"3180, Morningside Drive, Thousand Oaks, Ventur...",34.220473,-118.841976,-3888718.0,15065240.0,5747.173218,0,1373.342479
3,"3172, Montagne Way, Thousand Oaks, Ventura Cou...",34.222671,-118.846482,-3888118.0,15065240.0,5715.767665,0,1762.217758
4,"Caraway Court, Thousand Oaks, Ventura County, ...",34.224869,-118.85099,-3887518.0,15065240.0,5747.173218,0,1782.796657
5,"3423, Clarendon Place, Thousand Oaks, Ventura ...",34.227066,-118.855497,-3886918.0,15065240.0,5840.3767,0,1801.749984
6,"3621, Field Crest Court, Thousand Oaks, Ventur...",34.229264,-118.860005,-3886318.0,15065240.0,5992.495307,0,1904.229836
7,"North Westlake Boulevard, Thousand Oaks, Ventu...",34.20954,-118.828497,-3890818.0,15065760.0,5855.766389,0,1457.00915
8,"2529, Northpark Street, Thousand Oaks, Ventura...",34.211738,-118.833002,-3890218.0,15065760.0,5604.462508,0,957.425691
9,"2234, Green Oak Court, Thousand Oaks, Ventura ...",34.213935,-118.837508,-3889618.0,15065760.0,5408.326913,0,656.088595


In [33]:
print('Average distance to closest Chinese restaurant from each area center:', df_locations['Distance to Chinese restaurant'].mean())

Average distance to closest Chinese restaurant from each area center: 1855.404989459351


In [34]:
from folium import plugins
from folium.plugins import HeatMap
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]

chinese_latlons = [[res[2], res[3]] for res in chinese_restaurants.values()]

In [35]:
# A map of all restaurant locations
map_thousand_oaks = folium.Map(location=city_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_thousand_oaks) 
HeatMap(restaurant_latlons).add_to(map_thousand_oaks)
folium.Marker(city_center).add_to(map_thousand_oaks)
map_thousand_oaks

In [36]:
# A map of Chinese Restaurant locations
map_thousand_oaks = folium.Map(location=city_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_thousand_oaks) #cartodbpositron cartodbdark_matter
HeatMap(chinese_latlons).add_to(map_thousand_oaks)
folium.Marker(city_center).add_to(map_thousand_oaks)
map_thousand_oaks

In [38]:
roi_x_min = city_center_x - 2000
roi_y_max = city_center_y + 1000
roi_width = 5000
roi_height = 5000
roi_center_x = roi_x_min + 2500
roi_center_y = roi_y_max - 2500
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

map_thousand_oaks = folium.Map(location=roi_center, zoom_start=14)
HeatMap(restaurant_latlons).add_to(map_thousand_oaks)
folium.Marker(city_center).add_to(map_thousand_oaks)
map_thousand_oaks

In [39]:
k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_step = 100
y_step = 100 * k 
roi_y_min = roi_center_y - 2500

roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
for i in range(0, int(51/k)):
    y = roi_y_min + i * y_step
    x_offset = 50 if i%2==0 else 0
    for j in range(0, 51):
        x = roi_x_min + j * x_step + x_offset
        d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
        if (d <= 2501):
            lon, lat = xy_to_lonlat(x, y)
            roi_latitudes.append(lat)
            roi_longitudes.append(lon)
            roi_xs.append(x)
            roi_ys.append(y)

In [40]:
def count_restaurants_nearby(x, y, restaurants, radius=525):    
    count = 0
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=radius:
            count += 1
    return count

def find_nearest_restaurant(x, y, restaurants):
    d_min = 100000
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=d_min:
            d_min = d
    return d_min

roi_restaurant_counts = []
roi_chinese_distances = []

print('Generating data on location candidates... ', end='')
for x, y in zip(roi_xs, roi_ys):
    count = count_restaurants_nearby(x, y, restaurants, radius=250)
    roi_restaurant_counts.append(count)
    distance = find_nearest_restaurant(x, y, chinese_restaurants)
    roi_chinese_distances.append(distance)
print('done.')

Generating data on location candidates... done.


In [41]:
df_roi_locations = pd.DataFrame({'Latitude':roi_latitudes,
                                 'Longitude':roi_longitudes,
                                 'X':roi_xs,
                                 'Y':roi_ys,
                                 'Restaurants nearby':roi_restaurant_counts,
                                 'Distance to Chinese restaurant':roi_chinese_distances})

df_roi_locations.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Restaurants nearby,Distance to Chinese restaurant
0,34.213612,-118.857426,-3887668.0,15066960.0,0,396.59164
1,34.213978,-118.858177,-3887568.0,15066960.0,0,297.671144
2,34.211058,-118.853676,-3888218.0,15067040.0,0,944.080872
3,34.211424,-118.854428,-3888118.0,15067040.0,0,844.163416
4,34.211791,-118.855179,-3888018.0,15067040.0,0,744.268133
5,34.212157,-118.85593,-3887918.0,15067040.0,0,644.405334
6,34.212523,-118.856681,-3887818.0,15067040.0,0,544.59289
7,34.212889,-118.857432,-3887718.0,15067040.0,0,444.864694
8,34.213255,-118.858184,-3887618.0,15067040.0,0,345.293755
9,34.213621,-118.858935,-3887518.0,15067040.0,2,246.071043


In [42]:
good_res_count = np.array((df_roi_locations['Restaurants nearby']<=3))
print('Locations with no more than 5 restaurants nearby:', good_res_count.sum())

good_ita_distance = np.array(df_roi_locations['Distance to Chinese restaurant']>=800)
print('Locations with no Chinese restaurants within 500m:', good_ita_distance.sum())

good_locations = np.logical_and(good_res_count, good_ita_distance)
print('Locations with both conditions met:', good_locations.sum())

df_good_locations = df_roi_locations[good_locations]

Locations with no more than 5 restaurants nearby: 2227
Locations with no Chinese restaurants within 500m: 1819
Locations with both conditions met: 1795


In [44]:
good_latitudes = df_good_locations['Latitude'].values
good_longitudes = df_good_locations['Longitude'].values
from sklearn.cluster import KMeans

number_of_clusters = 15

good_xys = df_good_locations[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_thousand_oaks = folium.Map(location=roi_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_thousand_oaks)
HeatMap(restaurant_latlons).add_to(map_thousand_oaks)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.4).add_to(map_thousand_oaks)
folium.Marker(city_center).add_to(map_thousand_oaks)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=True, fill_opacity=0.25).add_to(map_thousand_oaks) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_thousand_oaks)
map_thousand_oaks

In [47]:
candidate_area_addresses = []
print('==============================================================')
print('Addresses of centers of areas recommended for further analysis')
print('==============================================================\n')
for lon, lat in cluster_centers:
    addr = get_address(lat, lon)
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, city_center_x, city_center_y)
    print('{}{} => {:.1f}km from the city center'.format(addr, ' '*(50-len(addr)), d/1000))

Addresses of centers of areas recommended for further analysis

50, Doone Street, Thousand Oaks, Ventura County, California, 91360, USA
50, Doone Street, Thousand Oaks, Ventura County, California, 91360, USA => 1.2km from the city center
1030, Windsor Drive, Thousand Oaks, Ventura County, California, 91360, USA
1030, Windsor Drive, Thousand Oaks, Ventura County, California, 91360, USA => 0.8km from the city center
East Avenida de Las Flores, Thousand Oaks, Ventura County, California, 91360, USA
East Avenida de Las Flores, Thousand Oaks, Ventura County, California, 91360, USA => 3.3km from the city center
174, Siesta Avenue, Thousand Oaks, Ventura County, California, 91360, USA
174, Siesta Avenue, Thousand Oaks, Ventura County, California, 91360, USA => 3.3km from the city center
coastal sage, Mayflower Street, Thousand Oaks, Ventura County, California, 91360, USA
coastal sage, Mayflower Street, Thousand Oaks, Ventura County, California, 91360, USA => 0.8km from the city center
399, Que

In [49]:
map_berlin = folium.Map(location=roi_center, zoom_start=14)
folium.Circle(city_center, radius=50, color='red', fill=True, fill_color='red', fill_opacity=1).add_to(map_berlin)
for lonlat, addr in zip(cluster_centers, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_berlin) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_berlin)
map_berlin