# The Battle of Neighborhoods - European Youth Capital Winner - Amiens
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)



## Introduction: Business Problem <a name="introduction"></a>

**Amiens** is the **European Youth Capital winner of 2020**. With a cultural events programme combining music, cinema and theatre festivals, some of which happen in the streets, Amiens fully deserves its place as European Youth Capital. Although it is a historical city with the Cathedrale Notre Dame, classified by UNESCO (the 800th anniversary in 2020) there is also a wonderful youth dynamic in Amiens thanks to the university **“La Citadelle”**, which is located in the centre of the town [1] This report is addressed towards students who want to find an optimal location to stay in the city of **Amiens**.

Since there are lots of venues in Amiens we will try to detect **locations crowded with venues that appeal more towards students**. We are also particularly interested in **areas that are as close to the university as possible**. So an area that satisfies both of the above statements is going to be considered as an optimal location for a student to live in.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by the students interested in living in Amiens. 

1. https://www.youthforum.org/european-youth-capital-winner-amiens-2020

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing venues in the neighborhood.
* weighted sum of all existing velues in the neighborhood.
* distance of each neighborhood from the university as well as the center.

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* number of venues and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Ctre Ville, that we are going to use as our center point, will be obtained using **Google Maps API geocoding**.
* we will also obtain the coordinates of Université de Picardie Jules Verne in order for find the distance of each location from the University.

## Methodology<a name="Methodology"></a>

### Neighborhood Candidates

Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods. We will create a grid of cells covering our area of interest which is aprox. 5x5 killometers centered around Ctre Ville.

Let's first find the latitude & longitude of Université de Picardie Jules Verne using Google Maps' geocoding API.

In [1]:
from getkeys import get_gkey, get_fsqkey
google_api_key = get_gkey()
client_id , client_secret = get_fsqkey()

In [2]:
import requests

def get_coordinates(api_key, address, verbose=False):
    try:
        url = f'https://maps.googleapis.com/maps/api/geocode/json?key={google_api_key}&address={address}'
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        geographical_data = results[0]['geometry']['location'] # get geographical coordinates
        lat = geographical_data['lat']
        lon = geographical_data['lng']
        return [lat, lon]
    except:
        return [None, None]
    
address = 'Ctre Ville, Amiens, France'
amiens_center = get_coordinates(google_api_key, address)
latitude = amiens_center[0]
longitude = amiens_center[1]
print(f'The geograpical coordinate of {address} are {amiens_center}.')

The geograpical coordinate of Ctre Ville, Amiens, France are [49.892168, 2.2994263].


In [3]:
address_ = 'Université de Picardie Jules Verne'
amiens_university = get_coordinates(google_api_key,address_)
latitude_ = amiens_university[0]
longitude_ = amiens_university[1]
print(f'The geograpical coordinate of {address_} are {amiens_university}.')

The geograpical coordinate of Université de Picardie Jules Verne are [49.8761492, 2.2655255].


Now let's create a grid of area candidates, equaly spaced, centered around city center and within ~5km from Ctre Ville. Our neighborhoods will be defined as circular areas with a radius of 300 meters, so our neighborhood centers will be 600 meters apart.

To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in  meters).

In [53]:
import shapely.geometry

import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


latt,lonn = lonlat_to_xy(longitude_,latitude_) #Amiens University coordinates
print('\n\t\t\t-------------------------------')
print('\t\t\tCoordinate transformation check')
print('\t\t\t-------------------------------\n')
print(f'Université de Picardie Jules Verne longitude={longitude_}, latitude={latitude_}.')
x, y = lonlat_to_xy(longitude_, latitude_)
print(f'Université de Picardie Jules Verne UTM X={x}, Y={y}')
lo, la = xy_to_lonlat(x, y)
print(f'Université de Picardie Jules Verne longitude={lo}, latitude={la}.')


			-------------------------------
			Coordinate transformation check
			-------------------------------

Université de Picardie Jules Verne longitude=2.2655255, latitude=49.8761492.
Université de Picardie Jules Verne UTM X=-413657.6949054776, Y=5603090.73454994
Université de Picardie Jules Verne longitude=2.2655255000000016, latitude=49.8761492.


Let's create a **hexagonal grid of cells**: we offset every other row, and adjust vertical row spacing so that **every cell center is equally distant from all it's neighbors**.

In [5]:
amiens_center_x, amiens_center_y = lonlat_to_xy(longitude, latitude) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = amiens_center_x - 5000
x_step = 600
y_min = amiens_center_y - 5000 - (int(21/k)*k*600 - 10000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
latitudes_ =[]
longitudes_ = []
distances_from_center = []
distances_from_university=[]
xs = []
ys = []
xs_=[]
ys_=[]
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(amiens_center_x, amiens_center_y, x, y)
        distance_from_university = calc_xy_distance(latt,lonn,x,y)
        if (distance_from_center <= 5001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)
            lon_, lat_ = xy_to_lonlat(x, y)
            latitudes_.append(lat_)
            longitudes_.append(lon_)
            distances_from_university.append(distance_from_university)
            xs_.append(x)
            ys_.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

251 candidate neighborhood centers generated.


Lets visualize each candidate neighborhood we have generated so far

In [6]:
import folium

map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

In [7]:
def get_address(google_api_key, latitude, longitude, verbose=False):
    try:
        url = f'https://maps.googleapis.com/maps/api/geocode/json?key={google_api_key}&latlng={latitude},{longitude}'
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None

addr = get_address(google_api_key, amiens_center[0], amiens_center[1])

Now lets create a Pandas DataFrame with the above data and save it with pickle for later use.

In [8]:
import pickle
import pandas as pd

loaded = False
try:
    with open('locations.pkl', 'rb') as f:
        df_locations = pickle.load(f)
        addresses = [address for address in df_locations['Address']]
        loaded = True
except:
    pass

if loaded == False:
    print('Obtaining location addresses: ', end='')
    addresses = []
    for lat, lon in zip(latitudes, longitudes):
        address = get_address(google_api_key,lat, lon)
        if address is None or 'amien' not in address.lower():
            address = 'NO ADDRESS'
        address = address.replace(', France', '')
        addresses.append(address)
        print(' .', end='')
    print(' done.')

    df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center,
                             'Distance from University': distances_from_university})
    df_locations.to_pickle('./locations.pkl')   

Let's print some addresses

In [9]:
df_locations['Address'][101:110]

101                                          NO ADDRESS
102                                          NO ADDRESS
103    291 Grande Rue du Petit Saint-Jean, 80480 Amiens
104                      661 Rue de Rouen, 80000 Amiens
105                      447 Rue de Rouen, 80000 Amiens
106                      86 Rue du Bellay, 80000 Amiens
107                        20 Rue Dheilly, 80000 Amiens
108                         24 Rue Duminy, 80000 Amiens
109             21 Rue de la Contrescarpe, 80000 Amiens
Name: Address, dtype: object

There are some addresses with address = NO ADDRESS that need to be dropped

In [10]:
df_locations = df_locations[df_locations['Address'] != 'NO ADDRESS']

Let's see our DataFrame

In [11]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Distance from University
11,"A29, 80090 Amiens",49.856752,2.318649,-410227.17343,5600288.0,4215.447782,4429.634107
12,"Rue Wasse, 80090 Amiens",49.857657,2.326792,-409627.17343,5600288.0,4355.456348,4909.000315
16,"31A Route d'Amiens, 80480 Dury",49.856321,2.272649,-413527.17343,5600808.0,4471.017781,2286.465879
20,"Le montjoie, Rue Saint-Fuscien, 80090 Amiens",49.859952,2.30522,-411127.17343,5600808.0,3642.80112,3407.994908
21,"D7, 80680 Amiens",49.860858,2.313363,-410527.17343,5600808.0,3659.234893,3874.410286
23,"40 Rue Wasse, 80090 Amiens",49.862669,2.329652,-409327.17343,5600808.0,3973.663297,4895.335188
24,"9 Rue de Montréal, 80090 Amiens",49.863574,2.337797,-408727.17343,5600808.0,4250.882261,5433.316885
29,"91 Route d'Amiens, 80480 Dury",49.861334,2.275503,-413227.17343,5601328.0,3874.274126,1814.923891
32,"23 Clos des Châtaigniers, 80090 Amiens",49.864058,2.299933,-411427.17343,5601328.0,3157.530681,2843.207
33,"3 Allée du Montjoie, 80090 Amiens",49.864964,2.308077,-410827.17343,5601328.0,3119.294792,3334.734144


In [12]:
map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_locations['Latitude'],df_locations['Longitude']):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

In [13]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Distance from University
11,"A29, 80090 Amiens",49.856752,2.318649,-410227.17343,5600288.0,4215.447782,4429.634107
12,"Rue Wasse, 80090 Amiens",49.857657,2.326792,-409627.17343,5600288.0,4355.456348,4909.000315
16,"31A Route d'Amiens, 80480 Dury",49.856321,2.272649,-413527.17343,5600808.0,4471.017781,2286.465879
20,"Le montjoie, Rue Saint-Fuscien, 80090 Amiens",49.859952,2.30522,-411127.17343,5600808.0,3642.80112,3407.994908
21,"D7, 80680 Amiens",49.860858,2.313363,-410527.17343,5600808.0,3659.234893,3874.410286
23,"40 Rue Wasse, 80090 Amiens",49.862669,2.329652,-409327.17343,5600808.0,3973.663297,4895.335188
24,"9 Rue de Montréal, 80090 Amiens",49.863574,2.337797,-408727.17343,5600808.0,4250.882261,5433.316885
29,"91 Route d'Amiens, 80480 Dury",49.861334,2.275503,-413227.17343,5601328.0,3874.274126,1814.923891
32,"23 Clos des Châtaigniers, 80090 Amiens",49.864058,2.299933,-411427.17343,5601328.0,3157.530681,2843.207
33,"3 Allée du Montjoie, 80090 Amiens",49.864964,2.308077,-410827.17343,5601328.0,3119.294792,3334.734144


### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on existing venues for each of these catergories:
* Arts & Entertainment
* Food
* Nightlife Spot
* Outdoors & Recreation


We're interested in venues in 'Outdoors & Recreation category.

In [14]:
adict = {"Arts & Entertainment" : '4d4b7104d754a06370d81259',
"Food" : '4d4b7105d754a06374d81259',
"Nightlife Spot" : '4d4b7105d754a06376d81259',
"Outdoors & Recreation" : '4d4b7105d754a06377d81259',}

In [15]:
import json, requests

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', France', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=400, limit=50):
    url = 'https://api.foursquare.com/v2/venues/explore'

    params = dict(
    client_id=client_id,
    client_secret=client_secret,
    v='20200505',
    ll=f'{lat},{lon}',
    limit=limit,
    radius=radius,
    categoryId=category[1]
    )
    resp = requests.get(url=url, params=params)
    data = json.loads(resp.text)
    try:
        results = data['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   category[0],
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
        pass
    return venues

In [16]:
import pickle

def get_venues(lats, lons):

    ven = {}
    location_vens = []
    food_vens = []
    arts_vens = []
    night_vens = []
    outdoor_vens = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        for cat in adict.items():
            # print(cat[0])
            venues = get_venues_near_location(lat, lon, cat, client_id, client_secret)
            area_vens = []
            for venue in venues:
                venue_id = venue[0]
                venue_name = venue[1]
                venue_categorie = venue[2]
                venue_latlon = venue[3]
                venue_address = venue[4]
                venue_distance = venue[5]
                avenue = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, x, y, venue_categorie)
                
                if venue_distance<=350:
                    area_vens.append(avenue)
                ven[venue_id] = avenue
            # location_vens.append(area_vens)
        # print(cat[0])
            if cat[0] =='Food':
                food_vens.append(area_vens)
            elif cat[0] == 'Arts & Entertainment':
                arts_vens.append(area_vens)
            elif cat[0] == 'Nightlife Spot':
                night_vens.append(area_vens)
            elif cat[0] == 'Outdoors & Recreation':
                outdoor_vens.append(area_vens)
        print(' .', end='')
    print(' done.')
    return ven , food_vens, arts_vens, night_vens, outdoor_vens

loaded = False
try:
    with open('venues.pkl', 'rb') as f:
        some_venues = pickle.load(f)
    with open('food.pkl', 'rb') as f:
        food_vens = pickle.load(f)
    with open('arts.pkl', 'rb') as f:
        arts_vens = pickle.load(f)
    with open('nightlife.pkl', 'rb') as f:
        night_vens = pickle.load(f)
    with open('outdoor.pkl', 'rb') as f:
        outdoor_vens = pickle.load(f)
    print('Venue data around neighborhoods loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    some_venues, food_vens, arts_vens, night_vens, outdoor_vens = get_venues(df_locations['Latitude'],df_locations['Longitude'])    
    #Let's persists this in local file system
    with open('venues.pkl', 'wb') as f:
        pickle.dump(some_venues, f)
    with open('food.pkl', 'wb') as f:
        pickle.dump(food_vens, f)
    with open('arts.pkl', 'wb') as f:
        pickle.dump(arts_vens, f)
    with open('nightlife.pkl', 'wb') as f:
        pickle.dump(night_vens, f)
    with open('outdoor.pkl', 'wb') as f:
        pickle.dump(outdoor_vens, f)
# some_venues, food_vens, arts_vens, events_vens, night_vens, outdoor_vens = get_venues(df_locations['Latitude'],df_locations['Longitude'])

Venue data around neighborhoods loaded.


In [17]:
map_amiens = folium.Map(location=amiens_center, zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for ven in some_venues.values():
    lat = ven[2]; lon = ven[3]
    cat = ven[8]
    if cat =='Food':
        color = 'red'
    elif cat == 'Arts & Entertainment':
        color = 'green'
    elif cat == 'Nightlife Spot':
        color = 'blue'
    else:
        color = 'yellow'
    # color = 'red' if cat =='Food' 'orange' elif cat == 'Event' else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_amiens)
map_amiens

In [18]:
import numpy as np

food_count = [len(i) for i in food_vens]
arts_count = [len(i) for i in arts_vens]
nightlife_count = [len(i) for i in night_vens]
outdoor_count = [len(i) for i in outdoor_vens]

df_locations['Food Places in Area'] = food_count
df_locations['Arts & Entertainment in Area'] = arts_count
df_locations['Nightlife Spots in Area'] = nightlife_count
df_locations['Outdoors & Recreation in Area'] = outdoor_count
df_locations.reset_index(inplace=True)
# df_locations.drop(columns='level_0',inplace=True)
df_locations[10:15]

Unnamed: 0,index,Address,Latitude,Longitude,X,Y,Distance from center,Distance from University,Food Places in Area,Arts & Entertainment in Area,Nightlife Spots in Area,Outdoors & Recreation in Area
10,34,"Espace mistral, 13 Rue de Redon, 80090 Amiens",49.865871,2.316222,-410227.17343,5601328.0,3195.309062,3857.081485,0,0,0,0
11,35,"6 Rue Soufflot, 80090 Amiens",49.866776,2.324367,-409627.17343,5601328.0,3377.869151,4399.284414,0,0,0,1
12,36,"722 Rue de Cagny, 80090 Amiens",49.867681,2.332512,-409027.17343,5601328.0,3651.027253,4954.828869,0,0,0,0
13,37,"105 Marais de Cagny, 80090 Amiens",49.868586,2.340658,-408427.17343,5601328.0,3996.248241,5519.687935,0,0,0,0
14,43,"116 Route d'Amiens, 80480 Dury",49.866347,2.278357,-412927.17343,5601847.0,3278.719262,1442.210543,4,0,1,0


In [19]:
venues_latlons = [[ven[2], ven[3]] for ven in some_venues.values()]

Let's visualize the data with a HeatMap

In [20]:
from folium import plugins
from folium.plugins import HeatMap

map_amiens = folium.Map(location=amiens_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_amiens) #cartodbpositron cartodbdark_matter
HeatMap(venues_latlons).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
folium.Circle(amiens_center, radius=1000, fill=False, color='white').add_to(map_amiens)
folium.Circle(amiens_center, radius=2000, fill=False, color='white').add_to(map_amiens)
folium.Circle(amiens_center, radius=3000, fill=False, color='white').add_to(map_amiens)
map_amiens

In [21]:
# df_locations = df_locations[df_locations['Food Places in Area']+df_locations['Arts & Entertainment in Area']+df_locations['Nightlife Spots in Area']+df_locations['Outdoors & Recreation in Area'] > 2]
# df_locations.head()

In [22]:
# df_locations['Food Places in Area'] = df_locations['Food Places in Area']/ np.asarray(df_locations['Food Places in Area']).max()
# df_locations['Arts & Entertainment in Area'] = df_locations['Arts & Entertainment in Area'] / np.asarray(df_locations['Arts & Entertainment in Area']).max()
# df_locations['Nightlife Spots in Area'] = df_locations['Nightlife Spots in Area'] / np.asarray(df_locations['Nightlife Spots in Area']).max()
# df_locations['Outdoors & Recreation in Area'] = df_locations['Outdoors & Recreation in Area'] / np.asarray(df_locations['Outdoors & Recreation in Area']).max()
# df_locations['Distance from center'] = df_locations['Distance from center'] / np.asarray(df_locations['Distance from center']).max()

# df_locations[10:15]

Let's take a look at these locations on the map

In [23]:
map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_locations['Latitude'],df_locations['Longitude']):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

In [24]:
roi_x_min = amiens_center_x - 2000
roi_y_max = amiens_center_y + 1000
roi_width = 5000
roi_height = 5000
roi_center_x = roi_x_min + 2500
roi_center_y = roi_y_max - 2500
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

In [25]:
k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_step = 100
y_step = 100 * k 
roi_y_min = roi_center_y - 2500

roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
for i in range(0, int(51/k)):
    y = roi_y_min + i * y_step
    x_offset = 50 if i%2==0 else 0
    for j in range(0, 51):
        x = roi_x_min + j * x_step + x_offset
        d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
        if (d <= 2501):
            lon, lat = xy_to_lonlat(x, y)
            roi_latitudes.append(lat)
            roi_longitudes.append(lon)
            roi_xs.append(x)
            roi_ys.append(y)

print(len(roi_latitudes), 'candidate neighborhood centers generated.')

2261 candidate neighborhood centers generated.


In [26]:
def count_(x, y, df_locations, radius=250):    
    food_count_ = 0
    art_count_ = 0
    night_count_ = 0
    out_count_ = 0
    # for ven_x,ven_y in zip(df_locations['Latitude'],df_locations['Longitude']):
    for index,ven in df_locations.iterrows():
        ven_x = ven['X']
        ven_y = ven['Y']
        d = calc_xy_distance(x, y, ven_x, ven_y)
        # print(d,ven['Food Places in Area'])
        if d<=radius and ven['Food Places in Area'] > 0:
            food_count_ = ven['Food Places in Area']
        if d<=radius and ven['Arts & Entertainment in Area'] > 0:
            art_count_ = ven['Arts & Entertainment in Area']
        if d<=radius and ven['Nightlife Spots in Area'] > 0:
            night_count_ = ven['Nightlife Spots in Area']
        if d<=radius and ven['Outdoors & Recreation in Area'] > 0:
            out_count_ = ven['Outdoors & Recreation in Area']
    return food_count_,art_count_,night_count_,out_count_

def calc_distance_(x, y,latt,lonn):
    d = calc_xy_distance(x, y, latt, lonn)
    return d



Generating data on location candidates........done.


In [37]:
food_venues_counts = []
art_venues_counts = []
night_venues_counts = []
out_venues_counts = []
roi_cen_dist = []
roi_uni_dist=[]
print('Generating data on location candidates........ ', end='')

loaded = False
try:
    with open('roi_locations.pkl', 'rb') as f:
        df_roi_locations= pickle.load(f)
        loaded = True
except:
    pass

if loaded == False:
    

    for x, y in zip(roi_xs, roi_ys):
        food_count_,art_count_,night_count_,out_count_ = count_(x, y, df_locations, radius=300)
        food_venues_counts.append(food_count_)
        art_venues_counts.append(art_count_)
        night_venues_counts.append(night_count_)
        out_venues_counts.append(out_count_)
        distance_ = calc_distance_(x,y,latt,lonn)
        roi_uni_dist.append(distance_)
        distance = calc_distance_(x,y,amiens_center_x,amiens_center_y)
        roi_cen_dist.append(distance)
    print('done.')
    # Let's put this into dataframe
    df_roi_locations = pd.DataFrame({'Latitude':roi_latitudes,
                                    'Longitude':roi_longitudes,
                                    'X':roi_xs,
                                    'Y':roi_ys,
                                    'Food Venues nearby':food_venues_counts,
                                    'Arts venues nearby':art_venues_counts,
                                    'Nightlife venues nearby':night_venues_counts,
                                    'Outdoor & Recreation venues nearby':out_venues_counts,
                                    'Distance to the University':roi_uni_dist,
                                    'Distance to the Center':roi_cen_dist,
                                    })
    df_roi_locations.to_pickle('./roi_locations.pkl')  

Generating data on location candidates........done.


In [39]:
df_roi_locations.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center
0,49.857751,2.314889,-410477.17343,5600445.0,0,0,0,0,4136.909576,4025.232913
1,49.857902,2.316246,-410377.17343,5600445.0,0,0,0,0,4214.276347,4037.635447
2,49.85768,2.307222,-411027.17343,5600532.0,0,0,0,0,3669.774539,3914.674913
3,49.857832,2.308579,-410927.17343,5600532.0,0,0,0,0,3742.10495,3913.39746
4,49.857983,2.309937,-410827.17343,5600532.0,0,0,0,0,3815.685227,3914.674913
5,49.858134,2.311294,-410727.17343,5600532.0,0,0,0,0,3890.444454,3918.504776
6,49.858285,2.312651,-410627.17343,5600532.0,0,0,0,0,3966.315966,3924.879575
7,49.858436,2.314008,-410527.17343,5600532.0,0,0,0,0,4043.237149,3933.786938
8,49.858587,2.315366,-410427.17343,5600532.0,0,0,0,0,4121.149225,3945.209713
9,49.858738,2.316723,-410327.17343,5600532.0,0,0,0,0,4199.997051,3959.126125


In [40]:
good_food_count = np.array((df_roi_locations['Food Venues nearby']>=1))
good_art_count = np.array((df_roi_locations['Arts venues nearby']>=1))
good_night_count = np.array((df_roi_locations['Nightlife venues nearby']>=1))
good_out_count = np.array((df_roi_locations['Outdoor & Recreation venues nearby']>=1))
good_uni_distance = np.array(df_roi_locations['Distance to the University']<=4000)
good_center_distance = np.array(df_roi_locations['Distance to the Center']<=4000)

good_locations = np.logical_and(good_food_count, np.logical_and(good_uni_distance,np.logical_and(good_art_count,np.logical_and(good_night_count,np.logical_and(good_out_count,good_center_distance)))))
print('Locations with both conditions met:', good_locations.sum())

df_good_locations = df_roi_locations[good_locations]

Locations with both conditions met: 182


In [41]:
good_latitudes = df_good_locations['Latitude'].values
good_longitudes = df_good_locations['Longitude'].values

good_locations = [[lat, lon] for lat, lon in zip(good_latitudes, good_longitudes)]

map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_amiens)
HeatMap(venues_latlons).add_to(map_amiens)
folium.Circle(amiens_center, radius=2500, color='white', fill=True, fill_opacity=0.6).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens) 
# folium.GeoJson(amiens_boroughs, style_function=boroughs_style, name='geojson').add_to(map_amiens)
map_amiens

In [42]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
HeatMap(good_locations, radius=25).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
# folium.GeoJson(amiens_boroughs, style_function=boroughs_style, name='geojson').add_to(map_amiens)
map_amiens

In [43]:
from sklearn.cluster import KMeans

number_of_clusters = 8

good_xys = df_good_locations[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_amiens)
HeatMap(venues_latlons).add_to(map_amiens)
folium.Circle(amiens_center, radius=2000, color='white', fill=True, fill_opacity=0.4).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=True, fill_opacity=0.25).add_to(map_amiens) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
# folium.GeoJson(amiens_boroughs, style_function=boroughs_style, name='geojson').add_to(map_amiens)
map_amiens

In [44]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#00000000', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_amiens) 
# folium.GeoJson(amiens_boroughs, style_function=boroughs_style, name='geojson').add_to(map_amiens)
map_amiens

In [45]:
candidate_area_addresses = []
print('\n\t==============================================================')
print('\tAddresses of centers of areas recommended for further analysis')
print('\t==============================================================\n')
for lon, lat in cluster_centers:
    addr = get_address(google_api_key, lat, lon).replace(', France', '')
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, latt, lonn)
    d_ = calc_xy_distance(x, y, amiens_center_x, amiens_center_y)
    print(f'{addr} => {"%.2f" %  (d/1000) }km from Université de Picardie Jules Verne')
    print(f'{addr} => {"%.2f" %  (d_/1000) }km from Ctre Ville')


	Addresses of centers of areas recommended for further analysis

22 Rue Maurice Thédié, 80000 Amiens => 2.67km from Université de Picardie Jules Verne
22 Rue Maurice Thédié, 80000 Amiens => 0.88km from Ctre Ville
3 Rue Vincent Auriol, 80000 Amiens => 3.14km from Université de Picardie Jules Verne
3 Rue Vincent Auriol, 80000 Amiens => 0.38km from Ctre Ville
15 Rue Gaudissart, 80000 Amiens => 3.70km from Université de Picardie Jules Verne
15 Rue Gaudissart, 80000 Amiens => 0.73km from Ctre Ville
16 Rue Saint-Germain, 80000 Amiens => 3.15km from Université de Picardie Jules Verne
16 Rue Saint-Germain, 80000 Amiens => 0.57km from Ctre Ville
24 Rue Duminy, 80000 Amiens => 2.65km from Université de Picardie Jules Verne
24 Rue Duminy, 80000 Amiens => 0.57km from Ctre Ville
17 Rue des 3 Cailloux, 80000 Amiens => 3.08km from Université de Picardie Jules Verne
17 Rue des 3 Cailloux, 80000 Amiens => 0.06km from Ctre Ville
66 Rue du Hocquet, 80000 Amiens => 3.50km from Université de Picardie Jule

In [46]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lonlat, addr in zip(cluster_centers, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_amiens) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_amiens)
map_amiens

In [47]:
#todo add results and ways to improve this report