# The Battle of Neighborhoods - European Youth Capital Winner - Amiens
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Data Analysis](#analysis)
* [Results and Discussion](#results)
* [Future directions](#future)



## Introduction: Business Problem <a name="introduction"></a>

**Amiens** is the **European Youth Capital winner of 2020**. With a cultural events programme combining music, cinema and theatre festivals, some of which happen in the streets, Amiens fully deserves its place as European Youth Capital. Although it is a historical city with the Cathedrale Notre Dame, classified by UNESCO (the 800th anniversary in 2020) there is also a wonderful youth dynamic in Amiens thanks to the university **“La Citadelle”**, which is located in the centre of the town [1] This report is addressed towards students who want to find an optimal location to stay in the city of **Amiens**.

Since there are lots of venues in Amiens we will try to detect **locations crowded with venues that appeal more towards students**. We are also particularly interested in **areas that are as close to the university as possible**. So an area that satisfies both of the above statements is going to be considered as an optimal location for a student to live in.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by the students interested in living in Amiens. 

1. https://www.youthforum.org/european-youth-capital-winner-amiens-2020

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing venues in the neighborhood.
* weighted sum of all existing velues in the neighborhood.
* distance of each neighborhood from the university as well as the center.

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* number of venues and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Ctre Ville, that we are going to use as our center point, will be obtained using **Google Maps API geocoding**.
* we will also obtain the coordinates of Université de Picardie Jules Verne in order for find the distance of each location from the University.

## Methodology<a name="methodology"></a>

### Neighborhood Candidates

Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods. We will create a grid of cells covering our area of interest which is aprox. 5x5 killometers centered around Ctre Ville.

Let's first find the latitude & longitude of Université de Picardie Jules Verne using Google Maps' geocoding API.

In [2]:
from getkeys import get_gkey, get_fsqkey
google_api_key = get_gkey()
client_id , client_secret = get_fsqkey()

In [3]:
import requests

def get_coordinates(api_key, address, verbose=False):
    try:
        url = f'https://maps.googleapis.com/maps/api/geocode/json?key={google_api_key}&address={address}'
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        geographical_data = results[0]['geometry']['location'] # get geographical coordinates
        lat = geographical_data['lat']
        lon = geographical_data['lng']
        return [lat, lon]
    except:
        return [None, None]
    
address = 'Ctre Ville, Amiens, France'
amiens_center = get_coordinates(google_api_key, address)
latitude = amiens_center[0]
longitude = amiens_center[1]
print(f'The geograpical coordinate of {address} are {amiens_center}.')

The geograpical coordinate of Ctre Ville, Amiens, France are [49.892168, 2.2994263].


In [4]:
address_ = 'Université de Picardie Jules Verne'
amiens_university = get_coordinates(google_api_key,address_)
latitude_ = amiens_university[0]
longitude_ = amiens_university[1]
print(f'The geograpical coordinate of {address_} are {amiens_university}.')

The geograpical coordinate of Université de Picardie Jules Verne are [49.8761492, 2.2655255].


Now let's create a grid of area candidates, equaly spaced, centered around city center and within ~5km from Ctre Ville. Our neighborhoods will be defined as circular areas with a radius of 300 meters, so our neighborhood centers will be 600 meters apart.

To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in  meters).

In [13]:
import shapely.geometry

import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


latt,lonn = lonlat_to_xy(longitude_,latitude_) #Amiens University coordinates
latt_,lonn_ = lonlat_to_xy(longitude__,latitude__) #Citadel University coordinates
x, y = lonlat_to_xy(longitude_, latitude_)
lo, la = xy_to_lonlat(x, y)

Let's create a **hexagonal grid of cells**: we offset every other row, and adjust vertical row spacing so that **every cell center is equally distant from all it's neighbors**.

In [14]:
amiens_center_x, amiens_center_y = lonlat_to_xy(longitude, latitude) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = amiens_center_x - 5000
x_step = 600
y_min = amiens_center_y - 5000 - (int(21/k)*k*600 - 10000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
latitudes_ =[]
longitudes_ = []
latitudes__ =[]
longitudes__ = []
distances_from_center = []
distances_from_university=[]
distances_from_citadel=[]
xs = []
ys = []
xs_=[]
ys_=[]
xs__=[]
ys__=[]
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(amiens_center_x, amiens_center_y, x, y)
        distance_from_university = calc_xy_distance(latt,lonn,x,y)
        distance_from_citadel = calc_xy_distance(latt_,lonn_,x,y)
        if (distance_from_center <= 5001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)
            lon_, lat_ = xy_to_lonlat(x, y)
            latitudes_.append(lat_)
            longitudes_.append(lon_)
            distances_from_university.append(distance_from_university)
            xs_.append(x)
            ys_.append(y)
            lon__, lat__ = xy_to_lonlat(x, y)
            latitudes_.append(lat__)
            longitudes_.append(lon__)
            distances_from_citadel.append(distance_from_citadel)
            xs__.append(x)
            ys__.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

251 candidate neighborhood centers generated.


Lets visualize each candidate neighborhood we have generated so far

In [15]:
import folium

map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

In [16]:
def get_address(google_api_key, latitude, longitude, verbose=False):
    try:
        url = f'https://maps.googleapis.com/maps/api/geocode/json?key={google_api_key}&latlng={latitude},{longitude}'
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None

addr = get_address(google_api_key, amiens_center[0], amiens_center[1])

Now lets create a Pandas DataFrame with the above data and save it with pickle for later use.

In [17]:
import pickle
import pandas as pd

loaded = False
try:
    with open('locations.pkl', 'rb') as f:
        df_locations = pickle.load(f)
        addresses = [address for address in df_locations['Address']]
        loaded = True
except:
    pass

if loaded == False:
    print('Obtaining location addresses: ', end='')
    addresses = []
    for lat, lon in zip(latitudes, longitudes):
        address = get_address(google_api_key,lat, lon)
        if address is None or 'amien' not in address.lower():
            address = 'NO ADDRESS'
        address = address.replace(', France', '')
        addresses.append(address)
        print(' .', end='')
    print(' done.')

    df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center,
                             'Distance from University': distances_from_university})
    df_locations.to_pickle('./locations.pkl')   

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


Let's print some addresses

In [18]:
df_locations['Address'][101:110]

101                                          NO ADDRESS
102                                          NO ADDRESS
103    291 Grande Rue du Petit Saint-Jean, 80480 Amiens
104                      661 Rue de Rouen, 80000 Amiens
105                      447 Rue de Rouen, 80000 Amiens
106                      86 Rue du Bellay, 80000 Amiens
107                        20 Rue Dheilly, 80000 Amiens
108                         24 Rue Duminy, 80000 Amiens
109             21 Rue de la Contrescarpe, 80000 Amiens
Name: Address, dtype: object

There are some addresses with address = NO ADDRESS that need to be dropped

In [19]:
df_locations = df_locations[df_locations['Address'] != 'NO ADDRESS']

Let's see our DataFrame

In [20]:
df_locations.reset_index(inplace=True,drop=True)
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Distance from University
0,"A29, 80090 Amiens",49.856752,2.318649,-410227.17343,5600288.0,4215.447782,4429.634107
1,"Rue Wasse, 80090 Amiens",49.857657,2.326792,-409627.17343,5600288.0,4355.456348,4909.000315
2,"31A Route d'Amiens, 80480 Dury",49.856321,2.272649,-413527.17343,5600808.0,4471.017781,2286.465879
3,"Le montjoie, Rue Saint-Fuscien, 80090 Amiens",49.859952,2.30522,-411127.17343,5600808.0,3642.80112,3407.994908
4,"D7, 80680 Amiens",49.860858,2.313363,-410527.17343,5600808.0,3659.234893,3874.410286
5,"40 Rue Wasse, 80090 Amiens",49.862669,2.329652,-409327.17343,5600808.0,3973.663297,4895.335188
6,"9 Rue de Montréal, 80090 Amiens",49.863574,2.337797,-408727.17343,5600808.0,4250.882261,5433.316885
7,"91 Route d'Amiens, 80480 Dury",49.861334,2.275503,-413227.17343,5601328.0,3874.274126,1814.923891
8,"23 Clos des Châtaigniers, 80090 Amiens",49.864058,2.299933,-411427.17343,5601328.0,3157.530681,2843.207
9,"3 Allée du Montjoie, 80090 Amiens",49.864964,2.308077,-410827.17343,5601328.0,3119.294792,3334.734144


In [21]:
map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_locations['Latitude'],df_locations['Longitude']):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on existing venues for each of these catergories:
* Arts & Entertainment
* Food
* Nightlife Spot
* Outdoors & Recreation


We're interested in venues in 'Outdoors & Recreation category.

In [22]:
adict = {"Arts & Entertainment" : '4d4b7104d754a06370d81259',
"Food" : '4d4b7105d754a06374d81259',
"Nightlife Spot" : '4d4b7105d754a06376d81259',
"Outdoors & Recreation" : '4d4b7105d754a06377d81259',}

In [23]:
import json, requests

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', France', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=400, limit=50):
    url = 'https://api.foursquare.com/v2/venues/explore'

    params = dict(
    client_id=client_id,
    client_secret=client_secret,
    v='20200505',
    ll=f'{lat},{lon}',
    limit=limit,
    radius=radius,
    categoryId=category[1]
    )
    resp = requests.get(url=url, params=params)
    data = json.loads(resp.text)
    try:
        results = data['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   category[0],
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
        pass
    return venues

In [24]:
import pickle

def get_venues(lats, lons):

    ven = {}
    location_vens = []
    food_vens = []
    arts_vens = []
    night_vens = []
    outdoor_vens = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        for cat in adict.items():
            # print(cat[0])
            venues = get_venues_near_location(lat, lon, cat, client_id, client_secret)
            area_vens = []
            for venue in venues:
                venue_id = venue[0]
                venue_name = venue[1]
                venue_categorie = venue[2]
                venue_latlon = venue[3]
                venue_address = venue[4]
                venue_distance = venue[5]
                avenue = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, x, y, venue_categorie)
                
                if venue_distance<=300:
                    area_vens.append(avenue)
                ven[venue_id] = avenue
            # location_vens.append(area_vens)
        # print(cat[0])
            if cat[0] =='Food':
                food_vens.append(area_vens)
            elif cat[0] == 'Arts & Entertainment':
                arts_vens.append(area_vens)
            elif cat[0] == 'Nightlife Spot':
                night_vens.append(area_vens)
            elif cat[0] == 'Outdoors & Recreation':
                outdoor_vens.append(area_vens)
        print(' .', end='')
    print(' done.')
    return ven , food_vens, arts_vens, night_vens, outdoor_vens

loaded = False
try:
    with open('venues.pkl', 'rb') as f:
        some_venues = pickle.load(f)
    with open('food.pkl', 'rb') as f:
        food_vens = pickle.load(f)
    with open('arts.pkl', 'rb') as f:
        arts_vens = pickle.load(f)
    with open('nightlife.pkl', 'rb') as f:
        night_vens = pickle.load(f)
    with open('outdoor.pkl', 'rb') as f:
        outdoor_vens = pickle.load(f)
    print('Venue data around neighborhoods loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    some_venues, food_vens, arts_vens, night_vens, outdoor_vens = get_venues(df_locations['Latitude'],df_locations['Longitude'])    
    #Let's pickle them
    with open('venues.pkl', 'wb') as f:
        pickle.dump(some_venues, f)
    with open('food.pkl', 'wb') as f:
        pickle.dump(food_vens, f)
    with open('arts.pkl', 'wb') as f:
        pickle.dump(arts_vens, f)
    with open('nightlife.pkl', 'wb') as f:
        pickle.dump(night_vens, f)
    with open('outdoor.pkl', 'wb') as f:
        pickle.dump(outdoor_vens, f)


Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [25]:
map_amiens = folium.Map(location=amiens_center, zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for ven in some_venues.values():
    lat = ven[2]; lon = ven[3]
    cat = ven[8]
    if cat =='Food':
        color = 'red'
    elif cat == 'Arts & Entertainment':
        color = 'green'
    elif cat == 'Nightlife Spot':
        color = 'blue'
    else:
        color = 'yellow'
    # color = 'red' if cat =='Food' 'orange' elif cat == 'Event' else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_amiens)
map_amiens

In [26]:
import numpy as np

food_count = [len(i) for i in food_vens]
arts_count = [len(i) for i in arts_vens]
nightlife_count = [len(i) for i in night_vens]
outdoor_count = [len(i) for i in outdoor_vens]

df_locations['Food Places in Area'] = food_count
df_locations['Arts & Entertainment in Area'] = arts_count
df_locations['Nightlife Spots in Area'] = nightlife_count
df_locations['Outdoors & Recreation in Area'] = outdoor_count
df_locations.reset_index(inplace=True,drop=True)
df_locations[10:15]

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Distance from University,Food Places in Area,Arts & Entertainment in Area,Nightlife Spots in Area,Outdoors & Recreation in Area
10,"Espace mistral, 13 Rue de Redon, 80090 Amiens",49.865871,2.316222,-410227.17343,5601328.0,3195.309062,3857.081485,0,0,0,0
11,"6 Rue Soufflot, 80090 Amiens",49.866776,2.324367,-409627.17343,5601328.0,3377.869151,4399.284414,0,0,0,1
12,"722 Rue de Cagny, 80090 Amiens",49.867681,2.332512,-409027.17343,5601328.0,3651.027253,4954.828869,0,0,0,0
13,"105 Marais de Cagny, 80090 Amiens",49.868586,2.340658,-408427.17343,5601328.0,3996.248241,5519.687935,0,0,0,0
14,"116 Route d'Amiens, 80480 Dury",49.866347,2.278357,-412927.17343,5601847.0,3278.719262,1442.210543,3,0,0,0


In [27]:
venues_latlons = [[ven[2], ven[3]] for ven in some_venues.values()]

Let's visualize the data with a HeatMap

In [28]:
from folium import plugins
from folium.plugins import HeatMap

map_amiens = folium.Map(location=amiens_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_amiens) #cartodbpositron cartodbdark_matter
HeatMap(venues_latlons).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
folium.Circle(amiens_center, radius=1000, fill=False, color='white').add_to(map_amiens)
folium.Circle(amiens_center, radius=2000, fill=False, color='white').add_to(map_amiens)
folium.Circle(amiens_center, radius=3000, fill=False, color='white').add_to(map_amiens)
map_amiens

Let's take a look at these locations on the map

In [29]:
map_amiens = folium.Map(location=[latitude,longitude], zoom_start=13)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_locations['Latitude'],df_locations['Longitude']):
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_amiens)
map_amiens

## Data Analysis <a name="analysis"></a>

In [30]:
roi_x_min = amiens_center_x - 2000
roi_y_max = amiens_center_y + 1000
roi_width = 5000
roi_height = 5000
roi_center_x = roi_x_min + 2500
roi_center_y = roi_y_max - 2500
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

In [31]:
loaded = False
roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
try:
    with open('roi_latitudes.pkl', 'rb') as f:
        roi_latitudes = pickle.load(f)
    with open('roi_longitudes.pkl', 'rb') as f:
        roi_longitudes = pickle.load(f)
    with open('roi_xs.pkl', 'rb') as f:
        roi_xs = pickle.load(f)
    with open('roi_ys.pkl', 'rb') as f:
        roi_ys= pickle.load(f)
    loaded = True
except:
    pass

if not loaded:
    k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
    x_step = 100
    y_step = 100 * k 
    roi_y_min = roi_center_y - 2500


    for i in range(0, int(51/k)):
        y = roi_y_min + i * y_step
        x_offset = 50 if i%2==0 else 0
        for j in range(0, 51):
            x = roi_x_min + j * x_step + x_offset
            d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
            if (d <= 2501):
                lon, lat = xy_to_lonlat(x, y)
                roi_latitudes.append(lat)
                roi_longitudes.append(lon)
                roi_xs.append(x)
                roi_ys.append(y)
    with open('roi_latitudes.pkl', 'wb') as f:
        pickle.dump(roi_latitudes,f)
    with open('roi_longitudes.pkl', 'wb') as f:
        pickle.dump(roi_longitudes,f)
    with open('roi_xs.pkl', 'wb') as f:
        pickle.dump(roi_xs ,f)
    with open('roi_ys.pkl', 'wb') as f:
        pickle.dump(roi_ys,f)

print(len(roi_latitudes), 'candidate neighborhood centers generated.')

2261 candidate neighborhood centers generated.


In [32]:
def count_(x, y, df_locations, radius=250):    
    food_count_ = 0
    art_count_ = 0
    night_count_ = 0
    out_count_ = 0
    
    for index,ven in df_locations.iterrows():
        ven_x = ven['X']
        ven_y = ven['Y']
        d = calc_xy_distance(x, y, ven_x, ven_y)
        
        if d<=radius and ven['Food Places in Area'] > 0:
            food_count_ = ven['Food Places in Area']
        if d<=radius and ven['Arts & Entertainment in Area'] > 0:
            art_count_ = ven['Arts & Entertainment in Area']
        if d<=radius and ven['Nightlife Spots in Area'] > 0:
            night_count_ = ven['Nightlife Spots in Area']
        if d<=radius and ven['Outdoors & Recreation in Area'] > 0:
            out_count_ = ven['Outdoors & Recreation in Area']
    return food_count_,art_count_,night_count_,out_count_

def calc_distance_(x, y,latt,lonn):
    d = calc_xy_distance(x, y, latt, lonn)
    return d



In [33]:
food_venues_counts = []
art_venues_counts = []
night_venues_counts = []
out_venues_counts = []
roi_cen_dist = []
roi_uni_dist=[]
print('Generating data on location candidates........ ', end='')

loaded = False
try:
    with open('roi_locations.pkl', 'rb') as f:
        df_roi_locations= pickle.load(f)
        loaded = True
except:
    pass

if loaded == False:
    

    for x, y in zip(roi_xs, roi_ys):
        food_count_,art_count_,night_count_,out_count_ = count_(x, y, df_locations, radius=300)
        food_venues_counts.append(food_count_)
        art_venues_counts.append(art_count_)
        night_venues_counts.append(night_count_)
        out_venues_counts.append(out_count_)
        distance_ = calc_distance_(x,y,latt,lonn)
        roi_uni_dist.append(distance_)
        distance = calc_distance_(x,y,amiens_center_x,amiens_center_y)
        roi_cen_dist.append(distance)
    print('done.')
    # Let's put this into dataframe
    df_roi_locations = pd.DataFrame({'Latitude':roi_latitudes,
                                    'Longitude':roi_longitudes,
                                    'X':roi_xs,
                                    'Y':roi_ys,
                                    'Food Venues nearby':food_venues_counts,
                                    'Arts venues nearby':art_venues_counts,
                                    'Nightlife venues nearby':night_venues_counts,
                                    'Outdoor & Recreation venues nearby':out_venues_counts,
                                    'Distance to the University':roi_uni_dist,
                                    'Distance to the Center':roi_cen_dist,
                                    })
    df_roi_locations.to_pickle('./roi_locations.pkl')  

Generating data on location candidates........ done.


In [34]:
df_roi_locations.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center
0,49.857751,2.314889,-410477.17343,5600445.0,0,0,0,0,4136.909576,4025.232913
1,49.857902,2.316246,-410377.17343,5600445.0,0,0,0,0,4214.276347,4037.635447
2,49.85768,2.307222,-411027.17343,5600532.0,0,0,0,0,3669.774539,3914.674913
3,49.857832,2.308579,-410927.17343,5600532.0,0,0,0,0,3742.10495,3913.39746
4,49.857983,2.309937,-410827.17343,5600532.0,0,0,0,0,3815.685227,3914.674913
5,49.858134,2.311294,-410727.17343,5600532.0,0,0,0,0,3890.444454,3918.504776
6,49.858285,2.312651,-410627.17343,5600532.0,0,0,0,0,3966.315966,3924.879575
7,49.858436,2.314008,-410527.17343,5600532.0,0,0,0,0,4043.237149,3933.786938
8,49.858587,2.315366,-410427.17343,5600532.0,0,0,0,0,4121.149225,3945.209713
9,49.858738,2.316723,-410327.17343,5600532.0,0,0,0,0,4199.997051,3959.126125


In [35]:
df_roi_sorted = df_roi_locations.sort_values(by='Distance to the Center',ascending=True)
df_roi_sorted.reset_index(inplace=True,drop=True)
df_roi_sorted

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center
0,49.892101,2.300143,-410877.17343,5.604429e+06,15,5,6,3,3085.823928,52.584605
1,49.891950,2.298785,-410977.17343,5.604429e+06,15,5,6,3,2996.031545,52.584605
2,49.892785,2.299262,-410927.17343,5.604516e+06,15,5,6,3,3079.944012,70.319398
3,49.891265,2.299667,-410927.17343,5.604342e+06,15,5,6,3,3003.740871,102.885683
4,49.892936,2.300620,-410827.17343,5.604516e+06,15,5,6,3,3168.936637,122.248999
...,...,...,...,...,...,...,...,...,...,...
2256,49.863739,2.333560,-409027.17343,5.600878e+06,0,0,0,0,5131.912203,4041.459932
2257,49.862753,2.331726,-409177.17343,5.600792e+06,0,0,0,0,5035.927767,4051.076241
2258,49.860629,2.326701,-409577.17343,5.600619e+06,0,0,0,0,4771.011988,4057.937820
2259,49.859492,2.323509,-409827.17343,5.600532e+06,0,0,0,0,4606.571035,4065.055925


In [36]:
good_food_count = np.array((df_roi_locations['Food Venues nearby']>=1))
good_art_count = np.array((df_roi_locations['Arts venues nearby']>=1))
good_night_count = np.array((df_roi_locations['Nightlife venues nearby']>=1))
good_out_count = np.array((df_roi_locations['Outdoor & Recreation venues nearby']>=1))
good_uni_distance = np.array(df_roi_locations['Distance to the University']<=4000)
good_center_distance = np.array(df_roi_locations['Distance to the Center']<=4000)

good_locations = np.logical_and(good_food_count, np.logical_and(good_uni_distance,np.logical_and(good_art_count,np.logical_and(good_night_count,np.logical_and(good_out_count,good_center_distance)))))
print('Locations with all conditions met:', good_locations.sum())

df_good_locations = df_roi_locations[good_locations]

Locations with all conditions met: 160


In [37]:
good_latitudes = df_good_locations['Latitude'].values
good_longitudes = df_good_locations['Longitude'].values

good_locations = [[lat, lon] for lat, lon in zip(good_latitudes, good_longitudes)]

map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_amiens)
HeatMap(venues_latlons).add_to(map_amiens)
folium.Circle(amiens_center, radius=2500, color='white', fill=True, fill_opacity=0.6).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
map_amiens

In [38]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
HeatMap(good_locations, radius=25).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)

map_amiens

In [39]:
from sklearn.cluster import KMeans

number_of_clusters = 8

good_xys = df_good_locations[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_amiens)
HeatMap(venues_latlons).add_to(map_amiens)
folium.Circle(amiens_center, radius=2000, color='white', fill=True, fill_opacity=0.4).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=True, fill_opacity=0.25).add_to(map_amiens) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
map_amiens

In [40]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#00000000', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_amiens)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_amiens)
map_amiens

In [41]:
candidate_area_addresses = []
print('\n\t==============================================================')
print('\tAddresses of centers of areas recommended for further analysis')
print('\t==============================================================\n')
for lon, lat in cluster_centers:
    addr = get_address(google_api_key, lat, lon).replace(', France', '')
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, latt, lonn)
    d_ = calc_xy_distance(x, y, amiens_center_x, amiens_center_y)
    print(f'{addr} => {"%.2f" %  (d/1000) }km from Université de Picardie Jules Verne')
    print(f'{addr} => {"%.2f" %  (d_/1000) }km from Ctre Ville')


	Addresses of centers of areas recommended for further analysis

20 Rue Cormont, 80000 Amiens => 3.34km from Université de Picardie Jules Verne
20 Rue Cormont, 80000 Amiens => 0.30km from Ctre Ville
60 Rue des Jacobins, 80000 Amiens => 3.04km from Université de Picardie Jules Verne
60 Rue des Jacobins, 80000 Amiens => 0.21km from Ctre Ville
6 Rue Saint-Patrice, 80000 Amiens => 3.09km from Université de Picardie Jules Verne
6 Rue Saint-Patrice, 80000 Amiens => 0.33km from Ctre Ville
7 Rue Vulfran Warmé, 80000 Amiens => 3.37km from Université de Picardie Jules Verne
7 Rue Vulfran Warmé, 80000 Amiens => 0.67km from Ctre Ville
31 Rue Millevoye, 80000 Amiens => 2.64km from Université de Picardie Jules Verne
31 Rue Millevoye, 80000 Amiens => 0.59km from Ctre Ville
19 Rue Motte, 80000 Amiens => 3.68km from Université de Picardie Jules Verne
19 Rue Motte, 80000 Amiens => 0.70km from Ctre Ville
12 Rue Gribeauval, 80000 Amiens => 3.15km from Université de Picardie Jules Verne
12 Rue Gribeauval,

In [42]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lonlat, addr in zip(cluster_centers, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_amiens) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_amiens)
map_amiens

In [43]:
df_roi_locations.head()

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center
0,49.857751,2.314889,-410477.17343,5600445.0,0,0,0,0,4136.909576,4025.232913
1,49.857902,2.316246,-410377.17343,5600445.0,0,0,0,0,4214.276347,4037.635447
2,49.85768,2.307222,-411027.17343,5600532.0,0,0,0,0,3669.774539,3914.674913
3,49.857832,2.308579,-410927.17343,5600532.0,0,0,0,0,3742.10495,3913.39746
4,49.857983,2.309937,-410827.17343,5600532.0,0,0,0,0,3815.685227,3914.674913


Let's personalize our notebook a little bit more in order for the proposed housing locations to be more fitted towards each student. That would give the student more options to play with and find his “perfect” housing location according to his needs without having to search manually for different venues around each location.

In [44]:
from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
import time
def f(x):
    return x
print('\n\t====================================================================')
print('\tSelecting number of venues and distances according to personal needs')
print('\t====================================================================\n')
foodmax = np.asarray(df_roi_locations['Food Venues nearby']).max()
foodmin = np.asarray(df_roi_locations['Food Venues nearby']).min()
food_inter = interactive(f, x=widgets.IntSlider(
    value=foodmin,
    min=foodmin,
    max=foodmax,
    step=1,
    description='Food:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(food_inter)
artmax = np.asarray(df_roi_locations['Arts venues nearby']).max()
artmin = np.asarray(df_roi_locations['Arts venues nearby']).min()
art_inter = interactive(f, x=widgets.IntSlider(
    value=artmin,
    min=artmin,
    max=artmax,
    step=1,
    description='Arts:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(art_inter)
nightmax = np.asarray(df_roi_locations['Nightlife venues nearby']).max()
nightmin = np.asarray(df_roi_locations['Nightlife venues nearby']).min()
night_inter = interactive(f, x=widgets.IntSlider(
    value=nightmin,
    min=nightmin,
    max=nightmax,
    step=1,
    description='Nightlife:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(night_inter)
outmax = np.asarray(df_roi_locations['Outdoor & Recreation venues nearby']).max()
outmin = np.asarray(df_roi_locations['Outdoor & Recreation venues nearby']).min()
out_inter = interactive(f, x=widgets.IntSlider(
    value=outmin,
    min=outmin,
    max=outmax,
    step=1,
    description='Outdoors:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(out_inter)
unimax = np.asarray(df_roi_locations['Distance to the University']).max()
unimin = np.asarray(df_roi_locations['Distance to the University']).min()
uni_dis_inter = interactive(f, x=widgets.IntSlider(
    value=unimax,
    min=unimin,
    max=unimax,
    step=10,
    description='Uni Dist:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(uni_dis_inter)
centmax = np.asarray(df_roi_locations['Distance to the Center']).max()
centmin = np.asarray(df_roi_locations['Distance to the Center']).min()
center_dis_inter = interactive(f, x=widgets.IntSlider(
    value=centmax,
    min=centmin,
    max=centmax,
    step=10,
    description='Center Dist:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d',
))
display(center_dis_inter)


	Selecting number of venues and distances according to personal needs



interactive(children=(IntSlider(value=0, continuous_update=False, description='Food:', max=15), Output()), _do…

interactive(children=(IntSlider(value=0, continuous_update=False, description='Arts:', max=5), Output()), _dom…

interactive(children=(IntSlider(value=0, continuous_update=False, description='Nightlife:', max=7), Output()),…

interactive(children=(IntSlider(value=0, continuous_update=False, description='Outdoors:', max=5), Output()), …

interactive(children=(IntSlider(value=5732, continuous_update=False, description='Uni Dist:', max=5732, min=74…

interactive(children=(IntSlider(value=4068, continuous_update=False, description='Center Dist:', max=4068, min…

In [45]:
food_num = food_inter.result
art_num = art_inter.result
night_num = night_inter.result
out_num = out_inter.result
uni_dis_num = uni_dis_inter.result
center_dis_num = center_dis_inter.result

In [46]:
df_roi_locations1 = pd.DataFrame(df_roi_locations)

good_food_count = np.array((df_roi_locations['Food Venues nearby']>=food_num))
good_art_count = np.array((df_roi_locations['Arts venues nearby']>=art_num))
good_night_count = np.array((df_roi_locations['Nightlife venues nearby']>=night_num))
good_out_count = np.array((df_roi_locations['Outdoor & Recreation venues nearby']>=out_num))
good_uni_distance = np.array(df_roi_locations['Distance to the University']<=uni_dis_num)
good_center_distance = np.array(df_roi_locations['Distance to the Center']<=center_dis_num)

good_locations = np.logical_and(good_food_count, np.logical_and(good_uni_distance,np.logical_and(good_art_count,np.logical_and(good_night_count,np.logical_and(good_out_count,good_center_distance)))))
print('Locations with all conditions met:', good_locations.sum())

df_roi_locations1 = df_roi_locations[good_locations]

Locations with all conditions met: 160


In [47]:
df_roi_locations1['Food Venues nearby'] = df_roi_locations['Food Venues nearby']/np.asarray(df_roi_locations['Food Venues nearby']).max()
df_roi_locations1['Arts venues nearby'] = df_roi_locations['Arts venues nearby']/np.asarray(df_roi_locations['Arts venues nearby']).max()
df_roi_locations1['Nightlife venues nearby'] = df_roi_locations['Nightlife venues nearby']/np.asarray(df_roi_locations['Nightlife venues nearby']).max()
df_roi_locations1['Outdoor & Recreation venues nearby'] = df_roi_locations['Outdoor & Recreation venues nearby']/np.asarray(df_roi_locations['Outdoor & Recreation venues nearby']).max()
df_roi_locations1['Distance to the University'] = (1 - df_roi_locations['Distance to the University']/np.asarray(df_roi_locations['Distance to the University']).max())
df_roi_locations1['Distance to the Center'] = (1 - df_roi_locations['Distance to the Center']/np.asarray(df_roi_locations['Distance to the Center']).max())
df_roi_locations1.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center
1795,49.888838,2.291924,-411527.17343,5604169.0,0.466667,0.4,0.285714,0.0,0.583409,0.837644
1796,49.888989,2.293282,-411427.17343,5604169.0,0.466667,0.4,0.285714,0.0,0.567772,0.859598
1797,49.889141,2.29464,-411327.17343,5604169.0,0.466667,0.4,0.285714,0.0,0.552002,0.880525
1801,49.889746,2.300073,-410927.17343,5604169.0,1.0,1.0,0.857143,0.6,0.487832,0.932132
1802,49.889897,2.301431,-410827.17343,5604169.0,1.0,1.0,0.857143,0.6,0.471568,0.927817


In [48]:
print('\n\t=============================================')
print('\tSelecting weights according to personal needs')
print('\t=============================================\n')
food_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Food:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(food_inter)
art_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Arts:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(art_inter)
night_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Nightlife:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(night_inter)
out_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Outdoors:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(out_inter)
uni_dis_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Uni Dist:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(uni_dis_inter)
center_dis_inter = interactive(f, x=widgets.FloatSlider(
    value=1.0,
    min=0.0,
    max=4.0,
    step=0.1,
    description='Center Dist:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
))
display(center_dis_inter)


	Selecting weights according to personal needs



interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Food:', max=4.0, readout_fo…

interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Arts:', max=4.0, readout_fo…

interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Nightlife:', max=4.0, reado…

interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Outdoors:', max=4.0, readou…

interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Uni Dist:', max=4.0, readou…

interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Center Dist:', max=4.0, rea…

In [49]:
food_weight = food_inter.result
art_weight = art_inter.result
night_weight = night_inter.result
out_weight = out_inter.result
uni_dis_weight = uni_dis_inter.result
center_dis_weight = center_dis_inter.result

weights = np.asarray([food_weight,art_weight,night_weight,out_weight,uni_dis_weight,center_dis_weight])

calc = np.asarray(df_roi_locations1[['Food Venues nearby','Arts venues nearby','Nightlife venues nearby',
                                    'Outdoor & Recreation venues nearby','Distance to the University','Distance to the Center']])

weighted_sums = np.multiply(calc,weights)
weighted_sums = np.sum(weighted_sums,axis=1)

In [50]:
df_roi_locations1['Weighted Sums'] = weighted_sums
df_roi_locations1.sort_values(by='Weighted Sums',ascending=False,inplace=True)
df_roi_locations1.reset_index(inplace=True,drop=True)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [51]:
df_roi_locations1.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center,Weighted Sums
0,49.891798,2.297427,-411077.17343,5604429.0,1.0,1.0,0.857143,0.6,0.492871,0.962911,8.266535
1,49.891114,2.298309,-411027.17343,5604342.0,1.0,1.0,0.857143,0.6,0.491785,0.964731,8.264334
2,49.89043,2.299191,-410977.17343,5604256.0,1.0,1.0,0.857143,0.6,0.490103,0.951826,8.245207
3,49.89195,2.298785,-410977.17343,5604429.0,1.0,1.0,0.857143,0.6,0.477323,0.987074,8.233168
4,49.892634,2.297904,-411027.17343,5604516.0,1.0,1.0,0.857143,0.6,0.478088,0.969949,8.218873
5,49.889746,2.300073,-410927.17343,5604169.0,1.0,1.0,0.857143,0.6,0.487832,0.932132,8.217111
6,49.891265,2.299667,-410927.17343,5604342.0,1.0,1.0,0.857143,0.6,0.475978,0.974709,8.215827
7,49.893318,2.297022,-411077.17343,5604602.0,1.0,1.0,0.857143,0.6,0.47827,0.946637,8.196237
8,49.890581,2.300549,-410877.17343,5604256.0,1.0,1.0,0.857143,0.6,0.474057,0.951826,8.185838
9,49.892101,2.300143,-410877.17343,5604429.0,1.0,1.0,0.857143,0.6,0.461658,0.987074,8.175208


In [53]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_amiens)
HeatMap(venues_latlons).add_to(map_amiens)
folium.Circle(amiens_center, radius=2500, color='white', fill=True, fill_opacity=0.6).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_roi_locations1['Latitude'], df_roi_locations1['Longitude']):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
map_amiens

In [54]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
HeatMap(zip(df_roi_locations1['Latitude'], df_roi_locations1['Longitude']), radius=25).add_to(map_amiens)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lat, lon in zip(df_roi_locations1['Latitude'], df_roi_locations1['Longitude']):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
map_amiens

In [55]:
# number_of_clusters = 8

# good_xys = df_roi_locations1[['X', 'Y']].values
# kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

# cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

# map_amiens = folium.Map(location=amiens_center, zoom_start=14)
# folium.TileLayer('cartodbpositron').add_to(map_amiens)
# HeatMap(venues_latlons).add_to(map_amiens)
# folium.Circle(amiens_center, radius=2000, color='white', fill=True, fill_opacity=0.4).add_to(map_amiens)
# folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
# folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
# for lon, lat in cluster_centers:
#     folium.Circle([lat, lon], radius=500, color='green', fill=True, fill_opacity=0.25).add_to(map_amiens) 
# for lat, lon in zip(df_roi_locations1['Latitude'], df_roi_locations1['Longitude']):
#     folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_amiens)
# map_amiens

In [56]:
df_weighted = df_roi_locations1[0:8]
df_weighted

Unnamed: 0,Latitude,Longitude,X,Y,Food Venues nearby,Arts venues nearby,Nightlife venues nearby,Outdoor & Recreation venues nearby,Distance to the University,Distance to the Center,Weighted Sums
0,49.891798,2.297427,-411077.17343,5604429.0,1.0,1.0,0.857143,0.6,0.492871,0.962911,8.266535
1,49.891114,2.298309,-411027.17343,5604342.0,1.0,1.0,0.857143,0.6,0.491785,0.964731,8.264334
2,49.89043,2.299191,-410977.17343,5604256.0,1.0,1.0,0.857143,0.6,0.490103,0.951826,8.245207
3,49.89195,2.298785,-410977.17343,5604429.0,1.0,1.0,0.857143,0.6,0.477323,0.987074,8.233168
4,49.892634,2.297904,-411027.17343,5604516.0,1.0,1.0,0.857143,0.6,0.478088,0.969949,8.218873
5,49.889746,2.300073,-410927.17343,5604169.0,1.0,1.0,0.857143,0.6,0.487832,0.932132,8.217111
6,49.891265,2.299667,-410927.17343,5604342.0,1.0,1.0,0.857143,0.6,0.475978,0.974709,8.215827
7,49.893318,2.297022,-411077.17343,5604602.0,1.0,1.0,0.857143,0.6,0.47827,0.946637,8.196237


In [57]:
candidate_area_addresses = []
print('\n\t==============================================================')
print('\tAddresses of centers of areas recommended for further analysis')
print('\t==============================================================\n')
for lat,lon in zip(df_weighted['Latitude'],df_weighted['Longitude']):
    addr = get_address(google_api_key, lat, lon).replace(', France', '')
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, latt, lonn)
    d_ = calc_xy_distance(x, y, amiens_center_x, amiens_center_y)
    print(f'{addr} => {"%.2f" %  (d/1000) }km from Université de Picardie Jules Verne')
    print(f'{addr} => {"%.2f" %  (d_/1000) }km from Ctre Ville')


	Addresses of centers of areas recommended for further analysis

19 Rue de la République, 80000 Amiens => 2.91km from Université de Picardie Jules Verne
19 Rue de la République, 80000 Amiens => 0.15km from Ctre Ville
24 Rue Alphonse Paillat, 80000 Amiens => 2.91km from Université de Picardie Jules Verne
24 Rue Alphonse Paillat, 80000 Amiens => 0.14km from Ctre Ville
3 Rue Vivien, 80000 Amiens => 2.92km from Université de Picardie Jules Verne
3 Rue Vivien, 80000 Amiens => 0.20km from Ctre Ville
32 Rue des Jacobins, 80000 Amiens => 3.00km from Université de Picardie Jules Verne
32 Rue des Jacobins, 80000 Amiens => 0.05km from Ctre Ville
8 Rue des 3 Cailloux, 80000 Amiens => 2.99km from Université de Picardie Jules Verne
8 Rue des 3 Cailloux, 80000 Amiens => 0.12km from Ctre Ville
22 Rue Emile Zola, 80000 Amiens => 2.94km from Université de Picardie Jules Verne
22 Rue Emile Zola, 80000 Amiens => 0.28km from Ctre Ville
42 Rue des Jacobins, 80000 Amiens => 3.00km from Université de Picardi

In [58]:
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
map_amiens = folium.Map(location=amiens_center, zoom_start=14)
folium.Marker([latitude,longitude], popup='Ctre Ville', icon=folium.Icon(color='red', icon='home')).add_to(map_amiens)
folium.Marker([latitude__,longitude__], popup='Citadel University', icon=folium.Icon(color='green', icon='info-sign')).add_to(map_amiens)
folium.Marker([latitude_,longitude_], popup='Université de Picardie Jules Verne', icon=folium.Icon(color='black', icon='info-sign')).add_to(map_amiens)
for lonlat, addr in zip(zip(df_weighted['Longitude'],df_weighted['Latitude']), candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_amiens) 
for lat, lon in zip(df_weighted['Latitude'],df_weighted['Longitude']):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_amiens)
map_amiens

## Results and Discussion <a name="results"></a>

We can clearly see from the graphs above that, most of the housing locations are located towards the city center and they are close to each other. This is bound to be happens since in most towns the city center is always packed with more venues than the rest of the areas around it. Furthermore, trying different combination of weights and number of venues/distances generates interesting results. In the end, it all comes down to each student’s priorities and needs. It is certain though, that if a student’s requirements lean towards more venues around their housing location, the area is going to be located near the city center. If nonetheless, venues are not a point of reference for a student and the distance from the university is the main factor the housing locations can be more flexible.

## Future directions <a name="future"></a>

We were able to achieve interesting results with both clustering and normalized weight values. Both our models were based in the distances to our 2 reference points and the number of venues around each candidate neighborhood. However, there is a huge variance in things that may fulfil a student’s needs. For example, one of the biggest factor in our opinion that was not taken into account is the student’s budget. Using a specific ceiling price as a budget can eliminate a lot of candidate locations. In addition to that, distances may not be as important when you take in mind the number of and distance to bus stops around each housing location. Furthermore, we can deepen the models even more by taking into consideration each student’s hobbies or activities. We can also use metadata using Google Forms so as to adapt very specific needs in our models. To sum up, all these data are obviously more difficult to extract and quantify, but if optimized, could bring significant improvements to our final model.