The main methodology would be scoring the various locations around the city in terms of their concentration with the venues of interest (either sights or places for dining). The scoring is carried out by encoding the location points in terms of nearby fetched venues using the Foursquare API. Thus, the final score of each location point would be given in terms of the relative frequency of the desired venue categories in the vicinity of this location. Then, these locations could be compared and these statistics are eventually used in order to construct the final heatmap for the tourists.

1. Mark the city with numerous spots (locations) that would be further evaluated and used in order to map out the heatmap
2. Make API requests for each of the spots to extract the popular venues in the vicinity and basic data about them
3. Process the acquired data and calculate the category of interest frequencies for each spot by first encoding the venue categories
4. Get the final datatable with all the spots and their scores (i.e. weights) of concentration of places of interest for tourists
5. Create a heatmap using the spots' scores by highlighting the zones that have higher score (and concentration).

Import Libraries

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [104]:
from folium.plugins import HeatMap

## Part 1: Locations generation around the city of interest

Input the city of interest and using the geolocator identify its geocoordinates.

In [3]:
address = 'Moscow, Russia'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))

The geograpical coordinate of Moscow, Russia are 55.7504461, 37.6174943.


Then we create the spots - locations around the city centre that would be further used in modeling and that we are going to score further on.

In [43]:
def generate_coordinates(num=20):
    coordinates=[]
    for i in np.linspace(-0.1,0.1,num=num):
        for j in np.linspace(-0.1,0.1,num=num):
            coordinates.append((latitude+i,longitude+j))
    coordinates=pd.DataFrame(coordinates,columns=['lat','long'])
    return(coordinates)

In [44]:
coordinates=generate_coordinates()

In [45]:
print('We have generated {} locations scattered around Moscow'.format(len(coordinates)))

We have generated 400 locations scattered around Moscow


In [339]:
def map_out_coordinates(coordinates):
    # create map of New York using latitude and longitude values
    map_plain = folium.Map(location=[latitude, longitude], zoom_start=10)

    # add markers to map
    for lat, lng in zip(coordinates['lat'], coordinates['long']):
        label = '{}, {}'.format(lat, lng)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=2,
            popup=label,
            color='blue',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(map_plain)  
    return(map_plain)

In [340]:
map_out_coordinates(coordinates)

## Part 2: API Requests to Foursquare service to extract useful data about the generated locations

In [55]:
LIMIT=100
radius=300
def getNearbyVenues(latitudes, longitudes, radius):
    
    venues_list=[]
    for lat, lng in zip(latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [
                  'lat', 
                  'long', 
                  'venue', 
                  'venue_lat', 
                  'venue_long', 
                  'venue_cat']
    
    return(nearby_venues)

In [56]:
spots_venues=getNearbyVenues(coordinates.lat,coordinates.long, radius=radius)

## Part 3: Process the acquired data

In [57]:
encoded=pd.get_dummies(spots_venues[['venue_cat']], prefix="", prefix_sep="")

Let's observe all the category names:

In [365]:
print(spots_venues.venue_cat.unique())

['Beer Store' 'Tea Room' 'Pizza Place' 'Café' 'Wine Shop'
 'Dumpling Restaurant' 'Park' 'Eastern European Restaurant'
 'Asian Restaurant' 'Hobby Shop' 'Pool Hall' 'Middle Eastern Restaurant'
 'Restaurant' 'Nightclub' 'Burger Joint' 'Gym' 'Caucasian Restaurant'
 'Smoke Shop' 'Border Crossing' 'Japanese Restaurant' 'Bus Station'
 'Irish Pub' 'Dog Run' 'Convenience Store' 'Playground' 'Auto Workshop'
 'Supermarket' 'Fast Food Restaurant' 'Skating Rink' 'Russian Restaurant'
 'Beer Bar' 'Clothing Store' 'Salon / Barbershop' 'Gym / Fitness Center'
 'Farmers Market' 'Plaza' 'Pharmacy' 'Flower Shop' 'Other Repair Shop'
 'Furniture / Home Store' 'Bookstore' 'Office' 'Department Store'
 'Electronics Store' 'Leather Goods Store' 'Cafeteria' 'Fountain'
 'Racetrack' 'Athletics & Sports' 'Bus Stop' 'Go Kart Track' 'Beach Bar'
 'Food & Drink Shop' 'Bar' 'Spa' 'Coffee Shop' 'Bakery' 'Donut Shop'
 'French Restaurant' 'Italian Restaurant' 'Soccer Field'
 'Health Food Store' 'Paper / Office Supplies Stor

In [81]:
restaurant_categories=spots_venues.venue_cat[spots_venues.venue_cat.str.contains('|'.join(['staura','afe','Gastro','Creperie','offee','andwic','Bistro','Food']))].unique()

By creating the relevant categories, we filter out our dataset to extract the needed venues for the tourists.

In [368]:
spots_venues[spots_venues.venue_cat.isin(restaurant_categories)].head()

Unnamed: 0,lat,long,venue,venue_lat,venue_long,venue_cat
5,55.650446,37.528021,золотой родник,55.652519,37.52783,Dumpling Restaurant
7,55.650446,37.528021,Бутлер,55.652158,37.528247,Eastern European Restaurant
8,55.650446,37.528021,Алтан Булаг,55.652489,37.527652,Asian Restaurant
11,55.650446,37.538547,Чайхона № 1,55.649648,37.534465,Middle Eastern Restaurant
12,55.650446,37.538547,Boobo NeoGeo,55.649768,37.539455,Restaurant


In [370]:
encoded['lat']=spots_venues.lat
encoded['long']=spots_venues.long

## Part 4: Score the locations

In [371]:
scores=encoded.groupby(['lat','long']).sum()
scores=scores.loc[:,restaurant_categories]

In [372]:
scores=scores.sum(axis=1).reset_index()
scores.columns=['lat','long','score']
scores['colors']=[colors.rgb2hex(i) for i in cm.get_cmap('RdYlBu_r')(scores.score/scores.score.max())]

## Part 5: HeatMap

In [373]:
def map_out_scores(scores):
    scores=scores.copy()
    # create map of New York using latitude and longitude values
    map_plain = folium.Map(location=[latitude, longitude], zoom_start=10)
    #normalize the scores
    scores_init=scores.score
    scores.score=scores.score/scores.score.max()
    # add markers to map
    for lat, lng, color,sc in zip(scores['lat'], scores['long'],scores['colors'],scores_init):
        label = '{}, {}: score: {}'.format(lat, lng,sc)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=3,
            popup=label,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=0.5,
            parse_html=False).add_to(map_plain)  
    map_plain.add_child(HeatMap(scores.values[:,0:3], radius=15,max_val=0.6,min_val=0.05,max_zoom=1))
    return(map_plain)

map_out_scores(scores)

We can see clear zones of interest for tourists who are willing to go to a cafe or a restaurant and need a variety of choice in the near vicinity. Naturally, the most significant areas tend to lie in the city centre, but some peculiar zones have also been identified (for example, in the west with the highest concentration of 22 restaurants in the 300m nearby).

Naturally, the outskirts of the city tend to contain even 0 restaurants nearby, although there are some minor exceptions.

## 4. Results

As the code report has shown, the method has provided very meaningful results by identifying the places of great interest to time-constrained tourists. 
We can see certain blobs of high concentration right in the city centre of Moscow and also some of them beside the living zones around the centre with some cafes around. Generally, we can see a radial natural structure of the city with the inner core and gradually less concentration of venues as we move outside.

On a side note, we could carry out the same process on any other city and practically other categories (e.g. parks, etc.), so this tool has wider applications than just the one presented.