# Part I - Explanation of the problem and why it is interesting

## 1 - Problem

Indication: Clearly define a problem or an idea of your choice, where you would need to leverage the Foursquare location data to solve or execute. Remember that data science problems always target an audience and are meant to help a group of stakeholders solve a problem, so make sure that you explicitly describe your audience and why they would care about your problem.

Israel is a very specific place in the arabic world with very specific needs and demand. Since it is creation on the 14th of May 1948 after the vote by the ONU, it has been largely influenced by the international community and specifically the occidental culture.
Nevertheless, Isreal remains part of the Arabic world and is affected by its culture.

This combination of cultures makes the understanding of israely's culture and its consumption behaviour extremely difficult.

Several international companies would like to understand better this context to establish a strategy plan for implementation.

#### Nota bene: there is nothing politic about this notebook its purpose is purely scientific.

## 2 - Data

Indication: Describe the data that you will be using to solve the problem or execute your idea. Remember that you will need to use the Foursquare location data to solve the problem or execute your idea. You can absolutely use other datasets in combination with the Foursquare location data. So make sure that you provide adequate explanation and discussion, with examples, of the data that you will be using, even if it is only Foursquare location data.

We would like to benchmark Israel's capital Tel Aviv versus two well known cities to establish the consumption behaviour of its population.

For the benchmark, we have selected New York City and Amman for which we will be using Four Square's location data and compare the venues of the three cities in terms of shops, cafes, restaurant and so on. By establishing the profile in terms of venues in Four square we should be able to caracterise Tel Aviv and then compare it to AMman and New York City. We will then investigate the differences and conclude by identifying the best proxy.

Indeed, the number of venues will be divided by the population of each city to remain consistent.

If it happens that none of the two cities are close enough, we will iterate with other cities among the top 50 largest cities.

# Part II - Method to solve it

## 0 - Import of libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

ModuleNotFoundError: No module named 'numpy'

In [None]:
CLIENT_ID = 'DCPMWEGCH0C2PZR13QN0IB0KEJSDLZMIPARV415TX5Y52FQ1' # your Foursquare ID
CLIENT_SECRET = 'W3BY4ARSZPO4YLRLD4BHCEWHJKLZW5UJYHEHEVDPKVXGRZXX' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

In [None]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# 1 - Explore Amman

In [None]:
address = 'Amman'

geolocator = Nominatim(user_agent="my access")
location = geolocator.geocode(address)
amman_latitude = location.latitude
amman_longitude = location.longitude
print('The geograpical coordinate of Amman are {}, {}.'.format(amman_latitude, amman_longitude))

In [None]:
LIMIT = 1000 # limit of number of venues returned by Foursquare API

radius = 10000 # define radius in meters

# create URL

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    amman_latitude, 
    amman_longitude, 
    radius, 
    LIMIT)

# 2 - Explore NYC

In [None]:
address = 'New York City'

geolocator = Nominatim(user_agent="n")
location = geolocator.geocode(address)
nyc_latitude = location.latitude
nyc_longitude = location.longitude
print('The geograpical coordinate of NYC are {}, {}.'.format(nyc_latitude, nyc_longitude))

In [None]:
LIMIT = 1000 # limit of number of venues returned by Foursquare API

radius = 10000 # define radius in meters

# create URL

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    nyc_latitude, 
    nyc_longitude, 
    radius, 
    LIMIT)
nyc_results = requests.get(url).json()

In [None]:
venues = nyc_results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

# 3 - Explore Israel

In [None]:
address = 'Israel'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
israel_latitude = location.latitude
israel_longitude = location.longitude
print('The geograpical coordinate of Israel are {}, {}.'.format(israel_latitude, israel_longitude))

In [None]:
LIMIT = 1000 # limit of number of venues returned by Foursquare API

radius = 10000 # define radius in meters

# create URL

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    israel_latitude, 
    israel_longitude, 
    radius, 
    LIMIT)