# Capstone Project: Battle of the Neighborhoods (Week 2)

## Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find an **optimal location for an art gallery**. Specifically, this report will be targeted to stakeholders interested in opening an **art gallery in Paris, France**.

Since there are lots of art galleries in Paris we will try to detect locations that are not already crowded with art galleries. We are also particularly interested in areas with no sculpture galleries in vicinity. We would also prefer locations as close to the city centre.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.


## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing art galleries in the neighborhood
* distance of neighborhood from city center

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* mapping and consolidation of location data features will be done using **Folium**
* number of art galleries and their type and location in every neighborhood will be obtained using **Foursquare API**


We start by importing the necessary libraries.

In [6]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    geopy-1.22.0               |     pyh9f0ad1d_0          63 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0           conda-forge
    geopy:          

First we get the latitude & longitude coordinates for centroids of the key locations. We will create a grid of cells covering our area of interest which is aprox. 10x10 killometers centered around the Louvre museum.

The first step is finding the latitude & longitude of the Louvre museum, using its address - Rue de Rivoli, 75001 Paris - and the Foursquare API.

In [8]:
address = 'Rue de Rivoli, Paris, France'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
 
print("Coordinates of Louvre:",latitude, longitude)

Coordinates of Louvre: 48.866387 2.3234711


In [10]:
search_query = 'art gallery'
radius = 5000
print(search_query + ' .... OK!')

art gallery .... OK!


In [55]:
url='https://api.foursquare.com/v2/venues/search?client_id=ALDAVZA23BVAYDJ55UWNZYDVYGHPXQCQKHJM1ZWYKWORJAKC&client_secret=2W4BBJS51LIRUNZAYG5CTRPX5WR55RVVLEWZ1FRNZ2GM1B1G&v=20200606&ll=48.866387,2.3234711&query=artgallery&radius=5000&limit=1000'
url

'https://api.foursquare.com/v2/venues/search?client_id=ALDAVZA23BVAYDJ55UWNZYDVYGHPXQCQKHJM1ZWYKWORJAKC&client_secret=2W4BBJS51LIRUNZAYG5CTRPX5WR55RVVLEWZ1FRNZ2GM1B1G&v=20200606&ll=48.866387,2.3234711&query=artgallery&radius=5000&limit=1000'

In [56]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5edced400a2972001b740882'},
 'response': {'venues': [{'id': '4cab440436fa6dcb2f6cd778',
    'name': "Galerie d'Art Primitif Africain        Art Gallery l'Oeil et la Main     Expert",
    'location': {'address': '41 rue de Verneuil',
     'lat': 48.857997,
     'lng': 2.328082,
     'labeledLatLngs': [{'label': 'display',
       'lat': 48.857997,
       'lng': 2.328082}],
     'distance': 993,
     'postalCode': '75007',
     'cc': 'FR',
     'city': 'Paris',
     'state': 'Île-de-France',
     'country': 'France',
     'formattedAddress': ['41 rue de Verneuil', '75007 Paris', 'France']},
    'categories': [{'id': '4bf58dd8d48988d1e2931735',
      'name': 'Art Gallery',
      'pluralName': 'Art Galleries',
      'shortName': 'Art Gallery',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/arts_entertainment/artgallery_',
       'suffix': '.png'},
      'primary': True}],
    'venuePage': {'id': '54632235'},
    'referralId': 'v-1591537

In [57]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",False,4cab440436fa6dcb2f6cd778,41 rue de Verneuil,FR,Paris,France,993,"[41 rue de Verneuil, 75007 Paris, France]","[{'label': 'display', 'lat': 48.857997, 'lng':...",48.857997,2.328082,75007.0,Île-de-France,Galerie d'Art Primitif Africain Art Gal...,v-1591537148,54632235.0
1,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",False,4c9b951680958cfa47fe4cd4,36 rue de Penthièvre,FR,Paris,France,991,"[36 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87225903355606...",48.872259,2.313298,75008.0,Île-de-France,Art Concorde,v-1591537148,
2,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",False,5236fcf2498e2893fd02b918,21 rue,FR,Paris,France,778,"[21 rue, Paris, France]","[{'label': 'display', 'lat': 48.86632587291662...",48.866326,2.334102,,Île-de-France,Bolotina Art Gallery,v-1591537148,
3,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",False,50ae6a6be4b0bcea5ca90981,45 rue de Penthièvre,FR,Paris,France,986,"[45 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87227, 'lng': ...",48.87227,2.313395,75008.0,Île-de-France,BJ Art Gallery,v-1591537148,
4,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",False,533720c9498e8f2e48cc9e5f,96 rue de Grenelle,FR,Paris,France,1157,"[96 rue de Grenelle, 75007 Paris, France]","[{'label': 'display', 'lat': 48.85599899291992...",48.855999,2.322671,75007.0,Île-de-France,Prince & Princess Art Gallery,v-1591537148,


In [58]:
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered


Unnamed: 0,name,categories,address,cc,city,country,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,Galerie d'Art Primitif Africain Art Gal...,Art Gallery,41 rue de Verneuil,FR,Paris,France,993,"[41 rue de Verneuil, 75007 Paris, France]","[{'label': 'display', 'lat': 48.857997, 'lng':...",48.857997,2.328082,75007.0,Île-de-France,4cab440436fa6dcb2f6cd778
1,Art Concorde,Art Gallery,36 rue de Penthièvre,FR,Paris,France,991,"[36 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87225903355606...",48.872259,2.313298,75008.0,Île-de-France,4c9b951680958cfa47fe4cd4
2,Bolotina Art Gallery,Art Gallery,21 rue,FR,Paris,France,778,"[21 rue, Paris, France]","[{'label': 'display', 'lat': 48.86632587291662...",48.866326,2.334102,,Île-de-France,5236fcf2498e2893fd02b918
3,BJ Art Gallery,Art Gallery,45 rue de Penthièvre,FR,Paris,France,986,"[45 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87227, 'lng': ...",48.87227,2.313395,75008.0,Île-de-France,50ae6a6be4b0bcea5ca90981
4,Prince & Princess Art Gallery,Art Gallery,96 rue de Grenelle,FR,Paris,France,1157,"[96 rue de Grenelle, 75007 Paris, France]","[{'label': 'display', 'lat': 48.85599899291992...",48.855999,2.322671,75007.0,Île-de-France,533720c9498e8f2e48cc9e5f
5,Alwane Art Gallery,Art Gallery,8 rue Milton,FR,Paris,France,1758,"[8 rue Milton, 75009 Paris, France]","[{'label': 'display', 'lat': 48.87732378451565...",48.877324,2.340793,75009.0,Île-de-France,4c194296838020a13a72e561
6,A2Z Art Gallery,Art Gallery,24 rue de l'Échaudé,FR,Paris,France,1680,"[24 rue de l'Échaudé, 75006 Paris, France]","[{'label': 'display', 'lat': 48.853578, 'lng':...",48.853578,2.335615,75006.0,Île-de-France,57e15202498eee71d3f97f8a
7,Paris Art Gallery,Art Gallery,67 avenue de Breteuil,FR,Paris,France,2216,"[67 avenue de Breteuil, 75007 Paris, France]","[{'label': 'display', 'lat': 48.847894, 'lng':...",48.847894,2.312242,75007.0,Île-de-France,57b6ea0c498e3f2ad319159c
8,In)( between Art Gallery,Art Gallery,39 rue Chapon,FR,Paris,France,2192,"[39 rue Chapon, 75003 Paris, France]","[{'label': 'display', 'lat': 48.8657, 'lng': 2...",48.8657,2.3534,75003.0,Île-de-France,55e88e33498e4f14ce62c7d7
9,International Art Gallery,,78 avenue de Suffren,FR,Paris,France,2378,"[78 avenue de Suffren, 75015 Paris, France]","[{'label': 'display', 'lat': 48.8519123, 'lng'...",48.851912,2.29959,75015.0,Île-de-France,55e2650b498e2abd1bc307bb


In [59]:
dataframe_filtered.name


0     Galerie d'Art Primitif Africain        Art Gal...
1                                          Art Concorde
2                                  Bolotina Art Gallery
3                                        BJ Art Gallery
4                        Prince &  Princess Art Gallery
5                                    Alwane Art Gallery
6                                       A2Z Art Gallery
7                                     Paris Art Gallery
8                              In)( between Art Gallery
9                             International Art Gallery
10                               Democratik-art Gallery
11                             French Paper Art Gallery
12                                 No Man's Art Gallery
13                              In)(between Art Gallery
14                                 Lowave - art gallery
15                               printmodel art gallery
16                               Joel Knafo Art Gallery
Name: name, dtype: object

In [60]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Louvre

# add a red circle marker to represent the Louvre
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Louvre',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the art galleries as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

So now we have all the art galleries in area within few kilometers from the Louvre! 
This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report on optimal locations for a new art gallery!



## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of Paris that have low restaurant density, particularly those with low number of Italian restaurants. We will limit our analysis to area ~5km around city center.
In first step we have collected the required data: location of every art gallery within 5km from Paris center. We identify the Paris center to be the area around the Louvre museum(also called the 1st Arrondissement). We believe that this is the optimal centerpoint because it is where the majority of tourists spend most of their time. This is of particular interest for the stakeholders since it increases their chances of maximizing their revenue. 

Second step in our analysis will be calculation and exploration of 'art gallery density' across different areas of Paris - we will use heatmaps to identify a few promising areas close to center with low number of art galleries in general and focus our attention on those areas.

In third and final step we will focus on most promising areas and within those create clusters of locations that meet some basic requirements established in discussion with stakeholders: we will take into consideration locations with no more than three restaurants in radius of 300 meters. We will present map of all such locations but also create clusters (using k-means clustering) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.


## Analysis <a name="analysis"></a>

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count the number of art galleries in every area candidate:

In [69]:

location_count = [len(dataframe_filtered)]
print('Average number of art galleries in every area with radius=300m:', np.array(location_count).mean())
dataframe_filtered

Average number of art galleries in every area with radius=300m: 17.0


Unnamed: 0,name,categories,address,cc,city,country,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,Galerie d'Art Primitif Africain Art Gal...,Art Gallery,41 rue de Verneuil,FR,Paris,France,993,"[41 rue de Verneuil, 75007 Paris, France]","[{'label': 'display', 'lat': 48.857997, 'lng':...",48.857997,2.328082,75007.0,Île-de-France,4cab440436fa6dcb2f6cd778
1,Art Concorde,Art Gallery,36 rue de Penthièvre,FR,Paris,France,991,"[36 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87225903355606...",48.872259,2.313298,75008.0,Île-de-France,4c9b951680958cfa47fe4cd4
2,Bolotina Art Gallery,Art Gallery,21 rue,FR,Paris,France,778,"[21 rue, Paris, France]","[{'label': 'display', 'lat': 48.86632587291662...",48.866326,2.334102,,Île-de-France,5236fcf2498e2893fd02b918
3,BJ Art Gallery,Art Gallery,45 rue de Penthièvre,FR,Paris,France,986,"[45 rue de Penthièvre, 75008 Paris, France]","[{'label': 'display', 'lat': 48.87227, 'lng': ...",48.87227,2.313395,75008.0,Île-de-France,50ae6a6be4b0bcea5ca90981
4,Prince & Princess Art Gallery,Art Gallery,96 rue de Grenelle,FR,Paris,France,1157,"[96 rue de Grenelle, 75007 Paris, France]","[{'label': 'display', 'lat': 48.85599899291992...",48.855999,2.322671,75007.0,Île-de-France,533720c9498e8f2e48cc9e5f
5,Alwane Art Gallery,Art Gallery,8 rue Milton,FR,Paris,France,1758,"[8 rue Milton, 75009 Paris, France]","[{'label': 'display', 'lat': 48.87732378451565...",48.877324,2.340793,75009.0,Île-de-France,4c194296838020a13a72e561
6,A2Z Art Gallery,Art Gallery,24 rue de l'Échaudé,FR,Paris,France,1680,"[24 rue de l'Échaudé, 75006 Paris, France]","[{'label': 'display', 'lat': 48.853578, 'lng':...",48.853578,2.335615,75006.0,Île-de-France,57e15202498eee71d3f97f8a
7,Paris Art Gallery,Art Gallery,67 avenue de Breteuil,FR,Paris,France,2216,"[67 avenue de Breteuil, 75007 Paris, France]","[{'label': 'display', 'lat': 48.847894, 'lng':...",48.847894,2.312242,75007.0,Île-de-France,57b6ea0c498e3f2ad319159c
8,In)( between Art Gallery,Art Gallery,39 rue Chapon,FR,Paris,France,2192,"[39 rue Chapon, 75003 Paris, France]","[{'label': 'display', 'lat': 48.8657, 'lng': 2...",48.8657,2.3534,75003.0,Île-de-France,55e88e33498e4f14ce62c7d7
9,International Art Gallery,,78 avenue de Suffren,FR,Paris,France,2378,"[78 avenue de Suffren, 75015 Paris, France]","[{'label': 'display', 'lat': 48.8519123, 'lng'...",48.851912,2.29959,75015.0,Île-de-France,55e2650b498e2abd1bc307bb


## Results and Discussion <a name="results"></a>

We can see that, according to Foursquare data, the small number of art galleries found in the key area (5km within the Louvre) means that our stakeholders will not be faced with overwhelming competition. This is true no matter where they may choose to open their art gallery, since only 17 such galleries exist in the area that concerns them.

## Conclusion <a name="conclusion"></a>

In conclusion, the project has found valuable information for the stakeholders. They had expected a very high number of art galleries to be found in the area most tourists visit when they go to Paris, but instead they found a very low number of competitors.

This means they are likely to be successful in their business.

