## Capstone Project - The Battle of Neighborhoods 

### WEEK 1

## 1. Business Problem

Aim: analyse restaurants' market porfolio in Segovia to find a suitable location.
Context: in this little town near Madrid there is no a huge variety of restaurants. A new chinese restaurant is planning to open. They know Segovian people really like this type of food, but it is not one of their first choices when thinking of a place to go for dinner/meal. Segovia is a very traditional town and they are used to eat their typical Spanish food. In this sense, this new chinese restaurant wants to capture the essence of Segovian's behaviour so they can attract their customers from this key point. 

- *What is the essence of Segovian's behaviour?* 
    Although Segovia is a very small town, their citizens always meet in the historic center of the city. The point of reference for meetings is the iconic Roman aqueduct, and nearby we can perceive a medieval environment with medieval walls, Romanesque churches, a former royal palace and a Gothic cathedral. 

- *What is the objective?*
    This new Chinese restaurant is thinking about what location would be the best to capture the essence of Segovian's behaviour. They are considering placing outside the walls but near the hictoric center, but they are not sure if this is going to be the perfect place.
    
Therefore, we are going to perform data analysis with python and foursquare data to obtain objective data about the restaurant's market porfolio and combined with previous qualitative data we have already gathered about Segovians, find the perfect location for this new restaurant to succeed. 

## 2. Data

-Data will include two neighbourhoods: inside city walls (historic center) or outside city walls (urban area).
- We are going to analyse the venues in this two neighbourhoods:
    - Frequency of restaurants to see where are placed most of the restaurants.
                If there are many restaurants concentrated in one neighbourhood this might be the right neighbourhood.
                What are the types of restaurants in the most populated area? 
                Do this restaurantes match with the essence of Segovian's behaviour?
    - Location of current chinese restaurants to analyse the competition. 
                Is there an area where most chinese (or Asian) restaurants are concentrated?
                Can we find chinese restaurants in the most populated area? 
                Do this current chinese restaurants match with the essence of Segovian's behaviour?
    - Trigger venues (this are venues such as bars or pubs which can be determinant to choose where to go to have dinner/meal, i.e. if there are other venues nearby people might choose that place as it is close to those venues)
                Is there a specific leisure place within any of the neighbourhoods?
                Which venues can match with the essence of Segovian's behaviour?
                In which neighbourhood are placed the matching venues?

- This quantitative data is going to be later interpreted in the context of the qualitative data we have available about habits and behaviour of Segovians. 


### WEEK 2

## 3. Methodology

### DATA PREPARATION

In [139]:
#Import all libraries needed to treat the data. We are considering only 2 neighbourhoods in Segovia: "CENTRO" and "LA GRANJA"
import pandas as pd
import numpy as np
import requests
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /opt/anaconda3

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1g             |       haf1e3a3_1         1.9 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         1.9 MB

The following packages will be DOWNGRADED:

  openssl                                 1.1.1g-h0b31af3_1 --> 1.1.1g-haf1e3a3_1



Downloading and Extracting Packages
openssl-1.1.1g       | 1.9 MB    | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Lib

###### Define Foursquare Credentials and Version

In [140]:
CLIENT_ID = 'XKA5QVXRYDRKMZQC5XLUZSUXZHCMTATNHPOLHLXODYTKNPVQ' # your Foursquare ID
CLIENT_SECRET = 'COQS5XDUCUSSZLN1ZDZ1C0ZMNHKUGT1OTQ4JJQ22ZBJ1ZDID' # your Foursquare Secret
VERSION = '20200811' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: XKA5QVXRYDRKMZQC5XLUZSUXZHCMTATNHPOLHLXODYTKNPVQ
CLIENT_SECRET:COQS5XDUCUSSZLN1ZDZ1C0ZMNHKUGT1OTQ4JJQ22ZBJ1ZDID


#### SEGOVIA CENTRO NEIGHBOURHOOD

#### In the next step will gather all the venues placed in Segovia Centro neighbourhood. Our main focus are bars, restaurants and trigger venues. 

In [141]:
#Obtain the geographical coordinades of Segovia Centro
address = 'Segovia, Spain'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude_centro = location.latitude
longitude_centro = location.longitude
print('The geograpical coordinate of Segovia are {}, {}.'.format(latitude_centro, longitude_centro))

The geograpical coordinate of Segovia are 40.9502159, -4.1241494.


In [142]:
#GET request URL
LIMIT_centro = 100
radius_centro = 500
 
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID,CLIENT_SECRET, VERSION, latitude_centro, longitude_centro, radius_centro, LIMIT_centro)
results = requests.get(url).json()

In [143]:
results

{'meta': {'code': 200, 'requestId': '5f36b2ad6dd9a7231df4b0d0'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Segovia',
  'headerFullLocation': 'Segovia',
  'headerLocationGranularity': 'city',
  'totalResults': 54,
  'suggestedBounds': {'ne': {'lat': 40.954715904500006,
    'lng': -4.118202457957227},
   'sw': {'lat': 40.9457158955, 'lng': -4.130096342042774}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bdc7e7a3904a5938e704e9e',
       'name': 'Catedral de Segovia',
       'location': {'address': 'Pl. Mayor',
        'crossStreet': 'C. Marqués de Arco',
        'lat': 40.95012928546704,
        'lng': -4.125484228134155,
        'labeledLatLngs': [{'label': 'display',
   

###### We found 55 venues in Segovia Centro


In [144]:
nearby_venues.head(5)


Unnamed: 0,name,categories,lat,lng
0,Catedral de Segovia,Church,40.950129,-4.125484
1,Restaurante José María,Spanish Restaurant,40.950509,-4.122548
2,Plaza Mayor,Plaza,40.950176,-4.124078
5,Hotel Eurostars Convento Capuchinos 5*,Hotel,40.952068,-4.123042
6,El Redebal,Spanish Restaurant,40.948256,-4.121843


In [145]:
#### We are going to drop any "shop" category venue as we do not consider them trigger venues. Firstly, we replace whatevery the shop is named to "Shop".
import re
categories = nearby_venues.categories
categories.replace('.*Shop','Shop', regex=True, inplace = True)
categories



0                            Church
1                Spanish Restaurant
2                             Plaza
5                             Hotel
6                Spanish Restaurant
7                               Bar
8                              Park
9                Spanish Restaurant
11               Spanish Restaurant
12                 Tapas Restaurant
13                              Bar
14                            Plaza
15                              Bar
16                              Bar
18                    Historic Site
19               Spanish Restaurant
20                       Restaurant
21                Indian Restaurant
22    Vegetarian / Vegan Restaurant
23                         Wine Bar
24                        Rock Club
25                          Theater
26               Spanish Restaurant
27                       Restaurant
28               Spanish Restaurant
29                   Scenic Lookout
30                       Restaurant
31               Spanish Res

In [146]:
#Exclude the "Shop" category venues.
nearby_venues = nearby_venues[nearby_venues.categories != "Shop"]
#Exclude the only one case "Grocery Store" venue:
nearby_venues = nearby_venues[nearby_venues.categories != "Grocery Store"]
nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Catedral de Segovia,Church,40.950129,-4.125484
1,Restaurante José María,Spanish Restaurant,40.950509,-4.122548
2,Plaza Mayor,Plaza,40.950176,-4.124078
5,Hotel Eurostars Convento Capuchinos 5*,Hotel,40.952068,-4.123042
6,El Redebal,Spanish Restaurant,40.948256,-4.121843
7,El Sitio,Bar,40.949792,-4.123003
8,Paseo Del Salón De Isabel II,Park,40.948641,-4.123335
9,Cueva de San Esteban,Spanish Restaurant,40.951251,-4.124472
11,Restaurante Villena,Spanish Restaurant,40.95217,-4.122966
12,El Fogón Sefardí,Tapas Restaurant,40.949629,-4.12428


#### LA GRANJA NEIGHBOURHOOD

#### Now we are going to obtain the same information about the venues throguh Foursquare for La Granja neighbourhood. Our main focus are bars, restaurants and trigger venues. 

In [147]:
#Obtain the geographical coordinades of Nueva Segovia
address = 'El Real Sitio de San Ildefonso, Spain'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude_nueva = location.latitude
longitude_nueva = location.longitude
print('The geograpical coordinate of Nueva Segovia are {}, {}.'.format(latitude_nueva, longitude_nueva))

The geograpical coordinate of Nueva Segovia are 40.9006043, -4.0054657.


In [148]:
#GET request URL
LIMIT_nueva = 100
radius_nueva = 1000
 
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID,CLIENT_SECRET, VERSION, latitude_nueva, longitude_nueva, radius_nueva, LIMIT_nueva)
results2 = requests.get(url).json()

In [149]:
results2

{'meta': {'code': 200, 'requestId': '5f36b3f82bbd33138045604c'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'San Ildefonso',
  'headerFullLocation': 'San Ildefonso',
  'headerLocationGranularity': 'city',
  'totalResults': 21,
  'suggestedBounds': {'ne': {'lat': 40.90960430900001,
    'lng': -3.9935807416149527},
   'sw': {'lat': 40.891604290999986, 'lng': -4.017350658385047}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b5b21bbf964a5205be528e3',
       'name': 'Casa Zaca',
       'location': {'address': 'Calle Embajadores, 6, BAJO',
        'lat': 40.90016528,
        'lng': -4.00724579,
        'labeledLatLngs': [{'label': 'display',
          'lat': 40.90016528,
       

##### We found 24 venues in La Granja 

In [150]:
nearby_venues2.categories


0     Spanish Restaurant
1                  Hotel
2                 Palace
3                 Garden
4             Restaurant
5             Restaurant
6                  Hotel
7     Italian Restaurant
8                   Café
9                    Bar
10                  Café
11            Restaurant
12    Spanish Restaurant
13            Restaurant
14    Spanish Restaurant
15            Restaurant
16    Spanish Restaurant
17      Sculpture Garden
18    Spanish Restaurant
19    Spanish Restaurant
20              Fountain
21              Fountain
22     Paella Restaurant
23              Fountain
Name: categories, dtype: object

###  DATA ANALYSIS

#### In this step we are going to perform an Exploratory Data Analysis

In [151]:
categories_count_LG= nearby_venues2['categories'].value_counts()
categories_count_LG

Spanish Restaurant    6
Restaurant            5
Fountain              3
Hotel                 2
Café                  2
Paella Restaurant     1
Palace                1
Italian Restaurant    1
Bar                   1
Garden                1
Sculpture Garden      1
Name: categories, dtype: int64

In [152]:
categories_count_C = nearby_venues['categories'].value_counts()
categories_count_C

Spanish Restaurant               14
Bar                               7
Restaurant                        4
Hotel                             4
Tapas Restaurant                  3
Plaza                             2
Historic Site                     2
Indian Restaurant                 1
Beer Garden                       1
Sculpture Garden                  1
Vegetarian / Vegan Restaurant     1
American Restaurant               1
Park                              1
Museum                            1
Rock Club                         1
Wine Bar                          1
Theater                           1
Scenic Lookout                    1
Paella Restaurant                 1
Church                            1
Name: categories, dtype: int64

#### Additionally, we are going to produce a map to get a visual feeling of the neighbourhood and the venues in each neighbourhood

In [153]:
map_centro = folium.Map(location=[latitude_centro, longitude_centro], zoom_start=27)
map_centro

In [154]:
map_lagranja = folium.Map(location=[latitude_nueva, longitude_nueva], zoom_start=18)
map_lagranja

## 4. Results and Discussion


As we can see in **LA GRANJA** neighbourhood there are overall less venues. Within this venues: 
1) There is only one restaurant providing foreing countrie's meals: Italian Restaurant. There are no Chinese restaurants. However, due to the qualitative analysis conducted previously we now there are a few Chineses restaurants. Therefore, we inferred these restaruants are not relevant enough and that is the reason Foursquare does not include them. 

We do not focus on the relevance of Chineses restaurants as we have qualitative data that supports that Segovians like Chinese food. We are going to focus on the competitors: 
2) The main competitors are Spanish restaurants, which would go in line with the Segovian's behaviour essence of the restaurants they often visit. However, we can notice that the fact that there are few trigger venues (palace, garden, sculpture garden, fountain) can help us conclude this is not aligned with Segovian's behaviour essence. 

3) In the map we can observe the venues are spread around the neighbourhood in a fairly similar proportion. This shows it is a daily used area with no specific agglomerations.

On the other hand, exploring **CENTRO** neighbourhood, we can observe the following:
1) Greater variety of foreing food restaurant (American, Indian, Vegetarian) which helps us conclude customers having meals in this neighbourhood are prompt to eat spanish food but also non-spanish food.

2) Competitors are mainly Spanish restaurants, which matches the Segovians behaviour essence. 

3) More trigger venues. 

4) In the map we can observe the venues are concentrated are we there are lots of them very close to each other. This hows it is an area with specific agglomerations which match the Segovians behaviour.

**CONTRADICTION: **

However, based on qualitative data we now there are no Chinese restaurants in CENTRO but there are some in LA GRANJA. Then, what is the answer to our initial question?

The main characteristic of this new Chinese restaurant is not its but (which is implicit in the business) but the objective of fussioning Segovians behaviour essence with the restaurant. Although the Chinese restaurants in LA GRANJA (qualitative data) are succeding, they are not succeding in Segovians behaviour essence. For example, Segovians go to LA GRANJA Chinese restaurants for informal ocassions, in specific timings and they do not consider them as one of their usual restaurants although they love their food. This new restaurant wants to attrack customers in their frequent social meetings and make them feel the restaurant as one of their usual ones.

## 5. Conclusion

This study allows to focus and analyse with more precision the potential place of the new Chinese restaurant in Segovia. 
A previous qualitative study was conducted by interviewing Segovians to get to know their habits within the city, their social life and the predisposition to go to a Chinese restaurant. The results showed how they were very predisposed to go to a Chinese restaurant, but this event was not part of their usual habit of socializing in Segovia Centro neighbourhood. 

Therefore, we conducted this quantitative data analysis to have a clear vision of the market in Segovia, considering the urban area, La Granja and the medieval city center, Centro. In a very effective way, we obtained the data making a call to the Foursquare API using their respectives latitude and logitud to get the most relevant venues within each area. Next, it was very important to prepare our data, eliminating venues which were not of interested and could cause outliers. Finally, we analysed the data based on the frequency of each venue in each restaurant so we could make conclusions about competitors, top venues, tendencies, etc., all together with the context of the qualitative results. Additionally, we produced a map of each neightbourhood as to see the distribution of the venues and conclude which characteristics match with Segovians behaviour essence.  

Finally, we conclude the most suitable place is within Centro neighbourhood, because although it is going to be a challenge as people is not used to Chinese restaurants in that area, this location takes the advantage of incorporating the key aspect of Segovian behaviour essence. 