# Capstone Project - The Battle of Neighborhoods Week 2

## Requirements

A full report consisting of all of the following components:
1. Introduction where you discuss the business problem and who would be interested in this project.
2. Data where you describe the data that will be used to solve the problem and the source of the data.
3. Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.
4. Results section where you discuss the results.
5. Discussion section where you discuss any observations you noted and any recommendations you can make based on the results.
6. Conclusion section where you conclude the report.

## Problem definition & Data

The problem to be solved consists of the opening of a gym business in the city of Valencia, Spain. More precisely, I look for a good location nearby the city center of the city. 

People usually go to the closest gym of their neighborhoords so I will try to find a place with low density of gyms.

In reference to the data, I am going to use Foursquare location data to know:

1. Where are the gyms located
2. Which is the valuation the clients give to them

This will allow me to find the perfect place to my gym.

### Problem Resolution

### Importing the required libraries to solve the problem

In [4]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2020.12.5          |   py36h5fab9bb_1         143 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-2.1.0                |     pyhd3deb0d_0          64 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         240 KB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-2.1.0-pyhd3deb0d_0

The following packages will be UPDATED:

  certifi                          2020.12.5-py36h5fab9bb_0 --> 202

### Foursquare Credentials

In [6]:
CLIENT_ID = 'QVN1NW1F0IXCPMH3GUHNAHLKLUG1VYSJ3GOWQ024H3SZKP3W' # your Foursquare ID
CLIENT_SECRET = '2KQ1TITO5KU4UOXE3N2MDLLRQ42YQ5FVCEZ0CV23ZUOX4NZR' # your Foursquare Secret
ACCESS_TOKEN = 'F1QZJ5NS1TFY3YI0WQSWTICPGKYKU4UTIJYMGFU20GR15YIP' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: QVN1NW1F0IXCPMH3GUHNAHLKLUG1VYSJ3GOWQ024H3SZKP3W
CLIENT_SECRET:2KQ1TITO5KU4UOXE3N2MDLLRQ42YQ5FVCEZ0CV23ZUOX4NZR


### We define the initial location of the neigbborhood to start the gym business (center and radius)

For our gym business we set up the center of our search nearby the "Mercado Central de Valencia", the main market that is located in the city center:

Coordinates of "Mercado Central de Valencia": 39°28'25.5"N 0°22'46.9"W
* Latitude: 39.473757
* Longitude: -0.379694

Besides, we also set up a radius for our analysis of 2kilometers:

* Radius: 2km

In [16]:
## Variable definition:

latitude = 39.473757
longitude = -0.379694
radius = 2000
search_query = 'Gym'

In [20]:
## Url definition:

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=QVN1NW1F0IXCPMH3GUHNAHLKLUG1VYSJ3GOWQ024H3SZKP3W&client_secret=2KQ1TITO5KU4UOXE3N2MDLLRQ42YQ5FVCEZ0CV23ZUOX4NZR&ll=39.473757,-0.379694&oauth_token=F1QZJ5NS1TFY3YI0WQSWTICPGKYKU4UTIJYMGFU20GR15YIP&v=20180604&query=Gym&radius=2000&limit=30'

### We send the GET Request and examine the results

In [21]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '6007efa60d7dc976f73d9770'},
 'notifications': [{'type': 'notificationTray', 'item': {'unreadCount': 0}}],
 'response': {'venues': [{'id': '4fbbfd7ce4b0d314f1767484',
    'name': 'gym24',
    'location': {'lat': 39.48004615438323,
     'lng': -0.3928476145314537,
     'labeledLatLngs': [{'label': 'display',
       'lat': 39.48004615438323,
       'lng': -0.3928476145314537}],
     'distance': 1329,
     'cc': 'ES',
     'city': 'Valencia',
     'state': 'Comunidad Valenciana',
     'country': 'España',
     'formattedAddress': ['Valencia Comunidad Valenciana']},
    'categories': [{'id': '4bf58dd8d48988d176941735',
      'name': 'Gym',
      'pluralName': 'Gyms',
      'shortName': 'Gym',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/gym_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1611132839',
    'hasPerk': False},
   {'id': '5de8ca521016f8000844d266',
    'name': 'Gym Boutique Alameda',
   

#### Get relevant part of JSON and transform it into a _pandas_ dataframe

In [25]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

  """


Unnamed: 0,id,name,categories,referralId,hasPerk,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.address,location.postalCode,location.crossStreet
0,4fbbfd7ce4b0d314f1767484,gym24,"[{'id': '4bf58dd8d48988d176941735', 'name': 'G...",v-1611132839,False,39.480046,-0.392848,"[{'label': 'display', 'lat': 39.48004615438323...",1329,ES,Valencia,Comunidad Valenciana,España,[Valencia Comunidad Valenciana],,,
1,5de8ca521016f8000844d266,Gym Boutique Alameda,"[{'id': '4bf58dd8d48988d175941735', 'name': 'G...",v-1611132839,False,39.475635,-0.365503,"[{'label': 'display', 'lat': 39.475635, 'lng':...",1237,ES,Valencia,Comunidad Valenciana,España,"[Paseo de La Alameda, 4, 46010 Valencia Comuni...","Paseo de La Alameda, 4",46010.0,
2,51379782e4b0536567a71523,Gym & Tonic,"[{'id': '4bf58dd8d48988d176941735', 'name': 'G...",v-1611132839,False,39.46257,-0.37074,"[{'label': 'display', 'lat': 39.46257, 'lng': ...",1463,ES,Valencia,Comunidad Valenciana,España,"[Doctor Sumsi 13, Valencia Comunidad Valenciana]",Doctor Sumsi 13,,
3,5182b1b4e4b0ea1baed06438,Master Gym,"[{'id': '4bf58dd8d48988d176941735', 'name': 'G...",v-1611132839,False,39.470801,-0.391775,"[{'label': 'display', 'lat': 39.4708005498919,...",1089,ES,Valencia,Comunidad Valenciana,España,"[Martin El Humano 11, 46008 Valencia Comunidad...",Martin El Humano 11,46008.0,
4,4eb938d4f5b94bd85d61fdbb,Venice Gym,"[{'id': '4bf58dd8d48988d176941735', 'name': 'G...",v-1611132839,False,39.48144,-0.372702,"[{'label': 'display', 'lat': 39.48143993086462...",1045,ES,Valencia,Comunidad Valenciana,España,"[Calle del Poeta Bodria, 4, 46010 Valencia, 46...","Calle del Poeta Bodria, 4, 46010 Valencia",46010.0,


#### Define information of interest and filter dataframe

Process to be followed:

1. We keep only columns that include venue name, and anything that is associated with location
2. We make a function that extracts the category of the venue
3. We filter the category for each row
4. We clean column names
5. We show the dataframe filtered

In [29]:
# keep only columns that include venue name, and anything that is associated with location

filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row

dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term

dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,address,postalCode,crossStreet,id
0,gym24,Gym,39.480046,-0.392848,"[{'label': 'display', 'lat': 39.48004615438323...",1329,ES,Valencia,Comunidad Valenciana,España,[Valencia Comunidad Valenciana],,,,4fbbfd7ce4b0d314f1767484
1,Gym Boutique Alameda,Gym / Fitness Center,39.475635,-0.365503,"[{'label': 'display', 'lat': 39.475635, 'lng':...",1237,ES,Valencia,Comunidad Valenciana,España,"[Paseo de La Alameda, 4, 46010 Valencia Comuni...","Paseo de La Alameda, 4",46010.0,,5de8ca521016f8000844d266
2,Gym & Tonic,Gym,39.46257,-0.37074,"[{'label': 'display', 'lat': 39.46257, 'lng': ...",1463,ES,Valencia,Comunidad Valenciana,España,"[Doctor Sumsi 13, Valencia Comunidad Valenciana]",Doctor Sumsi 13,,,51379782e4b0536567a71523
3,Master Gym,Gym,39.470801,-0.391775,"[{'label': 'display', 'lat': 39.4708005498919,...",1089,ES,Valencia,Comunidad Valenciana,España,"[Martin El Humano 11, 46008 Valencia Comunidad...",Martin El Humano 11,46008.0,,5182b1b4e4b0ea1baed06438
4,Venice Gym,Gym,39.48144,-0.372702,"[{'label': 'display', 'lat': 39.48143993086462...",1045,ES,Valencia,Comunidad Valenciana,España,"[Calle del Poeta Bodria, 4, 46010 Valencia, 46...","Calle del Poeta Bodria, 4, 46010 Valencia",46010.0,,4eb938d4f5b94bd85d61fdbb
5,Westin Gym,Gym / Fitness Center,39.473121,-0.361336,"[{'label': 'display', 'lat': 39.473121, 'lng':...",1579,ES,Valencia,Comunidad Valenciana,España,[46010 Valencia Comunidad Valenciana],,46010.0,,5aa40d2c2b98442cbed460a6
6,campus gym,Gym,39.481848,-0.364016,"[{'label': 'display', 'lat': 39.48184824501116...",1620,ES,,,España,,,,,4f9ed278e4b09fef554c8167
7,Mö Gym Studio,Gym,39.482729,-0.363171,"[{'label': 'display', 'lat': 39.482729, 'lng':...",1735,ES,Valencia,Comunidad Valenciana,España,"[Calle Bachiller, 7, 46010 Valencia Comunidad ...","Calle Bachiller, 7",46010.0,,4ea9ca06d3e3846cbc62a969
8,Sala de Abdominales Club Metropolitan Gym,Gym,39.456852,-0.375063,"[{'label': 'display', 'lat': 39.45685204525932...",1923,ES,Valencia,Comunidad Valenciana,España,"[Calle Filipinas (Peris Y Valero), 46006 Valen...",Calle Filipinas,46006.0,Peris Y Valero,4ff5e7efe4b033b23af2b53d
9,Sala de Maquinas Club Metropolitan Gym,Gym,39.457224,-0.375661,"[{'label': 'display', 'lat': 39.45722426248265...",1872,ES,Valencia,Comunidad Valenciana,España,"[Calle Filipinas (Peris Y Valero), 46006 Valen...",Calle Filipinas,46006.0,Peris Y Valero,4ff5de46e4b03705cdae3d49


#### Now we can visualize the Gyms that are nearby the "Mercado Central de Valencia"

Process to be followed:

1. We generate the map in our center location
2. We add a marker in the "Mercado Central de Valencia"
3. We add blue spots in the gyms that are inside the radius
4. We display the map

In [41]:
# We generate map centred in the "Mercado Central de Valencia"

venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) 

# We add in the map the "Mercado Central de Valencia" as a red circle mark

folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Mercado_Central',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(venues_map)

# add popular spots to the map as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

# display map
venues_map

#### Area to establish a gym nearby the "Mercado Central de Valencia"

As can be seen in the map, there is a lack of gyms in the neigborhood. We do the map again including a green circular area whithout any gym inside. 



In [42]:
# We add in the map the "Mercado Central de Valencia" as a red circle mark

folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Mercado_Central',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(venues_map)

# add popular spots to the map as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

# We add in the map the green area
folium.features.CircleMarker(
    [latitude, longitude],
    radius=200,
    popup='Mercado_Central',
    fill=True,
    color='green',
    fill_color='green',
    fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

## Conclusion section

The city center of Valencia is a good area to establish a gym business. There is a lack of gyms nearby the "Mercado Central" so that clearly exists a business opportunity. Now the problem will continue by exploring venues with following characteristics:

* Enough venue size (squared meters)
* Good price for the venue in euros per squared meter
* Good accessibility for potential clients
* Enough parking places in the surroundigs
* Etc.