## Table of Contents

<div>
        <a href="#item1">1. Business Statement</a>
        <br>
        <a href="##item2">2. Data Source</a>
        <br>
        <a href="#item3">3. Analyze Data</a>
        <br>
        <a href="#item4">4. Visualise Data</a>
        <br>
        <a href="#item5">5. Examine Clusters</a>    
        <br>
        <a href="#item5">6. Conclusions</a>    
        <br>
    </ol>
</div>

## 1. Business Statement

A new fitness chain would like to expand in Denmark. Opening the first Gym is going to give the stakeholder an idea on how profitable is their business model and how it will be received might dictate future plans for expansion on the Danish Market.

Data Analysis should be used to find the best location considering a few criteria:

Proximity to high traffic areas
A certain distance to existing fitness centers.
A location with lower density of Gym facilities.




## 2. Data Source

The main Data Source will be Foursquare API for identifying existing gym and mapping them over the Central Copenhagen zip codes.
Data from Denmark Statistics Institute will be used to identify higher density areas and popular venues.
People are more likely to choose a gym close to a Shopping Center then in a remote area so this should be factored into account.

By knowing existing Gym and their area of coverage, by applying unsupervised learning techniques it will be determined the optimum location for opening a new fitness center.

Using Foursquare let's fetch 

In [1]:
### A. Let's import the Fitness Gyms that are registered for Copenhagen from Foursquare.


import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
from pandas.io.json import json_normalize

print('Libraries imported.')


Libraries imported.


In [2]:
## Adding Foursquare Credentials:

CLIENT_ID = 'FZF2SWDMAVWXMFYAXUYYDKCLZYHQX0WESCXZFEU5ZNQ2DLLL' # your Foursquare ID
CLIENT_SECRET = 'WSRM2GTQKNXLREEGDXDR4ZWE4LNVILAKOPPVRPVY5H2OL3AO' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)


#Let's define the user agent:

address = 'Copenhagen, Denmark'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)


# Let's define the search based on location

search_query = 'Fitness'

radius = 70000
print(search_query + ' .... OK!')


# Let's print the url to fetch the venues
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

# Let's make a request and print the results:

results = requests.get(url).json()


Your credentails:
CLIENT_ID: FZF2SWDMAVWXMFYAXUYYDKCLZYHQX0WESCXZFEU5ZNQ2DLLL
CLIENT_SECRET:WSRM2GTQKNXLREEGDXDR4ZWE4LNVILAKOPPVRPVY5H2OL3AO


NameError: name 'Nominatim' is not defined

## 3. Analyze Data

Analyze Data

#### Clear the Data and Transform it in a Dataframe:

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.shape



# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

dataframe_filtered
    
# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]



In [None]:
#Let's see the list of gyms:

dataframe_filtered

## 4. Visualize Data


In [None]:
# Plotting the Gym addresses on the Map

In [None]:
# Let's see the gyms on a map:
!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values



print('Folium installed')


venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.features.CircleMarker(
    [latitude, longitude],
    radius=30,
    color='red',
    popup='Copenhagen',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=50,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.1
    ).add_to(venues_map)

# display map
venues_map

## 5. Examine Clusters

Examine Clusters

## 6. Conclusions

Conclusions