# Capstone Project

## Business Problem

Fitness is a growing trend, especially across the Middle East. The combination of having to constantly look good on social media and the healthy revolution has led to a significant increase in the number of those joining gyms, many of whom would not traditionally do so. In addition as COVID-19 linked lockdowns begin to ease, people are likely to flock to gyms and fitness centres to shed the pounds gained while stuck at home. Location is central to which gym people choose, nobody wants a half-hour commute for a workout. As of such I will leverage Foursquare API data to find the areas in Beirut, the capital of Lebanon, that are most underserved with fitness centres and use that as a proxy for where the optimal location for a new gym would be. 

## Data 

Complete data on Lebanon is scarce; few people collect or update it. As of such my ability to work with the Foursquare API is limited as many locations do not have a category or are missing some other entry. That said, the most complete list I could find was that of gyms and fitness centres and hence that is what data I will use.

## Process

To figure out which areas in Beirut are most underserved I'll first plot all fitness centres on a map and use such to qualitatively figure out which areas would be best served by a new gym. 

In [45]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


In [46]:
CLIENT_ID = 'ZF1BTD4OS3INYB1FQLKAK53W1S0SFVVCGZIPNKEHGQ1OHVME' # your Foursquare ID
CLIENT_SECRET = '3QVK0RDV12UMU3RX3BOXUYC0OTB2HXOB3OWNETO3ET0OK43N' # your Foursquare Secret
VERSION = '20200000'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZF1BTD4OS3INYB1FQLKAK53W1S0SFVVCGZIPNKEHGQ1OHVME
CLIENT_SECRET:3QVK0RDV12UMU3RX3BOXUYC0OTB2HXOB3OWNETO3ET0OK43N


In [47]:
address = 'Mar Antonios Street, Beirut'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

33.8942938 35.5129043


In [48]:
search_query = 'Fitness'
radius = 10000000
print(search_query + ' .... OK!')

Fitness .... OK!


In [49]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=ZF1BTD4OS3INYB1FQLKAK53W1S0SFVVCGZIPNKEHGQ1OHVME&client_secret=3QVK0RDV12UMU3RX3BOXUYC0OTB2HXOB3OWNETO3ET0OK43N&ll=33.8942938,35.5129043&v=20200000&query=Fitness&radius=10000000&limit=30'

In [50]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5ed4cbc47828ae001b3b7c69'},
 'response': {'venues': [{'id': '5641bb4738fa012315fc9840',
    'name': 'Fitness Zone at The Souks',
    'location': {'address': 'Beirut Souks',
     'lat': 33.89952691454219,
     'lng': 35.50350666585211,
     'labeledLatLngs': [{'label': 'display',
       'lat': 33.89952691454219,
       'lng': 35.50350666585211}],
     'distance': 1045,
     'cc': 'LB',
     'city': 'Beirut Souks - Beirut Central District | Downtown',
     'country': 'لبنان',
     'formattedAddress': ['Beirut Souks',
      'Beirut Souks - Beirut Central District | Downtown',
      'لبنان']},
    'categories': [{'id': '4bf58dd8d48988d175941735',
      'name': 'Gym / Fitness Center',
      'pluralName': 'Gyms or Fitness Centers',
      'shortName': 'Gym / Fitness',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/gym_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1591004188',
    'hasPerk': False},
  

In [51]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues).head(100)

In [52]:
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head(100)

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,cc,city,country,formattedAddress,crossStreet,state,id
0,Fitness Zone at The Souks,Gym / Fitness Center,Beirut Souks,33.899527,35.503507,"[{'label': 'display', 'lat': 33.89952691454219...",1045,LB,Beirut Souks - Beirut Central District | Downtown,لبنان,"[Beirut Souks, Beirut Souks - Beirut Central D...",,,5641bb4738fa012315fc9840
1,GT fitness,Gym,,33.89365,35.51223,"[{'label': 'display', 'lat': 33.89365011885825...",94,LB,,لبنان,[لبنان],,,52655d77498e67a4e2929aae
2,cross fitness,Gym,gemayzeh,33.894222,35.511616,"[{'label': 'display', 'lat': 33.89422225952148...",119,LB,,لبنان,"[gemayzeh, لبنان]",,,523da48b11d23025a7aba955
3,Fitness Zone Cofé,Café,Crown Plaza,33.896134,35.479479,"[{'label': 'display', 'lat': 33.89613373627912...",3095,LB,بيروت,لبنان,"[Crown Plaza (Hamra), بيروت, لبنان]",Hamra,محافظة بيروت,4f9a9edce4b0edc560069fb3
4,Fitness Zone At ABC,Gym / Fitness Center,,33.888545,35.519445,"[{'label': 'display', 'lat': 33.888545, 'lng':...",880,LB,بيروت,لبنان,"[بيروت, لبنان]",,محافظة بيروت,58d8e18bd7b47305c43f348d
5,Fitness Zone At ABC,Gym / Fitness Center,,33.888653,35.519371,"[{'label': 'display', 'lat': 33.888653, 'lng':...",866,LB,بيروت,لبنان,"[بيروت, لبنان]",,محافظة بيروت,58d8e0d7ca10707e5751e05c
6,Tactical Fitness,Gym / Fitness Center,Rafic Salloum Street,33.884712,35.544267,"[{'label': 'display', 'lat': 33.884712, 'lng':...",3088,LB,بيروت,لبنان,"[Rafic Salloum Street, بيروت, لبنان]",,محافظة بيروت,5b2666a71ffe97002c155007
7,Fitness Zone,Gym,,33.895597,35.496118,"[{'label': 'display', 'lat': 33.89559742654037...",1557,LB,,لبنان,[لبنان],,محافظة بيروت,4de321a28877bcb68655f3c5
8,180 Degrees Fitness & Spa,Gym / Fitness Center,BirHassan - Marriott Jnah Highway(Next To Oman...,33.87116,35.486589,"[{'label': 'display', 'lat': 33.8711602580186,...",3542,LB,BirHassan - Jnah 💪,لبنان,[BirHassan - Marriott Jnah Highway(Next To Oma...,BirHassan Highway,,547de6d9498ef97a10cfed87
9,Hi-tec fitness. Starco,Athletics & Sports,Starco,33.898606,35.500808,"[{'label': 'display', 'lat': 33.898606, 'lng':...",1216,LB,,لبنان,"[Starco, لبنان]",,,54662dbd498e7e8a014dbbd9


In [53]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the my address

# add the fitness centres as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

In [54]:
dataframe_filtered['categories'].value_counts()

Gym                     14
Gym / Fitness Center     9
Athletics & Sports       2
Office                   1
Juice Bar                1
Pilates Studio           1
Café                     1
Bathing Area             1
Name: categories, dtype: int64

## Results

As is clear in the above dataframe, Beirut has approximately 25 fitness centres. There is a clear correlation between the number of gyms in and the level of poverty in the said area. Unfortunately data for poverty levels by area is unavailable, but the areas with the least number of gyms, Chiyah, Ghobeiry, Sin el Fil, and Corniche el Mazraa clearly have very few gyms.

## Discussion

There is the obvious issue that many of these areas are poorly served by Foursquare data collectors. It is likely that gyms do exist in these regions, but are simply not listed. However, if we use listing on Foursquare as a proxy for the quality of the gym, the conclusion can be made that the areas listed above are the most underserved of quality gyms. 

It can thus be concluded that if a high-quality gym was set up in one of those areas with a buisiness model that properly targets the residents of that area, it would be a s
uccess.

## Conclusion

In conclusion, I used the Foursquare API to see which areas of Beirut have the least number of fitness centres and hence recommend potential locations for a new gym. 