# Capstone Project - The Battle of the Neighborhoods 
### Optimal location for my healthy food brand in Doha

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

This project is based on a real life problem. I have a home-run healthy snacks company and I am looking for potential spot to set up shop here in Doha, Qatar. I'm going to try and be as practical as I can with this decision seeing how I am personally invested.

The idea of healthy eating is gaining popularity here in Qatar. Slowly but surely, people are making an effort to eat responsibly, especially when it comes to their kids. In this project we want to focus on localities close to schools since that is a place of particular interest. It would also help our cause if no other cafes are in the vicinity since we primarily sell snacks, not fully-plated food.

Using all my data science experience, I set forth to discover the best location possible for this investment.

## Data <a name="data"></a>

The most important factors to consider are:
* areas close to schools
* number of and distance to cafes in the neighborhood, if any

Qatar is divided into eight municipalities, and each municipality is further divided into zones.
We will focus this research on the zones under the Doha municipality, which will serve as our defined neighbourhoods.

I am relying on the following sources to get the data I will need:
* coordinates of Doha using Google Maps
* coordinates of the zones using latlong.net
* number of schools and their location in every zone using Foursquare API

In [53]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')


Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


### Defining the neighbourhoods

Let's create a dataframe with the latitudes and longitudes of the zones we want to focus on

In [84]:
from bs4 import BeautifulSoup

res = requests.get("https://en.wikipedia.org/wiki/Zones_of_Qatar")
soup = BeautifulSoup(res.content,'lxml')

tables = soup.find_all('table', class_='wikitable')

d = {'Zone': [3,4,14,15,16,17,22,24,25,26,27,30,31,32,33,34,35,36,37,38,40,41,42,45,46,61,63,64,67,68],'Districts': ['Fereej Mohamed Bin Jasim','Mushayrib','Fereej Abdel Aziz','Ad Dawhah al Jadidah','Old Al Ghanim','Al Rufaa','Fereej Bin Mahmoud','Rawdat Al Khail','Fereej Bin Durham','Najma','Umm Ghuwailina','Duhail','Umm Lekhba','Madinat Khalifa North','Al Markhiya','Madinat Khalifa South','Fereej Kulaib','Al Messila','Fereej Bin Omran','Al Sadd','New Salatah','Nuaija','Al Hilal','Old Airport','Al Thumama','Al Dafna','Onaiza','Lejbailat','Hazm Al Markhiya','Jelaiah'], 'Population': [4886,28069,15706,15920,16334,6026,28327,18200,37082,28228,33262,7705,11897,12364,6242,38247,6507,6803,26121,41673,16086,33379,11671,48525,21367,4022,37461,4151,8967,5521], 'Latitude':[25.2865,25.2818,25.2777,25.2776,25.28,25.2853,25.2803,25.286,25.2693,25.2683,25.2766,25.3477,25.3477,25.329,25.3388,25.3156,25.3138,25.3006,25.3038,25.2838,25.2623,25.2467,25.2599,25.2481,25.2316,25.3077,25.3469,25.3212,25.3388,25.3522], 'Longitude':[51.5296,51.5275,51.5242,51.5321,51.54,51.5444,51.5124,51.5142,51.5295,51.5387,51.5492,51.4675,51.4675,51.4756,51.4992,51.4808,51.4914,51.4808,51.4953,51.4914,51.5094,51.5334,51.5439,51.5544,51.5413,51.5163,51.5176,51.5032,51.4992,51.4861]}
df = pd.DataFrame(data=d)
df



Unnamed: 0,Zone,Districts,Population,Latitude,Longitude
0,3,Fereej Mohamed Bin Jasim,4886,25.2865,51.5296
1,4,Mushayrib,28069,25.2818,51.5275
2,14,Fereej Abdel Aziz,15706,25.2777,51.5242
3,15,Ad Dawhah al Jadidah,15920,25.2776,51.5321
4,16,Old Al Ghanim,16334,25.28,51.54
5,17,Al Rufaa,6026,25.2853,51.5444
6,22,Fereej Bin Mahmoud,28327,25.2803,51.5124
7,24,Rawdat Al Khail,18200,25.286,51.5142
8,25,Fereej Bin Durham,37082,25.2693,51.5295
9,26,Najma,28228,25.2683,51.5387


### Foursquare
Now that we have the latitudes and longitudes of each neighbourhood, lets use foursquare to check for schools nearby which is our priority.

In [90]:
CLIENT_ID = 'WBOFWONU5OGJYAT2FA5MCFEFGRTX1VYNQTCFMK2HKJVHONFB' # your Foursquare ID
CLIENT_SECRET = 'WWJELALMLDLOWY5RK1AIIIB3SMWWBOIU4ECBE4045IZCWAK0' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WBOFWONU5OGJYAT2FA5MCFEFGRTX1VYNQTCFMK2HKJVHONFB
CLIENT_SECRET:WWJELALMLDLOWY5RK1AIIIB3SMWWBOIU4ECBE4045IZCWAK0


In [91]:
address = 'Doha'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

25.2856329 51.5264162


In [99]:
search_query = 'schools'
radius = 500

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

results = requests.get(url).json()
results


# Category IDs corresponding to schools were taken from Foursquare web site (https://developer.foursquare.com/docs/resources/categories):
schools = '4bf58dd8d48988d13b941735' # 'Root' category for all food-related venues

school_categories = ['4f4533804b9074f6e4fb0105','4bf58dd8d48988d13d941735','4f4533814b9074f6e4fb0106', 
                        '4f4533814b9074f6e4fb0107', '52e81612bcbc57f1066b7a45', '52e81612bcbc57f1066b7a46']


{'meta': {'code': 200, 'requestId': '5e4a5144923935001bd5bb53'},
 'response': {'venues': []}}