# Capstone Project - The Battle of the Neighborhoods 
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find an optimal location for a restaurant. Specifically, this report will be targeted to stakeholders interested in opening a multi cuisine restaurant in Hyderabad, India. 

we need a Neighborhood with a steady stream of customers. Since there are lots of restaurants in Hyderabad we will try to detect locations that are not already crowded with restaurants. We are also particularly interested in areas with no multi cuisine restaurants in vicinity. We would also prefer locations as close to city center as possible, assuming that first two conditions are met.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:

   1. frequency of the restaurants in the neighborhood (any type of restaurant)
   2. number of and distance to multi cuisine restaurants in the neighborhood, if any
   3. Stream of customers visiting neighborhoods and 
   4. distance of neighborhood from city center.

We will be using foursquare api to fetch venue information and analyze neighborhoods.

# Example data

In [11]:
# importing libraries

import requests
import pandas as pd
import numpy as np
import random
import folium
from geopy.geocoders import Nominatim

#from Ipython.display import Image
#from Ipython.core.display import HTML

from pandas.io.json import json_normalize
print("imported libraries")

#API Credentials
CLIENT_ID = '2GY2DGZ2WOXNNKBHZIDRTCTVLHYZJGY2214GF4O1WPMWXDHC' # your Foursquare ID
CLIENT_SECRET = 'SE1DCOXXYGQOULSZ2VEPYAV0SJ2JRNUNWGBDOEMD1XXUF0KA' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100

# Get latitude and longitude
address = 'Hyderabad'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

#location = geolocator.reverse("{} {}".format(latitude, longitude))
#print(location.address)


# search query
search_query = 'restaurant'
radius = 1000
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
#print(results)
venues=results['response']['venues']
dataframe = json_normalize(venues)
dataframe

imported libraries
17.38878595 78.4610647345315


Unnamed: 0,id,name,categories,referralId,hasPerk,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.address,location.crossStreet
0,5a21602775a6ea748fd77429,Nimrah Restaurant And Bakery,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",v-1571232771,False,17.392616,78.468165,"[{'label': 'display', 'lat': 17.392616, 'lng':...",866,500004.0,IN,Hyderabad,TG,India,"[Hyderabad 500004, TG, India]",,
1,50d1b77ee4b0d5316b4e0bff,Voulga Restaurant,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",v-1571232771,False,17.382614,78.465828,"[{'label': 'display', 'lat': 17.38261412350797...",853,,IN,Hyderabad,Telangana,India,"[Darus Salaam, Hyderabad, Telangana, India]",Darus Salaam,
2,5823557cda82023188157a2f,Grills restaurant,"[{'id': '503287a291d4c4b30a586d65', 'name': 'F...",v-1571232771,False,17.386126,78.456796,"[{'label': 'display', 'lat': 17.38612646647625...",541,,IN,Hyderabad,Telangana,India,"[Hyderabad, Telangana, India]",,
3,5686bc36498ece24b0f97da8,Zoha Restaurant,"[{'id': '54135bf5e4b08f3d2429dfe6', 'name': 'H...",v-1571232771,False,17.393055,78.457642,"[{'label': 'display', 'lat': 17.393055, 'lng':...",598,,IN,,,India,[India],,
4,5a1ade2ae96d0c5d8b3dc3da,Sri Anupama Family Restaurant,"[{'id': '54135bf5e4b08f3d2429dfe5', 'name': 'A...",v-1571232771,False,17.384757,78.45589,"[{'label': 'display', 'lat': 17.38475703996001...",709,500073.0,IN,Hyderabad,Telangana,India,"[Ahmed Commercial Complex, Ameerpet Main Road,...","Ahmed Commercial Complex, Ameerpet Main Road, ...","Nagarjuna Nagar Colony, Yella Reddy Guda"
5,50ba59a4e4b077f48d74046b,Azizia Restaurant,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",v-1571232771,False,17.39258,78.467255,"[{'label': 'display', 'lat': 17.39258040421995...",781,,IN,,,India,[India],,


In [15]:
dataframe.describe()

Unnamed: 0,location.lat,location.lng,location.distance
count,6.0,6.0,6.0
mean,17.388625,78.461929,724.666667
std,0.004659,0.005721,133.896477
min,17.382614,78.45589,541.0
25%,17.385099,78.457007,625.75
50%,17.389353,78.461735,745.0
75%,17.392607,78.466898,835.0
max,17.393055,78.468165,866.0


In [18]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]
dataframe_filtered
#filtered_columns

Unnamed: 0,name,categories,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.address,location.crossStreet,id
0,Nimrah Restaurant And Bakery,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",17.392616,78.468165,"[{'label': 'display', 'lat': 17.392616, 'lng':...",866,500004.0,IN,Hyderabad,TG,India,"[Hyderabad 500004, TG, India]",,,5a21602775a6ea748fd77429
1,Voulga Restaurant,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",17.382614,78.465828,"[{'label': 'display', 'lat': 17.38261412350797...",853,,IN,Hyderabad,Telangana,India,"[Darus Salaam, Hyderabad, Telangana, India]",Darus Salaam,,50d1b77ee4b0d5316b4e0bff
2,Grills restaurant,"[{'id': '503287a291d4c4b30a586d65', 'name': 'F...",17.386126,78.456796,"[{'label': 'display', 'lat': 17.38612646647625...",541,,IN,Hyderabad,Telangana,India,"[Hyderabad, Telangana, India]",,,5823557cda82023188157a2f
3,Zoha Restaurant,"[{'id': '54135bf5e4b08f3d2429dfe6', 'name': 'H...",17.393055,78.457642,"[{'label': 'display', 'lat': 17.393055, 'lng':...",598,,IN,,,India,[India],,,5686bc36498ece24b0f97da8
4,Sri Anupama Family Restaurant,"[{'id': '54135bf5e4b08f3d2429dfe5', 'name': 'A...",17.384757,78.45589,"[{'label': 'display', 'lat': 17.38475703996001...",709,500073.0,IN,Hyderabad,Telangana,India,"[Ahmed Commercial Complex, Ameerpet Main Road,...","Ahmed Commercial Complex, Ameerpet Main Road, ...","Nagarjuna Nagar Colony, Yella Reddy Guda",5a1ade2ae96d0c5d8b3dc3da
5,Azizia Restaurant,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",17.39258,78.467255,"[{'label': 'display', 'lat': 17.39258040421995...",781,,IN,,,India,[India],,,50ba59a4e4b077f48d74046b


## Methodology <a name="methodology"></a>

After fetching venues and neighbborhood data in hyderabad. fiter out the specific venues in which you are intrested and perform data wrangling create a dataframe.
visualize the restaurent data depending on the neighbborhood. 
Display the frequency of restaurents in each neighbborhood, cluster them accordingly, analyze the data and pick best possible location.

## Analysis <a name="analysis"></a>

## Results and Discussion <a name="results"></a>

## Conclusion <a name="conclusion"></a>