# The Restaurant Finder | Business Problem | Introduction  
"London is satisfied, Paris is resigned, but **New York** is always hopeful. Always it believes that something good is about to come off, and it must hurry to meet it.”

## Introduction  

As per recent statistics, there are over 30,000 different cuisine restaurants in alone NEW YORK City. It would take years for one person to visit all the restaurants in NYC. With that being said, the restaurant business sounds one of the most lucrative business industry in New York City as the city serves home for so many people belonging to different race and culture.  
Considering New Yorks's diversity and ethnicity it is evident that starting a restaurant business would earn you more money comparatively than most of the other businesses. This project serves as one of the guide to start an Indian restaurant in New York City by performing demgraphic analysis on various venues in different neighborhood of the city. 

## Business Problem

New York is famous for its restaurants serving cuisines from all over the world. The food culture of NYC consist of restaurants with so many international cuisines considering the population of the immigrants. Hence, to open up a restaurant in this city various factors are to be considered. The factors being:
1. New York City's Location and Demographacis.
2. Study about the venues in different neighborhood of New York City.
3. Competitors in that location.
4. Analyzing the different categories of the venues. 

Being a Data Scientist, its my job to help my clients to give a deep insights of these factors and help them to take such critical business decision. Its the role of Data Scientist to give a graphical representation as to which area would be better to open a new restaurants and provide with a complete explanation as to why it is the best area.

## Data Source

The NYC Population and demographic data was collected from 
1. https://en.wikipedia.org/wiki/New_York_City
2. https://en.wikipedia.org/wiki/Demographics_of_New_York_City
3. https://cocl.us/new_york_dataset

Various Python Librarires such as requests, BeautifulSoup, Pandas were used to extract this data. 
Foursquare API was used to analyze the various neighborhoods and the existing indian restaurants in those neighborhoods  along with the reviews and suggestion from different users. 

#### Lets being by Importing all the required Libraries we will need to perform the analysis.

In [7]:
import pandas as pd
import numpy as np
import requests # This Library handles requests
import random # This Library generate random numbers

# We need to install a module to convert a address into latitude and Longitude.
!pip install geopy
from geopy.geocoders import Nominatim

# Libraries to diplay images
from IPython.display import Image
from IPython.core.display import HTML
from IPython.display import display_html

# Libraries required to convert JSON file into Pandas Dataframe
from pandas import json_normalize

# Library to plot the result
!pip install folium==0.5.0
import folium 

from bs4 import BeautifulSoup
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors


print('Libraries Imported Successfully')


Libraries Imported Successfully


In [12]:
def get_new_york_data():
    url='https://cocl.us/new_york_dataset'
    resp=requests.get(url).json()
    # all data is present in features label
    features=resp['features']
    
    # define the dataframe columns
    column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
    # instantiate the dataframe
    new_york_data = pd.DataFrame(columns=column_names)
    
    for data in features:
        borough = data['properties']['borough'] 
        neighborhood_name = data['properties']['name']
        
        neighborhood_latlon = data['geometry']['coordinates']
        neighborhood_lat = neighborhood_latlon[1]
        neighborhood_lon = neighborhood_latlon[0]
    
        new_york_data = new_york_data.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)
    
    return new_york_data

In [13]:
new_york_data = get_new_york_data()
new_york_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [9]:
response_obj = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
soup = BeautifulSoup(response_obj,'lxml')
Population_Census_Table = soup.select_one('.wikitable:nth-of-type(5)') #use css selector to target correct table.

jurisdictions = []
rows = Population_Census_Table.select("tbody > tr")[3:8]
for row in rows:
    jurisdiction = {}
    tds = row.select('td')
    jurisdiction["jurisdiction"] = tds[0].text.strip()
    jurisdiction["population_census"] = tds[1].text.strip()
    jurisdiction["%_white"] = float(tds[2].text.strip().replace(",",""))
    jurisdiction["%_black_or_african_amercian"] = float(tds[3].text.strip().replace(",",""))
    jurisdiction["%_Asian"] = float(tds[4].text.strip().replace(",",""))
    jurisdiction["%_other"] = float(tds[5].text.strip().replace(",",""))
    jurisdiction["%_mixed_race"] = float(tds[6].text.strip().replace(",",""))
    jurisdiction["%_hispanic_latino_of_other_race"] = float(tds[7].text.strip().replace(",",""))
    jurisdiction["%_catholic"] = float(tds[10].text.strip().replace(",",""))
    jurisdiction["%_jewish"] = float(tds[12].text.strip().replace(",",""))
    jurisdictions.append(jurisdiction)

print(jurisdictions)

[{'jurisdiction': 'Queens', 'population_census': '2,229,379', '%_white': 44.1, '%_black_or_african_amercian': 20.0, '%_Asian': 17.6, '%_other': 12.3, '%_mixed_race': 6.1, '%_hispanic_latino_of_other_race': 25.0, '%_catholic': 37.0, '%_jewish': 5.0}, {'jurisdiction': 'Manhattan', 'population_census': '1,537,195', '%_white': 54.4, '%_black_or_african_amercian': 17.4, '%_Asian': 9.4, '%_other': 14.7, '%_mixed_race': 4.1, '%_hispanic_latino_of_other_race': 27.2, '%_catholic': 11.0, '%_jewish': 9.0}, {'jurisdiction': 'Bronx', 'population_census': '1,332,650', '%_white': 29.9, '%_black_or_african_amercian': 35.6, '%_Asian': 3.0, '%_other': 25.7, '%_mixed_race': 5.8, '%_hispanic_latino_of_other_race': 48.4, '%_catholic': 14.0, '%_jewish': 5.0}, {'jurisdiction': 'Staten Island', 'population_census': '443,728', '%_white': 77.6, '%_black_or_african_amercian': 9.7, '%_Asian': 5.7, '%_other': 4.3, '%_mixed_race': 2.7, '%_hispanic_latino_of_other_race': 12.1, '%_catholic': 11.0, '%_jewish': 5.0}, {

In [10]:
df = pd.DataFrame(jurisdictions, columns=["jurisdiction","%_white", "%_black_or_african_amercian", "%_Asian", "%_other", "%_mixed_race", "%_hispanic_latino_of_other_race"])
df.head()

Unnamed: 0,jurisdiction,%_white,%_black_or_african_amercian,%_Asian,%_other,%_mixed_race,%_hispanic_latino_of_other_race
0,Queens,44.1,20.0,17.6,12.3,6.1,25.0
1,Manhattan,54.4,17.4,9.4,14.7,4.1,27.2
2,Bronx,29.9,35.6,3.0,25.7,5.8,48.4
3,Staten Island,77.6,9.7,5.7,4.3,2.7,12.1
4,NYC Total,44.7,26.6,9.8,14.0,4.9,27.0
