# The Battle of Neighborhoods

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

Many Muslims who eat meat usually eat Halal Meat. What is "Halal"? The word "Halal" literally means 
permissible. Now to make meat Halal or permissible, an animal or poultry has to be slaughtered in a 
ritual way known "Zabihah". To make it readily comprehended halal is somewhat like Jewish kosher. 
Zabihah require animals to be alive and healthy at the time of slaughter. 
Basically it must be Slaughtered a certain way which the flowing blood must be drained out of the 
carcass, as blood is forbidden.

The Problem:

Say there is a Muslim family who wants to travel to another city in the US for vacation. 
They landed on to 3 choices, New York City, Los Angeles, and Chicago. 
They want to know which city has more venues that offer Halal.

My Target Audience:
Muslims who want to go to any of these cites or live in these cities. 
This would allow them to have an idea of what is around and the density of halal food in their 
respective city.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing restaurants by neighborhood or ZIP Code of our given Cites (Halal restaurants)

The following data is Scraped from websites containing all Neighborhoods and ZIP Codes for the following 
Cities

Tools Used in Project
* **Geopy Nominatim** used to generate Geo Coordinates of our dataset
* number of restaurants and location in every neighborhood will be obtained using **Foursquare API**

In [1]:
import pandas as pd # Data Analysis
import numpy as np # Handles Data in a Vectorized Manner
import requests #handles requests
import folium # Plotting
from geopy.geocoders import Nominatim # address to Latitude and Longitude
from geopy.extra.rate_limiter import RateLimiter # reduces the "Too Many Requests Error"
from bs4 import BeautifulSoup # web scraping


print('Modules Imported!')

Modules Imported!


In [2]:
# Web Sources for scraping

NYC_url = 'https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm'
LA_url = 'http://www.laalmanac.com/communications/cm02_communities.php'
CHI_url = 'https://www.seechicagorealestate.com/chicago-zip-codes-by-neighborhood.php'

## Web Scraping and setting up DataFrames

New York City

In [3]:
source1 = requests.get(NYC_url).text
soup1 = BeautifulSoup(source1,'lxml')
table1 = soup1.find('table')

readme_html1 = pd.read_html(str(table1))
NYC_df = pd.DataFrame(readme_html1[0])
print(NYC_df.shape, '\n', NYC_df.head())

(42, 3) 
   Borough                Neighborhood                   ZIP Codes
0   Bronx               Central Bronx         10453, 10457, 10460
1   Bronx      Bronx Park and Fordham         10458, 10467, 10468
2   Bronx  High Bridge and Morrisania         10451, 10452, 10456
3   Bronx  Hunts Point and Mott Haven  10454, 10455, 10459, 10474
4   Bronx   Kingsbridge and Riverdale                10463, 10471


Los Angeles 

In [4]:
source2 = requests.get(LA_url).text
soup2 = BeautifulSoup(source2, 'lxml')
table2 = soup2.find('table')

readme_html2 = pd.read_html(str(table2))
LA_df = pd.DataFrame(readme_html2[0])
print(LA_df.shape, '\n', LA_df.head())

(643, 2) 
             City/Community   Zip Code(s)
0                    Acton         93510
1             Agoura Hills         91301
2  Agoura Hills (PO Boxes)         91376
3               Agua Dulce         91390
4                 Alhambra  91801, 91803


Chicago

In [5]:
source3 = requests.get(CHI_url).text
soup3 = BeautifulSoup(source3, 'lxml')
table3 = soup3.find('table')

readme_html3 = pd.read_html(str(table3))
CHI_df = pd.DataFrame(readme_html3[0])
print(CHI_df.shape, '\n', CHI_df.head())

(199, 2) 
                     0             1
0            Downtown      Zip Code
1  Cathedral District         60611
2     Central Station         60605
3       Dearborn Park         60605
4          Gold Coast  60610, 60611


Renaming columns to be consistent through each DataFrame

In [6]:
NYC_df.rename(columns={'ZIP Codes': 'zip_codes'}, inplace=True)
LA_df.rename(columns={'City/Community': 'Neighborhood', 'Zip Code(s)': 'zip_codes'}, inplace=True)
CHI_df.columns = ['Neighborhood','zip_codes']

print(list(NYC_df.columns))
print(list(LA_df.columns))
print(list(CHI_df.columns))

['Borough', 'Neighborhood', 'zip_codes']
['Neighborhood', 'zip_codes']
['Neighborhood', 'zip_codes']


## Cleaning DataFrames

In [7]:
#Checking NA and NULLs in each DataFrame

print(NYC_df.isna().sum())
print(NYC_df.isnull().sum())
print('\n')
print(LA_df.isna().sum())
print(LA_df.isnull().sum())
print('\n')
print(CHI_df.isna().sum())
print(CHI_df.isnull().sum())

Borough         0
Neighborhood    0
zip_codes       0
dtype: int64
Borough         0
Neighborhood    0
zip_codes       0
dtype: int64


Neighborhood    0
zip_codes       0
dtype: int64
Neighborhood    0
zip_codes       0
dtype: int64


Neighborhood    7
zip_codes       7
dtype: int64
Neighborhood    7
zip_codes       7
dtype: int64


In [8]:
#Investigating the CHI DataFrame

#pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
#pd.set_option('display.width', None)
#pd.set_option('display.max_colwidth', -1)
#CHI_df

In [9]:
CHI_df.dropna(inplace=True)
print(CHI_df.isna().sum())
print(CHI_df.isnull().sum())

Neighborhood    0
zip_codes       0
dtype: int64
Neighborhood    0
zip_codes       0
dtype: int64


## Methodology <a name="methodology"></a>

#### Step 1:

In this Project, I discovered that there are muliple zipcodes for each Neighborhood.
I directed my effort in exploding our datasets so that we could view each ZIP Code by its 
neighborhood. As for my reason for exploding the datasets, it is so that the geo locator can go through each
ZIP Code for the best possible results. 

#### Step 2:

After Exploding my dataset and finding the geo coordinates to each of my ZIP Code, I would then have to explode
my datasets again. This being so instead of being a Tuple in my dataset I would have dedicated columns for each
of my Latitude and Longitude Coordinates.

#### Step 3:

This Step would be the most crucial as I am to look up my Resturants for each of my city and get back my results.
I would **Limit my search to 100** for each ZIP Code with a **Radius of 500 Meters**. A bit excessive, but the possibility
of missing venues because my radius was too small would be bothersome. I would then take my results find the
dimensions of my datasets and compare them with each other to see which was bigger.

#### Extra Step:

I decided to map them so that I could view them visually.

### Exploding DataFrame by its ZIP Code

since there are multiple zip codes for each neighborhood, I need to separate them so the the geo locator can locate each zipcode

In [10]:
new_df1 = pd.DataFrame(NYC_df.zip_codes.str.split(',').tolist(), index= NYC_df.Neighborhood).stack()
new_df1 = new_df1.reset_index([0, 'Neighborhood'])
NYC_df2 = new_df1
NYC_df2.columns = ['Neighborhood','zip_codes']
NYC_df2

Unnamed: 0,Neighborhood,zip_codes
0,Central Bronx,10453
1,Central Bronx,10457
2,Central Bronx,10460
3,Bronx Park and Fordham,10458
4,Bronx Park and Fordham,10467
...,...,...
173,South Shore,10312
174,Stapleton and St. George,10301
175,Stapleton and St. George,10304
176,Stapleton and St. George,10305


In [11]:
new_df2 = pd.DataFrame(LA_df.zip_codes.str.split(',').tolist(), index= LA_df.Neighborhood).stack()
new_df2 = new_df2.reset_index([0, 'Neighborhood'])
LA_df2 = new_df2
LA_df2.columns = ['Neighborhood','zip_codes']
LA_df2

Unnamed: 0,Neighborhood,zip_codes
0,Acton,93510
1,Agoura Hills,91301
2,Agoura Hills (PO Boxes),91376
3,Agua Dulce,91390
4,Alhambra,91801
...,...,...
975,Woodland Hills (Los Angeles),91364
976,Woodland Hills (Los Angeles),91367
977,Woodland Hills (PO Boxes) (Los Angeles),91365
978,Woodland Hills (PO Boxes) (Los Angeles),91372


In [12]:
new_df3 = pd.DataFrame(CHI_df.zip_codes.str.split(',').tolist(), index= CHI_df.Neighborhood).stack()
new_df3 = new_df3.reset_index([0, 'Neighborhood'])
new_df3.drop(new_df3.index[0], inplace=True)
new_df3.reset_index(inplace=True)
CHI_df2 = new_df3
CHI_df2.columns = ['old index' , 'Neighborhood','zip_codes']
CHI_df2.drop(['old index'], axis=1, inplace= True)
CHI_df2

Unnamed: 0,Neighborhood,zip_codes
0,Cathedral District,60611
1,Central Station,60605
2,Dearborn Park,60605
3,Gold Coast,60610
4,Gold Coast,60611
...,...,...
355,Washington Park,60609
356,Washington Park,60615
357,Washington Park,60621
358,Washington Park,60637


## Finding Geo Coordinates for each ZIP Code

In [13]:
#geolocator = Nominatim(user_agent='google_agent')
#geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)
#NYC_df2['location'] = NYC_df2['zip_codes'].apply(geocode)

#NYC_df2['point'] = NYC_df2['location'].apply(lambda loc: tuple(loc.point) if loc else None)

#print('DONE!')

In [14]:
#LA_df2['location'] = LA_df2['zip_codes'].apply(geocode)

#LA_df2['point'] = LA_df2['location'].apply(lambda loc: tuple(loc.point) if loc else None)

#print('DONE!')

In [15]:
#CHI_df2['location'] = CHI_df2['zip_codes'].apply(geocode)

#CHI_df2['point'] = CHI_df2['location'].apply(lambda loc: tuple(loc.point) if loc else None)

#print('DONE!')

Saving results so that I dont have to re run the code each time. Takes a while to find to reduce the "Too Many Requests Error"

In [16]:
#NYC_df2.to_csv('NYC_df2', index=False)
#LA_df2.to_csv('LA_df2', index=False)
#CHI_df2.to_csv('CHI_df2', index=False)

#print('DONE!')

In [17]:
NYC_df2 = pd.read_csv('NYC_df2')
LA_df2 = pd.read_csv('LA_df2')
CHI_df2 = pd.read_csv('CHI_df2')

Cleaning DataFrames Again. Because of there were coordinates and location outside of the US. I believe this happened because of the limitations of the Nominatim Function.

In [18]:
NYC_df2 = NYC_df2[NYC_df2.location.str.contains('New York')]
NYC_df2

Unnamed: 0,Neighborhood,zip_codes,location,point
0,Central Bronx,10453,"The Bronx, Bronx County, New York, 10453, Unit...","(40.85234385146179, -73.91196955537043, 0.0)"
1,Central Bronx,10457,"The Bronx, Bronx County, New York, 10457, Unit...","(40.84796417510888, -73.89770955547682, 0.0)"
3,Bronx Park and Fordham,10458,"The Bronx, Bronx County, New York, 10458, Unit...","(40.86156201215534, -73.88877720559567, 0.0)"
4,Bronx Park and Fordham,10467,"The Bronx, Bronx County, New York, 10467, Unit...","(40.87452493805371, -73.86794642958414, 0.0)"
5,Bronx Park and Fordham,10468,"The Bronx, Bronx County, New York, 10468, Unit...","(40.8710838, -73.8941052, 0.0)"
...,...,...,...,...
173,South Shore,10312,"Staten Island, Richmond County, New York, 1031...","(40.5436525, -74.16830055, 0.0)"
174,Stapleton and St. George,10301,"Staten Island, Richmond County, New York, 1030...","(40.63979875, -74.07574570091533, 0.0)"
175,Stapleton and St. George,10304,"Todt Hill, Staten Island, Richmond County, New...","(40.606031, -74.086625, 0.0)"
176,Stapleton and St. George,10305,"Todt Hill, Staten Island, Richmond County, New...","(40.609344, -74.076293, 0.0)"


In [19]:
CHI_df2 = CHI_df2[CHI_df2.location.str.contains('Chicago')]
CHI_df2

Unnamed: 0,Neighborhood,zip_codes,location,point
0,Cathedral District,60611,"Near North Side, Chicago, Cook County, Illinoi...","(41.89513283566048, -87.62281879221906, 0.0)"
1,Central Station,60605,"Near South Side, Chicago, Cook County, Illinoi...","(41.8658506, -87.6099423, 0.0)"
2,Dearborn Park,60605,"Near South Side, Chicago, Cook County, Illinoi...","(41.8658506, -87.6099423, 0.0)"
3,Gold Coast,60610,"Near North Side, Chicago, Cook County, Illinoi...","(41.8884984, -87.6292815, 0.0)"
4,Gold Coast,60611,"Near North Side, Chicago, Cook County, Illinoi...","(41.89513283566048, -87.62281879221906, 0.0)"
...,...,...,...,...
355,Washington Park,60609,"Bridgeport, Chicago, Cook County, Illinois, 60...","(41.8249226, -87.6381346, 0.0)"
356,Washington Park,60615,"Hyde Park, Chicago, Cook County, Illinois, 606...","(41.80110300957798, -87.59432360924957, 0.0)"
357,Washington Park,60621,"Englewood, Chicago, Cook County, Illinois, 606...","(41.7775939545224, -87.63663948181107, 0.0)"
358,Washington Park,60637,"Hyde Park, Chicago, Cook County, Illinois, 606...","(41.7884978, -87.5793199, 0.0)"


Due to NA/NULL values the same operation of filtering the data didnt work. So I had to drop them again. Possibly due to the limitations of Nominatim.


In [20]:
print(LA_df2.isna().sum())
print(LA_df2.isnull().sum())

Neighborhood     0
zip_codes        0
location        88
point           88
dtype: int64
Neighborhood     0
zip_codes        0
location        88
point           88
dtype: int64


In [21]:
LA_df2.dropna(inplace=True)
print(LA_df2.isna().sum())
print(LA_df2.isnull().sum())

Neighborhood    0
zip_codes       0
location        0
point           0
dtype: int64
Neighborhood    0
zip_codes       0
location        0
point           0
dtype: int64


In [22]:
LA_df2 = LA_df2[LA_df2.location.str.contains('Los Angeles')]
LA_df2


Unnamed: 0,Neighborhood,zip_codes,location,point
0,Acton,93510,"Los Angeles County, California, 93510, United ...","(34.46103834645381, -118.21138549680546, 0.0)"
1,Agoura Hills,91301,"Malibu Junction, Agoura Hills, Los Angeles Cou...","(34.15295555, -118.75932802244274, 0.0)"
5,Alhambra,91803,"Alhambra, Los Angeles County, California, 9180...","(34.07170388325506, -118.1458605938974, 0.0)"
7,Alhambra (PO Boxes),91802,"Alhambra, Los Angeles County, California, 9180...","(34.0927825, -118.1263605, 0.0)"
10,Altadena,91001,"Linda Vista, Altadena, Los Angeles County, Cal...","(34.1900323, -118.1325732, 0.0)"
...,...,...,...,...
956,Wilmington (Los Angeles),90744,"Thenard, Los Angeles, Los Angeles County, Cali...","(33.7933964, -118.2400379, 0.0)"
958,Wilshire Center (Los Angeles),90004,"Los Angeles, Los Angeles County, California, 9...","(34.071844381995426, -118.3023158073664, 0.0)"
966,Wilshire-La Brea (Los Angeles),90036,"Los Angeles, Los Angeles County, California, 9...","(34.06737818754923, -118.352266061733, 0.0)"
968,Windsor Hills (Los Angeles),90043,"Los Angeles, Los Angeles County, California, 9...","(33.992140810110676, -118.33125048227424, 0.0)"


In [23]:
NYC_df2.drop_duplicates(subset= 'zip_codes', keep= False, inplace=True)
LA_df2.drop_duplicates(subset= 'zip_codes', keep= False, inplace=True)
CHI_df2.drop_duplicates(subset= 'zip_codes', keep= False, inplace=True)

print('Duplicates Removed!')

Duplicates Removed!


Setting up for column split insteads of "point" it'll be  "Latitude" & "Longitude"

In [24]:
NYC_df2=NYC_df2.replace('\(','',regex=True).astype(str)
NYC_df2=NYC_df2.replace('\)','',regex=True).astype(str)
NYC_df2=NYC_df2.replace('\, 0.0','',regex=True).astype(str)

LA_df2=LA_df2.replace('\(','',regex=True).astype(str)
LA_df2=LA_df2.replace('\)','',regex=True).astype(str)
LA_df2=LA_df2.replace('\, 0.0','',regex=True).astype(str)

CHI_df2=CHI_df2.replace('\(','',regex=True).astype(str)
CHI_df2=CHI_df2.replace('\)','',regex=True).astype(str)
CHI_df2=CHI_df2.replace('\, 0.0','',regex=True).astype(str)

In [25]:
NYC_df2.reset_index(inplace=True)
LA_df2.reset_index(inplace=True)
CHI_df2.reset_index(inplace=True)

Made new DataFrame with Lat and Long values

In [26]:
NYC_df2.drop(['index'], axis=1, inplace= True)
LA_df2.drop(['index'], axis=1, inplace= True)
CHI_df2.drop(['index'], axis=1, inplace= True)

In [27]:
split_df1 = NYC_df2['point'].str.split(",")
data1 = split_df1.to_list()
names1 = ['Latitude', 'Longitude']
M1_df = pd.DataFrame(data1, columns=names1)

In [28]:
split_df2 = LA_df2['point'].str.split(",")
data2 = split_df2.to_list()
names2 = ['Latitude', 'Longitude']
M2_df = pd.DataFrame(data2, columns=names2)

In [29]:
split_df3 = CHI_df2['point'].str.split(",")
data3 = split_df3.to_list()
names3 = ['Latitude', 'Longitude']
M3_df = pd.DataFrame(data3, columns=names3)

In [30]:
NYC_df3 = pd.concat([NYC_df2, M1_df], axis=1, join='inner')
LA_df3 = pd.concat([LA_df2, M2_df], axis=1, join='inner')
CHI_df3 = pd.concat([CHI_df2, M3_df], axis=1, join='inner')

print('Concatenated objects')

Concatenated objects


In [31]:
NYC_df3.drop(['point'], axis=1, inplace= True)
LA_df3.drop(['point'], axis=1, inplace= True)
CHI_df3.drop(['point'], axis=1, inplace= True)

print("Columns Dropped!")

Columns Dropped!


In [32]:
convert_dict = {"zip_codes": int, "Latitude": float, "Longitude": float}
NYC_df3 = NYC_df3.astype(convert_dict)
LA_df3 = LA_df3.astype(convert_dict)
CHI_df3 = CHI_df3.astype(convert_dict)

In [33]:
print(NYC_df3.dtypes)
print('\n')
print(LA_df3.dtypes)
print('\n')
print(CHI_df3.dtypes)

Neighborhood     object
zip_codes         int64
location         object
Latitude        float64
Longitude       float64
dtype: object


Neighborhood     object
zip_codes         int64
location         object
Latitude        float64
Longitude       float64
dtype: object


Neighborhood     object
zip_codes         int64
location         object
Latitude        float64
Longitude       float64
dtype: object


In [34]:
address1 = 'New York City, NY'

geolocator1 = Nominatim(user_agent="explorer1")
location = geolocator1.geocode(address1)
latitude1 = location.latitude
longitude1 = location.longitude
print('The geograpical coordinate of New York City, NY are {}, {}.'.format(latitude1, longitude1))

The geograpical coordinate of New York City, NY are 40.7127281, -74.0060152.


In [35]:
address2 = 'Los Angeles, CA'

geolocator2 = Nominatim(user_agent="explorer2")
location = geolocator2.geocode(address2)
latitude2 = location.latitude
longitude2 = location.longitude
print('The geograpical coordinate of Los Angeles, CA are {}, {}.'.format(latitude2, longitude2))

The geograpical coordinate of Los Angeles, CA are 34.0536909, -118.2427666.


In [36]:
address3 = 'Chicago, IL'

geolocator3 = Nominatim(user_agent="explorer3")
location = geolocator3.geocode(address3)
latitude3 = location.latitude
longitude3 = location.longitude
print('The geograpical coordinate of Chicago, IL {}, {}.'.format(latitude3, longitude3))

The geograpical coordinate of Chicago, IL 41.8755616, -87.6244212.


In [37]:
map_NYC = folium.Map(location=[latitude1, longitude1], zoom_start=11)

for lat, lng, label in zip(NYC_df3['Latitude'], NYC_df3['Longitude'], NYC_df3['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_NYC)  
    
map_NYC

In [38]:
map_LA = folium.Map(location=[latitude2, longitude2], zoom_start=11)

for lat, lng, label in zip(LA_df3['Latitude'], LA_df3['Longitude'], LA_df3['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_LA)  
    
map_LA

In [39]:
map_CHI = folium.Map(location=[latitude3, longitude3], zoom_start=11)

for lat, lng, label in zip(CHI_df3['Latitude'], CHI_df3['Longitude'], CHI_df3['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_CHI)  
    
map_CHI

In [40]:
CLIENT_ID = 'CCU2NDZIW42VAGVMA0ABT3MIBGES1AII4R5MG0DGWE3F55V4' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [41]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [43]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION,
            latitude3, 
            longitude3, 
            radius,
            LIMIT,
            "52e81612bcbc57f1066b79ff") #HALAL Resturant ID
        
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['zip_codes', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [44]:
# LA_venues = getNearbyVenues(names = LA_df3['zip_codes'],
#                              latitudes = LA_df3['Latitude'],
#                              longitudes = LA_df3['Longitude'])
# 
# 

In [45]:
# LA_venues.head()
# 

In [46]:
# CHI_venues = getNearbyVenues(names = CHI_df3['zip_codes'],
#                              latitudes = CHI_df3['Latitude'],
#                              longitudes = CHI_df3['Longitude'])
# 

In [47]:
# CHI_venues.head()
# 

In [48]:
# NYC_venues = getNearbyVenues(names = NYC_df3['zip_codes'],
#                              latitudes = NYC_df3['Latitude'],
#                              longitudes = NYC_df3['Longitude'], 
#                              radius = 500)
# 

In [49]:
# NYC_venues.head()

In [50]:
# NYC_venues.to_csv('NYC_venues', index=False)
# LA_venues.to_csv('LA_venues', index=False)
# CHI_venues.to_csv('CHI_venues', index=False)

In [51]:
NYC_venues = pd.read_csv('NYC_venues')
LA_venues = pd.read_csv('LA_venues')
CHI_venues = pd.read_csv('CHI_venues')

## Analysis <a name="analysis"></a>

The Code for making my analysis on the datasets.

In [52]:
print('There are {} uniques categories.'.format(len(NYC_venues['Venue Category'].unique())))
print('There are {} uniques categories.'.format(len(LA_venues['Venue Category'].unique())))
print('There are {} uniques categories.'.format(len(CHI_venues['Venue Category'].unique())))

There are 3 uniques categories.
There are 3 uniques categories.
There are 3 uniques categories.


In [53]:
print('New YorK City', NYC_venues.shape)
print('Los Angeles', LA_venues.shape)
print('Chicago', CHI_venues.shape)

New YorK City (411, 7)
Los Angeles (447, 7)
Chicago (87, 7)


In [54]:
locations1 = NYC_venues[['Venue Latitude', 'Venue Longitude']]
locationlist1 = locations1.values.tolist()

for point in range(0, len(locationlist1)):
    folium.Marker(locationlist1[point], popup=NYC_venues['Venue'][point]).add_to(map_NYC)
    
map_NYC

In [55]:
locations2 = LA_venues[['Venue Latitude', 'Venue Longitude']]
locationlist2 = locations2.values.tolist()

for point in range(0, len(locationlist2)):
    folium.Marker(locationlist2[point], popup=LA_venues['Venue'][point]).add_to(map_LA)
    
map_LA

In [56]:
locations3 = CHI_venues[['Venue Latitude', 'Venue Longitude']]
locationlist3 = locations3.values.tolist()

for point in range(0, len(locationlist3)):
    folium.Marker(locationlist3[point], popup=CHI_venues['Venue'][point]).add_to(map_CHI)
    
map_CHI

## Results and Discussion <a name="results"></a>

#### Results:

In our Analysis, shows that Los Angeles has more Halal resturants and resturants that serve Halal food.
New York City had about 411 venues, Los Angelels had about 447 venues, and Chicago with 87 venues. It Shocked me
the most that Chicago would have such a low number for a incredibly large city compared to New York City and
Los Angeles.

#### Discussion:

Throughout the project I've thought about using clustering to map my results. However with my search being very
specific and narrow, it didnt make sense to do so. Ive tried at the end as an attempt to get something feasable, 
but the clusters in my array ended up being unusable to map. With that in mind, I decied to remove the clustering from
this project entirely.

## Conclusion <a name="conclusion"></a>

Final decsion on which city would be **best choice for Halal food is Los Angeles**. However, New York City is fairly close behind.
Chicago would be the last choice compared to the other two big cities.

The purpose of this project was to start something small and leave a pathway to expand it further. For example
if I had downloaded or Scraped Crime data from each city, the possibily of analyzing the results with the crime
data could make good analysis to open resturants in a safe Neighborhood. Another would have been finding a good
location to open a resturant up within the high traffic areas of our target audience would frequent to alot, but
also hold a competitive advantage in location and type of food served. Although the possibilties are endless, it
an awesome project to work on nonetheless.
