<a href="https://colab.research.google.com/github/Gulnaz-18/Coursera_Capstone/blob/master/Battle_of_Neighborhoods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Capstone Project - The Battle of the Neighborhoods

Applied Data Science Capstone by IBM/Coursera

<b>1.Introduction</b>

<b>1.1 Background</b>

New York is the most populous city in the US state of New York with over 8 million residents. New York is also very diverse as its residents come from various backgrounds and ethnicities. Diverse population and busy lifestyle provide many business opportunities to New York residents including the coffee shop industry. 


<b>1.2 Business statement</b>


<b>Opening a new coffeeshop in a city such as New York</b> takes many factors into consideration: financing, right location, catchy name, unique environment, etc. New York consists of five boroughs: Manhattan, The Bronx, Brooklyn, Queens, and Staten Island. Among these five boroughs, Manhattan definitely stands out as it is often described as the cultural and financial center of the world. (1) Therefore, in this report I will focus on coffee shop industry business opportunities that Manhattan has to offer.
In this report, I will identify if the area has a potential success of opening a coffee shop based on the following criteria:
<ul style="list-style-type:disc;">
<li>number of coffee shops represented in the particular neighborhood of Manhattan</li>
<li>identify which neighborhood of Manhattan has the smallest number of coffeeshops</li>
<li>identify potential success of new coffee shop in the particular neighborhood of Manhattan</li></ul>












<b>1.3 Target Audience</b>

This research will be helpful for businessmen and businesswomen thinking of opening a new coffee shop in Manhattan, NYC. The objective is to recommend the best neighborhood to start such a business.


<b>2.Data</b>

In order to do the research in this report, data will be gathered through the following sources:
1. For this analysis I will be using a New York City dataset that has data on all five boroughs of New York City as well as longitude and latitude of each neighborhood. The dataset allows me to explore the neighborhoods in Manhattan, NYC.
2. In order to identify all the coffee shops in Manhattan neighborhoods I will be using Foursquare location dataset.
3. The Foursquare API will be used to collect information regarding neighborhoods with a smallest number of existing coffee shops.




<b>3.Methodology</b>

The objective of this project is to find which neighborhood of Manhattan, NYC is a good choice for a new coffee shop business to open. For data analysis I will be using a New York City dataset that has data on the neighborhoods in Manhattan as well as longitude and latitude of each neighborhood. Manhattan, NYC has 53 neighborhoods and the goal of my report is to cluster these neighborhoods based on the presented coffee shops that each of them has.


Exploratory Data Analysis:


In [60]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


Load New York City dataset that has data on all five boroughs of New York City as well as longitude and latitude of each neighborhood. The dataset allows me to explore the neighborhoods in Manhattan, NYC and create a map of New York with neighborhoods using folium Python library. 

In [61]:
!wget -q -O 'newyork_data.json' https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs/newyork_data.json
print('Data downloaded!')

Data downloaded!


In [62]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [63]:
newyork_data

{'bbox': [-74.2492599487305,
  40.5033187866211,
  -73.7061614990234,
  40.9105606079102],
 'crs': {'properties': {'name': 'urn:ogc:def:crs:EPSG::4326'}, 'type': 'name'},
 'features': [{'geometry': {'coordinates': [-73.84720052054902,
     40.89470517661],
    'type': 'Point'},
   'geometry_name': 'geom',
   'id': 'nyu_2451_34572.1',
   'properties': {'annoangle': 0.0,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661],
    'borough': 'Bronx',
    'name': 'Wakefield',
    'stacked': 1},
   'type': 'Feature'},
  {'geometry': {'coordinates': [-73.82993910812398, 40.87429419303012],
    'type': 'Point'},
   'geometry_name': 'geom',
   'id': 'nyu_2451_34572.2',
   'properties': {'annoangle': 0.0,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.874294193

In [64]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [65]:
neighborhoods_data = newyork_data['features']

In [66]:
neighborhoods_data[0]

{'geometry': {'coordinates': [-73.84720052054902, 40.89470517661],
  'type': 'Point'},
 'geometry_name': 'geom',
 'id': 'nyu_2451_34572.1',
 'properties': {'annoangle': 0.0,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661],
  'borough': 'Bronx',
  'name': 'Wakefield',
  'stacked': 1},
 'type': 'Feature'}

In [67]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [68]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [69]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [70]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


In [71]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [72]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

Slice the original New York City dataframe and create a new dataframe of the Manhattan data and Manhattan neighborhoods. Visualize Manhattan neighborhoods using folium Python library. 


In [73]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [74]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [75]:
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

In [76]:
CLIENT_ID = 'KIXWB5WOULILNNXX2IFDBTDWDURRTKSYJ12X1MSNNM3OUO4Y' # your Foursquare ID
CLIENT_SECRET = '3Q2Q4QIGHPVDQPVE42QVLQ3CMQXKO3W5W1K0PZIQR0W5S4FD' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KIXWB5WOULILNNXX2IFDBTDWDURRTKSYJ12X1MSNNM3OUO4Y
CLIENT_SECRET:3Q2Q4QIGHPVDQPVE42QVLQ3CMQXKO3W5W1K0PZIQR0W5S4FD


In [77]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [78]:
# type your answer here
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards


Utilize the Foursquare API to explore the Manhattan neighborhoods and identify all presented venues. 


In [79]:
print(manhattan_venues.shape)
manhattan_venues.head()

(2999, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio
1,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
3,Marble Hill,40.876551,-73.91066,Rite Aid,40.875467,-73.908906,Pharmacy
4,Marble Hill,40.876551,-73.91066,Subway,40.874667,-73.909586,Sandwich Place


Filter Manhattan dataset’s Venue Category to identify Coffee Shops in each presented neighborhood. 


In [80]:
manhattan_coffee = manhattan_venues[manhattan_venues['Venue Category'] == 'Coffee Shop']
manhattan_coffee.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
7,Marble Hill,40.876551,-73.91066,Starbucks,40.877531,-73.905582,Coffee Shop
66,Chinatown,40.715618,-73.994279,Little Canal,40.714317,-73.990361,Coffee Shop
119,Chinatown,40.715618,-73.994279,Oliver Coffee,40.712986,-73.998106,Coffee Shop
134,Washington Heights,40.851903,-73.9369,Forever Coffee Bar,40.850433,-73.936607,Coffee Shop
183,Washington Heights,40.851903,-73.9369,Starbucks,40.850961,-73.93833,Coffee Shop


Count the total number of coffee shops in Manhattan. 

In [89]:
manhattan_venues[manhattan_venues['Venue Category'] == 'Coffee Shop'].count()

Neighborhood              115
Neighborhood Latitude     115
Neighborhood Longitude    115
Venue                     115
Venue Latitude            115
Venue Longitude           115
Venue Category            115
dtype: int64

Use data to calculate the number of coffee shops presented in each neighborhood of Manhattan.

In [130]:
manhattan_coffee.groupby('Neighborhood').count().fillna(0)

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,4,4,4,4,4,4
Carnegie Hill,5,5,5,5,5,5
Chelsea,5,5,5,5,5,5
Chinatown,2,2,2,2,2,2
Civic Center,6,6,6,6,6,6
Clinton,6,6,6,6,6,6
East Village,3,3,3,3,3,3
Financial District,9,9,9,9,9,9
Flatiron,3,3,3,3,3,3
Gramercy,3,3,3,3,3,3


Filter the dataset to display the Neighborhood and number of Coffee Shops. 

In [131]:
# one hot encoding
manhattan_count = pd.get_dummies(manhattan_coffee[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_count['Neighborhood'] = manhattan_coffee['Neighborhood'] 
# move neighborhood column to the first column
fixed_columns = [manhattan_count.columns[-1]] + list(manhattan_count.columns[:-1])
manhattan_count = manhattan_count [fixed_columns]
manhattan_count.columns = ['Neighborhood' ,'Venue Category']
manhattan_count.head()

Unnamed: 0,Neighborhood,Venue Category
7,Marble Hill,1
66,Chinatown,1
119,Chinatown,1
134,Washington Heights,1
183,Washington Heights,1


Sort the dataset in ascending order and identify which neighborhoods of Manhattan have the smallest number of presented coffee shops.


In [132]:
manhattan_grouped = manhattan_count.groupby('Neighborhood').sum().reset_index()
manhattan_grouped.sort_values(by=['Venue Category'], ascending=True)

Unnamed: 0,Neighborhood,Venue Category
13,Inwood,1
27,Stuyvesant Town,1
19,Marble Hill,1
30,Tudor City,1
28,Sutton Place,2
29,Tribeca,2
17,Manhattan Valley,2
15,Lincoln Square,2
25,Roosevelt Island,2
12,Hudson Yards,2


Cluster coffee shops and neighborhoods using the k-means clustering algorithm.


In [133]:
# set number of clusters
kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([4, 2, 2, 0, 2, 2, 3, 1, 3, 3], dtype=int32)

In [135]:
# add clustering labels
#manhattan_grouped.drop(['Cluster Labels'], axis=1, inplace=True)
manhattan_grouped.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(manhattan_grouped.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,Venue Category
0,Manhattan,Marble Hill,40.876551,-73.91066,0.0,1.0
1,Manhattan,Chinatown,40.715618,-73.994279,0.0,2.0
2,Manhattan,Washington Heights,40.851903,-73.9369,0.0,2.0
3,Manhattan,Inwood,40.867684,-73.92121,0.0,1.0
4,Manhattan,Hamilton Heights,40.823604,-73.949688,0.0,2.0


In [136]:
manhattan_merged = manhattan_merged.fillna(0)
print(manhattan_merged['Cluster Labels'].astype(int))
print(manhattan_merged['Venue Category'].astype(int))



0     0
1     0
2     0
3     0
4     0
5     0
6     0
7     0
8     3
9     4
10    3
11    0
12    3
13    0
14    2
15    4
16    2
17    2
18    3
19    3
20    0
21    0
22    3
23    3
24    3
25    0
26    3
27    3
28    4
29    1
30    2
31    4
32    2
33    3
34    0
35    3
36    0
37    0
38    3
39    0
Name: Cluster Labels, dtype: int64
0     1
1     2
2     2
3     1
4     2
5     2
6     0
7     0
8     3
9     4
10    3
11    2
12    3
13    2
14    6
15    4
16    5
17    5
18    3
19    3
20    0
21    2
22    3
23    3
24    3
25    2
26    3
27    3
28    4
29    9
30    5
31    4
32    6
33    3
34    2
35    3
36    1
37    1
38    3
39    2
Name: Venue Category, dtype: int64


Clustered data representation with the use of folium Python library:

In [137]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
#cluster = int(cluster)
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.9).add_to(map_clusters)
map_clusters       

Examine Clusters

Now, I examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, I can then assign a name to each cluster. 

Cluster 1

In [138]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Venue Category
0,Marble Hill,1.0
1,Chinatown,2.0
2,Washington Heights,2.0
3,Inwood,1.0
4,Hamilton Heights,2.0
5,Manhattanville,2.0
6,Central Harlem,0.0
7,East Harlem,0.0
11,Roosevelt Island,2.0
13,Lincoln Square,2.0


Cluster 2

In [139]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Venue Category
29,Financial District,9.0


Cluster 3

In [140]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Venue Category
14,Clinton,6.0
16,Murray Hill,5.0
17,Chelsea,5.0
30,Carnegie Hill,5.0
32,Civic Center,6.0


Cluster 4

In [141]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Venue Category
8,Upper East Side,3.0
10,Lenox Hill,3.0
12,Upper West Side,3.0
18,Greenwich Village,3.0
19,East Village,3.0
22,Little Italy,3.0
23,Soho,3.0
24,West Village,3.0
26,Morningside Heights,3.0
27,Gramercy,3.0


Cluster 5

In [142]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Venue Category
9,Yorkville,4.0
15,Midtown,4.0
28,Battery Park City,4.0
31,Noho,4.0


<b>4.Results</b>



Cluster the coffee shops and neighborhoods into 5 clusters using the k-means clustering algorithm. Using the Foursquare location data, I can extract the neighborhood's location and the number of coffee shops in each neighborhood. Applying the k-means clustering algorithm on the number of coffee shops in each neighborhood the dataset is partitioned into 5 clusters. The 5 clusters are partitioned based on a similar number of coffee shops that belong to neighborhoods. 

<b>5.Discussions</b>

After performed analysis, the following recommendations can be made:
By observing the cluster representation on a map, I can say that cluster 1 (red) has the smallest number of represented coffee shops and there is a good scope to open a new coffee shop in these neighborhoods. 
According to the dataset, neighborhoods such as Central Harlem, East Harlem, and Lower East Side have 0 represented coffee shops. Other neighborhoods in cluster 1 have 1 or 2 represented coffee shops which shows a good potential to open a new coffee shop in one of these neighborhoods. 
According to the results, neighborhoods such as Financial District, Chelsey, or Carnegie Hall with a large number of represented coffee shops are not recommended for consideration for opening coffee shop business.


<b>6.Conclusion</b>

New York City is a big city with its residents coming from various backgrounds. It is also a busy city that never sleeps and opening a new coffee shop in a city like New York sounds like a good idea. However, many factors have to be taken into consideration such as location, demand, etc. In this analysis, I focused mainly on one of the boroughs of New York City which is Manhattan and locations with the smallest number of coffee shops. However, the analysis can be expanded by further analysis of types of coffee shops, cafes, and other bakeries in specific areas. For instance, whether it is a traditional coffee shop or shop with unique brewing styles, does it offer tea or pastry, etc. Based on the current results it can be observed that the North and Northeast sides of Manhattan with its neighborhoods have big potential for successful coffee shop business. 


<b>7.References</b>

https://en.wikipedia.org/wiki/New_York_City