# Capstone Project - Week 1

## Irish Pubs in Boston

### Introduction

Boston is the capital and the most populous city in the Commonwealth of Massachusetts and one of the oldest municipalities in the United States. But all of those are the boring facts and none of that matters, because Boston as a city is so much more. It is the city that was integral to the fight for American Independence when it hosted the Boston Tea Party. Today, it is the home of the New England Patriots and the Boston Red Sox for sports and also to two of the best universities in the world, Harvard and MIT. Martin Scorsese's Departed featured Boston as a character which was arguably more important than the protagonists. I had the pleasure of residing in Boston for a few years while I completed my studies, which led me to the topic of this Capstone Project. Boston is a bustling, cosmopolitan city, with a vibrant multi-cultural attitude. Amongst the many cultures that inhabit Boston, few are as prominent as the Irish. Across Boston lie many Irish pubs that serve the best brew and food to accompany. It is clearly a popular establishment, and there is a lifestyle that enjoys indulging in the services it has to offer. Whether it is college students looking for weekend celebrations, weary colleagues enjoying the post-work drink, or even teetotalers who frequent the Irish pubs for the unhealthy menu; there is a broad demographic who impact the business of an Irish pub.

Of course with an ongoing pandemic and living in the age of social distancing, the hospitality industry has suffered significantly. But on an optimistic note, for when people begin to get back to the pre-Covid lifestyle, the Irish pub will regain its popularity and return to its mainstay in the Boston hospitality industry. In this project I have tried to stay true to current times, and have tried to use recent applicable data taking events into accounts, but in order to proceed, I have been forced to make some assumptions that I will mention during the methodology section of the project. The aim of this project is to turn a curiosity and an interest into a potential business opportunity/problem.

### Business Problem

If an entrepreneur with an interest and prior experience in the hospitality/food and beverage industry wishes to open an Irish pub in Boston, where would be the best place to situate it?

This business probable while hypothetical is still plausible. Any individual able to convert an interest into a plan to open an Irish pub is the audience. The demographic of the audience would vary, depending on age, income, capital, education, and risk averseness. Starting an Irish pub or any other business for that matter, most importantly requires an entrepreneurial mindset alongside the factors mentioned earlier. The first assumption I have made is that the individual looking to open an Irish pub has the means to do so.

Another assumption I have made is that the transaction costs are negligible or non-existent. So the land, labour, and permits are available to open an Irish pub, and macroeconomic factors such as stock markets, inflation, or fiscal policy do not interfere. Also the prices that the owner would set would depend upon market research conducted once the pub has opened.

A few points to consider are that the individual would want to open the pub that would profit the business best. Opening it in a place with many other pubs would leave it vulnerable to competition, but opening it in a secluded place or even a wrong place like a school district would mean that it would never get noticed and would be inconvenient for people to access it. So population and average income of the neighbourhood would have to be used for the analysis. Other services, attractions, and landmarks in the area would also have to be used in the analysis.

The analysis I will be using is k-means clustering, because by segmenting and clustering the neighbourhoods it would be easier to find the perfect location for the pub. The clustering would best use the defining factors such as population, average income, age, etc and provide a good result.

## Data

I have decided to use multiple datasets for this project including the Foursquare location data.
The first data set includes the coordinates, zipcodes, and neighbourhoods of Boston. This will serve as a basis to search for venues with the Foursquare and location data. Boston did not have boroughs, and was divided simply into neighbourhoods so searching for the right location became slightly easier.

I scraped the geospatial data for the coordinates, and the neighbourhoods from Wikipedia, which I converted into a csv file that I have uploaded in the data section. A dataset that I would like to add later involves the demographics rather than just the geographical data, because age and culture in the neighbourhoods could be defining factors in solving the business problem.

THe geospatial data also includes per capita income and population which I believe will be important later in the project

I have provided some code for the geospatial data, and the Foursquare data example for one of the neighbourhoods to show how the project will begin.

In [10]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
!pip install geocoder
from geopy.geocoders import Nominatim

print('Libraries imported.')

Libraries imported.


In [27]:
# uploading the basic geographical data set for Boston
# The credentials were removed before submission
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

bostondata = pd.read_csv(body)
bostondata.head()


Unnamed: 0,ZIP code,Neighbourhood,Per capitaincome,Population,Latitude,Longitude
0,2110,Financial District,"$152,007",1486,42.3613,-71.0483
1,2199,Prudential Center,"$151,060",1290,42.3465,-71.0832
2,2210,Fort Point,"$93,078",1905,42.3517,-71.0409
3,2109,North End,"$88,921",4277,42.3661,-71.0483
4,2116,Back Bay/Bay Village,"$81,458",21318,42.3531,-71.0765


In [24]:
!pip install folium==0.5.0
import folium #  installing map rendering library  
    



In [25]:
#  Creating Map of Boston using latitude and longitude values
map_boston = folium.Map(location=[42.3601, -71.0589], zoom_start=10)

# map markers
for lat, lng, neighbourhood in zip(bostondata['Latitude'], bostondata['Longitude'], bostondata['Neighbourhood']):
    label = '{}'.format(neighbourhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_boston)

map_boston

In [22]:
#Finding places within Allston
#The lattitude and longitude values of Allston are 42.3593, -71.1270
# The URL contains my foursquare API data which was removed prior to submission
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT ID, 
    CLIENT SECRET, 
    '20180605', 
    42.3593, 
    -71.1270, 
    500, 
    100)

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5fa73cd0418d9541e76f9a62'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'North Allston',
  'headerFullLocation': 'North Allston, Boston',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 7,
  'suggestedBounds': {'ne': {'lat': 42.3638000045, 'lng': -71.12092151180006},
   'sw': {'lat': 42.354799995499995, 'lng': -71.13307848819993}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '50494123e4b070328f93047a',
       'name': 'Boston Liquors',
       'location': {'address': '225 Cambridge St',
        'crossStreet': 'N. Harvard St.',
        'lat': 42.35802701394661,
        'lng': -71.12679283992392,
        'labeledLatLngs': [{'l