Capstone Project - Battle of Neighborhoods (week 2)
Applied Data Science Capstone
Introduction to Business Problem
Opening a new Italian Restaurant in Atlanta, Georgia
The objective of this report is to determine the best possible location to open an Italian Restaurant in Atlanta based on the different localities of the city, already established Italian restaurant in various geographical location and ease of accessibility by maximum number of people so that the revenue from the latest venture can be maximized.


In [1]:
#Importing required libraries
import numpy as np
import pandas as pd

from geopy.geocoders import Nominatim
try:
    import geocoder
except:
    !pip install geocoder
    import geocoder

import requests
from bs4 import BeautifulSoup

try:
    import folium
except:
    !pip install folium
    import folium
    
from sklearn.cluster import KMeans

from sklearn import preprocessing

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier

from sklearn import metrics

import matplotlib as mpl
import matplotlib.pyplot as plt


Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 5.5 MB/s eta 0:00:011
Collecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting folium
  Downloading folium-0.11.0-py2.py3-none-any.whl (93 kB)
[K     |████████████████████████████████| 93 kB 2.7 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.1-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0


In [2]:
# install wordcloud
!pip install wordcloud
# import package and its set of stopwords
from wordcloud import WordCloud, STOPWORDS

print ('Wordcloud is installed and imported!')


Collecting wordcloud
  Downloading wordcloud-1.8.1-cp37-cp37m-manylinux1_x86_64.whl (366 kB)
[K     |████████████████████████████████| 366 kB 9.3 MB/s eta 0:00:01
Installing collected packages: wordcloud
Successfully installed wordcloud-1.8.1
Wordcloud is installed and imported!


In [3]:
#Getting the location of Atlanta using the geocoder package
g = geocoder.arcgis('Atlanta, Georgia, USA')
blr_lat = g.latlng[0]
blr_lng = g.latlng[1]
print("The Latitude and Longitude of the City of Atlanta is {} and {}".format(blr_lat, blr_lng))


The Latitude and Longitude of the City of Atlanta is 33.74831000000006 and -84.39110999999997


In [4]:
#Scraping the Wikimedia webpage for list of localities present in Atlanta, Georgia, USA
neig = requests.get("https://en.wikipedia.org/wiki/Atlanta_metropolitan_area").text

In [5]:
#parsing the scraped content
soup = BeautifulSoup(neig, 'html.parser')

In [6]:
#Creating a list to store neighborhood data
neighborhoodlist = []

In [8]:
#Searching the localities using class labels and appending it to the neighborhood list
for i in soup.find_all('div', class_='category')[0].find_all('a'):
    neighborhoodlist.append(i.text)

#Creating a dataframe from the list
neig_df = pd.DataFrame({"Locality": neighborhoodlist})
neig_df.head()

Unnamed: 0,Locality
0,Metropolitan area


In [9]:
#Shape of dataframe neig_df
neig_df.shape

(1, 1)

In [10]:
#Defining a function to get the location of the localities
def get_location(localities):
    g = geocoder.arcgis('{}, Atlanta, Georgia'.format(localities))
    get_latlng = g.latlng
    return get_latlng

In [14]:
#Creating an empty list
coordinates = []
#Getting the coordinates of each locality using the function defined above
for i in neig_df["Locality"].tolist():
    coordinates.append(get_location(i))
print(coordinates)

[[33.724883586114544, -84.40789176943848]]


In [15]:
coordinates[:5]

[[33.724883586114544, -84.40789176943848]]

In [16]:
#Creating a dataframe from the list of location coordinates
coordinates_df = pd.DataFrame(coordinates, columns=['Latitudes', 'Longitudes'])

In [17]:
#Adding coordinates of localities to neig_df dataframe
neig_df["Latitudes"] = coordinates_df["Latitudes"]
neig_df["Longitudes"] = coordinates_df["Longitudes"]

In [18]:
print("The shape of neig_df is {}".format(neig_df.shape))
neig_df.head()

The shape of neig_df is (1, 3)


Unnamed: 0,Locality,Latitudes,Longitudes
0,Metropolitan area,33.724884,-84.407892


In [19]:
#Creating a map
blr_map = folium.Map(location=[blr_lat, blr_lng],zoom_start=11)

folium.Marker([blr_lat, blr_lng], popup='<i>Atlanta</i>', color='red', tooltip="Click to see").add_to(blr_map)

#markers for localities
for latitude,longitude,name in zip(neig_df["Latitudes"], neig_df["Longitudes"], neig_df["Locality"]):
    folium.CircleMarker(
        [latitude, longitude],
        radius=6,
        color='blue',
        popup=name,
        fill=True,
        fill_color='#3186ff'
    ).add_to(blr_map)

blr_map