## Capstone Project - Explore Atlanta
### Table of contents

    1. Introduction: Business Problem
    2. Data
    3.Methodology
    4. Analysis
    5. Results
   
### Introduction: Business Problem 
In this project, we are going to explore Atlanta and will try to find out what kind of restaurant is already popular in the target neighborhood. This project can give insight to those people who want to open a restaurant in Atlanta. Assuming they have not decide what kind of restaurant they are going to open, we can show them which kind is already popular and better not to open the same.

We need to know about features of each neighborhoods and decide which kind is suitable to open a restaurant. Based on features of selected neighborhoods, we are going to search the most popular kind of restaurant in these areas.

### Data
Based on definition of our problem, points that will influence our decission are:

    1. features of each neighborhood
    2. what kind of restaurant has already opened in the selected neighborhood

### Following data sources will be used:

    1. neighborhood list of Atlanta City of Atlanta
    2. venues in each neighborhood will be obtained using Foursquare API

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
!pip install geopy
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values


import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!pip install folium
#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
from IPython.display import HTML
print('Libraries imported.')

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 14.8MB/s eta 0:00:01
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/63/36/1c93318e9653f4e414a2e0c3b98fc898b4970e939afeedeee6075dd3b703/branca-0.3.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.1 folium-0.10.1
Libraries imported.


In [2]:
# import beautifulsoup
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
print('Beautifulsoup imported')

Beautifulsoup imported


In [3]:
# ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = 'https://www.atlantaga.gov/government/departments/city-planning/office-of-zoning-development/neighborhood-planning-unit-npu/npu-by-neighborhood'
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html,'html.parser')

# retrieve the table
table = soup.tbody
table_rows = table.find_all('tr')
table_tag = table.find_all('th')
table_tags=[j.text for j in table_tag]
df = []
ii=0
for tr in table_rows:
    ii=ii+1
    td = tr.find_all('td')
    #print(tr)
    row = [i.text for i in td]
    #print(ii,row)
    df.append(row)
df_n = df[2:len(df)-1]

#convert to pd dataframe
df_n = pd.DataFrame(df_n,columns=table_tags)
df_n.shape

(245, 2)

In [4]:
# drop the row without information
df_n.drop(df_n.loc[df_n['Neighborhood']=='\n'].index,inplace=True)
df_n.reset_index(drop=True,inplace=True)
print(df_n.shape)
df_n

(243, 2)


Unnamed: 0,Neighborhood,NPU
0,Adair Park,V
1,Adams Park,R
2,Adamsville,H
3,Almond Park,G
4,Amal Heights,Y
5,Ansley Park,E
6,Arden/Habersham,C
7,Ardmore,E
8,Argonne Forest,C
9,Arlington Estates,P


In [6]:
# check if there is any unwanted characters in cells
df_n['Neighborhood'].str.contains('\n').any()

True

In [7]:
df_n[df_n['Neighborhood'].str.contains('\n')]

Unnamed: 0,Neighborhood,NPU
195,Rosalie H. Wright Community Council\r\n ...,I


In [8]:
df_n.loc[195]['Neighborhood']='Rosalie H. Wright Community Council'
print (df_n.loc[195]['Neighborhood'])

Rosalie H. Wright Community Council


In [9]:
# remove duplicates
df_n.drop_duplicates(subset='Neighborhood',keep=False, inplace= True)

In [10]:
df_n = df_n.apply(lambda x: x.str.replace('/',', '))
print(df_n.shape)
df_n

(241, 2)


Unnamed: 0,Neighborhood,NPU
0,Adair Park,V
1,Adams Park,R
2,Adamsville,H
3,Almond Park,G
4,Amal Heights,Y
5,Ansley Park,E
6,"Arden, Habersham",C
7,Ardmore,E
8,Argonne Forest,C
9,Arlington Estates,P


In [11]:
# set ID and secret
CLIENT_ID = 'REMGQS3NDAZHXQHJYUEBIT521TKTDTQOR0RM4TOPHETXBZQU' # your Foursquare ID
CLIENT_SECRET = 'UEACYCXE53L2JTWBI3MKK5OXE0EV2H2KCEVM0ZU0GZEZPJ4X' # your Foursquare Secret
VERSION = '20191223'
LIMIT = 1
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: REMGQS3NDAZHXQHJYUEBIT521TKTDTQOR0RM4TOPHETXBZQU
CLIENT_SECRET:UEACYCXE53L2JTWBI3MKK5OXE0EV2H2KCEVM0ZU0GZEZPJ4X


In [12]:
df_lon=[]
df_lat=[]
geolocator = Nominatim(user_agent="foursquare_agent")
for nei in df_n['Neighborhood']:
    address = nei + ', Atlanta, GA'
    location = geolocator.geocode(address)
    if location == None:
        latitude='NaN'
        longitude='NaN'
    else:
        latitude = location.latitude
        longitude = location.longitude
    df_lon.append(longitude)
    df_lat.append(latitude)
    print(address, latitude, longitude)

Adair Park, Atlanta, GA 33.72468455 -84.4111459736292
Adams Park, Atlanta, GA 33.7120523 -84.4568734
Adamsville, Atlanta, GA 33.7592737 -84.505209
Almond Park, Atlanta, GA NaN NaN
Amal Heights, Atlanta, GA 33.7074991 -84.3951324
Ansley Park, Atlanta, GA 33.7945497 -84.3763154
Arden, Habersham, Atlanta, GA NaN NaN
Ardmore, Atlanta, GA 33.8062822 -84.4000278
Argonne Forest, Atlanta, GA NaN NaN
Arlington Estates, Atlanta, GA NaN NaN
Ashley Courts, Atlanta, GA NaN NaN
Ashview Heights, Atlanta, GA NaN NaN
Atkins Park, Atlanta, GA 33.776144 -84.35268
Atlanta Industrial Park, Atlanta, GA 33.1162131 -94.1663493
Atlanta University Center, Atlanta, GA 33.7515434 -84.4135965377069
Atlantic Station, Atlanta, GA 33.79262625 -84.3962243623586
Audubon Forest, Atlanta, GA NaN NaN
Audubon Forest West, Atlanta, GA NaN NaN
Baker Hills, Atlanta, GA NaN NaN
Bakers Ferry, Atlanta, GA 33.7604668 -84.5078772
Bankhead, Atlanta, GA 33.7722351 -84.4288824
Bankhead Courts, Atlanta, GA NaN NaN
Bankhead, Bolton, At

Ridgecrest Forest, Atlanta, GA NaN NaN
Ridgedale Park, Atlanta, GA NaN NaN
Ridgewood Heights, Atlanta, GA 33.8278826 -84.4457625
Riverside, Atlanta, GA 33.8123276 -84.46743
Rockdale, Atlanta, GA 33.7870502 -84.4349287
Rosedale Heights, Atlanta, GA 33.6820526 -84.3810371
Rosalie H. Wright Community Council, Atlanta, GA NaN NaN
Rue Royal, Atlanta, GA NaN NaN
Sandlewood Estates, Atlanta, GA NaN NaN
Scotts Crossing, Atlanta, GA NaN NaN
Sherwood Forest, Atlanta, GA 36.2087307 -86.5954377
South Atlanta, Atlanta, GA 33.7137749 -84.350005
South River Gardens, Atlanta, GA NaN NaN
South Tuxedo Park, Atlanta, GA NaN NaN
Southwest, Atlanta, GA 33.7490987 -84.3901849
Springlake, Atlanta, GA 33.8094729 -84.4097168
State Facility, Atlanta, GA NaN NaN
Summerhill, Atlanta, GA 33.7378846 -84.384371
Swallow Circle, Baywood, Atlanta, GA NaN NaN
Sweet Auburn, Atlanta, GA 33.7560339 -84.3797582
Sylvan Hills, Atlanta, GA 33.7092744 -84.4177054
Tampa Park, Atlanta, GA NaN NaN
The Villages at Carver, Atlanta, 

In [13]:
df_lon = pd.DataFrame(df_lon,columns=['Longitude'])
df_lat = pd.DataFrame(df_lat,columns=['Latitude'])
df_n_ll = df_n.join(df_lat).join(df_lon)
df_n_ll

Unnamed: 0,Neighborhood,NPU,Latitude,Longitude
0,Adair Park,V,33.7247,-84.4111
1,Adams Park,R,33.7121,-84.4569
2,Adamsville,H,33.7593,-84.5052
3,Almond Park,G,,
4,Amal Heights,Y,33.7075,-84.3951
5,Ansley Park,E,33.7945,-84.3763
6,"Arden, Habersham",C,,
7,Ardmore,E,33.8063,-84.4
8,Argonne Forest,C,,
9,Arlington Estates,P,,


In [14]:
df_n_ll.drop(df_n_ll[df_n_ll['Latitude']=='NaN'].index, inplace = True)
df_n_ll.dropna(axis=0,inplace=True)
df_n_ll.reset_index()
print(df_n_ll.shape)
df_n_ll

(121, 4)


Unnamed: 0,Neighborhood,NPU,Latitude,Longitude
0,Adair Park,V,33.7247,-84.4111
1,Adams Park,R,33.7121,-84.4569
2,Adamsville,H,33.7593,-84.5052
4,Amal Heights,Y,33.7075,-84.3951
5,Ansley Park,E,33.7945,-84.3763
7,Ardmore,E,33.8063,-84.4
12,Atkins Park,F,33.7761,-84.3527
13,Atlanta Industrial Park,G,33.1162,-94.1663
14,Atlanta University Center,T,33.7515,-84.4136
15,Atlantic Station,E,33.7926,-84.3962


In [15]:
# since we do not need NPU, it can be dropped
df_n_ll.drop('NPU',axis=1, inplace=True)
df_n_ll

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Adair Park,33.7247,-84.4111
1,Adams Park,33.7121,-84.4569
2,Adamsville,33.7593,-84.5052
4,Amal Heights,33.7075,-84.3951
5,Ansley Park,33.7945,-84.3763
7,Ardmore,33.8063,-84.4
12,Atkins Park,33.7761,-84.3527
13,Atlanta Industrial Park,33.1162,-94.1663
14,Atlanta University Center,33.7515,-84.4136
15,Atlantic Station,33.7926,-84.3962


In [16]:
address = 'Atlanta, GA, USA'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Atlanta are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Atlanta are 33.7490987, -84.3901849.


In [17]:
# create map of Toronto using latitude and longitude values
map_Atlanta = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(df_n_ll['Latitude'], df_n_ll['Longitude'], df_n_ll['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Atlanta)  
    
map_Atlanta

In [18]:
LIMIT=300
def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [19]:
Atlanta_venues = getNearbyVenues(names=df_n_ll['Neighborhood'],
                                   latitudes=df_n_ll['Latitude'],
                                   longitudes=df_n_ll['Longitude'])

Adair Park
Adams Park
Adamsville
Amal Heights
Ansley Park
Ardmore
Atkins Park
Atlanta Industrial Park
Atlanta University Center
Atlantic Station
Bakers Ferry
Bankhead
Beecher Hills
Ben Hill
Ben Hill Forest
Ben Hill Terrace
Benteen Park
Bolton
Boulder Park
Brandon
Brentwood
Briar Glen
Brookhaven
Brookwood
Brookwood Hills
Browns Mill Park
Buckhead Forest
Buckhead Village
Cabbagetown
Campbellton Road
Candler Park
Capitol View 
Capitol View Manor
Carey Park
Carroll Heights
Cascade Heights
Castleberry Hill
Castlewood
Center Hill
Chattahoochee
Chosewood Park
Collier Heights
Collier Hills
Colonial Homes
Deerwood
Downtown
Druid Hills
East Atlanta
East Lake
Edgewood
English Park
Fairburn
Fairburn Heights
Fernleaf
Fort McPherson
Georgia Tech
Grant Park
Greenbriar
Grove Park
Hammond Park
Hills Park
Huntington
Inman Park
Joyland
Kingswood
Kirkwood
Lake Clair
Lakewood
Lakewood Heights
Lenox
Lincoln Homes
Loring Heights
Margaret Mitchell
Mays
Mechanicsville
Mellwood
Midtown
Morningside, Lenox Park
M

In [20]:
print(Atlanta_venues.shape)
Atlanta_venues.head(20)

(7549, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Adair Park,33.724685,-84.411146,Monday Night Garage,33.729407,-84.418303,Brewery
1,Adair Park,33.724685,-84.411146,Effect Fitness,33.720842,-84.40685,Gym
2,Adair Park,33.724685,-84.411146,Adair Park One,33.730525,-84.412837,Park
3,Adair Park,33.724685,-84.411146,Jamrock Jerk Center,33.721405,-84.407696,Caribbean Restaurant
4,Adair Park,33.724685,-84.411146,Boxcar Atl,33.730106,-84.418582,Gastropub
5,Adair Park,33.724685,-84.411146,Atlanta BeltLine Corridor under Lee/Murphy,33.727205,-84.417238,Trail
6,Adair Park,33.724685,-84.411146,Glitter's Fitness Club,33.736542,-84.402209,Gym / Fitness Center
7,Adair Park,33.724685,-84.411146,Krispy Kreme Doughnuts,33.737829,-84.41588,Donut Shop
8,Adair Park,33.724685,-84.411146,Tassili's Raw Reality,33.738453,-84.422394,Vegetarian / Vegan Restaurant
9,Adair Park,33.724685,-84.411146,Healthful Essence,33.737381,-84.416463,Vegetarian / Vegan Restaurant


In [21]:
Atlanta_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adair Park,79,79,79,79,79,79
Adams Park,35,35,35,35,35,35
Adamsville,35,35,35,35,35,35
Amal Heights,41,41,41,41,41,41
Ansley Park,100,100,100,100,100,100
Ardmore,100,100,100,100,100,100
Atkins Park,100,100,100,100,100,100
Atlanta Industrial Park,21,21,21,21,21,21
Atlanta University Center,100,100,100,100,100,100
Atlantic Station,100,100,100,100,100,100


In [22]:
print('There are {} uniques categories.'.format(len(Atlanta_venues['Venue Category'].unique())))

There are 320 uniques categories.


### Methodology 

In this project, we detect venues in an area ~2km around each neighborhood center.

In the first step, we have collected the required data:

    1. neighborhood list in Atlanta
    2. venues around each neighborhood
    
In the next step, we are going to explore these neighborhoods by the categories of venues. We will use K-means to cluster neighborhoods and find which cluster is suitable to open an restaurant.

In the final step, we will dig more in this selected cluster. Each neighborhood in this cluster will be analyzed and discussed, and we want to know opening which kind of restaurant in a specific neighborhood could be profitable. We will present a map to tell our results.

### Analysis 

Let's explore these neighborhood by the category of venue!

In [23]:
# one hot encoding
Atlanta_onehot = pd.get_dummies(Atlanta_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Atlanta_onehot.drop(columns=['Neighborhood'],axis=1,inplace=True)
Atlanta_onehot.insert(0,'Neighborhood',Atlanta_venues['Neighborhood'])#['Neighborhood'] = Toronto_venues['Neighborhood'] 
# move neighborhood column to the first column

Atlanta_onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cemetery,Chinese Restaurant,Chocolate Shop,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Basketball Court,College Bookstore,College Rec Center,College Theater,College Track,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Cuban Restaurant,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drive-in Theater,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Truck,Football Stadium,Fountain,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,General Travel,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gun Shop,Gym,Gym / Fitness Center,Gym Pool,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Theater,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Leather Goods Store,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Motel,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Nightclub,Non-Profit,Noodle House,Office,Optical Shop,Other Great Outdoors,Other Repair Shop,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Paintball Field,Paper / Office Supplies Store,Park,Parking,Pawn Shop,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Piercing Parlor,Pizza Place,Planetarium,Playground,Plaza,Poke Place,Pool,Pub,Public Art,Racetrack,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Resort,Restaurant,Road,Rock Club,Roller Rink,Rugby Pitch,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Trail,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Storage Facility,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Taxi,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train,Train Station,Travel & Transport,Tree,Tunnel,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waste Facility,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Adair Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Adair Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Adair Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Adair Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Adair Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [25]:
#And let's examine the new dataframe size.
Atlanta_onehot.shape

(7549, 320)

In [26]:
Altanta_g = Atlanta_onehot.groupby('Neighborhood').sum().reset_index()
Atlanta_grouped = Atlanta_onehot.groupby('Neighborhood').mean().reset_index()
Altanta_g

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Terminal,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cemetery,Chinese Restaurant,Chocolate Shop,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Basketball Court,College Bookstore,College Rec Center,College Theater,College Track,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Cuban Restaurant,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drive-in Theater,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Eye Doctor,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Truck,Football Stadium,Fountain,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,General Travel,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gun Shop,Gym,Gym / Fitness Center,Gym Pool,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,High School,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Theater,Insurance Office,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Leather Goods Store,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Motel,Motorcycle Shop,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Nightclub,Non-Profit,Noodle House,Office,Optical Shop,Other Great Outdoors,Other Repair Shop,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Paintball Field,Paper / Office Supplies Store,Park,Parking,Pawn Shop,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Piercing Parlor,Pizza Place,Planetarium,Playground,Plaza,Poke Place,Pool,Pub,Public Art,Racetrack,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Resort,Restaurant,Road,Rock Club,Roller Rink,Rugby Pitch,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Trail,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Storage Facility,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Taxi,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train,Train Station,Travel & Transport,Tree,Tunnel,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waste Facility,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Adair Park,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,0,0,0,0,0,0,0,0,0,0,1,1,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,4,1,0,0,0,1,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,3,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,4,0,0,3,0,1,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,4,0,1,0,0,0,0,0,0,1,2,0,0,0
1,Adams Park,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,5,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
2,Adamsville,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,2,0,0,0,0,0,0,0,0,3,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0
3,Amal Heights,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,3,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Ansley Park,0,0,1,0,0,0,4,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,2,0,0,0,0,0,1,1,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,1,0,0,2,0,1,0,1,0,0,0,0,4,0,0,0,2,0,0,0,0,0,2,0,1,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,5,1,0,0,1,1,0,0,0,1,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,3,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,2,0,0,1,0,0,0,2,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,4,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,1,4,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0
5,Ardmore,0,0,0,0,0,0,5,0,0,0,0,0,1,2,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,3,0,0,0,0,0,1,3,0,0,0,0,0,0,0,0,0,0,1,1,0,5,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,1,0,0,0,2,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,1,2,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,3,1,0,0,0,2,0,0,1,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,1,0,0,2,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,1,2,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,2,0,0,0,0,0,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,2,0,0
6,Atkins Park,0,0,0,1,0,0,3,0,2,0,0,0,1,0,0,0,1,0,0,0,1,0,1,1,10,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,2,1,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,1,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,3,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,3,0,0,0,0,0,0,0,0,0,1,0,2,1,0,0,0,0,0,2,0,1,0,0,0,0,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,3,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,2,0,0,0,0,1,0,0,6,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0
7,Atlanta Industrial Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Atlanta University Center,0,0,0,0,0,0,1,0,0,0,0,0,5,1,0,0,0,0,0,0,3,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,1,0,0,2,0,1,0,0,0,0,4,0,0,0,0,0,0,0,0,1,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,2,0,0,1,0,0,0,2,0,0,0,0,0,2,0,1,0,0,0,1,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,2,0,2,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,5,0,0,0,0,0,0,0,0,0,4,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,3,0,0,3,0,0,0,1,0,0,0,0,1,0,0,0,0,0,4,1,0,0,0,0,1,2,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0
9,Atlantic Station,0,0,0,0,0,0,5,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,2,0,3,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,4,2,0,0,0,1,0,0,0,0,0,0,1,0,7,0,3,0,1,0,0,0,0,2,2,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,2,0,0,0,0,2,0,0,1,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,1,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,3,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,0,2,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0


In [27]:
# Let's confirm the new size
Atlanta_grouped.shape

(121, 320)

In [28]:
#write a function to sort the venues in descending order.

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    #value = np.core.defchararray.add(row_categories_sorted.index.values[0:num_top_venues], row_categories_sorted[0:num_top_venues])
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [29]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Atlanta_grouped['Neighborhood']

for ind in np.arange(Atlanta_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Atlanta_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(20)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adair Park,Sandwich Place,Vegetarian / Vegan Restaurant,Gas Station,Seafood Restaurant,Discount Store,Trail,Pizza Place,Park,Southern / Soul Food Restaurant,Art Gallery
1,Adams Park,Gas Station,Seafood Restaurant,Chinese Restaurant,Golf Course,American Restaurant,Discount Store,Fast Food Restaurant,Fried Chicken Joint,Liquor Store,Non-Profit
2,Adamsville,Gas Station,Fried Chicken Joint,Discount Store,Fast Food Restaurant,Food,Intersection,Gym / Fitness Center,Clothing Store,Liquor Store,Baseball Field
3,Amal Heights,Gas Station,Trail,Sandwich Place,Discount Store,Seafood Restaurant,Intersection,Park,Thrift / Vintage Store,BBQ Joint,Coffee Shop
4,Ansley Park,Hotel,Southern / Soul Food Restaurant,Park,Thai Restaurant,American Restaurant,Garden,Seafood Restaurant,Music Venue,Food Truck,Café
5,Ardmore,Coffee Shop,American Restaurant,Pizza Place,Gym / Fitness Center,Park,Burrito Place,Mexican Restaurant,Brewery,New American Restaurant,Art Museum
6,Atkins Park,Bar,Trail,Coffee Shop,Pub,Dessert Shop,Italian Restaurant,American Restaurant,Burger Joint,Grocery Store,Event Space
7,Atlanta Industrial Park,Fast Food Restaurant,Gas Station,Pizza Place,Gym / Fitness Center,Bank,Coffee Shop,Seafood Restaurant,Grocery Store,Big Box Store,Mexican Restaurant
8,Atlanta University Center,Park,Art Gallery,Fast Food Restaurant,Southern / Soul Food Restaurant,Pizza Place,Coffee Shop,Seafood Restaurant,BBQ Joint,Trail,Sandwich Place
9,Atlantic Station,Hotel,American Restaurant,Pizza Place,Gym,Southern / Soul Food Restaurant,Seafood Restaurant,Ice Cream Shop,Coffee Shop,Furniture / Home Store,Monument / Landmark


In [30]:
# set number of clusters
kclusters = 10

Atlanta_grouped_clustering = Atlanta_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, n_init = 1200,init = "k-means++").fit(Atlanta_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([2, 0, 5, 7, 1, 1, 1, 2, 1, 1, 5, 1, 0, 6, 6, 6, 8, 2, 0, 4, 1, 2,
       1, 1, 1, 9, 1, 1, 1, 6, 1, 7, 7, 0, 5, 0, 1, 4, 6, 7, 8, 5, 1, 1,
       6, 1, 1, 1, 1, 1, 0, 5, 5, 2, 0, 1, 8, 6, 7, 2, 1, 1, 1, 7, 1, 1,
       2, 7, 7, 1, 7, 1, 1, 0, 1, 6, 1, 1, 7, 3, 3, 1, 5, 1, 7, 7, 9, 8,
       1, 1, 8, 0, 2, 2, 2, 0, 1, 2, 2, 1, 9, 2, 2, 1, 1, 8, 1, 0, 2, 1,
       0, 1, 1, 7, 7, 7, 7, 2, 7, 1, 1], dtype=int32)

In [31]:
# add clustering labels
#neighborhoods_venues_sorted.drop(columns=['Cluster Labels'],axis=1,inplace=True)
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Atlanta_merged = df_n_ll

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
Atlanta_merged = Atlanta_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')


print(Atlanta_merged['Cluster Labels'].unique())
Atlanta_merged.head()

[2 0 5 7 1 6 8 4 9 3]


Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adair Park,33.7247,-84.4111,2,Sandwich Place,Vegetarian / Vegan Restaurant,Gas Station,Seafood Restaurant,Discount Store,Trail,Pizza Place,Park,Southern / Soul Food Restaurant,Art Gallery
1,Adams Park,33.7121,-84.4569,0,Gas Station,Seafood Restaurant,Chinese Restaurant,Golf Course,American Restaurant,Discount Store,Fast Food Restaurant,Fried Chicken Joint,Liquor Store,Non-Profit
2,Adamsville,33.7593,-84.5052,5,Gas Station,Fried Chicken Joint,Discount Store,Fast Food Restaurant,Food,Intersection,Gym / Fitness Center,Clothing Store,Liquor Store,Baseball Field
4,Amal Heights,33.7075,-84.3951,7,Gas Station,Trail,Sandwich Place,Discount Store,Seafood Restaurant,Intersection,Park,Thrift / Vintage Store,BBQ Joint,Coffee Shop
5,Ansley Park,33.7945,-84.3763,1,Hotel,Southern / Soul Food Restaurant,Park,Thai Restaurant,American Restaurant,Garden,Seafood Restaurant,Music Venue,Food Truck,Café


In [32]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
kclusters=kclusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Atlanta_merged['Latitude'], Atlanta_merged['Longitude'], Atlanta_merged['Neighborhood'], Atlanta_merged['Cluster Labels']):
    cluster=int(cluster)
    
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [33]:
Cluster_0 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 0, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster_0

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Adams Park,Gas Station,Seafood Restaurant,Chinese Restaurant,Golf Course,American Restaurant,Discount Store,Fast Food Restaurant,Fried Chicken Joint,Liquor Store,Non-Profit
23,Beecher Hills,Gas Station,Chinese Restaurant,Golf Course,American Restaurant,Fast Food Restaurant,Stadium,Gym / Fitness Center,Liquor Store,Seafood Restaurant,BBQ Joint
36,Boulder Park,Gas Station,Breakfast Spot,Rental Car Location,Nightclub,Gym / Fitness Center,Photography Studio,Sandwich Place,Storage Facility,Intersection,Motel
56,Carey Park,Gas Station,Convenience Store,Park,Scenic Lookout,Fast Food Restaurant,Liquor Store,American Restaurant,Discount Store,Breakfast Spot,Nightclub
61,Cascade Heights,Gas Station,American Restaurant,Discount Store,Fast Food Restaurant,Chinese Restaurant,Golf Course,Convenience Store,Southern / Soul Food Restaurant,Sandwich Place,Electronics Store
88,English Park,Breakfast Spot,Food,Gas Station,Pizza Place,Park,Seafood Restaurant,Scenic Lookout,Soccer Field,Fast Food Restaurant,Waste Facility
97,Fort McPherson,Gas Station,Art Gallery,Breakfast Spot,Vegetarian / Vegan Restaurant,Food,Cajun / Creole Restaurant,Metro Station,Nightclub,Fast Food Restaurant,Seafood Restaurant
142,Mays,Gas Station,American Restaurant,Chinese Restaurant,Discount Store,Golf Course,Fast Food Restaurant,Fried Chicken Joint,Caribbean Restaurant,Mobile Phone Shop,Park
176,Perkerson,Gas Station,Convenience Store,Thrift / Vintage Store,Trail,Park,Art Gallery,Fast Food Restaurant,Breakfast Spot,Non-Profit,Sandwich Place
182,Polar Rock,Gas Station,Fast Food Restaurant,Wings Joint,Discount Store,Pharmacy,Seafood Restaurant,Pizza Place,Sandwich Place,Road,Dry Cleaner


In [36]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster_0.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Breakfast Spot     1
Gas Station       11
dtype: int64

2nd Most Common Venue
American Restaurant     2
Art Gallery             2
Breakfast Spot          1
Chinese Restaurant      1
Convenience Store       2
Fast Food Restaurant    1
Food                    1
Seafood Restaurant      2
dtype: int64

3rd Most Common Venue
Breakfast Spot            1
Chinese Restaurant        2
Discount Store            1
Fast Food Restaurant      1
Gas Station               1
Golf Course               1
Park                      1
Rental Car Location       1
Thrift / Vintage Store    1
Trail                     1
Wings Joint               1
dtype: int64

4th Most Common Venue
American Restaurant              1
Discount Store                   3
Fast Food Restaurant             1
Golf Course                      1
Nightclub                        1
Park                             1
Pizza Place                      1
Scenic Lookout                   1
Trail                           

In [37]:
Cluster1 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 1, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster1

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Ansley Park,Hotel,Southern / Soul Food Restaurant,Park,Thai Restaurant,American Restaurant,Garden,Seafood Restaurant,Music Venue,Food Truck,Café
7,Ardmore,Coffee Shop,American Restaurant,Pizza Place,Gym / Fitness Center,Park,Burrito Place,Mexican Restaurant,Brewery,New American Restaurant,Art Museum
12,Atkins Park,Bar,Trail,Coffee Shop,Pub,Dessert Shop,Italian Restaurant,American Restaurant,Burger Joint,Grocery Store,Event Space
14,Atlanta University Center,Park,Art Gallery,Fast Food Restaurant,Southern / Soul Food Restaurant,Pizza Place,Coffee Shop,Seafood Restaurant,BBQ Joint,Trail,Sandwich Place
15,Atlantic Station,Hotel,American Restaurant,Pizza Place,Gym,Southern / Soul Food Restaurant,Seafood Restaurant,Ice Cream Shop,Coffee Shop,Furniture / Home Store,Monument / Landmark
20,Bankhead,Art Gallery,Gym,Gas Station,Mexican Restaurant,Pizza Place,Gastropub,Restaurant,Coffee Shop,Furniture / Home Store,Convenience Store
39,Brentwood,Italian Restaurant,Steakhouse,Spa,Pizza Place,New American Restaurant,American Restaurant,Hotel,Tapas Restaurant,Breakfast Spot,Coffee Shop
41,Brookhaven,Department Store,American Restaurant,Sporting Goods Shop,Cosmetics Shop,Shopping Mall,Pizza Place,Italian Restaurant,Kitchen Supply Store,Sandwich Place,Burger Joint
43,Brookwood,Pizza Place,American Restaurant,Coffee Shop,Park,Gym / Fitness Center,Mexican Restaurant,New American Restaurant,Brewery,Burrito Place,Shopping Plaza
44,Brookwood Hills,American Restaurant,Pizza Place,Park,Gym / Fitness Center,Coffee Shop,Trail,New American Restaurant,Mexican Restaurant,Music Venue,Art Museum


In [38]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster1.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
American Restaurant                3
Art Gallery                        2
BBQ Joint                          2
Bar                                4
Brewery                            1
Clothing Store                     1
Coffee Shop                        5
Department Store                   1
Furniture / Home Store             1
Gym                                1
Gym / Fitness Center               1
Hotel                              5
Italian Restaurant                 5
Mexican Restaurant                 2
Park                               6
Pizza Place                        4
Southern / Soul Food Restaurant    1
Trail                              3
dtype: int64

2nd Most Common Venue
American Restaurant                7
Aquarium                           1
Art Gallery                        2
BBQ Joint                          1
Bar                                2
Breakfast Spot                     1
Café                               1
Coffee Shop      

In [39]:
Cluster2 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 2, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster2

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adair Park,Sandwich Place,Vegetarian / Vegan Restaurant,Gas Station,Seafood Restaurant,Discount Store,Trail,Pizza Place,Park,Southern / Soul Food Restaurant,Art Gallery
13,Atlanta Industrial Park,Fast Food Restaurant,Gas Station,Pizza Place,Gym / Fitness Center,Bank,Coffee Shop,Seafood Restaurant,Grocery Store,Big Box Store,Mexican Restaurant
34,Bolton,Pizza Place,Sandwich Place,Mexican Restaurant,Train Station,Brewery,Café,Gym / Fitness Center,Juice Bar,Recording Studio,Discount Store
40,Briar Glen,Discount Store,Fast Food Restaurant,American Restaurant,Department Store,Gas Station,Wings Joint,Sandwich Place,Shoe Store,Cosmetics Shop,Spa
95,Fernleaf,Pizza Place,Sandwich Place,Brewery,Gym / Fitness Center,Café,Pharmacy,Discount Store,American Restaurant,Gas Station,Taco Place
108,Hammond Park,Gas Station,Discount Store,Wings Joint,Sandwich Place,Fried Chicken Joint,Pizza Place,Seafood Restaurant,Fast Food Restaurant,Intersection,Pharmacy
128,Lake Clair,Fast Food Restaurant,Pub,Convenience Store,Video Store,General Entertainment,BBQ Joint,Park,Multiplex,Liquor Store,Gas Station
178,Piedmont Heights,Fast Food Restaurant,Discount Store,Southern / Soul Food Restaurant,Breakfast Spot,Grocery Store,Clothing Store,Supermarket,Fried Chicken Joint,Pharmacy,Music Venue
180,Pittsburgh,Gas Station,Park,Brewery,Discount Store,Trail,Vegetarian / Vegan Restaurant,Chinese Restaurant,Pizza Place,Convenience Store,Restaurant
181,Pleasant Hill,Mexican Restaurant,Sandwich Place,Fast Food Restaurant,Pizza Place,Discount Store,Breakfast Spot,Convenience Store,Gas Station,Business Service,Indian Restaurant


In [40]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster2.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Discount Store          1
Fast Food Restaurant    3
Gas Station             2
Grocery Store           1
Gym / Fitness Center    1
Mexican Restaurant      1
Park                    1
Pizza Place             5
Sandwich Place          1
dtype: int64

2nd Most Common Venue
Brewery                          1
Convenience Store                1
Discount Store                   3
Fast Food Restaurant             1
Gas Station                      2
Gym / Fitness Center             1
Park                             1
Pub                              1
Salon / Barbershop               1
Sandwich Place                   3
Vegetarian / Vegan Restaurant    1
dtype: int64

3rd Most Common Venue
American Restaurant                1
Brewery                            2
Convenience Store                  1
Discount Store                     1
Fast Food Restaurant               2
Gas Station                        1
Italian Restaurant                 1
Juice Bar                   

In [41]:
Cluster3 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 3, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster3

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
155,Niskey Cove,Pizza Place,Pharmacy,Spa,Rugby Pitch,Park,Seafood Restaurant,Gym / Fitness Center,Chinese Restaurant,Fast Food Restaurant,Theater
156,Niskey Lake,Chinese Restaurant,Pizza Place,Spa,Pharmacy,Seafood Restaurant,Gym / Fitness Center,Rugby Pitch,Park,Wings Joint,Music Store


In [42]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster3.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Chinese Restaurant    1
Pizza Place           1
dtype: int64

2nd Most Common Venue
Pharmacy       1
Pizza Place    1
dtype: int64

3rd Most Common Venue
Spa    2
dtype: int64

4th Most Common Venue
Pharmacy       1
Rugby Pitch    1
dtype: int64

5th Most Common Venue
Park                  1
Seafood Restaurant    1
dtype: int64

6th Most Common Venue
Gym / Fitness Center    1
Seafood Restaurant      1
dtype: int64

7th Most Common Venue
Gym / Fitness Center    1
Rugby Pitch             1
dtype: int64

8th Most Common Venue
Chinese Restaurant    1
Park                  1
dtype: int64

9th Most Common Venue
Fast Food Restaurant    1
Wings Joint             1
dtype: int64

10th Most Common Venue
Music Store    1
Theater        1
dtype: int64



In [43]:
Cluster4 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 4, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster4

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
38,Brandon,Golf Course,Park,Gym / Fitness Center,Pool,General Travel,Playground,Tennis Court,Ski Trail,New American Restaurant,Pet Store
63,Castlewood,Park,Non-Profit,Pet Store,Ski Trail,Dessert Shop,American Restaurant,New American Restaurant,Soccer Stadium,Eye Doctor,Falafel Restaurant


In [44]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster4.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Golf Course    1
Park           1
dtype: int64

2nd Most Common Venue
Non-Profit    1
Park          1
dtype: int64

3rd Most Common Venue
Gym / Fitness Center    1
Pet Store               1
dtype: int64

4th Most Common Venue
Pool         1
Ski Trail    1
dtype: int64

5th Most Common Venue
Dessert Shop      1
General Travel    1
dtype: int64

6th Most Common Venue
American Restaurant    1
Playground             1
dtype: int64

7th Most Common Venue
New American Restaurant    1
Tennis Court               1
dtype: int64

8th Most Common Venue
Ski Trail         1
Soccer Stadium    1
dtype: int64

9th Most Common Venue
Eye Doctor                 1
New American Restaurant    1
dtype: int64

10th Most Common Venue
Falafel Restaurant    1
Pet Store             1
dtype: int64



In [45]:
Cluster5 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 5, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster5

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Adamsville,Gas Station,Fried Chicken Joint,Discount Store,Fast Food Restaurant,Food,Intersection,Gym / Fitness Center,Clothing Store,Liquor Store,Baseball Field
19,Bakers Ferry,Gas Station,Fried Chicken Joint,Intersection,Discount Store,Food,Liquor Store,Fast Food Restaurant,Gym / Fitness Center,Sandwich Place,Baseball Field
57,Carroll Heights,Gas Station,Discount Store,Liquor Store,Airport Terminal,Fried Chicken Joint,Chinese Restaurant,Fast Food Restaurant,Intersection,Sandwich Place,Laundromat
70,Collier Heights,Gas Station,Liquor Store,Discount Store,Fried Chicken Joint,Intersection,Airport Terminal,Fast Food Restaurant,Food,Roller Rink,Laundromat
89,Fairburn,Gas Station,Liquor Store,Fast Food Restaurant,Discount Store,Fried Chicken Joint,Airport Terminal,Intersection,Seafood Restaurant,Airport,American Restaurant
90,Fairburn Heights,Gas Station,Liquor Store,Fast Food Restaurant,Discount Store,Fried Chicken Joint,Airport Terminal,Intersection,Seafood Restaurant,Airport,American Restaurant
159,Oakcliff,Discount Store,Fried Chicken Joint,Fast Food Restaurant,Gas Station,ATM,Roller Rink,Food,Chinese Restaurant,Clothing Store,Seafood Restaurant


In [46]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster5.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Discount Store    1
Gas Station       6
dtype: int64

2nd Most Common Venue
Discount Store         1
Fried Chicken Joint    3
Liquor Store           3
dtype: int64

3rd Most Common Venue
Discount Store          2
Fast Food Restaurant    3
Intersection            1
Liquor Store            1
dtype: int64

4th Most Common Venue
Airport Terminal        1
Discount Store          3
Fast Food Restaurant    1
Fried Chicken Joint     1
Gas Station             1
dtype: int64

5th Most Common Venue
ATM                    1
Food                   2
Fried Chicken Joint    3
Intersection           1
dtype: int64

6th Most Common Venue
Airport Terminal      3
Chinese Restaurant    1
Intersection          1
Liquor Store          1
Roller Rink           1
dtype: int64

7th Most Common Venue
Fast Food Restaurant    3
Food                    1
Gym / Fitness Center    1
Intersection            2
dtype: int64

8th Most Common Venue
Chinese Restaurant      1
Clothing Store          1
F

In [47]:
Cluster6 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 6, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster6

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,Ben Hill,Gas Station,Discount Store,Cosmetics Shop,Park,Fast Food Restaurant,Fried Chicken Joint,Southern / Soul Food Restaurant,Bank,Tailor Shop,Bridal Shop
26,Ben Hill Forest,Gas Station,Discount Store,Cosmetics Shop,Park,Fast Food Restaurant,Fried Chicken Joint,Southern / Soul Food Restaurant,Bank,Tailor Shop,Bridal Shop
28,Ben Hill Terrace,Gas Station,Discount Store,Cosmetics Shop,Park,Fast Food Restaurant,Fried Chicken Joint,Southern / Soul Food Restaurant,Bank,Tailor Shop,Bridal Shop
51,Campbellton Road,Fast Food Restaurant,Gas Station,Bank,Park,Cosmetics Shop,Discount Store,Southern / Soul Food Restaurant,Pharmacy,Fried Chicken Joint,Accessories Store
64,Center Hill,Discount Store,Steakhouse,Pharmacy,Nightclub,Fast Food Restaurant,Gas Station,Chinese Restaurant,Clothing Store,Liquor Store,Park
76,Deerwood,Clothing Store,Discount Store,Fast Food Restaurant,Park,Gas Station,Cosmetics Shop,Department Store,BBQ Joint,Tailor Shop,Southern / Soul Food Restaurant
105,Greenbriar,Cosmetics Shop,Fast Food Restaurant,Discount Store,Clothing Store,Bank,Department Store,Seafood Restaurant,Pizza Place,Chinese Restaurant,Optical Shop
145,Mellwood,Fast Food Restaurant,Bank,Gas Station,Accessories Store,Pharmacy,Discount Store,Cosmetics Shop,Park,Fried Chicken Joint,Southern / Soul Food Restaurant


In [48]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster6.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Clothing Store          1
Cosmetics Shop          1
Discount Store          1
Fast Food Restaurant    2
Gas Station             3
dtype: int64

2nd Most Common Venue
Bank                    1
Discount Store          4
Fast Food Restaurant    1
Gas Station             1
Steakhouse              1
dtype: int64

3rd Most Common Venue
Bank                    1
Cosmetics Shop          3
Discount Store          1
Fast Food Restaurant    1
Gas Station             1
Pharmacy                1
dtype: int64

4th Most Common Venue
Accessories Store    1
Clothing Store       1
Nightclub            1
Park                 5
dtype: int64

5th Most Common Venue
Bank                    1
Cosmetics Shop          1
Fast Food Restaurant    4
Gas Station             1
Pharmacy                1
dtype: int64

6th Most Common Venue
Cosmetics Shop         1
Department Store       1
Discount Store         2
Fried Chicken Joint    3
Gas Station            1
dtype: int64

7th Most Common Venue

In [49]:
Cluster7 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 7, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster7

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Amal Heights,Gas Station,Trail,Sandwich Place,Discount Store,Seafood Restaurant,Intersection,Park,Thrift / Vintage Store,BBQ Joint,Coffee Shop
54,Capitol View,Gas Station,Seafood Restaurant,Liquor Store,Convenience Store,Park,Trail,Brewery,Sandwich Place,Chinese Restaurant,Non-Profit
55,Capitol View Manor,Gas Station,Seafood Restaurant,Liquor Store,Convenience Store,Park,Trail,Brewery,Sandwich Place,Chinese Restaurant,Non-Profit
68,Chattahoochee,Park,Hotel,Gym / Fitness Center,Yoga Studio,Rental Car Location,Intersection,Convenience Store,Restaurant,Gas Station,Scenic Lookout
107,Grove Park,Gas Station,Convenience Store,BBQ Joint,Mobile Phone Shop,Other Great Outdoors,Automotive Shop,Park,Smoke Shop,Discount Store,Pharmacy
122,Joyland,Sandwich Place,Gas Station,Seafood Restaurant,Park,Music Venue,Fish & Chips Shop,Bowling Alley,Café,Caribbean Restaurant,Non-Profit
130,Lakewood,Gas Station,Intersection,Discount Store,Restaurant,Toy / Game Store,Trail,Park,Fast Food Restaurant,Dog Run,Seafood Restaurant
131,Lakewood Heights,Park,Gas Station,Liquor Store,Bar,Paintball Field,Sandwich Place,Trail,Smoke Shop,Coffee Shop,Mexican Restaurant
135,Lincoln Homes,Park,Gas Station,Convenience Store,Recording Studio,Café,Seafood Restaurant,Scenic Lookout,Coffee Shop,Restaurant,Rental Car Location
151,Mozley Park,Gas Station,Trail,Park,Convenience Store,Discount Store,Fried Chicken Joint,Pizza Place,Wings Joint,Department Store,Tennis Court


In [51]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster7.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Convenience Store    2
Discount Store       1
Gas Station          8
Park                 3
Sandwich Place       1
Trail                2
dtype: int64

2nd Most Common Venue
Convenience Store     1
Discount Store        1
Gas Station           7
Hotel                 1
Intersection          1
Seafood Restaurant    2
Trail                 4
dtype: int64

3rd Most Common Venue
BBQ Joint               1
Convenience Store       1
Discount Store          2
Gym / Fitness Center    1
Liquor Store            3
Park                    5
Sandwich Place          1
Seafood Restaurant      1
Trail                   2
dtype: int64

4th Most Common Venue
Art Gallery           1
Bar                   1
Convenience Store     3
Discount Store        2
Mobile Phone Shop     1
Park                  3
Pizza Place           1
Recording Studio      1
Restaurant            1
Seafood Restaurant    1
Trail                 1
Yoga Studio           1
dtype: int64

5th Most Common Venue
Café  

In [52]:
Cluster8 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 8, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster8

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
29,Benteen Park,Zoo Exhibit,Mexican Restaurant,Pizza Place,Discount Store,Fast Food Restaurant,Trail,Fried Chicken Joint,Southern / Soul Food Restaurant,Grocery Store,Video Store
69,Chosewood Park,Zoo Exhibit,Coffee Shop,Mexican Restaurant,Trail,Music Venue,Theme Park,Sandwich Place,Chinese Restaurant,Paintball Field,Fast Food Restaurant
102,Grant Park,Zoo Exhibit,Coffee Shop,Trail,Park,Mexican Restaurant,Pizza Place,BBQ Joint,Breakfast Spot,Yoga Studio,Taco Place
166,Ormewood Park,Zoo Exhibit,Coffee Shop,Bar,Trail,Pizza Place,Grocery Store,Park,Restaurant,Gym / Fitness Center,Discount Store
175,Peoplestown,Zoo Exhibit,Historic Site,Coffee Shop,Gas Station,Theme Park,BBQ Joint,Music Venue,Burger Joint,Chinese Restaurant,Sandwich Place
206,Summerhill,Zoo Exhibit,Coffee Shop,Mexican Restaurant,BBQ Joint,Burger Joint,Seafood Restaurant,Fast Food Restaurant,Music Venue,Breakfast Spot,Bar


In [53]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster8.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Zoo Exhibit    6
dtype: int64

2nd Most Common Venue
Coffee Shop           4
Historic Site         1
Mexican Restaurant    1
dtype: int64

3rd Most Common Venue
Bar                   1
Coffee Shop           1
Mexican Restaurant    2
Pizza Place           1
Trail                 1
dtype: int64

4th Most Common Venue
BBQ Joint         1
Discount Store    1
Gas Station       1
Park              1
Trail             2
dtype: int64

5th Most Common Venue
Burger Joint            1
Fast Food Restaurant    1
Mexican Restaurant      1
Music Venue             1
Pizza Place             1
Theme Park              1
dtype: int64

6th Most Common Venue
BBQ Joint             1
Grocery Store         1
Pizza Place           1
Seafood Restaurant    1
Theme Park            1
Trail                 1
dtype: int64

7th Most Common Venue
BBQ Joint               1
Fast Food Restaurant    1
Fried Chicken Joint     1
Music Venue             1
Park                    1
Sandwich Place         

In [54]:
Cluster9 = Atlanta_merged.loc[Atlanta_merged['Cluster Labels'] == 9, Atlanta_merged.columns[[0] + list(range(4, Atlanta_merged.shape[1]))]]
Cluster9

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,Browns Mill Park,Intersection,Fast Food Restaurant,Gas Station,Fried Chicken Joint,Liquor Store,Sandwich Place,Grocery Store,Trail,Park,Dog Run
165,Orchard Knob,Intersection,Hotel,Fast Food Restaurant,Golf Course,Liquor Store,Shop & Service,Grocery Store,Park,Trail,Sandwich Place
194,Rosedale Heights,Intersection,Fast Food Restaurant,Chinese Restaurant,Storage Facility,Grocery Store,Sandwich Place,Gas Station,Shop & Service,Fried Chicken Joint,Construction & Landscaping


In [55]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

    print(Cluster9.groupby(columns[ind+1]).size())
    print('')

1st Most Common Venue
Intersection    3
dtype: int64

2nd Most Common Venue
Fast Food Restaurant    2
Hotel                   1
dtype: int64

3rd Most Common Venue
Chinese Restaurant      1
Fast Food Restaurant    1
Gas Station             1
dtype: int64

4th Most Common Venue
Fried Chicken Joint    1
Golf Course            1
Storage Facility       1
dtype: int64

5th Most Common Venue
Grocery Store    1
Liquor Store     2
dtype: int64

6th Most Common Venue
Sandwich Place    2
Shop & Service    1
dtype: int64

7th Most Common Venue
Gas Station      1
Grocery Store    2
dtype: int64

8th Most Common Venue
Park              1
Shop & Service    1
Trail             1
dtype: int64

9th Most Common Venue
Fried Chicken Joint    1
Park                   1
Trail                  1
dtype: int64

10th Most Common Venue
Construction & Landscaping    1
Dog Run                       1
Sandwich Place                1
dtype: int64



## Based on the results above, we think Cluster 2 is very suitable to open a restaurant. Because there are a lot of restaurant around which means these neighborhoods could be known as foodie place. And thus, people would consider this area first to grab food and drinks.

Therefore, let's explore Cluster 2 more.

In [56]:
Cluster2_merged = Atlanta_merged.loc[Atlanta_merged['Cluster Labels']==2].reset_index()
print(Cluster2_merged.shape)
Cluster2_merged

(16, 15)


Unnamed: 0,index,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Adair Park,33.7247,-84.4111,2,Sandwich Place,Vegetarian / Vegan Restaurant,Gas Station,Seafood Restaurant,Discount Store,Trail,Pizza Place,Park,Southern / Soul Food Restaurant,Art Gallery
1,13,Atlanta Industrial Park,33.1162,-94.1663,2,Fast Food Restaurant,Gas Station,Pizza Place,Gym / Fitness Center,Bank,Coffee Shop,Seafood Restaurant,Grocery Store,Big Box Store,Mexican Restaurant
2,34,Bolton,33.8143,-84.4533,2,Pizza Place,Sandwich Place,Mexican Restaurant,Train Station,Brewery,Café,Gym / Fitness Center,Juice Bar,Recording Studio,Discount Store
3,40,Briar Glen,33.6967,-84.2765,2,Discount Store,Fast Food Restaurant,American Restaurant,Department Store,Gas Station,Wings Joint,Sandwich Place,Shoe Store,Cosmetics Shop,Spa
4,95,Fernleaf,33.8206,-84.4417,2,Pizza Place,Sandwich Place,Brewery,Gym / Fitness Center,Café,Pharmacy,Discount Store,American Restaurant,Gas Station,Taco Place
5,108,Hammond Park,33.6751,-84.4013,2,Gas Station,Discount Store,Wings Joint,Sandwich Place,Fried Chicken Joint,Pizza Place,Seafood Restaurant,Fast Food Restaurant,Intersection,Pharmacy
6,128,Lake Clair,38.5049,-90.0036,2,Fast Food Restaurant,Pub,Convenience Store,Video Store,General Entertainment,BBQ Joint,Park,Multiplex,Liquor Store,Gas Station
7,178,Piedmont Heights,36.0412,-79.8072,2,Fast Food Restaurant,Discount Store,Southern / Soul Food Restaurant,Breakfast Spot,Grocery Store,Clothing Store,Supermarket,Fried Chicken Joint,Pharmacy,Music Venue
8,180,Pittsburgh,33.7285,-84.4044,2,Gas Station,Park,Brewery,Discount Store,Trail,Vegetarian / Vegan Restaurant,Chinese Restaurant,Pizza Place,Convenience Store,Restaurant
9,181,Pleasant Hill,33.9198,-84.1705,2,Mexican Restaurant,Sandwich Place,Fast Food Restaurant,Pizza Place,Discount Store,Breakfast Spot,Convenience Store,Gas Station,Business Service,Indian Restaurant


In [57]:
# create map
map_cluster2 = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
kclusters=kclusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Cluster2_merged['Latitude'], Cluster2_merged['Longitude'], Cluster2_merged['Neighborhood'], Cluster2_merged['Cluster Labels']):
    cluster=int(cluster)
    
    label = folium.Popup(str(poi), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_cluster2)
       
map_cluster2

In [58]:
# get the location of Gatech
address = 'Georgia Institute of Technology, Atlanta, GA, USA'
location = geolocator.geocode(address)
gt_latitude = location.latitude
gt_longitude = location.longitude
print('The geograpical coordinate of Gatech are {}, {}.'.format(gt_latitude, gt_longitude))

The geograpical coordinate of Gatech are 33.776033, -84.3988408600158.


In [59]:
# get the location of Atlantic station
address = 'Atlantic station, Atlanta, GA, USA'
location = geolocator.geocode(address)
as_latitude = location.latitude
as_longitude = location.longitude
print('The geograpical coordinate of Atlantic Station are {}, {}.'.format(as_latitude, as_longitude))

The geograpical coordinate of Atlantic Station are 33.79262625, -84.3962243623586.


In [60]:
# get the location of Midtown
address = 'Midtown, Atlanta, GA, USA'
location = geolocator.geocode(address)
mt_latitude = location.latitude
mt_longitude = location.longitude
print('The geograpical coordinate of Midtown are {}, {}.'.format(mt_latitude, mt_longitude))

The geograpical coordinate of Midtown are 33.7811275, -84.38636.


In [61]:
# define a function to see whether satisfy the criteria above
import geopy.distance

def distance(lat,lon):
    coords = [lat,lon]
    coords_gt = [gt_latitude, gt_longitude]
    coords_as = [as_latitude, as_longitude]
    coords_mt = [mt_latitude, mt_longitude]
    
    dis_gt = geopy.distance.geodesic(coords,coords_gt).km
    dis_as = geopy.distance.geodesic(coords,coords_as).km
    dis_mt = geopy.distance.geodesic(coords,coords_mt).km
    
    if dis_gt < 3 and dis_as < 3 and dis_mt < 3:
        return 'True'
    else:
        return 'False'

In [62]:
df_criteria = []
for lat,lon in zip(Cluster2_merged['Latitude'], Cluster2_merged['Longitude']):
    df_criteria.append(distance(lat,lon))

df_criteria = pd.DataFrame(df_criteria,columns=['Flag'])
df_criteria.head()

Unnamed: 0,Flag
0,False
1,False
2,False
3,False
4,False


In [63]:
Cluster2_filter =Cluster2_merged
Cluster2_filter['Flag']=df_criteria
Cluster2_filter.head()

Unnamed: 0,index,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Flag
0,0,Adair Park,33.7247,-84.4111,2,Sandwich Place,Vegetarian / Vegan Restaurant,Gas Station,Seafood Restaurant,Discount Store,Trail,Pizza Place,Park,Southern / Soul Food Restaurant,Art Gallery,False
1,13,Atlanta Industrial Park,33.1162,-94.1663,2,Fast Food Restaurant,Gas Station,Pizza Place,Gym / Fitness Center,Bank,Coffee Shop,Seafood Restaurant,Grocery Store,Big Box Store,Mexican Restaurant,False
2,34,Bolton,33.8143,-84.4533,2,Pizza Place,Sandwich Place,Mexican Restaurant,Train Station,Brewery,Café,Gym / Fitness Center,Juice Bar,Recording Studio,Discount Store,False
3,40,Briar Glen,33.6967,-84.2765,2,Discount Store,Fast Food Restaurant,American Restaurant,Department Store,Gas Station,Wings Joint,Sandwich Place,Shoe Store,Cosmetics Shop,Spa,False
4,95,Fernleaf,33.8206,-84.4417,2,Pizza Place,Sandwich Place,Brewery,Gym / Fitness Center,Café,Pharmacy,Discount Store,American Restaurant,Gas Station,Taco Place,False


In [64]:
Cluster2_filter = Cluster2_filter[Cluster2_filter['Flag']=='True']

Cluster2_filter

Unnamed: 0,index,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Flag


In [65]:
# create map
map_cluster2_filter = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=kclusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Cluster2_filter['Latitude'], Cluster2_filter['Longitude'], Cluster2_filter['Neighborhood'], Cluster2_filter['Cluster Labels']):
    cluster=int(cluster)
    
    label = folium.Popup(str(poi), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_cluster2_filter)
       
map_cluster2_filter

In [66]:
Ansley_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Ansley Park'].reset_index()
Ansley_venues

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,190,Ansley Park,33.79455,-84.376315,Atlanta Botanical Garden,33.790112,-84.373023,Botanical Garden
1,191,Ansley Park,33.79455,-84.376315,The Cook's Warehouse,33.798724,-84.371601,Food & Drink Shop
2,192,Ansley Park,33.79455,-84.376315,Fuqua Orchid Center,33.78886,-84.37538,Garden
3,193,Ansley Park,33.79455,-84.376315,Cascades Garden,33.791002,-84.373586,Garden
4,194,Ansley Park,33.79455,-84.376315,Jiao,33.795864,-84.368896,Massage Studio
5,195,Ansley Park,33.79455,-84.376315,Varuni-Napoli,33.796621,-84.368724,Pizza Place
6,196,Ansley Park,33.79455,-84.376315,Piedmont Driving Club,33.787868,-84.377264,Golf Course
7,197,Ansley Park,33.79455,-84.376315,Ansley Golf Club,33.800246,-84.375906,Golf Course
8,198,Ansley Park,33.79455,-84.376315,Piedmont Park Legacy Fountain,33.791064,-84.371519,Fountain
9,199,Ansley Park,33.79455,-84.376315,Skyline Garden,33.788347,-84.375116,Garden


In [67]:
Ansley_restaurant = Ansley_venues[Ansley_venues['Venue Category'].str.contains('Restaurant')]
Ansley_restaurant

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
16,206,Ansley Park,33.79455,-84.376315,Atmosphere,33.798765,-84.368272,French Restaurant
18,208,Ansley Park,33.79455,-84.376315,Longleaf,33.790094,-84.373721,Restaurant
28,218,Ansley Park,33.79455,-84.376315,Bantam & Biddy,33.798146,-84.370878,Southern / Soul Food Restaurant
32,222,Ansley Park,33.79455,-84.376315,South City Kitchen,33.785973,-84.384329,Southern / Soul Food Restaurant
35,225,Ansley Park,33.79455,-84.376315,Bangkok Thai Restaurant,33.795677,-84.370838,Thai Restaurant
38,228,Ansley Park,33.79455,-84.376315,The Nook on Piedmont Park,33.786036,-84.378383,American Restaurant
41,231,Ansley Park,33.79455,-84.376315,Joy Cafe,33.784921,-84.382922,Southern / Soul Food Restaurant
45,235,Ansley Park,33.79455,-84.376315,Lure,33.784972,-84.384408,Seafood Restaurant
49,239,Ansley Park,33.79455,-84.376315,Nan Thai Fine Dining,33.791736,-84.38941,Thai Restaurant
51,241,Ansley Park,33.79455,-84.376315,HOBNOB,33.796767,-84.368932,American Restaurant


In [68]:
AP_cate = Ansley_restaurant['Venue Category'].unique()
print(AP_cate)
len(AP_cate)

['French Restaurant' 'Restaurant' 'Southern / Soul Food Restaurant'
 'Thai Restaurant' 'American Restaurant' 'Seafood Restaurant'
 'New American Restaurant' 'Indian Restaurant' 'Tapas Restaurant'
 'Mediterranean Restaurant' 'Italian Restaurant' 'Tex-Mex Restaurant']


12

In [69]:
# create map
map_ap_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(AP_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(Ansley_restaurant['Venue Latitude'], Ansley_restaurant['Venue Longitude'], Ansley_restaurant['Venue Category']):
    colorcode=np.where(AP_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_ap_restaurant)
       
map_ap_restaurant

In [70]:
print(Ansley_restaurant['Venue Category'].value_counts())

Southern / Soul Food Restaurant    4
Thai Restaurant                    4
American Restaurant                4
Seafood Restaurant                 3
New American Restaurant            2
French Restaurant                  1
Mediterranean Restaurant           1
Tapas Restaurant                   1
Indian Restaurant                  1
Tex-Mex Restaurant                 1
Restaurant                         1
Italian Restaurant                 1
Name: Venue Category, dtype: int64


In [71]:
AS_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Atlantic Station'].reset_index()
AS_venues

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,611,Atlantic Station,33.792626,-84.396224,Kilwin's Chocolates & Ice Cream,33.793002,-84.397461,Ice Cream Shop
1,612,Atlantic Station,33.792626,-84.396224,Land of a Thousand Hills Coffee,33.793309,-84.395859,Coffee Shop
2,613,Atlantic Station,33.792626,-84.396224,Atlantic Station,33.791964,-84.396923,Shopping Plaza
3,614,Atlantic Station,33.792626,-84.396224,Cirque Du Soleil - Luzia,33.794722,-84.395228,Circus
4,615,Atlantic Station,33.792626,-84.396224,LOBBY at TWELVE,33.791786,-84.397718,Pizza Place
5,616,Atlantic Station,33.792626,-84.396224,Yard House,33.793421,-84.397394,American Restaurant
6,617,Atlantic Station,33.792626,-84.396224,Target,33.792608,-84.399644,Big Box Store
7,618,Atlantic Station,33.792626,-84.396224,Millennium Gate,33.790989,-84.399977,Monument / Landmark
8,619,Atlantic Station,33.792626,-84.396224,"Regal Atlantic Station ScreenX, IMAX, RPX & VIP",33.793484,-84.396207,Movie Theater
9,620,Atlantic Station,33.792626,-84.396224,Atlanta United Team Store,33.792912,-84.397442,Clothing Store


In [72]:
AS_restaurant = AS_venues[AS_venues['Venue Category'].str.contains('Restaurant')].reset_index()
print(AS_restaurant.shape)
AS_restaurant.head()


(25, 9)


Unnamed: 0,level_0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,5,616,Atlantic Station,33.792626,-84.396224,Yard House,33.793421,-84.397394,American Restaurant
1,12,623,Atlantic Station,33.792626,-84.396224,endive Publik house,33.794986,-84.400072,American Restaurant
2,16,627,Atlantic Station,33.792626,-84.396224,Nan Thai Fine Dining,33.791736,-84.38941,Thai Restaurant
3,27,638,Atlantic Station,33.792626,-84.396224,Wagaya Japanese Restaurant,33.786332,-84.398216,Japanese Restaurant
4,35,646,Atlantic Station,33.792626,-84.396224,Tuk Tuk Thai Food Loft,33.801657,-84.392464,Thai Restaurant


In [73]:
AS_cate = AS_restaurant['Venue Category'].unique()
print(AS_cate)
len(AS_cate)

['American Restaurant' 'Thai Restaurant' 'Japanese Restaurant'
 'Italian Restaurant' 'Mexican Restaurant'
 'Southern / Soul Food Restaurant' 'Seafood Restaurant'
 'Vegetarian / Vegan Restaurant' 'Indian Restaurant'
 'Fast Food Restaurant' 'Tapas Restaurant' 'Mediterranean Restaurant'
 'Spanish Restaurant']


13

In [74]:
#show on the map
# create map
map_as_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(AS_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(AS_restaurant['Venue Latitude'], AS_restaurant['Venue Longitude'], AS_restaurant['Venue Category']):
    colorcode=np.where(AS_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_as_restaurant)
       
map_as_restaurant

In [75]:
print(AS_restaurant['Venue Category'].value_counts())

American Restaurant                5
Southern / Soul Food Restaurant    3
Seafood Restaurant                 3
Italian Restaurant                 2
Mexican Restaurant                 2
Thai Restaurant                    2
Japanese Restaurant                2
Vegetarian / Vegan Restaurant      1
Fast Food Restaurant               1
Tapas Restaurant                   1
Indian Restaurant                  1
Mediterranean Restaurant           1
Spanish Restaurant                 1
Name: Venue Category, dtype: int64


In [76]:
bw_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Brookwood'].reset_index()
bw_venues


Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,1329,Brookwood,33.802605,-84.392983,Tuk Tuk Thai Food Loft,33.801657,-84.392464,Thai Restaurant
1,1330,Brookwood,33.802605,-84.392983,R. Thomas' Deluxe Grill,33.804326,-84.393649,Vegetarian / Vegan Restaurant
2,1331,Brookwood,33.802605,-84.392983,Egg Harbor Cafe,33.805171,-84.393905,Breakfast Spot
3,1332,Brookwood,33.802605,-84.392983,Chipotle Mexican Grill,33.801615,-84.392751,Mexican Restaurant
4,1333,Brookwood,33.802605,-84.392983,barre3,33.801617,-84.392393,Gym / Fitness Center
5,1334,Brookwood,33.802605,-84.392983,Mellow Mushroom,33.8024,-84.393222,Pizza Place
6,1335,Brookwood,33.802605,-84.392983,Starbucks,33.805975,-84.393536,Coffee Shop
7,1336,Brookwood,33.802605,-84.392983,Ted's Montana Grill,33.80623,-84.394343,American Restaurant
8,1337,Brookwood,33.802605,-84.392983,El Azteca,33.803372,-84.393284,Mexican Restaurant
9,1338,Brookwood,33.802605,-84.392983,Bell Street Burritos,33.80468,-84.39376,Burrito Place


In [77]:
bw_restaurant = bw_venues[bw_venues['Venue Category'].str.contains('Restaurant')].reset_index()
print(bw_restaurant.shape)
bw_restaurant.head()

(22, 9)


Unnamed: 0,level_0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,0,1329,Brookwood,33.802605,-84.392983,Tuk Tuk Thai Food Loft,33.801657,-84.392464,Thai Restaurant
1,1,1330,Brookwood,33.802605,-84.392983,R. Thomas' Deluxe Grill,33.804326,-84.393649,Vegetarian / Vegan Restaurant
2,3,1332,Brookwood,33.802605,-84.392983,Chipotle Mexican Grill,33.801615,-84.392751,Mexican Restaurant
3,7,1336,Brookwood,33.802605,-84.392983,Ted's Montana Grill,33.80623,-84.394343,American Restaurant
4,8,1337,Brookwood,33.802605,-84.392983,El Azteca,33.803372,-84.393284,Mexican Restaurant


In [78]:
bw_cate = bw_restaurant['Venue Category'].unique()
print(bw_cate)
len(bw_cate)

['Thai Restaurant' 'Vegetarian / Vegan Restaurant' 'Mexican Restaurant'
 'American Restaurant' 'Southern / Soul Food Restaurant'
 'Mediterranean Restaurant' 'New American Restaurant'
 'Fast Food Restaurant' 'Middle Eastern Restaurant' 'Japanese Restaurant']


10

In [79]:
#show on the map
# create map
map_bw_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(bw_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(bw_restaurant['Venue Latitude'], bw_restaurant['Venue Longitude'], bw_restaurant['Venue Category']):
    colorcode=np.where(bw_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_bw_restaurant)
       
map_bw_restaurant

In [80]:
print(bw_restaurant['Venue Category'].value_counts())

American Restaurant                6
Mexican Restaurant                 3
New American Restaurant            3
Thai Restaurant                    2
Vegetarian / Vegan Restaurant      2
Mediterranean Restaurant           2
Southern / Soul Food Restaurant    1
Middle Eastern Restaurant          1
Fast Food Restaurant               1
Japanese Restaurant                1
Name: Venue Category, dtype: int64


In [81]:
gt_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Georgia Tech'].reset_index()
gt_venues

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,3250,Georgia Tech,33.776033,-84.398841,Ferst Center For The Arts,33.77482,-84.399194,College Theater
1,3251,Georgia Tech,33.776033,-84.398841,Campus Recreation Center (CRC),33.775532,-84.403383,College Rec Center
2,3252,Georgia Tech,33.776033,-84.398841,Spoiled Opulence Salon,33.776506,-84.407723,Salon / Barbershop
3,3253,Georgia Tech,33.776033,-84.398841,Sublime Doughnuts,33.781734,-84.40493,Donut Shop
4,3254,Georgia Tech,33.776033,-84.398841,Georgia Tech Aquatic Center,33.775535,-84.403547,Gym Pool
5,3255,Georgia Tech,33.776033,-84.398841,Delia's Chicken Sausage Stand,33.776406,-84.407408,Hot Dog Joint
6,3256,Georgia Tech,33.776033,-84.398841,Le Fat,33.778169,-84.409113,Vietnamese Restaurant
7,3257,Georgia Tech,33.776033,-84.398841,Coca-Cola Mainstreet @ AOC,33.77098,-84.398037,American Restaurant
8,3258,Georgia Tech,33.776033,-84.398841,Thumbs Up Diner,33.774854,-84.406714,Diner
9,3259,Georgia Tech,33.776033,-84.398841,Fred's Meat & Bread,33.777075,-84.389308,Burger Joint


In [82]:
gt_restaurant = gt_venues[gt_venues['Venue Category'].str.contains('Restaurant')].reset_index()
print(gt_restaurant.shape)
gt_restaurant.head()

(30, 9)


Unnamed: 0,level_0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,6,3256,Georgia Tech,33.776033,-84.398841,Le Fat,33.778169,-84.409113,Vietnamese Restaurant
1,7,3257,Georgia Tech,33.776033,-84.398841,Coca-Cola Mainstreet @ AOC,33.77098,-84.398037,American Restaurant
2,11,3261,Georgia Tech,33.776033,-84.398841,bartaco,33.77873,-84.40932,Mexican Restaurant
3,17,3267,Georgia Tech,33.776033,-84.398841,The Optimist,33.779892,-84.410597,Seafood Restaurant
4,19,3269,Georgia Tech,33.776033,-84.398841,Gio's Chicken Amalfitano,33.784875,-84.405933,Italian Restaurant


In [83]:
gt_cate = gt_restaurant['Venue Category'].unique()
print(gt_cate)
len(gt_cate)

['Vietnamese Restaurant' 'American Restaurant' 'Mexican Restaurant'
 'Seafood Restaurant' 'Italian Restaurant' 'Japanese Restaurant'
 'Middle Eastern Restaurant' 'Mediterranean Restaurant' 'Asian Restaurant'
 'New American Restaurant' 'Sushi Restaurant' 'Fast Food Restaurant'
 'Caribbean Restaurant' 'Tapas Restaurant' 'Vegetarian / Vegan Restaurant'
 'Restaurant' 'Southern / Soul Food Restaurant' 'Spanish Restaurant']


18

In [84]:
#show on the map
# create map
map_gt_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(gt_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(gt_restaurant['Venue Latitude'], gt_restaurant['Venue Longitude'], gt_restaurant['Venue Category']):
    colorcode=np.where(gt_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_gt_restaurant)
       
map_gt_restaurant


In [85]:
print(gt_restaurant['Venue Category'].value_counts())

American Restaurant                4
Seafood Restaurant                 4
New American Restaurant            3
Mexican Restaurant                 2
Asian Restaurant                   2
Japanese Restaurant                2
Mediterranean Restaurant           2
Sushi Restaurant                   1
Middle Eastern Restaurant          1
Caribbean Restaurant               1
Vegetarian / Vegan Restaurant      1
Southern / Soul Food Restaurant    1
Italian Restaurant                 1
Restaurant                         1
Tapas Restaurant                   1
Fast Food Restaurant               1
Vietnamese Restaurant              1
Spanish Restaurant                 1
Name: Venue Category, dtype: int64


In [86]:
lh_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Loring Heights'].reset_index()
lh_venues

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,4316,Loring Heights,33.798716,-84.397983,Cirque Du Soleil - Luzia,33.794722,-84.395228,Circus
1,4317,Loring Heights,33.798716,-84.397983,Tuk Tuk Thai Food Loft,33.801657,-84.392464,Thai Restaurant
2,4318,Loring Heights,33.798716,-84.397983,Kilwin's Chocolates & Ice Cream,33.793002,-84.397461,Ice Cream Shop
3,4319,Loring Heights,33.798716,-84.397983,endive Publik house,33.794986,-84.400072,American Restaurant
4,4320,Loring Heights,33.798716,-84.397983,Land of a Thousand Hills Coffee,33.793309,-84.395859,Coffee Shop
5,4321,Loring Heights,33.798716,-84.397983,Chipotle Mexican Grill,33.801615,-84.392751,Mexican Restaurant
6,4322,Loring Heights,33.798716,-84.397983,barre3,33.801617,-84.392393,Gym / Fitness Center
7,4323,Loring Heights,33.798716,-84.397983,Stoddard's Range and Guns,33.792995,-84.403804,Gun Shop
8,4324,Loring Heights,33.798716,-84.397983,Egg Harbor Cafe,33.805171,-84.393905,Breakfast Spot
9,4325,Loring Heights,33.798716,-84.397983,R. Thomas' Deluxe Grill,33.804326,-84.393649,Vegetarian / Vegan Restaurant


In [87]:
lh_restaurant = lh_venues[lh_venues['Venue Category'].str.contains('Restaurant')].reset_index()
print(lh_restaurant.shape)
lh_restaurant.head()

(24, 9)


Unnamed: 0,level_0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,1,4317,Loring Heights,33.798716,-84.397983,Tuk Tuk Thai Food Loft,33.801657,-84.392464,Thai Restaurant
1,3,4319,Loring Heights,33.798716,-84.397983,endive Publik house,33.794986,-84.400072,American Restaurant
2,5,4321,Loring Heights,33.798716,-84.397983,Chipotle Mexican Grill,33.801615,-84.392751,Mexican Restaurant
3,9,4325,Loring Heights,33.798716,-84.397983,R. Thomas' Deluxe Grill,33.804326,-84.393649,Vegetarian / Vegan Restaurant
4,14,4330,Loring Heights,33.798716,-84.397983,Yard House,33.793421,-84.397394,American Restaurant


In [88]:
lh_cate = lh_restaurant['Venue Category'].unique()
print(lh_cate)
len(lh_cate)

['Thai Restaurant' 'American Restaurant' 'Mexican Restaurant'
 'Vegetarian / Vegan Restaurant' 'Japanese Restaurant'
 'Southern / Soul Food Restaurant' 'Mediterranean Restaurant'
 'New American Restaurant' 'Fast Food Restaurant'
 'Middle Eastern Restaurant' 'Italian Restaurant']


11

In [89]:
#show on the map
# create map
map_lh_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(lh_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(lh_restaurant['Venue Latitude'], lh_restaurant['Venue Longitude'], lh_restaurant['Venue Category']):
    colorcode=np.where(lh_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_lh_restaurant)
       
map_lh_restaurant

In [90]:
print(lh_restaurant['Venue Category'].value_counts())

American Restaurant                5
Southern / Soul Food Restaurant    3
Mexican Restaurant                 3
Vegetarian / Vegan Restaurant      2
Thai Restaurant                    2
New American Restaurant            2
Fast Food Restaurant               2
Japanese Restaurant                2
Mediterranean Restaurant           1
Middle Eastern Restaurant          1
Italian Restaurant                 1
Name: Venue Category, dtype: int64


In [91]:
mt_venues = Atlanta_venues[Atlanta_venues['Neighborhood']=='Midtown'].reset_index()
mt_venues


Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,4674,Midtown,33.781127,-84.38636,Cafe Agora,33.780932,-84.38446,Mediterranean Restaurant
1,4675,Midtown,33.781127,-84.38636,Mac's Beer & Wine,33.780916,-84.387992,Liquor Store
2,4676,Midtown,33.781127,-84.38636,Ecco,33.778827,-84.385988,Mediterranean Restaurant
3,4677,Midtown,33.781127,-84.38636,Brazilian Wax by Andreia,33.780646,-84.387739,Spa
4,4678,Midtown,33.781127,-84.38636,Savi Provisions,33.78101,-84.384278,Gourmet Shop
5,4679,Midtown,33.781127,-84.38636,Sweet Hut Bakery & Cafe,33.780315,-84.3841,Bakery
6,4680,Midtown,33.781127,-84.38636,Empire State South,33.781374,-84.383662,Southern / Soul Food Restaurant
7,4681,Midtown,33.781127,-84.38636,Dancing Goats Coffee Bar,33.78081,-84.386653,Coffee Shop
8,4682,Midtown,33.781127,-84.38636,Exhale,33.783294,-84.383368,Spa
9,4683,Midtown,33.781127,-84.38636,Steamhouse Lounge,33.78341,-84.387644,Seafood Restaurant


In [92]:
mt_restaurant = mt_venues[mt_venues['Venue Category'].str.contains('Restaurant')].reset_index()
print(mt_restaurant.shape)
mt_restaurant.head()

(29, 9)


Unnamed: 0,level_0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,0,4674,Midtown,33.781127,-84.38636,Cafe Agora,33.780932,-84.38446,Mediterranean Restaurant
1,2,4676,Midtown,33.781127,-84.38636,Ecco,33.778827,-84.385988,Mediterranean Restaurant
2,6,4680,Midtown,33.781127,-84.38636,Empire State South,33.781374,-84.383662,Southern / Soul Food Restaurant
3,9,4683,Midtown,33.781127,-84.38636,Steamhouse Lounge,33.78341,-84.387644,Seafood Restaurant
4,12,4686,Midtown,33.781127,-84.38636,Marlow's Tavern,33.78011,-84.387566,New American Restaurant


In [93]:
mt_cate = mt_restaurant['Venue Category'].unique()
print(mt_cate)
len(mt_cate)

['Mediterranean Restaurant' 'Southern / Soul Food Restaurant'
 'Seafood Restaurant' 'New American Restaurant' 'Indian Restaurant'
 'American Restaurant' 'Mexican Restaurant' 'Japanese Restaurant'
 'Sushi Restaurant' 'Modern European Restaurant' 'Vietnamese Restaurant'
 'Middle Eastern Restaurant' 'Italian Restaurant' 'Tex-Mex Restaurant'
 'Korean Restaurant' 'Restaurant' 'Cuban Restaurant']


17

In [94]:
# create map
map_mt_restaurant = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
kclusters=len(mt_cate)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cate in zip(mt_restaurant['Venue Latitude'], mt_restaurant['Venue Longitude'], mt_restaurant['Venue Category']):
    colorcode=np.where(mt_cate == cate)[0][0]
   
    label = folium.Popup(str(cate), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        color=rainbow[colorcode],
        fill=True,
        fill_color=rainbow[colorcode],
        fill_opacity=0.8).add_to(map_mt_restaurant)
       
map_mt_restaurant

In [95]:
print(mt_restaurant['Venue Category'].value_counts())

Southern / Soul Food Restaurant    4
New American Restaurant            3
American Restaurant                3
Seafood Restaurant                 3
Mediterranean Restaurant           3
Italian Restaurant                 2
Sushi Restaurant                   1
Cuban Restaurant                   1
Mexican Restaurant                 1
Restaurant                         1
Tex-Mex Restaurant                 1
Modern European Restaurant         1
Indian Restaurant                  1
Middle Eastern Restaurant          1
Vietnamese Restaurant              1
Korean Restaurant                  1
Japanese Restaurant                1
Name: Venue Category, dtype: int64


## Results 

### Based on the clustering of neighborhoods, we found that areas in Cluster 2 are good places to open a restaurant. Since the distribution of Cluster 2 is around the center of Atlanta, then we want to open a restaurant near Gatech, Midtown and Atlantic Station. The neighborhoods listed below satisfy our criteria:

    1. Ansley Park
    2. Atlantic Station
    3. Brookwood
    4. Georgia Tech
    5. Loring Heights
    6. Midtown
    
#### The most popular restaurant in each selected neighborhood are listed below:

    -> Ansley Park: Thai Restaurant, Southern / Soul Food Restaurant, American Restaurant, Seafood Restaurant
    -> Atlantic Station: American Restaurant, Seafood Restaurant, Southern / Soul Food Restaurant
    -> Brookwood: American Restaurant, New American Restaurant, Mexican Restaurant
    -> Georgia Tech: Seafood Restaurant, American Restaurant, New American Restaurant
    -> Loring Heights: American Restaurant, Japanese Restaurant, Southern / Soul Food Restaurant, Mexican Restaurant
    -> Midtwon: Southern / Soul Food Restaurant, American Restaurant, Seafood Restaurant, Mediterranean Restaurant