## Business Problem

#### Jerry is a huge NHL fan and he loves to travel to different cities to watch the hockey games and explore the city during the trip. Although he wants to see everything in each NHL city, he always on the tight schedule. He wants to know what is the top pick in that city

## Data Description

#### First, fetch geo information of each NHL city (via wiki: https://en.wikipedia.org/wiki/List_of_National_Hockey_League_arenas) and then use the Foursquare API to find the top-pick in that city. Lastly, cluster the NHL arenas based on near-by top-picks (KMEANS).

In [1]:
import pandas as pd 
import numpy as np
import requests
from bs4 import BeautifulSoup

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_National_Hockey_League_arenas"
g = requests.get(url)

In [3]:
soup = BeautifulSoup(g.text, "html5lib")
wiki_arena_list = soup.find("tbody")
wiki_column = wiki_arena_list.find("tr").text
wiki_column=wiki_column.split("\n")

In [4]:
for i in range(len(wiki_column)): 
    try: 
        print(i, wiki_column[i])
        if wiki_column[i] == "": 
            wiki_column.pop(i)
    except:
        pass

0 
1 
2 
3 
4 
5 
6 
7 
8 


In [5]:
wiki_column

['Image',
 'Arena',
 'Location',
 'Team',
 'Capacity',
 'Opened',
 'Season of first NHL game',
 'Ref(s)']

In [6]:
wiki_cells = wiki_arena_list.find_all("td")

In [7]:
col0=[]
col1=[]
col2=[]
col3=[]
col4=[]
col5=[]
col6=[]
col7=[]

for i in range(len(wiki_cells)): 
    if i%8 == 0: 
        col0.append(wiki_cells[i].text[0:-1])     
    elif i%8==1: 
        col1.append(wiki_cells[i].text[0:-1])
    elif i%8==2: 
        col2.append(wiki_cells[i].text[0:-1])
    elif i%8==3: 
        col3.append(wiki_cells[i].text[0:-1])
    elif i%8==4: 
        col4.append(wiki_cells[i].text[0:-1])
    elif i%8==5: 
        col5.append(wiki_cells[i].text[0:-1])
    elif i%8==6: 
        col6.append(wiki_cells[i].text[0:-1])
    else: 
        col7.append(wiki_cells[i].text[0:-1])

In [8]:
df=pd.DataFrame({wiki_column[0]:col0,
                 wiki_column[1]:col1,
                wiki_column[2]:col2,
                wiki_column[3]:col3,
                wiki_column[4]:col4,
                wiki_column[5]:col5,
                wiki_column[6]:col6,
                wiki_column[7]:col7,})

In [9]:
df.head()

Unnamed: 0,Image,Arena,Location,Team,Capacity,Opened,Season of first NHL game,Ref(s)
0,,Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,19092,1996,1996–97,[1]
1,,American Airlines Center,"Dallas, Texas",Dallas Stars,18532,2001,2001–02,[2]
2,,Ball Arena,"Denver, Colorado",Colorado Avalanche,17809,1999,1999–2000,[3]
3,,BB&T Center,"Sunrise, Florida",Florida Panthers,19250,1998,1998–99,[4]
4,,Bell Centre,"Montreal, Quebec",Montreal Canadiens,21302,1996,1995–96,[5]


In [10]:
df.drop("Ref(s)", axis = 1 ,inplace = True)

In [11]:
import geocoder
from geopy.geocoders import Nominatim

In [12]:
address = 'North America'
geolocator = Nominatim(user_agent="google")

In [13]:
location = geolocator.geocode('{}, {}'.format(df["Arena"][0],df["Location"][0]))
latitude = location.latitude
longitude = location.longitude

In [14]:
latitude=[]
longitude=[]

for k in range(32):
    try: 
        location = geolocator.geocode('{}, {}'.format(df["Arena"][k],df["Location"][k]))
        print(k,location, location.latitude,location.longitude)
        latitude.append(location.latitude)
        longitude.append(location.longitude)
    except: 
        print(k)
        latitude.append(np.nan)
        longitude.append(np.nan)
        continue

0 Amalie Arena, 401, Channelside Drive, Harbour Island, Tampa, Hillsborough County, Florida, 33602, United States 27.942704 -82.45189031562487
1 American Airlines Center, 2500, Victory Avenue, Harwood District, Dallas, Dallas County, Texas, 75219, United States 32.7905076 -96.81027213460834
2 Ball Arena, 1000, Chopper Circle, Auraria, Denver, Denver County, Colorado, 80204, United States 39.748683799999995 -105.00754401780362
3 BB&T Center, 1, Northwest 137th Way, Sunrise, Broward County, Florida, 33323, United States 26.15837015 -80.3254288749938
4 Centre Bell, Rue Drummond, René-Lévesque, Ville-Marie, Montréal, Agglomération de Montréal, Montréal (06), Québec, H3B 4W8, Canada 45.4960916 -73.56927782355446
5 Bridgestone Arena, 5th Avenue South, Nashville-Davidson, Davidson County, Tennessee, 37203, United States 36.1589806 -86.77838189265074
6 Canada Life, 60, Osborne Street North, Osborne Village, Roslyn, Fort Rouge–East Fort Garry, Winnipeg, Winnipeg (city), Manitoba, R3C1V3, Canada

In [15]:
# No.28th area doesnt have the information 
# used https://www.openstreetmap.org/ to check, no results either 
df.iloc[28,:]

Image                                         
Arena                                UBS Arena
Location                      Elmont, New York
Team                        New York Islanders
Capacity                                17,113
Opened                                    2021
Season of first NHL game               2021–22
Name: 28, dtype: object

In [16]:
df["Latitude"]=latitude
df["Longitude"] = longitude

In [17]:
df.head()

Unnamed: 0,Image,Arena,Location,Team,Capacity,Opened,Season of first NHL game,Latitude,Longitude
0,,Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,19092,1996,1996–97,27.942704,-82.45189
1,,American Airlines Center,"Dallas, Texas",Dallas Stars,18532,2001,2001–02,32.790508,-96.810272
2,,Ball Arena,"Denver, Colorado",Colorado Avalanche,17809,1999,1999–2000,39.748684,-105.007544
3,,BB&T Center,"Sunrise, Florida",Florida Panthers,19250,1998,1998–99,26.15837,-80.325429
4,,Bell Centre,"Montreal, Quebec",Montreal Canadiens,21302,1996,1995–96,45.496092,-73.569278


In [18]:
# According to wikipedia:https://en.wikipedia.org/wiki/UBS_Arena. The arena geolocation is 40.712094°N 73.727157°W
# AKA: 40.712094  -73.727157
df.iloc[28,7]=40.712094
df.iloc[28,8] = -73.727157
df.iloc[28,:]

Image                                         
Arena                                UBS Arena
Location                      Elmont, New York
Team                        New York Islanders
Capacity                                17,113
Opened                                    2021
Season of first NHL game               2021–22
Latitude                               40.7121
Longitude                             -73.7272
Name: 28, dtype: object

In [19]:
df.drop("Image",axis=1,inplace=True)
df.head()

Unnamed: 0,Arena,Location,Team,Capacity,Opened,Season of first NHL game,Latitude,Longitude
0,Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,19092,1996,1996–97,27.942704,-82.45189
1,American Airlines Center,"Dallas, Texas",Dallas Stars,18532,2001,2001–02,32.790508,-96.810272
2,Ball Arena,"Denver, Colorado",Colorado Avalanche,17809,1999,1999–2000,39.748684,-105.007544
3,BB&T Center,"Sunrise, Florida",Florida Panthers,19250,1998,1998–99,26.15837,-80.325429
4,Bell Centre,"Montreal, Quebec",Montreal Canadiens,21302,1996,1995–96,45.496092,-73.569278


In [20]:
#export to json
result = df.to_json(r"NHL_arena_Lat_Log.json")

In [21]:
df.columns.values

array(['Arena', 'Location', 'Team', 'Capacity', 'Opened',
       'Season of first NHL game', 'Latitude', 'Longitude'], dtype=object)

## connect with Foursquare API and select top-picks

In [22]:
import matplotlib.cm as cm 
import matplotlib.colors as colors 
import matplotlib.pyplot as plt 
from sklearn.cluster import KMeans
import folium
import seaborn as sns 
%matplotlib inline

In [23]:
address2 = 'United States of America'
geolocator2 = Nominatim(user_agent="google")
location2 = geolocator.geocode(address2)
latitude2 = location2.latitude
longitude2 = location2.longitude

In [24]:
map_na = folium.Map(location=[latitude2,longitude2],zoom_start = 4,width = 1000)
for lat, lng, arena, city, team in zip(df["Latitude"], df["Longitude"],df["Arena"],df["Location"],df["Team"]):
    label = "Team: {}, \n Arena: {},  City: {}".format(team,arena,city)
    label = folium.Popup(label,parse_html=True, max_width=200)
    folium.CircleMarker(
    [lat,lng], 
    radius = 6 ,
    popup = label , 
    color = "red",
    fill=True, 
    fill_color = "red",
    fill_capacity = 0.7,
    parse_html=True).add_to(map_na)
    
map_na

In [25]:
CLIENT_ID = 'T3LYSVVFWCXCPBIHPLDIXEPPS2JNTX2LACLOQU0MXTR5VYU5' # your Foursquare ID
CLIENT_SECRET = 'PBUOZ1LKV1D14RQKXICJZT2FXNXDZ43Q3VD1L4VO0HRLX2N2' # your Foursquare Secret
CODE = "KJ04IZ1JI3IAXIQ3BEPQ2JX2CJRC4PY1DMEFNI2B0T0J1XK3#_=_"
ACCESS_TOKEN = 'IEZX0QMM1LAALBLWIDNJTGMZZN5SZ10WMU4T40YFC5FWZRVR' # your FourSquare Access Token
VERSION = '20180604'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: T3LYSVVFWCXCPBIHPLDIXEPPS2JNTX2LACLOQU0MXTR5VYU5
CLIENT_SECRET:PBUOZ1LKV1D14RQKXICJZT2FXNXDZ43Q3VD1L4VO0HRLX2N2


In [26]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [27]:
api_df = pd.DataFrame()

for c in range(32): 
    aa = df["Latitude"][c]
    bb = df["Longitude"][c]
    radius=4000
    section = "topPicks"

    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&section={}'.format(CLIENT_ID, CLIENT_SECRET, aa, bb, VERSION, radius,section)
    results = requests.get(url).json()
    items = results["response"]["groups"][0]["items"]
    dataframe = pd.json_normalize(items) 

    filtered_columns = ["venue.name", "venue.categories"] + [col for col in dataframe.columns if col.startswith("venue.location.")]
    dataframe_filtered = dataframe.loc[:,filtered_columns]
    dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)
    dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]
    dataframe_filtered["Arena"] = df["Arena"][c]
    dataframe_filtered["Location"]=df["Location"][c]
    dataframe_filtered["Team"] = df["Team"][c]
    
    api_df = pd.concat([api_df, dataframe_filtered])

In [28]:
api_df.head()

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,Arena,Location,Team,crossStreet,neighborhood
0,Double Decker,Karaoke Bar,1721 E 7th Ave,27.960307,-82.439717,"[{'label': 'display', 'lat': 27.96030709631222...",2296,33605,US,Tampa,FL,United States,"[1721 E 7th Ave, Tampa, FL 33605, United States]",Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,,
1,Edison’s Swigamajig,Fish & Chips Shop,,27.943044,-82.447524,"[{'label': 'display', 'lat': 27.943044, 'lng':...",431,33602,US,Tampa,FL,United States,"[Tampa, FL 33602, United States]",Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,,
2,Chill Bros,Ice Cream Shop,,27.960353,-82.437613,"[{'label': 'display', 'lat': 27.960353, 'lng':...",2414,33605,US,Tampa,FL,United States,"[Tampa, FL 33605, United States]",Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,,
3,Casa Santo Stefano,Italian Restaurant,1607 N 22nd St,27.959311,-82.434563,"[{'label': 'display', 'lat': 27.959311, 'lng':...",2514,33605,US,Tampa,FL,United States,"[1607 N 22nd St, Tampa, FL 33605, United States]",Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,,
4,Urban Kai Paddleboarding Rental,Board Shop,2305 N Willow Ave,27.963482,-82.474181,"[{'label': 'display', 'lat': 27.963482, 'lng':...",3186,33607,US,Tampa,FL,United States,"[2305 N Willow Ave, Tampa, FL 33607, United St...",Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,,


In [73]:
nearby_map = folium.Map(location=[latitude2, longitude2], zoom_start = 4,width = 1000)

for lat, lng, arena, city, team in zip(df["Latitude"], df["Longitude"],df["Arena"],df["Location"],df["Team"]):
    label = "Team: {},  Arena: {},  City: {}".format(team,arena,city)
    label = folium.Popup(label,parse_html=True, max_width=200)
    folium.CircleMarker(
        [lat,lng], 
        radius = 20 ,
        popup = label , 
        color = "red",
        fill=True, 
        fill_color = "red",
        fill_capacity = 0.7).add_to(nearby_map)
    
for lat, lng, lb,storename in zip(api_df["lat"], api_df["lng"], api_df["categories"],api_df["name"]):
    label="name: {},  category: {}".format(storename, lb)
    label=folium.Popup(label,parse_html=True,max_width=200)
    folium.CircleMarker(
         [lat, lng],
         radius=5,
        popup=label,
         color='blue',
        fill=True,
         fill_color='blue',
         fill_opacity=0.6).add_to(nearby_map)

nearby_map

## cluster the arena based on the top-picks 

In [30]:
api_df.columns

Index(['name', 'categories', 'address', 'lat', 'lng', 'labeledLatLngs',
       'distance', 'postalCode', 'cc', 'city', 'state', 'country',
       'formattedAddress', 'Arena', 'Location', 'Team', 'crossStreet',
       'neighborhood'],
      dtype='object')

In [31]:
api_df.shape

(373, 18)

In [32]:
len(api_df["categories"].unique())

142

In [33]:
cluster_onehot=pd.get_dummies(api_df[["categories"]],prefix="",prefix_sep="")
cluster_onehot.head()

Unnamed: 0,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,...,Trail,Vegetarian / Vegan Restaurant,Vehicle Inspection Station,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio,Zoo Exhibit
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [34]:
cluster_onehot.shape

(373, 142)

In [35]:
cluster_onehot["Arena"]=api_df["Arena"]
fixed_columns = [cluster_onehot.columns[-1]]+list(cluster_onehot.columns[:-1])

In [36]:
cluster_onehot = cluster_onehot[fixed_columns]
cluster_onehot.head()

Unnamed: 0,Arena,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,...,Trail,Vegetarian / Vegan Restaurant,Vehicle Inspection Station,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio,Zoo Exhibit
0,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [37]:
cluster = cluster_onehot.groupby("Arena").sum().reset_index()
cluster.shape

(32, 143)

In [38]:
cluster.head()

Unnamed: 0,Arena,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,...,Trail,Vegetarian / Vegan Restaurant,Vehicle Inspection Station,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio,Zoo Exhibit
0,Amalie Arena,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,American Airlines Center,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,BB&T Center,3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Ball Arena,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
4,Bell Centre,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [39]:
num_top_venues = 5

for hood in cluster['Arena']:
    print("----"+hood+"----")
    temp = cluster[cluster['Arena'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Amalie Arena----
                venue  freq
0      Ice Cream Shop   1.0
1          Board Shop   1.0
2   Fish & Chips Shop   1.0
3             Brewery   1.0
4  Italian Restaurant   1.0


----American Airlines Center----
                 venue  freq
0          Coffee Shop   2.0
1              Brewery   2.0
2  American Restaurant   1.0
3   Tex-Mex Restaurant   1.0
4                 Park   1.0


----BB&T Center----
                 venue  freq
0  American Restaurant   3.0
1                 Park   2.0
2        Shopping Mall   2.0
3       Breakfast Spot   1.0
4         Liquor Store   1.0


----Ball Arena----
                           venue  freq
0                    Pizza Place   2.0
1                          Diner   1.0
2  Vegetarian / Vegan Restaurant   1.0
3                           Café   1.0
4                         Lounge   1.0


----Bell Centre----
                venue  freq
0          Restaurant   2.0
1      Sandwich Place   1.0
2            Tea Room   1.0
3  Mexican Restau

In [40]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [41]:
num_top_venues = 20

indicators = ['st', 'nd', 'rd']

columns = ['Arena']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

arena_venues_sorted = pd.DataFrame(columns=columns)
arena_venues_sorted['Arena'] = cluster['Arena']

for ind in np.arange(cluster.shape[0]):
    arena_venues_sorted.iloc[ind, 1:] = return_most_common_venues(cluster.iloc[ind, :], num_top_venues)

arena_venues_sorted.head()

Unnamed: 0,Arena,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Amalie Arena,Brewery,Ice Cream Shop,Italian Restaurant,Karaoke Bar,Board Shop,Fish & Chips Shop,Fast Food Restaurant,Food & Drink Shop,Food,...,Fishing Store,Fish Market,Zoo Exhibit,Farmers Market,Food Truck,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop
1,American Airlines Center,Coffee Shop,Brewery,Park,Bar,Sports Bar,Shopping Mall,Seafood Restaurant,Tex-Mex Restaurant,Burger Joint,...,Japanese Restaurant,Music Venue,American Restaurant,Exhibit,Dosa Place,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Escape Room
2,BB&T Center,American Restaurant,Park,Shopping Mall,Café,Breakfast Spot,Caribbean Restaurant,Clothing Store,Liquor Store,Burger Joint,...,Italian Restaurant,Farmers Market,Fast Food Restaurant,Exhibit,Fish Market,Escape Room,Fishing Store,Flower Shop,Electronics Store,Dumpling Restaurant
3,Ball Arena,Pizza Place,Diner,Donut Shop,Café,Lounge,Vegetarian / Vegan Restaurant,Farmers Market,Fishing Store,Fish Market,...,Fast Food Restaurant,Escape Room,Exhibit,Food,Electronics Store,Dumpling Restaurant,Dosa Place,Dog Run,Flower Shop,Zoo Exhibit
4,Bell Centre,Restaurant,Sandwich Place,Poutine Place,Café,Tea Room,Coffee Shop,Mexican Restaurant,Exhibit,Fish Market,...,Fast Food Restaurant,Farmers Market,Zoo Exhibit,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop,Dog Run,Fishing Store


In [42]:
kclusters = 4

a = cluster.drop('Arena', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=4).fit(a)
kmeans.labels_[0:10] 

array([2, 0, 2, 2, 2, 2, 0, 2, 0, 1])

In [43]:
arena_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [44]:
arena_venues_sorted.shape

(32, 22)

In [45]:
df.head()

Unnamed: 0,Arena,Location,Team,Capacity,Opened,Season of first NHL game,Latitude,Longitude
0,Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,19092,1996,1996–97,27.942704,-82.45189
1,American Airlines Center,"Dallas, Texas",Dallas Stars,18532,2001,2001–02,32.790508,-96.810272
2,Ball Arena,"Denver, Colorado",Colorado Avalanche,17809,1999,1999–2000,39.748684,-105.007544
3,BB&T Center,"Sunrise, Florida",Florida Panthers,19250,1998,1998–99,26.15837,-80.325429
4,Bell Centre,"Montreal, Quebec",Montreal Canadiens,21302,1996,1995–96,45.496092,-73.569278


In [46]:
df_merged = df
df_merged = pd.merge(left = df_merged, right = arena_venues_sorted, on='Arena',how = "left")
df_merged.head()

Unnamed: 0,Arena,Location,Team,Capacity,Opened,Season of first NHL game,Latitude,Longitude,Cluster Labels,1st Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Amalie Arena,"Tampa, Florida",Tampa Bay Lightning,19092,1996,1996–97,27.942704,-82.45189,2,Brewery,...,Fishing Store,Fish Market,Zoo Exhibit,Farmers Market,Food Truck,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop
1,American Airlines Center,"Dallas, Texas",Dallas Stars,18532,2001,2001–02,32.790508,-96.810272,0,Coffee Shop,...,Japanese Restaurant,Music Venue,American Restaurant,Exhibit,Dosa Place,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Escape Room
2,Ball Arena,"Denver, Colorado",Colorado Avalanche,17809,1999,1999–2000,39.748684,-105.007544,2,Pizza Place,...,Fast Food Restaurant,Escape Room,Exhibit,Food,Electronics Store,Dumpling Restaurant,Dosa Place,Dog Run,Flower Shop,Zoo Exhibit
3,BB&T Center,"Sunrise, Florida",Florida Panthers,19250,1998,1998–99,26.15837,-80.325429,2,American Restaurant,...,Italian Restaurant,Farmers Market,Fast Food Restaurant,Exhibit,Fish Market,Escape Room,Fishing Store,Flower Shop,Electronics Store,Dumpling Restaurant
4,Bell Centre,"Montreal, Quebec",Montreal Canadiens,21302,1996,1995–96,45.496092,-73.569278,2,Restaurant,...,Fast Food Restaurant,Farmers Market,Zoo Exhibit,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop,Dog Run,Fishing Store


In [47]:
df_merged.shape

(32, 29)

In [48]:
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow

['#8000ff', '#2adddd', '#d4dd80', '#ff0000']

In [49]:
cluster_map = folium.Map(location=[latitude2, longitude2], zoom_start = 4,width = 1000)

markers_colors = []
for lat, lon, arena, team, cluster in zip(df_merged['Latitude'], df_merged['Longitude'], df_merged['Arena'],df_merged["Team"], df_merged['Cluster Labels']):
    label = folium.Popup(str(arena) +", Team: "+team+ ', Cluster: ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(cluster_map)
       
cluster_map

In [50]:
df_merged.loc[df_merged['Cluster Labels'] == 0, df_merged.columns[[1] + list(range(8, df_merged.shape[1]))]]

Unnamed: 0,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
1,"Dallas, Texas",0,Coffee Shop,Brewery,Park,Bar,Sports Bar,Shopping Mall,Seafood Restaurant,Tex-Mex Restaurant,...,Japanese Restaurant,Music Venue,American Restaurant,Exhibit,Dosa Place,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Escape Room
6,"Winnipeg, Manitoba",0,Brewery,Garden Center,Modern European Restaurant,Deli / Bodega,Ice Cream Shop,Indian Restaurant,Vietnamese Restaurant,Breakfast Spot,...,Sandwich Place,Thai Restaurant,Cocktail Bar,Coffee Shop,Zoo Exhibit,Fish Market,Farmers Market,Fish & Chips Shop,Fast Food Restaurant,Flower Shop
8,"Washington, D.C.",0,Art Museum,Coffee Shop,Café,Wine Bar,Poke Place,Ice Cream Shop,Indian Restaurant,Sculpture Garden,...,Food Truck,Restaurant,Market,Garden,Zoo Exhibit,Fast Food Restaurant,Escape Room,Farmers Market,Exhibit,Fish Market
14,"Detroit, Michigan",0,Coffee Shop,Plaza,Mediterranean Restaurant,Greek Restaurant,Farmers Market,Flower Shop,Fishing Store,Fish Market,...,Zoo Exhibit,Exhibit,Food & Drink Shop,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop,Food,Food Truck
16,"Columbus, Ohio",0,Coffee Shop,Pizza Place,Bar,Brewery,Ice Cream Shop,Restaurant,Market,Theater,...,Flower Shop,Fishing Store,Fish Market,Fish & Chips Shop,Donut Shop,Farmers Market,Food,Dosa Place,Escape Room,Electronics Store
18,"Pittsburgh, Pennsylvania",0,Coffee Shop,History Museum,Scenic Lookout,Thai Restaurant,Fish Market,Diner,Brewery,Korean Restaurant,...,Performing Arts Venue,Plaza,Argentinian Restaurant,Pub,Exhibit,Fast Food Restaurant,Farmers Market,Zoo Exhibit,Escape Room,Electronics Store
24,"Calgary, Alberta",0,Zoo Exhibit,Salad Place,Bakery,Brewery,Burger Joint,Coffee Shop,Exhibit,French Restaurant,...,Pizza Place,Pub,Restaurant,Hotel Bar,Trail,Tex-Mex Restaurant,Grocery Store,Cuban Restaurant,Deli / Bodega,Dessert Shop
26,"Boston, Massachusetts",0,Coffee Shop,Brewery,Bakery,Concert Hall,Wine Shop,Wine Bar,Park,Ice Cream Shop,...,Beer Garden,Gastropub,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Zoo Exhibit,Farmers Market,Flower Shop,Exhibit,Escape Room
29,"Chicago, Illinois",0,Coffee Shop,Hotel,Theater,Rock Club,Yoga Studio,Deli / Bodega,Grocery Store,BBQ Joint,...,Liquor Store,Beer Bar,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Zoo Exhibit,Farmers Market,Flower Shop,Exhibit,Escape Room


In [51]:
df_merged.loc[df_merged['Cluster Labels'] == 1, df_merged.columns[[1] + list(range(8, df_merged.shape[1]))]]

Unnamed: 0,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
9,"Seattle, Washington",1,Karaoke Bar,Speakeasy,Park,Dumpling Restaurant,Vietnamese Restaurant,Pizza Place,French Restaurant,Sushi Restaurant,...,Beach,Cocktail Bar,Fast Food Restaurant,Fish Market,Fish & Chips Shop,Zoo Exhibit,Flower Shop,Farmers Market,Exhibit,Escape Room
20,"Vancouver, British Columbia",1,Hotel,Park,Bakery,Seafood Restaurant,Boxing Gym,Cocktail Bar,Lingerie Store,Gastropub,...,Waterfront,Dessert Shop,Farmers Market,Exhibit,Fast Food Restaurant,Fish & Chips Shop,Escape Room,Electronics Store,Fish Market,Dumpling Restaurant
22,"San Jose, California",1,Bar,Mexican Restaurant,Hotel,Art Gallery,Gastropub,Cocktail Bar,Pub,Donut Shop,...,Exhibit,Food & Drink Shop,Dumpling Restaurant,Fast Food Restaurant,Dosa Place,Fish & Chips Shop,Fish Market,Fishing Store,Flower Shop,Food
25,"Los Angeles, California",1,Bar,French Restaurant,Speakeasy,Food Truck,Lounge,Taco Place,Roof Deck,Korean Restaurant,...,Home Service,Gym / Fitness Center,Whisky Bar,Electronics Store,Escape Room,Dumpling Restaurant,Flower Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop


In [52]:
df_merged.loc[df_merged['Cluster Labels'] == 2, df_merged.columns[[1] + list(range(8, df_merged.shape[1]))]]

Unnamed: 0,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Tampa, Florida",2,Brewery,Ice Cream Shop,Italian Restaurant,Karaoke Bar,Board Shop,Fish & Chips Shop,Fast Food Restaurant,Food & Drink Shop,...,Fishing Store,Fish Market,Zoo Exhibit,Farmers Market,Food Truck,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop
2,"Denver, Colorado",2,Pizza Place,Diner,Donut Shop,Café,Lounge,Vegetarian / Vegan Restaurant,Farmers Market,Fishing Store,...,Fast Food Restaurant,Escape Room,Exhibit,Food,Electronics Store,Dumpling Restaurant,Dosa Place,Dog Run,Flower Shop,Zoo Exhibit
3,"Sunrise, Florida",2,American Restaurant,Park,Shopping Mall,Café,Breakfast Spot,Caribbean Restaurant,Clothing Store,Liquor Store,...,Italian Restaurant,Farmers Market,Fast Food Restaurant,Exhibit,Fish Market,Escape Room,Fishing Store,Flower Shop,Electronics Store,Dumpling Restaurant
4,"Montreal, Quebec",2,Restaurant,Sandwich Place,Poutine Place,Café,Tea Room,Coffee Shop,Mexican Restaurant,Exhibit,...,Fast Food Restaurant,Farmers Market,Zoo Exhibit,Escape Room,Electronics Store,Dumpling Restaurant,Dosa Place,Donut Shop,Dog Run,Fishing Store
5,"Nashville, Tennessee",2,Food Truck,Whisky Bar,Burger Joint,Vegetarian / Vegan Restaurant,Coffee Shop,Dosa Place,Dumpling Restaurant,Electronics Store,...,Food & Drink Shop,Exhibit,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Fishing Store,Dog Run,Flower Shop,Food
7,"Ottawa, Ontario",2,Chinese Restaurant,Food Truck,Sports Club,Electronics Store,Arts & Crafts Store,Asian Restaurant,Italian Restaurant,Greek Restaurant,...,Korean Restaurant,Optical Shop,Fast Food Restaurant,Fish Market,Fish & Chips Shop,Exhibit,Farmers Market,Flower Shop,Escape Room,Dumpling Restaurant
10,"St. Louis, Missouri",2,Steakhouse,Korean Restaurant,Donut Shop,Sandwich Place,Chinese Restaurant,Breakfast Spot,Bar,American Restaurant,...,History Museum,Dosa Place,Dumpling Restaurant,Electronics Store,Escape Room,Exhibit,Farmers Market,Hawaiian Restaurant,Fast Food Restaurant,Fish & Chips Shop
11,"Glendale, Arizona",2,Golf Course,American Restaurant,Mexican Restaurant,Playground,Clothing Store,Outdoor Sculpture,Construction & Landscaping,Restaurant,...,Vehicle Inspection Station,Cosmetics Shop,Hawaiian Restaurant,Fishing Store,Electronics Store,Fish Market,Dumpling Restaurant,Escape Room,Exhibit,Farmers Market
12,"Anaheim, California",2,Gift Shop,Gastropub,Pizza Place,Photography Studio,Ice Cream Shop,Candy Store,Vietnamese Restaurant,Plaza,...,Electronics Store,Farmers Market,Exhibit,Escape Room,Zoo Exhibit,Dumpling Restaurant,Dosa Place,Fish & Chips Shop,Donut Shop,Dog Run
13,"Buffalo, New York",2,Middle Eastern Restaurant,Fishing Store,Playground,Harbor / Marina,Café,Vietnamese Restaurant,Greek Restaurant,French Restaurant,...,Fish & Chips Shop,Fast Food Restaurant,Zoo Exhibit,Exhibit,Escape Room,Flower Shop,Dumpling Restaurant,Dosa Place,Donut Shop,Dog Run


In [53]:
df_merged.loc[df_merged['Cluster Labels'] == 3, df_merged.columns[[1] + list(range(8, df_merged.shape[1]))]]

Unnamed: 0,Location,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
27,"Paradise, Nevada",3,Nightclub,Resort,Japanese Restaurant,Shopping Mall,Seafood Restaurant,Lounge,Botanical Garden,Taco Place,...,Comedy Club,Electronics Store,Fishing Store,Escape Room,Exhibit,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Dumpling Restaurant,Dosa Place


## Results and Discussion
#### First, the project solved the initial question - what are the toppicks near the arena. The toppicks are based on the Foursquare API information. Secondly, the results is showing some arenas surroundings are very different from others. for example,ParadiseVegas Golden Knights is definitely unique among all and west coast arenas are different from the rest. The reason might rooted in the culture itself. For further study, it is worthwhile to add the features such as how far the arena from downtown.

## Conclusion
#### The goal of this project is achieved and potentially this project is also useful for these nearby toppicks to market themselves to the sport tourists. 