# Segmenting and Clustering Mining Towns in Canada (Capstone Project)


## Table of contents
* [Introduction: Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)


## Introduction: Problem <a name="introduction"></a>

Minerals and metals are the building blocks of our modern society and the mining industry is one of the highest-paid industries within Canada. However since mining jobs are only found where there is mine, mining professionals usually have to move to a mining town/city, which is usually located in remote regions. Despite the huge progress made to grant miners fair living conditions, mining towns or mining cities are not always fun.
    
In this project, I will **explore and categorize the condition of the mining towns in Canada.** Each town will be categorized based on its amenities. This will help mining professionals to make an informed choice on their work locations.
    

## Data <a name="data"></a>
Based on the problem definition, factors that will influenced our decision are
- Amount of Amenities
- Type of Amenities

The following data sources will be extracted to support my analysis.

1 : Use the ArcGIS REST API to generate a dataframe of the mine as my inital starting point.
- Current Operating Mine and their location: ArcGIS REST API
     - This layer is maintained by the Government of Canada 
     - https://maps-cartes.services.geo.ca/server_serveur/rest/services/NRCan/900A_and_top_100_en/MapServer
     - https://open.canada.ca/data/en/dataset/000183ed-8864-42f0-ae43-c4313a860720#wb-auto-6
     
2: Use the geocode package to find the address of the mine and their associated town.

3: Use the town data to call the Foursquare API to extract the amenities data.


In [1]:
'''
import all necessary libraries
'''

#Data processing libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

#Geo Data processing libraries
from pyproj import Transformer #Converting Spatial Reference 
from geopy.geocoders import Nominatim #Geo location search
import geocoder # import geocoder

# library to handle requests
import requests 

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import seaborn as sns

# map rendering library
import folium 

### Using ArcGIS REST APIs to obtain all the producing mines and it's geo location
The mine location are obtained through a layer that is supported by the canaidan gonvernment.


In [2]:
dict_MineType = {'Metal mines':4,'Nonmetal mines':5,'Coal mines':6,'Oil sands mines':7}
mine_list=[]

for key,value in dict_MineType.items():
    url = ("https://maps-cartes.services.geo.ca/server_serveur/rest/services/NRCan/900A_and_top_100_en/MapServer/{}/query?where=&text=%25&objectIds=&time=&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&relationParam=&outFields=&returnGeometry=true&returnTrueCurves=false&maxAllowableOffset=&geometryPrecision=&outSR=&having=&returnIdsOnly=false&returnCountOnly=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&returnZ=false&returnM=false&gdbVersion=&historicMoment=&returnDistinctValues=false&resultOffset=&resultRecordCount=&queryByDistance=&returnExtentOnly=false&datumTransformation=&parameterValues=&rangeValues=&quantizationParameters=&featureEncoding=esriDefault&f=pjson".format(value))
    results = requests.get(url).json()
    for j in results['features']:
        mine_list.append([j['attributes']['operation_name_En'],j['geometry']['x'],j['geometry']['y'],key,value])


In [3]:
df_Mine = pd.DataFrame(mine_list,columns=['Mine Name','x','y','Mine Type','Mine Type ID'])

In [4]:
#Converting projected coordinates to lat/lon using pyproj
transformer = Transformer.from_crs('epsg:3978', 'epsg:4326')
x2,y2 = transformer.transform(df_Mine['x'].tolist(), df_Mine['y'].tolist())
df_Mine['Latitude'] = x2
df_Mine['Longitude'] = y2

In [5]:
df_Mine.tail(5)

Unnamed: 0,Mine Name,x,y,Mine Type,Mine Type ID,Latitude,Longitude
204,Aurora North and South,-957298.573792,1037388.0,Oil sands mines,7,57.313399,-111.458001
205,Mildred Lake,-966517.171387,1004577.0,Oil sands mines,7,57.000001,-111.466707
206,Muskeg River,-962235.637512,1031173.0,Oil sands mines,7,57.2465,-111.512003
207,Fort Hills,-961488.884705,1047701.0,Oil sands mines,7,57.395199,-111.572001
208,Horizon,-974605.241577,1044809.0,Oil sands mines,7,57.338233,-111.774948


In [6]:
df_Mine['marker_color'] = pd.cut(df_Mine['Mine Type ID'], bins=4, labels=['yellow', 'green', 'blue', 'red'])

### Visualize all the producing mine in canada
- Yellow : Metal Mine
- Green : Nonmetal Mines
- Red : Oil Sand Mines
- Blue : Coal Mine

In [7]:
# create map
map_mine_in_canada = folium.Map(location=[57, -91], zoom_start=4)

# add markers to the map
for lat, lon, poi, mcolor,mtype in zip(df_Mine['Latitude'], df_Mine['Longitude'],df_Mine['Mine Name'], df_Mine['marker_color'],df_Mine['Mine Type']):
    label = folium.Popup(str(poi) + ' ' + str(mtype))
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=mcolor,
        fill=True,
        fill_color=mcolor,
        fill_opacity=0.7).add_to(map_mine_in_canada)
    
map_mine_in_canada

### Using the geocode package to find the address of the mine and their associated town.

In [8]:
#https://developers.arcgis.com/rest/geocode/api-reference/geocoding-reverse-geocode.htm
#https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/reverseGeocode?f=pjson&featureTypes=&location=-111.340000,56.871900
l_results = []
for mine_name,lat, lonin in zip(df_Mine['Mine Name'],df_Mine['Latitude'], df_Mine['Longitude']):
    arcgis_url = ("https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/reverseGeocode?f=pjson&featureTypes=&location={},{}".format(lonin,lat))
    results = requests.get(arcgis_url).json()
    l_results.append([mine_name,results['address']['LongLabel'],results['address']['City'],results['address']['MetroArea'],results['address']['Subregion'],results['address']['Region'],results['address']['Territory']])

In [9]:
df_Mine_Town = pd.DataFrame(l_results,columns=['Mine Name','LongLabel','City','MetroArea','Subregion','Region','Territory'])

In [10]:
df_Mine['Town address'] = df_Mine_Town['City'] +' , ' +df_Mine_Town['Region']

In [11]:
l_town_lat = []
l_town_long = []
for town_address in zip(df_Mine['Town address']):
    address_converter = Nominatim(user_agent = 'foursquare_agent')
    location = address_converter.geocode(town_address)
    latitude = location.latitude
    longitude = location.longitude
    l_town_lat.append(latitude)
    l_town_long.append(longitude)

In [12]:
df_Town_Coord = pd.DataFrame({'lat':l_town_lat,'long':l_town_long})
df_Mine['Town lat'] = df_Town_Coord['lat']
df_Mine['Town long'] = df_Town_Coord['long']

In [13]:
df_Mine.head()

Unnamed: 0,Mine Name,x,y,Mine Type,Mine Type ID,Latitude,Longitude,marker_color,Town address,Town lat,Town long
0,New Afton,-1745331.0,538990.0,Metal mines,4,50.661,-120.514,yellow,"Kamloops , British Columbia",50.675827,-120.339415
1,Copper Mountain,-1804148.0,403843.9,Metal mines,4,49.331001,-120.533002,yellow,"Fraser Valley , British Columbia",59.715297,-135.047995
2,Highland Valley,-1787566.0,535991.6,Metal mines,4,50.4855,-121.0483,yellow,"Logan Lake , British Columbia",50.494463,-120.813366
3,Gibraltar,-1773727.0,776445.3,Metal mines,4,52.528495,-122.28716,yellow,"Cariboo , British Columbia",49.247779,-122.889774
4,Mount Milligan,-1754343.0,1081143.0,Metal mines,4,55.118,-124.031,yellow,"Bulkley-Nechako , British Columbia",54.531617,-125.605626


### Using the town data to call the Foursquare API to extract the amenities data.
Here we will make a function with the FSQ API and call it using the dataframe we build ealier.

In [14]:
import config #Four square API key
VERSION = '20180604'
LIMIT = 1000

In [15]:
def getNearbyVenues(names, latitudes, longitudes, radius=50000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
                   
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            config.FSQ_CLIENT_ID, 
            config.FSQ_CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    return(nearby_venues)

In [16]:
Mine_venues = getNearbyVenues(names=df_Mine['Mine Name'],
                                   latitudes=df_Mine['Town lat'],
                                   longitudes=df_Mine['Town long']
                                  )

## Methodology <a name="methodology"></a>
- Part 1 : Exam the mine data and the venues data together
- Part 2 : Look at each mine and there top 5 places
- Part 3 : Look at each mine and there most common venues
- Part 4 : Using kmean to segmented and clustered mine base on the town venues

### Part 1 : Exam the mine data and the venues data together

In [17]:
# one hot encoding
Mine_onehot = pd.get_dummies(Mine_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Mine_onehot['Neighborhood'] = Mine_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Mine_onehot.columns[-1]] + list(Mine_onehot.columns[:-1])
Mine_onehot = Mine_onehot[fixed_columns]

Mine_grouped = Mine_onehot.groupby('Neighborhood').sum().reset_index()
Mine_grouped.head()

Unnamed: 0,Neighborhood,Zoo,ATM,Accessories Store,Airport,Airport Lounge,Airport Terminal,American Restaurant,Antique Shop,Apres Ski Bar,...,Warehouse Store,Water Park,Waterfall,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Yoga Studio
0,4J,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,777,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Acton Vale,0,0,0,0,0,0,1,0,0,...,1,0,0,0,0,0,1,0,0,0
3,Allan,1,0,0,0,0,0,1,0,0,...,1,0,0,0,0,0,0,0,0,0
4,Amaranth,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Using the clustergrammer2 library we are able to visulaize the 200 by 300 matrix. This helps us to explore the data quickly and identify some trends.

In [51]:
Mine_grouped_copy = Mine_grouped.copy()
Mine_grouped_copy = Mine_grouped_copy.set_index('Neighborhood')

# import the widget
from clustergrammer2 import net

net1 = net

net1.load_df(Mine_grouped_copy)

net1.widget()

CGM2(network='{"row_nodes": [{"name": "4J", "ini": 201, "clust": 199, "rank": 104, "rankvar": 100}, {"name": "…

### Part 2 : Look at each mine and there top 5 places

In [18]:
num_top_venues = 5

for hood in Mine_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Mine_grouped[Mine_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----4J----
                  venue  freq
0           Coffee Shop   4.0
1  Fast Food Restaurant   4.0
2         Grocery Store   3.0
3     Convenience Store   2.0
4        History Museum   2.0


----777----
                  venue  freq
0         Big Box Store   1.0
1           Gas Station   1.0
2  Fast Food Restaurant   1.0
3  Kitchen Supply Store   1.0
4           Outlet Mall   0.0


----Acton Vale----
                  venue  freq
0           Coffee Shop   7.0
1  Fast Food Restaurant   6.0
2        Sandwich Place   5.0
3                 Hotel   5.0
4            Restaurant   5.0


----Allan----
            venue  freq
0     Coffee Shop   5.0
1            Café   3.0
2  Breakfast Spot   3.0
3       Bookstore   3.0
4  Ice Cream Shop   3.0


----Amaranth----
                        venue  freq
0                 Gas Station   1.0
1                       Hotel   1.0
2  Construction & Landscaping   1.0
3             Automotive Shop   1.0
4                         Zoo   0.0


----Amaruq----
  

4       Paintball Field   0.0


----Diavik----
                          venue  freq
0  Theme Park Ride / Attraction  15.0
1                         Hotel  14.0
2                          Park   4.0
3                   Coffee Shop   4.0
4                      Mountain   4.0


----Donkin----
                venue  freq
0                Café   5.0
1              Bakery   4.0
2         Coffee Shop   4.0
3  Seafood Restaurant   4.0
4      Scenic Lookout   3.0


----Dundas----
           venue  freq
0           Park  19.0
1  Grocery Store  13.0
2     Restaurant  10.0
3           Café   9.0
4         Bakery   9.0


----Eagle (Dublin Gulch)----
                     venue  freq
0              Coffee Shop   5.0
1                    Hotel   3.0
2            Grocery Store   2.0
3  New American Restaurant   2.0
4     Fast Food Restaurant   2.0


----Eagle River----
                   venue  freq
0                  Trail   1.0
1              Nightclub   0.0
2        Paintball Field   0.0
3         

                  venue  freq
0          Liquor Store   1.0
1  Fast Food Restaurant   1.0
2         Grocery Store   1.0
3            Restaurant   1.0
4           Pizza Place   1.0


----Lac des Iles----
                   venue  freq
0                  Trail   1.0
1              Nightclub   0.0
2        Paintball Field   0.0
3            Outlet Mall   0.0
4  Outdoors & Recreation   0.0


----Lac-des-Îles----
                  venue  freq
0           Coffee Shop   3.0
1  Fast Food Restaurant   3.0
2         Grocery Store   2.0
3           Gas Station   2.0
4            Restaurant   2.0


----Lalor Lake----
             venue  freq
0    Moving Target   1.0
1          Airport   1.0
2             Lake   1.0
3       Non-Profit   0.0
4  Paintball Field   0.0


----Lamaque----
                  venue  freq
0           Gas Station   5.0
1         Grocery Store   3.0
2  Fast Food Restaurant   2.0
3           Coffee Shop   2.0
4                 Hotel   2.0


----Langlois----
                    

4        Hotel   2.0


----Renard----
                  venue  freq
0         Grocery Store   8.0
1           Coffee Shop   6.0
2  Fast Food Restaurant   6.0
3              Pharmacy   6.0
4                  Café   4.0


----Rocanville----
                  venue  freq
0           Gas Station   3.0
1           Coffee Shop   3.0
2        Ice Cream Shop   2.0
3  Fast Food Restaurant   2.0
4                   Inn   1.0


----Saint-Armand----
                 venue  freq
0  American Restaurant   5.0
1                Hotel   4.0
2              Brewery   4.0
3        Grocery Store   3.0
4           Restaurant   3.0


----Saint-Basile----
                  venue  freq
0         Grocery Store   7.0
1                  Park   6.0
2  Fast Food Restaurant   6.0
3            Restaurant   5.0
4                  Café   5.0


----Saint-Modeste----
                  venue  freq
0           Gas Station   8.0
1           Coffee Shop   5.0
2                 Hotel   4.0
3  Fast Food Restaurant   4.0
4      

### Part 3 : Look at each mine and there most common venues

In [19]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [20]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Mine_grouped['Neighborhood']

for ind in np.arange(Mine_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Mine_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,4J,Fast Food Restaurant,Coffee Shop,Grocery Store,Pub,History Museum,Ski Lodge,Convenience Store,Resort,Café,New American Restaurant
1,777,Kitchen Supply Store,Fast Food Restaurant,Gas Station,Big Box Store,Yoga Studio,Fabric Shop,Duty-free Shop,Eastern European Restaurant,Electronics Store,Escape Room
2,Acton Vale,Coffee Shop,Fast Food Restaurant,Hotel,Sandwich Place,Restaurant,Brewery,Grocery Store,Pharmacy,Park,Ice Cream Shop
3,Allan,Coffee Shop,Breakfast Spot,Ice Cream Shop,Bookstore,Pub,Restaurant,Bakery,Café,Sandwich Place,Grocery Store
4,Amaranth,Construction & Landscaping,Automotive Shop,Gas Station,Hotel,Hockey Arena,Exhibit,Drugstore,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant


### Part 4 : Using kmean to segmented and clustered mine base on the town venues

In [21]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

Mine_grouped_clustering = Mine_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Mine_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Mine_merged = df_Mine
Mine_merged.rename(columns={"Mine Name": "Neighborhood"},inplace = True)

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
Mine_merged = pd.merge(Mine_merged, neighborhoods_venues_sorted, how="inner", on=["Neighborhood"])
Mine_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,x,y,Mine Type,Mine Type ID,Latitude,Longitude,marker_color,Town address,Town lat,...,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,New Afton,-1745331.0,538990.0,Metal mines,4,50.661,-120.514,yellow,"Kamloops , British Columbia",50.675827,...,Hotel,Gas Station,Coffee Shop,American Restaurant,Japanese Restaurant,Sushi Restaurant,Restaurant,Breakfast Spot,Brewery,Café
1,Copper Mountain,-1804148.0,403843.9,Metal mines,4,49.331001,-120.533002,yellow,"Fraser Valley , British Columbia",59.715297,...,Cruise Ship,Coffee Shop,Seafood Restaurant,Café,Train Station,Boat or Ferry,Breakfast Spot,Electronics Store,Tourist Information Center,Bar
2,Highland Valley,-1787566.0,535991.6,Metal mines,4,50.4855,-121.0483,yellow,"Logan Lake , British Columbia",50.494463,...,Gas Station,Coffee Shop,Sushi Restaurant,Restaurant,Café,Hotel,Brewery,Bed & Breakfast,Inn,Breakfast Spot
3,Gibraltar,-1773727.0,776445.3,Metal mines,4,52.528495,-122.28716,yellow,"Cariboo , British Columbia",49.247779,...,Park,Coffee Shop,Brewery,Hotel,Trail,Ice Cream Shop,Market,Bakery,Lake,Dessert Shop
4,Mount Milligan,-1754343.0,1081143.0,Metal mines,4,55.118,-124.031,yellow,"Bulkley-Nechako , British Columbia",54.531617,...,Fast Food Restaurant,Sandwich Place,Warehouse Store,Yoga Studio,Escape Room,Drive-in Theater,Drugstore,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant


In [63]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

import folium # map rendering library


# create map
map_clusters = folium.Map(location=[57, -91], zoom_start=4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Mine_merged['Town lat'], Mine_merged['Town long'], Mine_merged['Neighborhood'], Mine_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters


In [64]:
from branca.element import Template, MacroElement

template = """
{% macro html(this, kwargs) %}

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>jQuery UI Draggable - Default functionality</title>
  <link rel="stylesheet" href="//code.jquery.com/ui/1.12.1/themes/base/jquery-ui.css">

  <script src="https://code.jquery.com/jquery-1.12.4.js"></script>
  <script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script>
  
  <script>
  $( function() {
    $( "#maplegend" ).draggable({
                    start: function (event, ui) {
                        $(this).css({
                            right: "auto",
                            top: "auto",
                            bottom: "auto"
                        });
                    }
                });
});

  </script>
</head>
<body>

 
<div id='maplegend' class='maplegend' 
    style='position: absolute; z-index:9999; border:2px solid grey; background-color:rgba(255, 255, 255, 0.8);
     border-radius:6px; padding: 10px; font-size:14px; right: 20px; bottom: 20px;'>
     
<div class='legend-title'>Legend (draggable!)</div>
<div class='legend-scale'>
  <ul class='legend-labels'>
    <li><span style='background:red;opacity:0.7;'></span>Cluster 0</li>
    <li><span style='background:purple;opacity:0.7;'></span>Cluster 1</li>
    <li><span style='background:Blue;opacity:0.7;'></span>Cluster 2</li>
    <li><span style='background:Aquamarine;opacity:0.7;'></span>Cluster 3</li>
    <li><span style='background:orange;opacity:0.7;'></span>Cluster 4</li>
    

  </ul>
</div>
</div>
 
</body>
</html>

<style type='text/css'>
  .maplegend .legend-title {
    text-align: left;
    margin-bottom: 5px;
    font-weight: bold;
    font-size: 90%;
    }
  .maplegend .legend-scale ul {
    margin: 0;
    margin-bottom: 5px;
    padding: 0;
    float: left;
    list-style: none;
    }
  .maplegend .legend-scale ul li {
    font-size: 80%;
    list-style: none;
    margin-left: 0;
    line-height: 18px;
    margin-bottom: 2px;
    }
  .maplegend ul.legend-labels li span {
    display: block;
    float: left;
    height: 16px;
    width: 30px;
    margin-right: 5px;
    margin-left: 0;
    border: 1px solid #999;
    }
  .maplegend .legend-source {
    font-size: 80%;
    color: #777;
    clear: both;
    }
  .maplegend a {
    color: #777;
    }
</style>
{% endmacro %}"""

macro = MacroElement()
macro._template = Template(template)

map_clusters.get_root().add_child(macro)

map_clusters

In [23]:
Mine_merged

Unnamed: 0,Neighborhood,x,y,Mine Type,Mine Type ID,Latitude,Longitude,marker_color,Town address,Town lat,...,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,New Afton,-1.745331e+06,5.389900e+05,Metal mines,4,50.661000,-120.514000,yellow,"Kamloops , British Columbia",50.675827,...,Hotel,Gas Station,Coffee Shop,American Restaurant,Japanese Restaurant,Sushi Restaurant,Restaurant,Breakfast Spot,Brewery,Café
1,Copper Mountain,-1.804148e+06,4.038439e+05,Metal mines,4,49.331001,-120.533002,yellow,"Fraser Valley , British Columbia",59.715297,...,Cruise Ship,Coffee Shop,Seafood Restaurant,Café,Train Station,Boat or Ferry,Breakfast Spot,Electronics Store,Tourist Information Center,Bar
2,Highland Valley,-1.787566e+06,5.359916e+05,Metal mines,4,50.485500,-121.048300,yellow,"Logan Lake , British Columbia",50.494463,...,Gas Station,Coffee Shop,Sushi Restaurant,Restaurant,Café,Hotel,Brewery,Bed & Breakfast,Inn,Breakfast Spot
3,Gibraltar,-1.773727e+06,7.764453e+05,Metal mines,4,52.528495,-122.287160,yellow,"Cariboo , British Columbia",49.247779,...,Park,Coffee Shop,Brewery,Hotel,Trail,Ice Cream Shop,Market,Bakery,Lake,Dessert Shop
4,Mount Milligan,-1.754343e+06,1.081143e+06,Metal mines,4,55.118000,-124.031000,yellow,"Bulkley-Nechako , British Columbia",54.531617,...,Fast Food Restaurant,Sandwich Place,Warehouse Store,Yoga Studio,Escape Room,Drive-in Theater,Drugstore,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
202,Aurora North and South,-9.572986e+05,1.037388e+06,Oil sands mines,7,57.313399,-111.458001,red,"Wood Buffalo , Alberta",57.652783,...,Coffee Shop,Hotel,Airport,Bus Station,Falafel Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Escape Room,Exhibit
203,Mildred Lake,-9.665172e+05,1.004577e+06,Oil sands mines,7,57.000001,-111.466707,red,"Wood Buffalo , Alberta",57.652783,...,Coffee Shop,Hotel,Airport,Bus Station,Falafel Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Escape Room,Exhibit
204,Muskeg River,-9.622356e+05,1.031173e+06,Oil sands mines,7,57.246500,-111.512003,red,"Wood Buffalo , Alberta",57.652783,...,Coffee Shop,Hotel,Airport,Bus Station,Falafel Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Escape Room,Exhibit
205,Fort Hills,-9.614889e+05,1.047701e+06,Oil sands mines,7,57.395199,-111.572001,red,"Wood Buffalo , Alberta",57.652783,...,Coffee Shop,Hotel,Airport,Bus Station,Falafel Restaurant,Duty-free Shop,Eastern European Restaurant,Electronics Store,Escape Room,Exhibit


In [24]:
MineMaster = pd.merge(Mine_merged, Mine_grouped, on="Neighborhood")

In [49]:
MineMaster_light = pd.merge(Mine_merged[['Neighborhood','Cluster Labels']], Mine_grouped, on="Neighborhood")
MineMaster_light = MineMaster_light.drop(['Neighborhood'],axis = 1)
MineMaster_light = MineMaster_light.groupby(['Cluster Labels']).mean()
MineMaster_light

Unnamed: 0_level_0,Zoo,ATM,Accessories Store,Airport,Airport Lounge,Airport Terminal,American Restaurant,Antique Shop,Apres Ski Bar,Art Gallery,...,Warehouse Store,Water Park,Waterfall,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Yoga Studio
Cluster Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,0.0,0.015038,0.0,0.24812,0.0,0.015038,0.218045,0.0,0.0,0.0,...,0.022556,0.0,0.0,0.0,0.0,0.0,0.015038,0.007519,0.045113,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0625,0.0,0.0,0.0,0.0,0.0,1.1875,0.0,0.0,0.0,...,0.9375,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.6875,0.0
3,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,2.0
4,0.433962,0.0,0.037736,0.09434,0.018868,0.0,1.075472,0.075472,0.056604,0.150943,...,0.716981,0.09434,0.018868,0.207547,0.018868,0.075472,0.075472,0.339623,0.188679,0.09434


In [54]:
MineMaster.to_csv(R"C:\Users\19029\Desktop\Personal\05 - Learning\Python Learning\IBM\Capstone\MineMaster.csv", index = False)
neighborhoods_venues_sorted.to_csv(R"C:\Users\19029\Desktop\Personal\05 - Learning\Python Learning\IBM\Capstone\neighborhoods_venues_sorted.csv", index = False)
MineMaster_light.to_csv(R"C:\Users\19029\Desktop\Personal\05 - Learning\Python Learning\IBM\Capstone\MineMaster_light.csv", index = False)