# Project Name: Ireland Tourism Planning - Sites of Interest

## Project Description
<p>This project involves travel recommendations for a trip to Ireland. While Ireland is not a large country, it is diverse and has multiple areas of interest. In order to make the most of the time there, travelers need to know which locations have the top tourist venues -- i.e. landmarks, historical sites, museums, parks. I plan to make use of of the Ireland cities database downloaded from SimpleMaps.com. I will then use foursquare to locate points of interest and utilize K-means to cluster the data and identify areas with the most venues of interest so that the travelers itinerary can be planned accordingly.

## Data Used
<p>I am using an Ireland cities CSV file (filename: ie.csv) downloaded from Simplemaps.com. It contains city/county/geocoordinate/population information. I will strip out the population information for this exercise and keep only city/country/lat/long.

<p>I will feed this information into foursquare and pull out tourism points of interest. This will be done by selecting certain categories from fourspace -- Historic sites, Landmarks, Monuments and Memorial sites.  

<p>I'll then run K-means clustering and highlight results on a map of Ireland.

In [360]:
#Import required Libraries

import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


In [361]:
#Read table from SimpleMaps.com which contains Geocoordinates for top towns in Ireland. Only select latitude, longitude, 
#City and County from the source table
table1=pd.read_csv('ie.csv')
table1
ireland = table1[['city', 'lat','lng','admin']] 
ireland
ireland.columns = (['City', 'Latitude','Longitude','County'])
ireland.columns

Index(['City', 'Latitude', 'Longitude', 'County'], dtype='object')

In [362]:
#As we will land in Dublin, we will start our analysis there.

address = 'Dublin, Ireland'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

53.3497645 -6.2602732


### Create Map of Ireland with Cities
<p> Map of Ireland with 33 cities overlaid. You can see that it would be useful to have a tool to show which cities have the most tourist attractions as defined by our categories (Historic sites, monuments, castles, landmarks)

In [363]:
#Create map of Ireland with Cities laid on top.
# create map of New York using latitude and longitude values
map_ireland = folium.Map(location=[latitude, longitude], zoom_start=7)

# add markers to map
for lat, lng, city, county in zip(ireland['Latitude'], ireland['Longitude'], ireland['City'], ireland['County']):
    label = '{}, {}'.format(city, county)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='purple',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_ireland)  
    

print('Map of Ireland with Cities highlighted')

Map of Ireland with Cities highlighted


In [16]:
map_ireland

### First fourspace query
<p> We will run our first fourspace query to locate historic sites, landmarks/monuments in Dublin. Dublin is the capital city and where our flight lands, so it is the best place to start.

In [418]:
LIMIT = 500 # limit of number of venues returned by Foursquare API
radius = 16000 #Ten Mile Radius
categoryId =  '4deefb944765f83613cdba6e,5642206c498e4bfca532186c,4bf58dd8d48988d12d941735'
CLIENT_ID = 'C1MTKBAJLE1JMCAM2NWPGBBO2KJUWMS45VOFBUOIEE4HH2KO'
CLIENT_SECRET = 'VTTK1FFJTFFG3P13L5FN3VOHTWUI04CENX5HNSEP53LRYLWH'
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude, 
    radius,
    LIMIT,
    categoryId)
url 

'https://api.foursquare.com/v2/venues/explore?&client_id=C1MTKBAJLE1JMCAM2NWPGBBO2KJUWMS45VOFBUOIEE4HH2KO&client_secret=VTTK1FFJTFFG3P13L5FN3VOHTWUI04CENX5HNSEP53LRYLWH&v=20180604&ll=53.3497645,-6.2602732&radius=16000&limit=500&categoryId=4deefb944765f83613cdba6e,5642206c498e4bfca532186c,4bf58dd8d48988d12d941735'

In [419]:
#Run query to get the results for Dublin
results = requests.get(url).json()
results


{'meta': {'code': 200, 'requestId': '5dea74b640a7ea450c653ee5'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Dublin',
  'headerFullLocation': 'Dublin',
  'headerLocationGranularity': 'city',
  'query': 'historic site',
  'totalResults': 61,
  'suggestedBounds': {'ne': {'lat': 53.493764644000144,
    'lng': -6.019488465128034},
   'sw': {'lat': 53.205764355999854, 'lng': -6.501057934871967}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4ade0f24f964a520387121e3',
       'name': 'Old City Area',
       'location': {'address': 'Between Parliament St. & Fishamble St.',
        'lat': 53.34434670745031,
        'lng': -6.268818108718381,
        'labeledLatLngs': [{'label': 'disp

In [420]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [454]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(10)

Unnamed: 0,name,categories,lat,lng
0,Old City Area,Historic Site,53.344347,-6.268818
1,"Phoenix Park, Park Gate St. Entrance",Historic Site,53.350219,-6.3002
2,Trinity College Old Library & The Book of Kell...,College Library,53.343692,-6.256907
3,Marsh's Library,Historic Site,53.339037,-6.270671
4,Liffey Boardwalk,Scenic Lookout,53.346797,-6.26181
5,The Spire of Dublin / An Túr Solais (The Spire...,Monument / Landmark,53.349805,-6.26026
6,Abbey Theatre,Theater,53.348542,-6.257492
7,Dublin City Wall,Historic Site,53.344186,-6.2745
8,Kilmainham Gaol,Museum,53.341849,-6.308478
9,General Post Office (GPO),Historic Site,53.349507,-6.260277


In [422]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

61 venues were returned by Foursquare.


### Results of Dublin query
<p> 61 attractions found in Dublin alone ! Top results include the Old city area, Phoenix Park, Trinity College and Book of Kells, Dublin city wall, Spire of Dublin. Lets run now for all the towns!

In [423]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    LIMIT = 500 # limit of number of venues returned by Foursquare API
    radius = 16000 #Ten mile radius
    categoryId =  '4deefb944765f83613cdba6e,5642206c498e4bfca532186c,4bf58dd8d48988d12d941735'
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            categoryId)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude',          
                  'Venue Category']
    
    return(nearby_venues)

In [424]:
print(url)

https://api.foursquare.com/v2/venues/explore?&client_id=C1MTKBAJLE1JMCAM2NWPGBBO2KJUWMS45VOFBUOIEE4HH2KO&client_secret=VTTK1FFJTFFG3P13L5FN3VOHTWUI04CENX5HNSEP53LRYLWH&v=20180604&ll=53.3497645,-6.2602732&radius=16000&limit=500&categoryId=4deefb944765f83613cdba6e,5642206c498e4bfca532186c,4bf58dd8d48988d12d941735


In [425]:
# Run above function to print the names of the cities and retrieve venues

ireland_venues = getNearbyVenues(names=ireland['City'],
                                   latitudes=ireland['Latitude'],
                                   longitudes=ireland['Longitude']
                                  )



Dublin
Cork
Limerick
Galway
Waterford
Drogheda
Tralee
Kilkenny
Sligo
Killarney
Shannon
Monaghan
Ros Comáin
Donegal
Lifford
Carlow
Mullingar
An Cabhán
Wicklow
Dún Dealgan
Clonmel
Naas
Ennis
Port Laoise
Swords
Tallaght
Wexford
Trim
Tullamore
Castlebar
Nenagh
Dunleary
Longford
Carrick on Shannon


In [458]:
ireland_venues


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Dublin,53.333056,-6.248889,Old City Area,53.344347,-6.268818,Historic Site
1,Dublin,53.333056,-6.248889,Marsh's Library,53.339037,-6.270671,Historic Site
2,Dublin,53.333056,-6.248889,Trinity College Old Library & The Book of Kell...,53.343692,-6.256907,College Library
3,Dublin,53.333056,-6.248889,"Phoenix Park, Park Gate St. Entrance",53.350219,-6.300200,Historic Site
4,Dublin,53.333056,-6.248889,Liffey Boardwalk,53.346797,-6.261810,Scenic Lookout
...,...,...,...,...,...,...,...
319,Dunleary,53.292500,-6.128611,Prison Chill Mhaigneann,53.341782,-6.308023,Historic Site
320,Dunleary,53.292500,-6.128611,Magazine Fort,53.349725,-6.315452,Historic Site
321,Dunleary,53.292500,-6.128611,The Phoenix Monument,53.360734,-6.327082,Monument / Landmark
322,Dunleary,53.292500,-6.128611,Drimnagh Castle,53.324747,-6.332395,Castle


### Full city results
<p> When run for all cities in Ireland we have 324 results. Top 61 still in Dublin but additional historic sites in other cities such as Cork, Limerick, Galway and Dunleary.


In [427]:
#Analyze each city
# one hot encoding
ireland_onehot = pd.get_dummies(ireland_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ireland_onehot['City'] = ireland_venues['City'] 

# move neighborhood column to the first column
fixed_columns = [ireland_onehot.columns[-1]] + list(ireland_onehot.columns[:-1])
ireland_onehot = ireland_onehot[fixed_columns]

ireland_onehot

Unnamed: 0,City,Art Gallery,Capitol Building,Castle,College Library,Garden,Government Building,Historic Site,History Museum,Memorial Site,Monument / Landmark,Museum,Outdoor Sculpture,Park,Plaza,Road,Scenic Lookout,Sculpture Garden,Theater,Trail
0,Dublin,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
1,Dublin,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
2,Dublin,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Dublin,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
4,Dublin,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
319,Dunleary,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
320,Dunleary,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
321,Dunleary,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
322,Dunleary,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [428]:
ireland_onehot.shape


(324, 20)

In [429]:
ireland_grouped = ireland_onehot.groupby('City').mean().reset_index()
ireland_grouped

Unnamed: 0,City,Art Gallery,Capitol Building,Castle,College Library,Garden,Government Building,Historic Site,History Museum,Memorial Site,Monument / Landmark,Museum,Outdoor Sculpture,Park,Plaza,Road,Scenic Lookout,Sculpture Garden,Theater,Trail
0,An Cabhán,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Carlow,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Carrick on Shannon,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Castlebar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Clonmel,0.0,0.0,0.0,0.0,0.0,0.0,0.571429,0.0,0.142857,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Cork,0.0,0.0,0.0,0.0,0.0,0.0,0.636364,0.0,0.090909,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Donegal,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Drogheda,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Dublin,0.016393,0.016393,0.032787,0.016393,0.032787,0.032787,0.360656,0.016393,0.0,0.327869,0.016393,0.016393,0.0,0.032787,0.016393,0.032787,0.016393,0.016393,0.0
9,Dunleary,0.016393,0.016393,0.032787,0.016393,0.032787,0.032787,0.360656,0.016393,0.0,0.327869,0.016393,0.016393,0.0,0.032787,0.016393,0.032787,0.016393,0.016393,0.0


In [430]:
print(ireland_venues.shape)
ireland_venues.head()

(324, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Dublin,53.333056,-6.248889,Old City Area,53.344347,-6.268818,Historic Site
1,Dublin,53.333056,-6.248889,Marsh's Library,53.339037,-6.270671,Historic Site
2,Dublin,53.333056,-6.248889,Trinity College Old Library & The Book of Kell...,53.343692,-6.256907,College Library
3,Dublin,53.333056,-6.248889,"Phoenix Park, Park Gate St. Entrance",53.350219,-6.3002,Historic Site
4,Dublin,53.333056,-6.248889,Liffey Boardwalk,53.346797,-6.26181,Scenic Lookout


In [431]:
ireland_venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
An Cabhán,2,2,2,2,2,2
Carlow,1,1,1,1,1,1
Carrick on Shannon,1,1,1,1,1,1
Castlebar,1,1,1,1,1,1
Clonmel,7,7,7,7,7,7
Cork,11,11,11,11,11,11
Donegal,2,2,2,2,2,2
Drogheda,7,7,7,7,7,7
Dublin,61,61,61,61,61,61
Dunleary,61,61,61,61,61,61


In [432]:
num_top_venues = 5

for hood in ireland_grouped['City']:
    print("----"+hood+"----")
    temp = ireland_grouped[ireland_grouped['City'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----An Cabhán----
                 venue  freq
0  Monument / Landmark   0.5
1        Historic Site   0.5
2               Museum   0.0
3              Theater   0.0
4     Sculpture Garden   0.0


----Carlow----
                 venue  freq
0  Monument / Landmark   1.0
1               Museum   0.0
2              Theater   0.0
3     Sculpture Garden   0.0
4       Scenic Lookout   0.0


----Carrick on Shannon----
              venue  freq
0     Historic Site   1.0
1       Art Gallery   0.0
2            Museum   0.0
3           Theater   0.0
4  Sculpture Garden   0.0


----Castlebar----
              venue  freq
0     Memorial Site   1.0
1       Art Gallery   0.0
2            Museum   0.0
3           Theater   0.0
4  Sculpture Garden   0.0


----Clonmel----
                 venue  freq
0        Historic Site  0.57
1  Monument / Landmark  0.29
2        Memorial Site  0.14
3    Outdoor Sculpture  0.00
4              Theater  0.00


----Cork----
                 venue  freq
0        Historic Si

In [433]:
print('There are {} unique categories.'.format(len(ireland_venues['Venue Category'].unique())))

There are 19 unique categories.


In [434]:
#put in dataframe
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [435]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
city_venues_sorted = pd.DataFrame(columns=columns)
city_venues_sorted['City'] = ireland_grouped['City']

for ind in np.arange(ireland_grouped.shape[0]):
    city_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ireland_grouped.iloc[ind, :], num_top_venues)

city_venues_sorted

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,An Cabhán,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
1,Carlow,Monument / Landmark,Memorial Site,Capitol Building,Castle,College Library
2,Carrick on Shannon,Historic Site,Trail,Memorial Site,Capitol Building,Castle
3,Castlebar,Memorial Site,Trail,Capitol Building,Castle,College Library
4,Clonmel,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle
5,Cork,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle
6,Donegal,Monument / Landmark,Memorial Site,Capitol Building,Castle,College Library
7,Drogheda,Historic Site,Trail,Memorial Site,Capitol Building,Castle
8,Dublin,Historic Site,Monument / Landmark,Scenic Lookout,Castle,Plaza
9,Dunleary,Historic Site,Monument / Landmark,Scenic Lookout,Castle,Plaza


### City Breakdown of Venues
<p> Above shows top 5 venues by city. You can see our three categories have been broken down into 19 subcategories. Quite a few castles in this list 

### Run K-means clustering 
<p> We'll run K-means clustering to see if we can group the cities into types of venues to better allow tourists to decide where to visit.

In [436]:
# import k-means from clustering stage
from sklearn.cluster import KMeans


In [437]:

#Run K means to cluster Cities
# set number of clusters
kclusters = 5

ireland_grouped_clustering = ireland_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ireland_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 4, 1, 2, 0, 0, 4, 1, 0, 0], dtype=int32)

In [385]:
ireland_grouped_clustering


Unnamed: 0,Art Gallery,Art Museum,Comedy Club,Concert Hall,Dance Studio,Exhibit,Go Kart Track,History Museum,Indie Movie Theater,Jazz Club,...,Piano Bar,Planetarium,Public Art,Racecourse,Rock Club,Rugby Stadium,Science Museum,Theater,Tour Provider,Zoo Exhibit
0,0.088235,0.0,0.0,0.0,0.0,0.0,0.058824,0.088235,0.0,0.0,...,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.147059,0.0,0.0
1,0.09,0.04,0.01,0.03,0.01,0.0,0.02,0.14,0.02,0.0,...,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.16,0.01,0.03
2,0.047619,0.0,0.0,0.0,0.047619,0.0,0.0,0.190476,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0
3,0.04,0.0,0.0,0.0,0.0,0.0,0.04,0.12,0.0,0.0,...,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.24,0.0,0.0
4,0.1,0.0,0.0,0.05,0.016667,0.0,0.033333,0.133333,0.0,0.0,...,0.016667,0.0,0.0,0.016667,0.0,0.033333,0.0,0.066667,0.0,0.0
5,0.054054,0.0,0.0,0.081081,0.027027,0.0,0.054054,0.081081,0.0,0.0,...,0.0,0.0,0.0,0.0,0.027027,0.027027,0.0,0.054054,0.0,0.0
6,0.083333,0.0,0.0,0.041667,0.0,0.0,0.041667,0.166667,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0
7,0.07,0.04,0.01,0.03,0.01,0.0,0.02,0.08,0.02,0.0,...,0.0,0.01,0.0,0.01,0.03,0.0,0.01,0.18,0.01,0.03
8,0.09,0.04,0.01,0.03,0.01,0.0,0.02,0.09,0.02,0.01,...,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.17,0.01,0.03
9,0.09,0.04,0.01,0.03,0.01,0.0,0.02,0.09,0.02,0.01,...,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.17,0.01,0.03


In [440]:
#Add Clustering Labels
# add clustering labels
#neighborhoods_venues_sorted
# Run if running code more than once
#city_venues_sorted = city_venues_sorted.drop('Cluster Labels', 1)
city_venues_sorted.insert(0, ('Cluster Labels'), kmeans.labels_)

ireland_merged = ireland

# merge ireland_grouped with venue data
ireland_merged = ireland_merged.join(city_venues_sorted.set_index('City'), on='City')

ireland_merged # check the last columns!


Unnamed: 0,City,Latitude,Longitude,County,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Dublin,53.333056,-6.248889,Dublin,0.0,Historic Site,Monument / Landmark,Scenic Lookout,Castle,Plaza
1,Cork,51.898611,-8.495833,Cork,0.0,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle
2,Limerick,52.664722,-8.623056,Limerick,0.0,Historic Site,Monument / Landmark,Museum,Memorial Site,Capitol Building
3,Galway,53.271944,-9.048889,Galway,0.0,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
4,Waterford,52.258333,-7.111944,Waterford,0.0,Historic Site,Monument / Landmark,Park,Memorial Site,Capitol Building
5,Drogheda,53.718889,-6.347778,Louth,1.0,Historic Site,Trail,Memorial Site,Capitol Building,Castle
6,Tralee,52.266667,-9.716667,Kerry,0.0,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
7,Kilkenny,52.654167,-7.252222,Kilkenny,1.0,Historic Site,Castle,Trail,Memorial Site,Capitol Building
8,Sligo,54.266667,-8.483333,Sligo,2.0,Historic Site,Memorial Site,Outdoor Sculpture,Trail,Capitol Building
9,Killarney,52.05,-9.516667,Kerry,0.0,Monument / Landmark,Castle,Historic Site,Memorial Site,Capitol Building


In [442]:
#Need to drop rows where NAN and then check to verify. Not interested in cities with no top venues.
ireland_merged = ireland_merged.drop([ireland_merged.index[11],ireland_merged.index[14],ireland_merged.index[16],ireland_merged.index[28],ireland_merged.index[32]])
ireland_merged.isnull()

Unnamed: 0,City,Latitude,Longitude,County,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False
5,False,False,False,False,False,False,False,False,False,False
6,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False
8,False,False,False,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False,False,False,False


In [443]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow
#add markers to the map
markers_colors = []
for lat, lon, city, cluster in zip(ireland_merged['Latitude'], ireland_merged['Longitude'], ireland_merged['City'], ireland_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
      [lat, lon],
      radius=5,
      popup=label,
      color = rainbow[int(cluster)-1],
      fill=True,
      fill_color = rainbow[int(cluster)-1],
      fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Results of Clustering: Map
<p> From the map above, we can see the results of Clustering. There are quite a few clusters surrounding Dublin as expected, but there are also some in Limerick, Waterford, Cork and Galway. Next step is to analyze the clusters to see if its what we want.

In [448]:
print("Cluster 1 Results")
ireland_merged.loc[ireland_merged['Cluster Labels'] == 0, ireland_merged.columns[[0] + list(range(5, ireland_merged.shape[1]))]]

Cluster 1 Results


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Dublin,Historic Site,Monument / Landmark,Scenic Lookout,Castle,Plaza
1,Cork,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle
2,Limerick,Historic Site,Monument / Landmark,Museum,Memorial Site,Capitol Building
3,Galway,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
4,Waterford,Historic Site,Monument / Landmark,Park,Memorial Site,Capitol Building
6,Tralee,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
9,Killarney,Monument / Landmark,Castle,Historic Site,Memorial Site,Capitol Building
17,An Cabhán,Monument / Landmark,Historic Site,Memorial Site,Capitol Building,Castle
19,Dún Dealgan,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle
20,Clonmel,Historic Site,Monument / Landmark,Memorial Site,Capitol Building,Castle


### Cluster 1 results
<p> Cluster one contains many of the venues we are interested in. Based on this, it would be well worth visiting Dublin, Cork, Limerick, Galway and Waterford.

In [449]:
print("Cluster 2 Results")
ireland_merged.loc[ireland_merged['Cluster Labels'] == 1, ireland_merged.columns[[0] + list(range(5, ireland_merged.shape[1]))]]

Cluster 2 Results


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
5,Drogheda,Historic Site,Trail,Memorial Site,Capitol Building,Castle
7,Kilkenny,Historic Site,Castle,Trail,Memorial Site,Capitol Building
10,Shannon,Historic Site,Trail,Memorial Site,Capitol Building,Castle
12,Ros Comáin,Historic Site,Trail,Memorial Site,Capitol Building,Castle
18,Wicklow,Historic Site,Trail,Memorial Site,Capitol Building,Castle
21,Naas,Historic Site,Trail,Memorial Site,Capitol Building,Castle
23,Port Laoise,Historic Site,Memorial Site,Trail,Capitol Building,Castle
33,Carrick on Shannon,Historic Site,Trail,Memorial Site,Capitol Building,Castle


### Cluster 2 results
<p> Cluster 2 contains some historic and memorial sites. No monuments/landmarks but some historic sites and of castle (thout not many). Probably worth visiting depending upon the length of your trip.

In [450]:
print("Cluster 3 Results")
ireland_merged.loc[ireland_merged['Cluster Labels'] == 2, ireland_merged.columns[[0] + list(range(5, ireland_merged.shape[1]))]]

Cluster 3 Results


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
8,Sligo,Historic Site,Memorial Site,Outdoor Sculpture,Trail,Capitol Building
22,Ennis,Memorial Site,Historic Site,Trail,Capitol Building,Castle
29,Castlebar,Memorial Site,Trail,Capitol Building,Castle,College Library


### Cluster 3, 4 and 5 results
<p> Overall fewer results here though there are still some historic sites, monuments and castles to be found. Probably not worth visiting Nenagh in Cluster 4 as the most common venue is a trail. Additional analysis would be needed on Clusters 3 and 5to determine the value of visiting these sites (Dependent upon what else is in your itinerary)

In [451]:
print("Cluster 4 Results")
ireland_merged.loc[ireland_merged['Cluster Labels'] == 3, ireland_merged.columns[[0] + list(range(5, ireland_merged.shape[1]))]]

Cluster 4 Results


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
30,Nenagh,Trail,Memorial Site,Capitol Building,Castle,College Library


In [452]:
print("Cluster 5 Results")
ireland_merged.loc[ireland_merged['Cluster Labels'] == 4, ireland_merged.columns[[0] + list(range(5, ireland_merged.shape[1]))]]

Cluster 5 Results


Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
13,Donegal,Monument / Landmark,Memorial Site,Capitol Building,Castle,College Library
15,Carlow,Monument / Landmark,Memorial Site,Capitol Building,Castle,College Library


### Conclusion
<p> Given the number of venues in Cluster 1, I would suggest that a tourist start here.  Top cities to visit would be Dublin, Limerick, Cork, Galway and Waterford. As some Cluster 2 cities lie on the way to Cluster 1s, they may be worth a stop as well - these include Shannon and Kilkenny,