# Battle of the Neighbourhoods - Finding an ideal holiday destination

## Project Description:

I decided to use this opportunity to analyse and understand which holiday destinations would suit us the best. 

### Description of the problem:
It's difficult to find the holiday spots that my wife and I can agree upon. As expected my wife always has the upper hand. Hopefully with this project I will search potential holiday destinations and score them for desirability based on our requirements. I will be searching for the top 100 attractions from centroid location in a city and assigning each a score. The scoring will be based on two key requirements: 
1. Has good city centre or what is available near city centre for a good cultural break that I like to have. 
2. Has a sea-side that my wife likes. 

### Methodology
First we'll identify 3 holiday destination coordinates using Geopy and our original tried and tested holiday destination, Devon, for comparison purposes:

In [1]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import warnings
warnings.filterwarnings('ignore')
address = 'Devon'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude_x = location.latitude
longitude_y = location.longitude
print('The geograpical coordinate of {} is {}, {}.'.format(address, latitude_x, longitude_y))


The geograpical coordinate of Devon is 50.75, -3.75.


In [2]:
import folium

In [21]:
m = folium.Map(
    location=[52.3745403, 4.89797550561798],
    zoom_start=5,
    tiles='Stamen Terrain'
)

folium.Marker(
    location=[55.6867243, 12.5700724],
    popup='Copenhagen',
    icon=folium.Icon(icon='flag')
).add_to(m)

folium.Marker(
    location=[48.85341, 2.3488],
    popup='Paris',
    icon=folium.Icon(color='green', icon='flag')
).add_to(m)

folium.Marker(
    location=[52.6843696, -1.8275286],
    popup='Home',
    icon=folium.Icon(color='red', icon='home')
).add_to(m)

folium.Marker(
    location=[52.3745403, 4.89797550561798],
    popup='Amsterdam',
    icon=folium.Icon(color='orange', icon='flag')
).add_to(m)

folium.Marker(
    location=[50.75, -3.75],
    popup='Devon',
    icon=folium.Icon(color='red', icon='heart')
).add_to(m)

m

### Next steps
I am going to use these coordinates in a Foursquare query to find venues for determining scores for each of these categories for my wife and me (D and S) and put these into a dataframe. 

Combine the data from Foursquare and the scores from the family to calculate the desirability of the 4 holiday destinations: Paris, Amsterdam, Copenhagen and Devon (control group). 

I will then use cluster analysis to ensure that both of us will enjoy the holiday by identifying the holiday destination that provide both sea-side and city break.

In [14]:
# find distinct venues in Paris as sample set
# perhaps filter for only hoilday relevant venues
# determine scoing from 4 parties 
# describe calcaultions
# plot locations based desirabilities back to map

Copenhagen_Coords =[55.6867243, 12.5700724]
Paris_Coords=[48.85341, 2.3488]
Amsterdam_Coords=[52.3745403, 4.89797550561798]
Devon_Coords=[50.725562, -3.5269108]

#### Import the libraries we're going to need:


In [15]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
# Matplotlib and associated plotting modules
import matplotlib as plt
import matplotlib.colors as colors

In [16]:
#Foursquare credentials 
CLIENT_ID = 'AGHDNWBAEWYOG5LQQILEJQREDLXZX3L23FY02ZXEY4VRCLTG'
CLIENT_SECRET = 'YJA1A1BECET22RATJKSQU014Y0LWHVHOQ433VXZFES1BGSQE'
VERSION = '20180605'


## First challenge - Filtering FourSquare results so that it only brings back results that are relevant

https://developer.foursquare.com/docs/api/venues/recommendations : By setting the 'intent' filter, we can only bring back venues of that category: food, breakfast, brunch, lunch, coffee, dinner, dessert, drinks, shopping, fun, sights. Specifies the top-level “intent” for a search.

For my holiday activities, I'm going to use 'fun'

In [17]:
intent = 'fun'
radius = 2000
LIMIT = 100

First we'll define a function to clean up the data received from FourSquare

In [18]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Then this function return the required data given a set of coordinates:

In [20]:
def get_holiday_fun(coords):
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&intent={}'.format(
    CLIENT_ID, CLIENT_SECRET, VERSION, coords[0], coords[1], radius, LIMIT,intent)
    results = requests.get(url).json()
    venues=results['response']['groups'][0]['items']
    nearby_venues = json_normalize(venues)
    filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
    nearby_venues = nearby_venues.loc[:, filtered_columns]
    nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
    nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
    return nearby_venues

I'll use my function 4 times to pull the information about 'fun' attractions in the potential holiday destinations

In [22]:
copenhagen_venues = get_holiday_fun(Copenhagen_Coords)
paris_venues = get_holiday_fun(Paris_Coords)
amsterdam_venues = get_holiday_fun(Amsterdam_Coords)
devon_venues = get_holiday_fun(Devon_Coords)

I now need a distinct list of attractions so that I can take these to my audience and ask them which is most important for our holiday. So I combine the datasets into one after having marked each with the city names:

In [23]:
# Combine Datasets
copenhagen_venues['City'] = 'Copenhagen'
paris_venues['City'] = 'Paris'
amsterdam_venues['City'] = 'Amsterdam'
devon_venues['City'] = 'Devon'
df = pd.concat([copenhagen_venues, paris_venues,amsterdam_venues,devon_venues])
df.columns = ['Name','Categories','Latitude','Longitude','City']

Using the .unique() function, I've found each type of attraction and written this to a CSV



In [24]:
categories = df.Categories.unique()

dataset = pd.DataFrame({'Categories':categories})

dataset.sort_values(by=['Categories'], ascending = True).to_csv(r'categories.csv')

I will use the scoring data in week 2 to score the venues and pick up the ideal holiday location for us