## Golfing in Hawaii

### Problem Description and Discussion

In discovery discussions with the client on the **_business of golf in Hawaii_**, it soon became clear they were looking for sound data on which to make some significant business decisions. In their business community everyone claims to know the business. But the client wants to go beyond anecdotal stories and acquire a deep understanding of the customer. 

There are many vacationers coming to Hawaii who also play some golf while they are here. But among those coming to Hawaii, many are serious golfers looking to maintain their game skills during the off season. As we often tell the mainlanders, "_It's never off season in Hawaii._"

**The casual golfer:** For the casual golfer there are many suitable services such as: resorts with splendid accommodations and golf courses; golf tour packages through online hotel reservation systems; a few independent tour guide services specializing in golf; and the occasional golf pro instructor with a small client base. Oftentimes the golf pro will work almost exclusively with one of the big resorts and have the occasional side job with a client. 

**The serious golfer:** For the serious golfer finding an adequate service is more of a challenge. While they may enjoy some vacation time while here, their main focus is golf, serious golf. They don't want to be annoyed by casual golfers, they don't want to be treated like a tourist, they often come for an extended stay in Hawaii every winter for one main purpose, serious golf. While golf courses at the resorts are quite good, they don't want to get stuck in a foursome with golf novices because this is all about **_serious golf_**. This type of visitor is willing to pay **_serious money_**. These clients are looking for an upscale guide service which includes knowledgeable golf pros and caters to serious golfers like themselves. This type of customer is usually looking to play several golf courses during their stay. For this customer it's like collecting bragging rights for their golf buddies so they can say things like, "_Oh yeah, I golfed that course. Do you remember the 14th fairway?_" 

#### Key Question
From an extensive review of golf services in Hawaii a key question came into focus... 

"**_How can we provide the best tour guide service for the serious golfer coming to Hawaii?_**" 

With this question there a several follow-on questions:
- Which courses are most likely to be on the serious golfer's list?
- Which courses should we strongly recommend?
- How long is the typical golfing vacation?
- How many days should we plan the tours?
- For a one week tour, which golf courses would a serious golf fan play?
- Two week tour?
- Three week tour?
- How do the accommodations fit into this equation?
- What is the short list of courses for us to spend time on gaining the intimate knowledge needed 
to give good recommendations on how to play each hole on the course?

#### Stakeholders
There are many potential stakeholders in these questions and the data needed to answer them:
- The resort and hotel industry
- The golf course design and construction industry
- The tour guide industry
- Golf pro instructors

#### Target Audience
The target audience for this data science study is a consortium of golf instructors. While they do okay as captive employees to the stereotypical resort, they can do much better as independent guides. While they often receive tips as a resort employee, as an independent guide they can build the tip into the service and still receive additional, lucrative tips which they don't have to share with the resort staff. While there is competition among the independent golf pros, they realize they can pool their knowledge and skills to offer a premium service with broad appeal. 

#### Why They Care
By building this premium service together, this like minded group of golf pros is confident they can make an exceptionally good living while living in paradise and "working" at one of the best jobs in the world. 

--------

## Data Description and Usage

#### Foursquare Data
For this project it is abundantly clear that the data available from [FOURSQUARE](http://foursquare.com/ "Foursqare") is of great value and can be used to:
- Develop a geo-location inventory of all the golf courses in Hawaii
- Continuously gather and evaluate reviews of all the golf courses
- Plan tours to play the most courses with the least amount of travel
- Develop a **short list** of the most highly recommended courses for the serious golf fan

The geo-location data is indispensable for this project. It can effectively be used to gather and develop a database of golf courses. This database can be used to recommend and tailor custom tours for each client. Island-wide maps of the golf courses will provide a nice visual appeal and help in planning the tour. The distance and location data will be very valuable. Readily available reviews from the tips data in Foursquare will be much better than the typical static reviews on the resort web sites. This is especially valuable for current reviews from real players on the conditions of each golf course. Additionally, the client can encourage reviews from their customers and develop a network of reviewers with a good reputation. Telling the serious golfer that his insights are valuable is not only helpful to the business, but a good compliment and selling point for the customer. Finally, the Foursquare data will come in quite handy for all the other accommodations needed for the tour such as hotels, restaurants and transportation. 

#### Other Data, Missing Data and Data Wrangling
- If data is scarce for some locations and venues, a search for alternative sources should be conducted.
- Mean values can be filled it for missing numeric data as appropriate.

Per data science methodology, because this study answers the following yes/no question: _Should I play course ABC?_... The **classification approach** will be used.

**Data Wrangling** will be a significant effort in providing quality data for this project. Because the source data is provided by end user input, much of the location data needs to be fixed. There is a lack of consistent naming and there are multiple entries for the same golf courses. Specific Foursquare location data for "the 10th green is not helpful. Likewise, some of the reviews can be flippant and non-informative. 

#### Extra data needs
For specific data beyound what is available in Foursquare, it may be scraped from sites such as these:
- [Hawaii Golf](http://www.hawaiigolf.com/courses/)
- [Top 100 Courses](https://www.top100golfcourses.com/golf-courses/north-america/usa/hawaii)
- [Hawaii State Golf](http://www.hawaiistategolf.org/Default.aspx%3Fp%3DDynamicModule%26pageid%3D309547%26ssid%3D198036%26vnf%3D1)

#### Data Examples
Here are some examples of the data we will provide: 
- Island Maps
- The Golf Course Short List
- Course Reviews

### Island Maps

Notes to project reviewer:
- Please note that this is just a data preview, so the map will only be for the Big Island.
- Please scroll down to the bottom so you can review the resulting visuals of the example data.

In [1]:
# Import all needed packages here
import numpy as np
import pandas as pd
import requests
import json
from pandas.io.json import json_normalize
#import matplotlib.cm as cm
#import matplotlib.colors as colors
import geocoder
import folium
from constants import CLIENT_ID, CLIENT_SECRET, CATID, VERSION, LIMIT
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 20)
print('Libraries imported.')

Libraries imported.


#### Get location data

In [3]:
geo_data = [ #Island      Town             Radius    meters in a mile = 1610
            ('Hawaii',   'Naalehu, HI',    161000),
            ('Lanai',    'Lanai City, HI',  16100),
            ('Maui',     'Pukalani, HI',    40250),
            ('Molokai',  'Kualapuu, HI',     4830),
            ('Oahu',     'Laie, HI',        64400),
            ('Kauai',    'Wailua, HI',      40250),
           ]

In [5]:
default_radius = 40234 # in meters = 25 miles
d_island = {}
for island, town, radius in geo_data:
    lat = lng = None
    while lng is None:
        gc = geocoder.arcgis(town)
        lat, lng = gc.latlng
    d_island[town] = {'latitude':lat, 'longitude': lng, 'radius': radius,
                      'venue_list': None, 'venues_df': None, 'map': None}
    print(town, lat, lng)

print('Done getting coordinates.')

Naalehu, HI 19.061420000000055 -155.58232999999998
Lanai City, HI 20.82794000000007 -156.91951999999998
Pukalani, HI 20.839210000000037 -156.34107999999998
Kualapuu, HI 21.151910000000044 -157.03659999999996
Laie, HI 21.64867000000004 -157.92323999999996
Wailua, HI 22.055890000000034 -159.37107999999998
Done getting coordinates.


In [13]:
# Just do the Big Island for the example
lat = 19.061420000000055
lng = -155.58232999999998
radius = 161000

In [14]:
def get_geo_data(lat, lng, radius):
    burl = 'https://api.foursquare.com/v2/venues/search?'
    buri = '&client_id={}&client_secret={}&v={}&categoryId={}&limit={}&radius={}&ll={},{}'
    url = burl + buri
    url = url.format(CLIENT_ID, CLIENT_SECRET, VERSION, CATID, LIMIT, radius, lat, lng)
    return requests.get(url).json()

In [18]:
json_data = get_geo_data(lat, lng, radius)

In [16]:
def create_venues_list(json_data):
    l_dict_data = json_data['response']['venues'][:]
    l_venues = []
    for dven in l_dict_data:
        ltmp = [dven['name'].title(), 
                float(dven['location']['lat']), 
                float(dven['location']['lng']), 
                round(dven['location']['distance']/1609.344, 1),
                dven['id']]
        l_venues.append(ltmp)
    l_venues.sort()
    return l_venues

In [None]:
l_venues = create_venues_list(json_data)
top_courses = [
 'Four Seasons Resort Hualalai At Historic Ka`Upulehu',
 'Hapuna Golf Course',
 'The Club At Hokulia',
 'Hualalai Golf Course',
 'Kona Country Club: The Ocean Course',
 'Mauna Kea Golf Course',
 'Mauna Lani South Course',
 'Nanea',
 'Waikoloa Village Golf Club']
top_venues = [ven for ven in l_venues if ven[0] in top_courses]
short_list = create_venues_df(top_venues)
#short_list

In [41]:
def create_venues_df(list_data):
    df = pd.DataFrame(columns=['Golf Course', 'Latitude', 'Longitude', 'Distance'])
    for crs, lat, lng, dst, id in list_data:
        df = df.append({'Golf Course': crs, 'Latitude': lat, 'Longitude': lng,
                        'Distance': dst}, ignore_index=True)
    return df

In [46]:
def create_map(lat, lng, points):
    the_map = folium.Map(location=[lat, lng], zoom_start=9)
    # add points for golf courses
    for course, lat, lng, dst, vid in points:
        label = '{}'.format(course)
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker([lat, lng], radius=5, popup=label, color='blue',
            fill=True, fill_color='#3186cc', fill_opacity=0.7, 
            parse_html=False).add_to(the_map)

    return the_map

In [47]:
def get_venue_data(venue_id):
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'
    url = url.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
    return requests.get(url).json()

In [None]:
def get_venue_tips(venue_id):
    url = 'https://api.foursquare.com/v2/venues/{}/tips?client_id={}&client_secret={}&v={}&limit={}'
    url = url.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION, LIMIT)
    return requests.get(url).json()

In [None]:
def get_overall_rating(result):
    rating = 'This venue has not been rated yet.'
    try: rating = result['response']['venue']['rating']
    except: pass
    return rating

In [52]:
venue_tips = [
    ['Kona Country Club', '#14 is worth the entire day... Bring a camera!'],
    ['Kona Country Club', "It's fun to crack macadamia nuts w your cart at the 15th hole."],
    ['Kona Country Club', "The effective sea breeze on fairway 18 is more than you feel from the tee."]]

In [53]:
tips_df = pd.DataFrame(columns=['Golf Course', 'Tip'])
for crs, tip in venue_tips:
    tips_df = tips_df.append({'Golf Course': crs, 'Tip': tip}, ignore_index=True)

In [None]:
l_venues = create_venues_list(json_data)
big_island_map = create_map(lat, lng, l_venues)

In [56]:
pd.set_option('display.max_colwidth', 200)

Okay, we have the data so let's look at the results...

### Island Maps

In [48]:
big_island_map

### Golf Course Short List

In [57]:
short_list

Unnamed: 0,Golf Course,Latitude,Longitude,Distance
0,Four Seasons Resort Hualalai At Historic Ka`Upulehu,19.827752,-155.991819,59.4
1,Hapuna Golf Course,19.995342,-155.820798,66.4
2,Hualalai Golf Course,19.826661,-155.991906,59.3
3,Kona Country Club: The Ocean Course,19.559458,-155.964965,42.6
4,Mauna Kea Golf Course,20.006168,-155.823015,67.2
5,Mauna Lani South Course,19.938855,-155.867346,63.5
6,Nanea,19.79518,-155.998184,57.5
7,The Club At Hokulia,19.506256,-155.947365,38.9
8,Waikoloa Village Golf Club,19.930252,-155.793443,61.7
9,Waikoloa Village Golf Club,19.93049,-155.794758,61.7


### Course Tips

In [58]:
tips_df

Unnamed: 0,Golf Course,Tip
0,Kona Country Club,#14 is worth the entire day... Bring a camera!
1,Kona Country Club,It's fun to crack macadamia nuts w your cart at the 15th hole.
2,Kona Country Club,The effective sea breeze on fairway 18 is more than you feel from the tee.
