# Hipster Finder - Density Based Clustering

### Capstone Project for IBM's Data Science Certification

Stephen Ewing
06/03/2019

## Introduction

While nobody wants to be called a hipster recent history has shown that the places hipsters flock tend to have property values explode.  From Williamsburg Brooklyn to East Atlanta hipster hotspots have attracted the eye of real estate investors across the country.  This clustering method could be of use to real estate investors, small businesspeople and various taste makers.

On the other hand, hipsters are often unwashed beardos and their presence is anathema to many.  If your neighborhood is becoming a hipster haven you might want to move and start renting out your house for exorbitant amounts of money.

This project will show a method to identify such places.

## Data

The data for this project will draw from the Foursquare API.  I will query the Foursquare API with a city and the search term 'hipster'.  From which I will collect the name of the venue along with its latitude and longitude.

In [6]:
import pandas as pd
import numpy as np
!conda install -c conda-forge folium=0.5.0 --yes 
import folium
from geopy.geocoders import Nominatim
import json
import requests
import matplotlib.cm as cm
import matplotlib.colors as colors

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/DSX-Python35

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    altair-2.2.2               |           py35_1         462 KB  conda-forge
    openssl-1.0.2r             |       h14c3975_0         3.1 MB  conda-forge
    certifi-2018.8.24          |        py35_1001         139 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         4.0 MB

The following NEW packages will

In [47]:
city = 'atlanta, ga'

geolocator = Nominatim(user_agent="explorer")
location = geolocator.geocode(city)
lat = location.latitude
lon = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(city, lat, lon))

The geograpical coordinate of atlanta, ga are 33.7490987, -84.3901849.


In [48]:
# The code was removed by Watson Studio for sharing.

In [49]:
LIMIT = 1000
query = 'hipster'
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&near={},&query={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            city,  
            query,
            LIMIT)
            
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']

# make the list of venues
venues_list = []
venues_list.append([(
    v['venue']['name'], 
    v['venue']['location']['lat'], 
    v['venue']['location']['lng'],
    v['venue']['categories'][0]['name']) for v in results])
    
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])

nearby_venues.columns = ['Venue', 'Latitude', 'Longitude', 'Venue Category']
nearby_venues.head(10)

Unnamed: 0,Venue,Latitude,Longitude,Venue Category
0,Octane Coffee,33.779402,-84.410225,Coffee Shop
1,The Earl,33.740963,-84.346027,Bar
2,MJQ Concourse,33.774218,-84.363277,Nightclub
3,Sun in My Belly,33.764165,-84.316531,Breakfast Spot
4,Octane Coffee + Little Tart Bakeshop,33.746074,-84.372786,Coffee Shop
5,The Highlander,33.779144,-84.36745,Dive Bar
6,Jack's Pizza & Wings,33.761403,-84.365232,Pizza Place
7,The Local,33.773981,-84.362285,Bar
8,Grant Central Pizza,33.740094,-84.345674,Pizza Place
9,97 Estoria,33.752141,-84.363384,Bar


In [50]:
map_hipsters = folium.Map(location=[lat, lon], zoom_start=13)


# add markers to map
for lat, lon, Venue, Category in zip(nearby_venues['Latitude'], nearby_venues['Longitude'], nearby_venues['Venue'], nearby_venues['Venue Category']):
    label = 'Venue: {}\n Category: {}'.format(Venue, Category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.7,
        parse_html=False).add_to(map_hipsters)  

folium.TileLayer('cartodbdark_matter').add_to(map_hipsters)
    
map_hipsters

In [43]:
from sklearn.cluster import DBSCAN
import sklearn.utils
from sklearn.preprocessing import StandardScaler
sklearn.utils.check_random_state(1000)
Clus_dataSet = nearby_venues[['Latitude','Longitude']]
Clus_dataSet = np.nan_to_num(Clus_dataSet)
Clus_dataSet = StandardScaler().fit_transform(Clus_dataSet)

# Compute DBSCAN
db = DBSCAN(eps=0.15, min_samples=4).fit(Clus_dataSet)
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
nearby_venues["Clus_Db"]=labels

# A sample of clusters
nearby_venues[["Venue", "Clus_Db"]].head(10)

Unnamed: 0,Venue,Clus_Db
0,Off the Record,-1
1,Strangeways,-1
2,The Wild Detectives,0
3,The Ginger Man,-1
4,Houndstooth Coffee,1
5,Lakewood Landing,-1
6,Mudsmith,1
7,Company Cafe,1
8,Crooked Tree Coffeehouse,-1
9,Kung Fu Saloon,-1


In [44]:
map_hipsters = folium.Map(location=[lat, lon], zoom_start=13)

hip_clusters = nearby_venues[nearby_venues['Clus_Db'] >= 0]

colors_array = cm.rainbow(np.linspace(0, 1, len(set(hip_clusters['Clus_Db']))))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to map
for lat, lon, Venue, Category, Cluster in zip(hip_clusters['Latitude'], hip_clusters['Longitude'], hip_clusters['Venue'], hip_clusters['Venue Category'], hip_clusters['Clus_Db']):
    label = 'Venue: {}\n Category: {}'.format(Venue, Category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[Cluster-1],
        fill=True,
        fill_color=rainbow[Cluster-1],
        fill_opacity=0.7,
        parse_html=False).add_to(map_hipsters)  

folium.TileLayer('cartodbdark_matter').add_to(map_hipsters)
    
map_hipsters