# Leveraging Location Data of Anoka County for Optimal Viability of New Storefronts

### By: Jonathan Resch
#### Contact Information: jonathanresch@gmail.com 

#### Published: 12/13/2020



### Table of Contents

* [Executive summary/Description of Problem](#abstract)

* [Background/Introductory Section](#background)
     * [Description of Data](#data)

* [Methodology](#method)

* [Results](#results)

* [Discussion](#discuss)

* [Conclusion](#con)

* [Acknowledgements](#ack)

* [References](#reference)

* [Appendices](#appendix)
<a class="anchor" id="abstrct"></a>


<a class="anchor" id="abstract"></a>
### Executive Summary/Description of Problem/Introduction

&emsp;The Covid-19 pandemic has presented hardship on many people this year.  Truly this is an unprecedent time of sadness for many in the forms of health, loss of loved ones, and financial burdens.  There are a few people who have been able to adapt to the new pressures of Covid and found opportunity for new business ventures from everything from video chat software, contact-less scheduling, or traveling distances to nurse for hospitals in need.  These select people have seen a boom in their personal finances and should they want to invest back into their community want to ensure their new windfall is put in the best place possible.  Using location data from Foursquare, this report hopes to elucidate the best location in anoka county for a new store front. In addition, this report will generate suggestions on what may maximize consumer adoption in that area.

<a class="anchor" id="background"></a>
### Background/Introductory Section
&emsp; Minnesota borders Canada in the middle of the United States. Anoka county is north of the capital city of Minneapolis.  The area can be best described as rural to increasingly more suburban as you approach Minneapolis.  The county has a total population of approximately 356,921.(https://www.census.gov/quickfacts/anokacountyminnesota)

&emsp;Location is the predominant factor in deciding where the store front is placed as financial data is outside of the scope of this report.  Location data is additionaly used in this report to imply popularity and success of the corresponding business.


In [512]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "https://www.usnews.com/news/healthiest-communities/img/counties/27003.png", width=300, height=342)

<a class="anchor" id="data"></a>
#### Description of Data
&emsp; Anoka county, MN consists of 20 cities. Using the Foursquare API, venues were searched within a 2500m radius of the center of the latitude and longitude of the listed cities.  580 venues were found.  There were 8 cities with less then 10 venues listed with these parameter and 7 cities that generated 68% of the data used.  Data includes the city's name, venue's name, latitude and longitude of both city and venue, and a classification of the venue.

&emsp; While there are risks in taking into account these cities with few venues listed on Foursquare, these could also be untapped potential.  Thankfully there is data supporting popularity of businesses in other cities that can give ideas on what could be successful in the cities of these counties.



In [513]:
import pandas as pd   
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
%matplotlib inline
from IPython.display import display
from urllib.request import urlopen
import requests
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans
import folium 
from geopy.geocoders import Nominatim 
import json 
from pandas.io.json import json_normalize 
import matplotlib.cm as cm
import matplotlib.colors as colors

In [514]:
anokacounty = ['Andover',
               'Anoka',
               'Bethel',
               'Blaine',
               'Centerville',
               'Circle Pines',
               'Columbia Heights',
               'Columbus',
               'Coon Rapids',
               'East Bethel',
               'Fridley',
               'Ham Lake',
               'Hilltop',
               'Lexington',
               'Lino Lakes',
               'Nowthen',
               'Oak Grove',
               'Ramsey',
               'Saint Francis',
               'Spring Lake Park']

In [515]:
df = pd.DataFrame(anokacounty, columns=['City'])
df['latitude'] = 0
df['longitude'] = 0

In [516]:
for i in range(len(anokacounty)):
    anokacounty[i] = anokacounty[i] + ', MN'

In [517]:
for i  in range(len(anokacounty)):
    address = anokacounty[i]
    geolocator = Nominatim(user_agent="Bexplorer")
    location = geolocator.geocode(address)
    df['latitude'].loc[i] = location.latitude
    df['longitude'].loc[i] = location.longitude
print("Done gathering location data")

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


Done gathering location data


In [518]:
CLIENT_ID = '414YC1C2SV3HIWWBNGYOPD4NQLZWY4VPTUEVQUHYDIFKTTVI' 
CLIENT_SECRET = 'SGAOSWWB4AHWJHLKI02JQKTLJ3OUR4O25NLQ02WOIRXVTYIG' 
ACCESS_TOKEN = 'QGCU0QJ1S1XDO2NYY43YH1BCPO0MTO3NAS4CNXYNDSPUKHTG' 
VERSION = '20180604'
LIMIT = 100 

In [519]:
def getNearbyVenues(names, latitudes, longitudes, radius=2500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']   
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name'], 
            v['venue']['id']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', #
                  'Neighborhood Latitude', #
                  'Neighborhood Longitude', #
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category',
                  'Venue ID']
    
    return(nearby_venues)

In [520]:
anokacounty_venues = getNearbyVenues(names=df['City'],
                                   latitudes=df['latitude'],
                                   longitudes=df['longitude'])

Andover
Anoka
Bethel
Blaine
Centerville
Circle Pines
Columbia Heights
Columbus
Coon Rapids
East Bethel
Fridley
Ham Lake
Hilltop
Lexington
Lino Lakes
Nowthen
Oak Grove
Ramsey
Saint Francis
Spring Lake Park


In [521]:
anokacounty_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID
0,Andover,45.233298,-93.291341,Bunker Hills Regional Park,45.220310,-93.276911,Park,4b94307af964a520a96d34e3
1,Andover,45.233298,-93.291341,Andover YMCA,45.246526,-93.308480,Gym,4d8018d6360b224ba349d456
2,Andover,45.233298,-93.291341,The Bean,45.245793,-93.305646,Café,4efca9797ee59da372083e7f
3,Andover,45.233298,-93.291341,Pizza Ranch,45.220488,-93.310777,Pizza Place,52bf6674498ebf3d1a801592
4,Andover,45.233298,-93.291341,Dunkin',45.218475,-93.307057,Donut Shop,58bc67512bc5e25a80ce4ec8
...,...,...,...,...,...,...,...,...
575,Spring Lake Park,45.115427,-93.249287,Marino's,45.101948,-93.237400,Italian Restaurant,4be566112468c928acf2ff42
576,Spring Lake Park,45.115427,-93.249287,ABC Liquors,45.111671,-93.220697,Liquor Store,4e7bcae262e1b601280e23e2
577,Spring Lake Park,45.115427,-93.249287,Shoeaholic,45.110992,-93.278198,Shoe Store,543d074b498e006131f0c949
578,Spring Lake Park,45.115427,-93.249287,El Rinconcito Latino Resturant,45.111022,-93.278475,Burrito Place,4c746b282db5236abe68ba79


In [522]:
# create map of Anoka County Venues using latitude and longitude values
map_anokac2 = folium.Map(location=[45.160799, -93.234949], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(anokacounty_venues['Venue Latitude'], anokacounty_venues['Venue Longitude'], anokacounty_venues['Venue']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_anokac2)  
    
map_anokac2

In [523]:
# one hot encoding
anokac_onehot = pd.get_dummies(anokacounty_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
anokac_onehot['Neighborhood'] = anokacounty_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [anokac_onehot.columns[-1]] + list(anokac_onehot.columns[:-1])
anokac_onehot = anokac_onehot[fixed_columns]

In [524]:
anokac_grouped = anokac_onehot.groupby('Neighborhood').mean().reset_index()

In [525]:
num_top_venues = 10
for hood in anokac_grouped['Neighborhood']:
    temp = anokac_grouped[anokac_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})

In [526]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [527]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = anokac_grouped['Neighborhood']
for ind in np.arange(anokac_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(anokac_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Andover,Park,Coffee Shop,Pizza Place,Bank,Fast Food Restaurant,Donut Shop,Movie Theater,Convenience Store,Café,Restaurant
1,Anoka,Park,Other Repair Shop,Racetrack,Football Stadium,Disc Golf,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store
2,Bethel,Home Service,Dive Bar,Brewery,Bar,Construction & Landscaping,Other Repair Shop,Department Store,Dessert Shop,Diner,Disc Golf
3,Blaine,Sandwich Place,Fast Food Restaurant,American Restaurant,Department Store,Taco Place,Pet Store,Skating Rink,Coffee Shop,Cosmetics Shop,Big Box Store
4,Centerville,Park,Beach,Baseball Field,Gas Station,Hotel,Ice Cream Shop,Fast Food Restaurant,Mexican Restaurant,Pizza Place,Campground


In [528]:
# set number of clusters
kclusters = 4
anokac_grouped_clustering = anokac_grouped.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(anokac_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 0, 2, 3, 3, 3, 3, 0, 3, 2])

In [529]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
anokac_merged = anokacounty_venues
anokac_merged = anokac_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [530]:
# create map
map_clusters = folium.Map(location=[45.160799, -93.234949], zoom_start=10)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(anokac_merged['Neighborhood Latitude'], anokac_merged['Neighborhood Longitude'], anokac_merged['Neighborhood'], anokac_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)      
map_clusters

In [531]:
neighborhoods_venues_sorted.sort_values('Cluster Labels')

Unnamed: 0,Cluster Labels,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,0,Anoka,Park,Other Repair Shop,Racetrack,Football Stadium,Disc Golf,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store
7,0,Columbus,Auto Garage,Video Store,Park,Lake,Nature Preserve,Diner,Donut Shop,Dog Run,Dive Bar,Discount Store
18,1,Saint Francis,Nightlife Spot,Yoga Studio,Disc Golf,Falafel Restaurant,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store,Diner
9,2,East Bethel,Home Service,Disc Golf,Construction & Landscaping,Golf Course,Falafel Restaurant,Electronics Store,Donut Shop,Dog Run,Dive Bar,Discount Store
2,2,Bethel,Home Service,Dive Bar,Brewery,Bar,Construction & Landscaping,Other Repair Shop,Department Store,Dessert Shop,Diner,Disc Golf
15,2,Nowthen,Construction & Landscaping,Home Service,Pizza Place,Liquor Store,Burger Joint,Baseball Field,Bar,Dog Run,Electronics Store,Donut Shop
17,3,Ramsey,Furniture / Home Store,Pizza Place,Bar,Fast Food Restaurant,Sandwich Place,Pharmacy,Park,Chinese Restaurant,Donut Shop,Automotive Shop
16,3,Oak Grove,Soccer Field,Print Shop,Lake,Furniture / Home Store,Farm,Discount Store,Falafel Restaurant,Electronics Store,Donut Shop,Dog Run
14,3,Lino Lakes,Beach,Park,Golf Course,Baseball Field,Moving Target,Campground,Lake,Discount Store,Electronics Store,Donut Shop
13,3,Lexington,Pizza Place,Sandwich Place,Grocery Store,Video Store,Bar,Discount Store,Construction & Landscaping,Café,Cajun / Creole Restaurant,Pharmacy


In [532]:
neighborhoods_venues_sorted['Cluster Labels'].value_counts()

3    14
2     3
0     2
1     1
Name: Cluster Labels, dtype: int64

# K-means Analysis of Anoka County
Using a K-means analysis,  I, frankly, was disappointed in the results. Defining the number of clusters greater then 5 resulted in the algorithm seperating out only one value in each cluster except the last cluster in which all the rest of the remaining cities would be placed.  

However, using 3 clusters, we can see that at least the towns of East Bethel, Bethely and Nowthen were grouped together indicating at least some relation there.  St. Francis, possibly rightly so, was placed in its own group. (St. Francis' top 3 venues being unique amongst all cities Nightlife Spot, Yoga Studio, Food).  This also has something to do with the sample size.  Only one venue was pulled from St. Francis.

In [533]:
anokacounty_venues["Venue Category"].value_counts().index.tolist()

['Pizza Place',
 'Park',
 'Sandwich Place',
 'American Restaurant',
 'Coffee Shop',
 'Fast Food Restaurant',
 'Pharmacy',
 'Bar',
 'Mexican Restaurant',
 'Grocery Store',
 'Video Store',
 'Gym',
 'Pet Store',
 'Chinese Restaurant',
 'Discount Store',
 'Furniture / Home Store',
 'Ice Cream Shop',
 'Big Box Store',
 'Liquor Store',
 'Baseball Field',
 'Salon / Barbershop',
 'Cosmetics Shop',
 'Automotive Shop',
 'Construction & Landscaping',
 'Bakery',
 'Hardware Store',
 'Convenience Store',
 'Lake',
 'Gym / Fitness Center',
 'ATM',
 'Trail',
 'Playground',
 'Café',
 'Sports Bar',
 'Bank',
 'Beach',
 'Noodle House',
 'Golf Course',
 'Brewery',
 'Thrift / Vintage Store',
 'Home Service',
 'Spa',
 'Burrito Place',
 'Supermarket',
 'Donut Shop',
 'Other Repair Shop',
 'Korean Restaurant',
 'Hotel',
 'Hobby Shop',
 'Flower Shop',
 'Japanese Restaurant',
 'Disc Golf',
 'Sporting Goods Shop',
 'Hookah Bar',
 'Asian Restaurant',
 'Garden Center',
 'Mobile Phone Shop',
 'Thai Restaurant',
 'Caj

In [534]:
anokacounty_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Andover,22,22,22,22,22,22,22
Anoka,6,6,6,6,6,6,6
Bethel,6,6,6,6,6,6,6
Blaine,60,60,60,60,60,60,60
Centerville,18,18,18,18,18,18,18
Circle Pines,56,56,56,56,56,56,56
Columbia Heights,57,57,57,57,57,57,57
Columbus,5,5,5,5,5,5,5
Coon Rapids,44,44,44,44,44,44,44
East Bethel,4,4,4,4,4,4,4


In [535]:
Blaine = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Blaine']
CirclePine = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Circle Pines']
ColumHeights = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Columbia Heights']
Fridley = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Fridley']
Hilltop = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Hilltop']
SpringL = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Spring Lake Park']
Ramsey = anokacounty_venues.loc[anokacounty_venues["Neighborhood"]=='Ramsey']

In [536]:
Blaine = Blaine.append(CirclePine, ignore_index=True)
Blaine = Blaine.append(ColumHeights, ignore_index=True)
Blaine = Blaine.append(Fridley, ignore_index=True)
Blaine = Blaine.append(Hilltop, ignore_index=True)
Blaine = Blaine.append(SpringL, ignore_index=True)
Blaine = Blaine.append(Ramsey, ignore_index=True)

In [537]:
Blaine['Neighborhood'].value_counts()
Blaine.shape

(393, 8)

In [538]:
# one hot encoding
anokac_onehot = pd.get_dummies(Blaine[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
anokac_onehot['Neighborhood'] = Blaine['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [anokac_onehot.columns[-1]] + list(anokac_onehot.columns[:-1])
anokac_onehot = anokac_onehot[fixed_columns]

In [539]:
anokac_grouped = anokac_onehot.groupby('Neighborhood').mean().reset_index()

In [540]:
num_top_venues = 10
for hood in anokac_grouped['Neighborhood']:
    temp = anokac_grouped[anokac_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})

In [541]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = anokac_grouped['Neighborhood']

for ind in np.arange(anokac_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(anokac_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Blaine,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
1,Circle Pines,Pizza Place,Bar,Video Store,American Restaurant,Grocery Store,Salon / Barbershop,Coffee Shop,Discount Store,Pharmacy,Sandwich Place
2,Columbia Heights,Park,Pet Store,Pharmacy,Pizza Place,Mexican Restaurant,Coffee Shop,American Restaurant,Bar,Gym,Sandwich Place
3,Fridley,Gym,Sandwich Place,Video Store,American Restaurant,Pizza Place,Coffee Shop,Chinese Restaurant,Mexican Restaurant,Pharmacy,Café
4,Hilltop,Mexican Restaurant,Pizza Place,Park,Pharmacy,Video Store,Coffee Shop,Sandwich Place,Liquor Store,American Restaurant,Grocery Store


In [542]:
# set number of clusters
kclusters = 4
anokac_grouped_clustering = anokac_grouped.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(anokac_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 0, 1, 1, 1, 3, 2])

In [543]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
anokac_merged = Blaine
anokac_merged = anokac_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [544]:
neighborhoods_venues_sorted.sort_values('Cluster Labels')

Unnamed: 0,Cluster Labels,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,0,Circle Pines,Pizza Place,Bar,Video Store,American Restaurant,Grocery Store,Salon / Barbershop,Coffee Shop,Discount Store,Pharmacy,Sandwich Place
2,1,Columbia Heights,Park,Pet Store,Pharmacy,Pizza Place,Mexican Restaurant,Coffee Shop,American Restaurant,Bar,Gym,Sandwich Place
3,1,Fridley,Gym,Sandwich Place,Video Store,American Restaurant,Pizza Place,Coffee Shop,Chinese Restaurant,Mexican Restaurant,Pharmacy,Café
4,1,Hilltop,Mexican Restaurant,Pizza Place,Park,Pharmacy,Video Store,Coffee Shop,Sandwich Place,Liquor Store,American Restaurant,Grocery Store
0,2,Blaine,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
6,2,Spring Lake Park,Coffee Shop,Sports Bar,Gym,Sandwich Place,Liquor Store,Lingerie Store,Cosmetics Shop,Bookstore,Grocery Store,Fast Food Restaurant
5,3,Ramsey,Bar,Furniture / Home Store,Fast Food Restaurant,Pizza Place,Gym / Fitness Center,Donut Shop,Park,Sandwich Place,Grocery Store,Pharmacy


In [545]:
anokac_merged.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Blaine,45.160799,-93.234949,Noodles & Company,45.15885,-93.234269,Noodle House,4bcdee50b6c49c7487379691,2,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
1,Blaine,45.160799,-93.234949,Heat Yoga Studio,45.155261,-93.23419,Yoga Studio,4c9938779c663704d73c4efd,2,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
2,Blaine,45.160799,-93.234949,Starbucks,45.165493,-93.2334,Coffee Shop,4bf6d0acb182c9b6b48b745a,2,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
3,Blaine,45.160799,-93.234949,Hajime Sushi,45.169521,-93.231209,Sushi Restaurant,4f7e29f9e4b02cb6c6dc5cb9,2,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop
4,Blaine,45.160799,-93.234949,National Sports Center,45.15852,-93.226046,Athletics & Sports,487c5986f964a5201e511fe3,2,Sandwich Place,Fast Food Restaurant,American Restaurant,Wings Joint,Skating Rink,Pizza Place,Cosmetics Shop,Department Store,Taco Place,Coffee Shop


In [546]:
# create map
map_clusters2 = folium.Map(location=[45.160799, -93.234949], zoom_start=10)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(anokac_merged['Neighborhood Latitude'], anokac_merged['Neighborhood Longitude'], anokac_merged['Neighborhood'], anokac_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters2

In [None]:
epsilon = 0.3
minimumSamples = 7
db = DBSCAN(eps=epsilon, min_samples=minimumSamples).fit(X)
labels = db.labels_
labels_

In [None]:
a

In [378]:
anokac_grouped_clustering

Unnamed: 0,ATM,Accessories Store,Airport Terminal,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Automotive Shop,...,Tex-Mex Restaurant,Thai Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Veterans' Organization,Video Game Store,Video Store,Wings Joint,Yoga Studio
0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.016667,0.0,0.0,...,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.033333,0.016667
4,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,0.017857,0.0,0.017857,0.053571,0.0,0.017857,0.017857,0.0,0.0,0.017857,...,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.053571,0.0,0.0
6,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0
7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0
8,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.022727,...,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0
9,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
duh = anokacounty_venues.loc[anokacounty_venues['Venue ID']=='4d04253ae350b60cb4988142']
duh

In [None]:
venue_latitude = anokacounty_venues.loc[8, 'Venue Latitude'] # neighborhood latitude value
venue_longitude = anokacounty_venues.loc[8, 'Venue Longitude'] # neighborhood longitude value

venue_name = anokacounty_venues.loc[8, 'Venue'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(venue_name, 
                                                               venue_latitude, 
                                                               venue_longitude))

In [94]:
def getRatings(names, latitudes, longitudes, radius=1000):
    
    venues_list1 = []
    venues_list2 = []
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/{}?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        #only 500 calls per day as this is premo supremo    
            name,
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)


        # make the GET request
        rating = requests.get(url).json()["response"]["venue"]['rating']
        print(rating)
        if rating == NULL:
            rating = 0
        NumberofRatings = requests.get(url).json()["response"]["venue"]['ratingSignals']
        if rating == NULL:
            rating = 0
        venues_list1.append(rating)
        venues_list2.append(NumberofRatings)

    v = {'Venue ID':[name],'Rating':[rating],'Number of Ratings':[NumberofRatings]}
    nearby_venues2 = pd.DataFrame(v)

  #  nearby_venues2 = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
   # nearby_venues2.columns = ['Neighborhood', #
    #              'Neighborhood Latitude', #
     #             'Neighborhood Longitude', #
      #            'Venue Rating', 
       #           'Venue Number of Ratings']
    return(nearby_venues2)

In [95]:
anokacountyvenuesRate = getRatings(names=anokacounty_venues['Venue ID'],
                                   latitudes=anokacounty_venues['Venue Latitude'],
                                   longitudes=anokacounty_venues['Venue Longitude'])

5cf7d1382a7ab6002c52bf37


KeyError: 'rating'

In [85]:
anokacountyvenuesRate

Unnamed: 0,Venue,Rating,Number of Ratings
0,4d04253ae350b60cb4988142,6.3,8


anokacountyvenuesRate = getRatings(names=anokacounty_venues['Venue ID'],
                                   latitudes=anokacounty_venues['Venue Latitude'],
                                   longitudes=anokacounty_venues['Venue Longitude']
                                  )'rating'
print("done")

In [None]:
#results = requests.get(url).json()
#results
#Latitude and longitude values of Bunker Hills Regional Park are 45.22031033461877, -93.2769113696225.
#unique ID 4b94307af964a520a96d34e3 called 'venue': {'id': '4b94307af964a520a96d34e3', in JSON
#8.4/10 called 'rating': 8.4, in JSON
#44 ratings called 'ratingSignals': 44, in JSON

In [69]:
test

Unnamed: 0,Venue ID,Venue Latitude,Venue Longitude
0,4d04253ae350b60cb4988142,45.4036,-93.2678


In [67]:
#Test dataframe in case the function gets funky
test = pd.DataFrame([['bob','greg','Hello']], columns=('Venue ID','Venue Latitude','Venue Longitude'))
test['Venue Latitude'][0] = anokacounty_venues['Venue Latitude'][8]
test['Venue Longitude'][0] = anokacounty_venues['Venue Longitude'][8]
test['Venue ID'][0] = anokacounty_venues['Venue ID'][8]
display(test)
anokacountyvenuesRate = getRatings(names=test['Venue ID'],
                                   latitudes=test['Venue Latitude'],
                                   longitudes=test['Venue Longitude']
                                  )
display(anokacountyvenuesRate.head())

#should display data for Firsthand Construction LLC
#includes 0 ratings and 0 number of ratings which is funky but thats what it is

Unnamed: 0,Venue ID,Venue Latitude,Venue Longitude
0,4d04253ae350b60cb4988142,45.4036,-93.2678


4d04253ae350b60cb4988142


Unnamed: 0,0
0,[6.3]


In [51]:
anokacountyvenuesRate.shape

(31, 5)

In [None]:
# type your answer here
radius = 500
ID = '4b94307af964a520a96d34e3'
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
#url = 'https://api.foursquare.com/v2/venues/4b94307af964a520a96d34e3?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
##url = 'https://api.foursquare.com/v2/venues/{}?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
#bunker hill unique ID above
#only 500 calls per day as this is premo supremo    
    ##ID,
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    venue_latitude, 
    venue_longitude, 
    radius, 
    LIMIT)
url

In [56]:
#bunker hill premo ID test
radius = 500
ID = '4d04253ae350b60cb4988142'
#url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
#url = 'https://api.foursquare.com/v2/venues/4b94307af964a520a96d34e3?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
url = 'https://api.foursquare.com/v2/venues/{}?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
#bunker hill unique ID above
#only 500 calls per day as this is premo supremo    
    ID,
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    venue_latitude, 
    venue_longitude, 
    radius, 
    LIMIT)

venues_list=[]
url
results = requests.get(url).json()["response"]["venue"]
        #NumberofRatings = requests.get(url).json()["response"]["venue"]['ratingSignals']
name = 'goony jones'
lat = 1
lng = -1
venues_list.append([(
   name, 
   lat, 
   lng, 
   v['rating'], 
   v['ratingSignals']) for v in results])


        

TypeError: 'module' object is not subscriptable

In [None]:
def getRatings(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/{}?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        #only 500 calls per day as this is premo supremo    
            name,
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)

            
        # make the GET request
        results = requests.get(url).json()["response"]["venue"]
        venues_rating=result['response']['venues']['rating']
        
        venues_list.append(
            name, 
            lat, 
            lng, 
            ['rating'], 
            ['ratingSignals'])
        



    nearby_venues2 = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues2.columns = ['Neighborhood', #
                  'Neighborhood Latitude', #
                  'Neighborhood Longitude', #
                  'Venue Rating', 
                  'Venue Number of Ratings']
    return(nearby_venues2)

In [None]:
all in response. all in venue

{'meta': {'code': 200, 'requestId': '5fd672773878a257d034b2f8'},
 'response': {'venue': {'id': '4d04253ae350b60cb4988142',
   'name': 'The Hydrant Bar',
   'contact': {},
   'location': {'lat': 45.403551, #loc
    'lng': -93.267837,
    'labeledLatLngs': [{'label': 'display', 'lat': 45.403551,'lng': -93.267837},
     {'label': 'entrance', 'lat': 45.403691, 'lng': -93.267804}],
    'distance': 0,
    'postalCode': '55005',
    'cc': 'US',
    'city': 'Bethel',
    'state': 'MN',
    'country': 'United States',
    'formattedAddress': ['Bethel, MN 55005', 'United States']}, #loc
   'canonicalUrl': 'https://foursquare.com/v/the-hydrant-bar/4d04253ae350b60cb4988142',
   'categories': [{'id': '4bf58dd8d48988d118941735', #1 [{
     'name': 'Dive Bar',
     'pluralName': 'Dive Bars',
     'shortName': 'Dive Bar',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/divebar_','suffix': '.png'},
     'primary': True}], #1 }]
   'verified': False,
   'stats': {'tipCount': 4},
   'price': {'tier': 1, 'message': 'Cheap', 'currency': '$'},
   'likes': {'count': 4, #2 {
    'groups': [{'type': 'others',
      'count': 4,
      'items': [{'isSanctioned': False,
        'firstName': 'Miranda',
        'lastName': 'L',
        'countryCode': 'US'},
       {'isSanctioned': False,
        'firstName': 'Angela',
        'lastName': 'J',
        'countryCode': 'US'},
       {'isSanctioned': False,
        'firstName': 'Morgan',
        'lastName': 'D',
        'countryCode': 'US'},
       {'isSanctioned': False,
        'firstName': 'Faith',
        'lastName': 'S',
        'countryCode': 'US'}]}],
    'summary': '4 Likes'},  #2 }
   'dislike': False,
   'ok': False,
   'rating': 6.3,
   'ratingColor': 'FFC800',
   'ratingSignals': 8,
   'allowMenuUrlEdit': True,
   'beenHere': {'count': 0,
    'unconfirmedCount': 0,
    'marked': False,
    'lastCheckinExpiredAt': 0},
   'specials': {'count': 0, 'items': []},
   'photos': {'count': 3,
    'groups': [{'type': 'venue',
      'name': 'Venue photos',
      'count': 3,
      'items': [{'id': '5154d251e4b0ca06ecb0a0f5',
        'createdAt': 1364513361,
        'source': {'name': 'Foursquare for iOS',
         'url': 'https://foursquare.com/download/#/iphone'},
        'prefix': 'https://fastly.4sqi.net/img/general/',
        'suffix': '/2377328_5Ql9G16VFGspi-xAUd6f6-FnYJIVw94i6qwacuzJQOU.jpg',
        'width': 720,
        'height': 960,
        'user': {'isSanctioned': False,
         'firstName': 'Faith',
         'lastName': 'S',
         'countryCode': 'US'},
        'visibility': 'public'}]}]},
   'reasons': {'count': 0, 'items': []},
   'hereNow': {'count': 0, 'summary': 'Nobody here', 'groups': []},
   'createdAt': 1292117306,
   'tips': {'count': 4,
    'groups': [{'type': 'others',
      'name': 'All tips',
      'count': 4,
      'items': [{'id': '4e6c2cbf483b1f5ebdf949e2',
        'createdAt': 1315712191,
        'text': 'The pizza is HORRIBLE!',
        'type': 'user',
        'canonicalUrl': 'https://foursquare.com/item/4e6c2cbf483b1f5ebdf949e2',
        'lang': 'en',
        'likes': {'count': 1,
         'groups': [{'type': 'others',
           'count': 1,
           'items': [{'isSanctioned': False,
             'firstName': 'Lynn',
             'lastName': 'G',
             'countryCode': 'US'}]}],
         'summary': '1 like'},
        'logView': True,
        'agreeCount': 1,
        'disagreeCount': 0,
        'todo': {'count': 0},
        'user': {'isSanctioned': False,
         'firstName': 'James',
         'lastName': 'K',
         'countryCode': 'US'}}]}]},
   'shortUrl': 'http://4sq.com/i7bH4S',
   'timeZone': 'America/Chicago',
   'listed': {'count': 0,
    'groups': [{'type': 'others',
      'name': 'Lists from other people',
      'count': 0,
      'items': []}]},
   'seasonalHours': [],
   'pageUpdates': {'count': 0, 'items': []},
   'inbox': {'count': 0, 'items': []},
   'attributes': {'groups': [{'type': 'price',
      'name': 'Price',
      'summary': '$',
      'count': 1,
      'items': [{'displayName': 'Price',
        'displayValue': '$',
        'priceTier': 1}]},
     {'type': 'touchTunes',
      'name': 'TouchTunes Jukebox',
      'summary': 'Yes',
      'count': 1,
      'items': [{'displayName': 'No', 'displayValue': 'Yes'}]}]},
   'bestPhoto': {'id': '5154d251e4b0ca06ecb0a0f5',
    'createdAt': 1364513361,
    'source': {'name': 'Foursquare for iOS',
     'url': 'https://foursquare.com/download/#/iphone'},
    'prefix': 'https://fastly.4sqi.net/img/general/',
    'suffix': '/2377328_5Ql9G16VFGspi-xAUd6f6-FnYJIVw94i6qwacuzJQOU.jpg',
    'width': 720,
    'height': 960,
    'visibility': 'public'}}}}
​
​
​
