# Introduction/Business Problem

### Does a universities local venues play a roll in its students' success?
With the abundance of universities in the US, each university varies with size, audience, and success rates of their students. Some colleges are small and cater to their local students attending in-state. Other universities are massive with a  large variety of students from all around the world. What drives students to want to attend these massive universities? Some may argue this to be the student success rates, while others just look for programs catered to what they would be interested in studying and eventually pursuing in the workforce. However, what some may not consider is how a student actually feels living in the area of that university depending on its available venues nearby. For example, a student considering an out-of-state college experience may wish to attend a university that has a variety of local venues for the student to visit and explore during their experience, while an in-state student (or one that lives locally) may not mind as they already live there. Could the variety in nearby venues drive a university to have more successful students, considering the majority of their students are attending on out-of-state tuition? We will explore this question further.

# Data


We have retrieved a dataset from <a href="https://www.kaggle.com/sumithbhongale/american-university-data-ipeds-dataset">Kaggle.com</a> giving the locations of different universities, with various statistics from the year 2013. Particularly, we will be looking at universities with a majority of out-of-state students and how the university performs using graduation statistics, as well as pre-college test scores. We will be comparing this to the variety of nearby venues, as well as what nearby venues correlate with a high success rate in the classroom.

In [1]:
import pandas as pd
import numpy as np

df = pd.read_csv('UniversityData.csv')

df.head()

Unnamed: 0,ID number,Name,year,ZIP code,Highest degree offered,County name,Longitude location of institution,Latitude location of institution,Religious affiliation,Offers Less than one year certificate,...,Percent of freshmen receiving federal grant aid,Percent of freshmen receiving Pell grants,Percent of freshmen receiving other federal grant aid,Percent of freshmen receiving state/local grant aid,Percent of freshmen receiving institutional grant aid,Percent of freshmen receiving student loan aid,Percent of freshmen receiving federal student loans,Percent of freshmen receiving other loan aid,Endowment assets (year end) per FTE enrollment (GASB),Endowment assets (year end) per FTE enrollment (FASB)
0,100654,Alabama A & M University,2013,35762,Doctor's degree - research/scholarship,Madison County,-86.568502,34.783368,Not applicable,Implied no,...,81.0,81.0,7.0,1.0,32.0,89.0,89.0,1.0,,
1,100663,University of Alabama at Birmingham,2013,35294-0110,Doctor's degree - research/scholarship and pro...,Jefferson County,-86.80917,33.50223,Not applicable,Implied no,...,36.0,36.0,10.0,0.0,60.0,56.0,55.0,5.0,24136.0,
2,100690,Amridge University,2013,36117-3553,Doctor's degree - research/scholarship and pro...,Montgomery County,-86.17401,32.362609,Churches of Christ,Implied no,...,90.0,90.0,0.0,40.0,90.0,100.0,100.0,0.0,,302.0
3,100706,University of Alabama in Huntsville,2013,35899,Doctor's degree - research/scholarship and pro...,Madison County,-86.63842,34.722818,Not applicable,Yes,...,31.0,31.0,4.0,1.0,63.0,46.0,46.0,3.0,11502.0,
4,100724,Alabama State University,2013,36104-0271,Doctor's degree - research/scholarship and pro...,Montgomery County,-86.295677,32.364317,Not applicable,Implied no,...,76.0,76.0,13.0,11.0,34.0,81.0,81.0,0.0,13202.0,


In [2]:
df['ZipCode'] = df['ZIP code']
df['Longitude'] = df['Longitude location of institution']
df['Latitude'] = df['Latitude location of institution']
df['%1stTimeUgrad-outofstate'] = df['Percent of first-time undergraduates - out-of-state']
df['%1stTimeUgrad-foreign'] = df['Percent of first-time undergraduates - foreign countries']

univ = df[['Name', 'ZipCode', 'Longitude', 'Latitude', '%1stTimeUgrad-outofstate', '%1stTimeUgrad-foreign']]
univ.head()

Unnamed: 0,Name,ZipCode,Longitude,Latitude,%1stTimeUgrad-outofstate,%1stTimeUgrad-foreign
0,Alabama A & M University,35762,-86.568502,34.783368,,
1,University of Alabama at Birmingham,35294-0110,-86.80917,33.50223,13.0,1.0
2,Amridge University,36117-3553,-86.17401,32.362609,,
3,University of Alabama in Huntsville,35899,-86.63842,34.722818,14.0,4.0
4,Alabama State University,36104-0271,-86.295677,32.364317,37.0,4.0


In [3]:
univ2 = univ[univ['%1stTimeUgrad-outofstate'].notnull()]
univ2.head()

Unnamed: 0,Name,ZipCode,Longitude,Latitude,%1stTimeUgrad-outofstate,%1stTimeUgrad-foreign
1,University of Alabama at Birmingham,35294-0110,-86.80917,33.50223,13.0,1.0
3,University of Alabama in Huntsville,35899,-86.63842,34.722818,14.0,4.0
4,Alabama State University,36104-0271,-86.295677,32.364317,37.0,4.0
5,The University of Alabama,35487-0166,-87.545766,33.2144,57.0,3.0
7,Auburn University at Montgomery,36117-3596,-86.177351,32.369939,2.0,0.0


In [4]:
outofstate = univ2[univ2['%1stTimeUgrad-outofstate'] > 50].reset_index(drop=True)
outofstate.head()

Unnamed: 0,Name,ZipCode,Longitude,Latitude,%1stTimeUgrad-outofstate,%1stTimeUgrad-foreign
0,The University of Alabama,35487-0166,-87.545766,33.2144,57.0,3.0
1,Tuskegee University,36088-1920,-85.710315,32.431021,64.0,0.0
2,Embry-Riddle Aeronautical University-Prescott,86301-3720,-112.452285,34.615678,77.0,6.0
3,California Institute of Technology,91125,-118.12574,34.139275,56.0,8.0
4,Pomona College,91711-6319,-117.711944,34.098298,56.0,18.0


# Methodology

In [5]:
import json
import requests
from pandas.io.json import json_normalize

In [30]:
CLIENT_ID = 'UQKVYXW5VZZZ3VSDL0HNND2S4HO22CXWTOVNJTJ5KAJHWV1R' # your Foursquare ID
CLIENT_SECRET = 'NJQURIXPLV5ASFIEFSQMANWHFRYZUVEBJMYIMYH0IUFIEPZ1' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('My credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentails:
CLIENT_ID: UQKVYXW5VZZZ3VSDL0HNND2S4HO22CXWTOVNJTJ5KAJHWV1R
CLIENT_SECRET:NJQURIXPLV5ASFIEFSQMANWHFRYZUVEBJMYIMYH0IUFIEPZ1


In [31]:
radius = 50
LIMIT = 200

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            outofstate['Latitude'][0], 
            outofstate['Longitude'][0], 
            radius, 
            LIMIT)
            
url

'https://api.foursquare.com/v2/venues/explore?&client_id=UQKVYXW5VZZZ3VSDL0HNND2S4HO22CXWTOVNJTJ5KAJHWV1R&client_secret=NJQURIXPLV5ASFIEFSQMANWHFRYZUVEBJMYIMYH0IUFIEPZ1&v=20180605&ll=33.2144,-87.545766&radius=50&limit=200'

In [32]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()['response']['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['University', 
                  'University Latitude', 
                  'University Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [33]:
UniversityVenues = getNearbyVenues(names=outofstate['Name'],
                                  latitudes=outofstate['Latitude'],
                                  longitudes=outofstate['Longitude'])

The University of Alabama
Tuskegee University
Embry-Riddle Aeronautical University-Prescott
California Institute of Technology
Pomona College
Colorado College
University of Denver
University of Bridgeport
Connecticut College
Fairfield University
University of Hartford
University of New Haven
Quinnipiac University
Sacred Heart University
Trinity College
Wesleyan University
Yale University
Delaware State University
University of Delaware
Gallaudet University
George Washington University
Georgetown University
Howard University
Eckerd College
Embry-Riddle Aeronautical University-Daytona Beach
University of Miami
The University of Tampa
Clark Atlanta University
Emory University
University of Chicago
Trinity International University-Illinois
Wheaton College
DePauw University
Earlham College
University of Notre Dame
Saint Mary's College
Taylor University
Briar Cliff University
Coe College
Cornell College
Dordt College
Drake University
Graceland University-Lamoni
Grinnell College
Loras College

In [34]:
UniversityVenues.shape

(2501, 7)

In [35]:
UniversityVenues

Unnamed: 0,University,University Latitude,University Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The University of Alabama,33.21440,-87.545766,Fresh Food Company,33.215009,-87.545754,Restaurant
1,The University of Alabama,33.21440,-87.545766,Starbucks,33.214589,-87.546302,Coffee Shop
2,The University of Alabama,33.21440,-87.545766,Chick-fil-A,33.214466,-87.545656,Fast Food Restaurant
3,The University of Alabama,33.21440,-87.545766,Raising Cane's Chicken Fingers - Temporarily C...,33.216454,-87.546377,Fried Chicken Joint
4,The University of Alabama,33.21440,-87.545766,The Ferguson Center Food Court,33.214639,-87.545479,Food Court
...,...,...,...,...,...,...,...
2496,Ave Maria University,26.33711,-81.441082,The Bean of Ave Maria,26.337021,-81.436549,Café
2497,Ave Maria University,26.33711,-81.441082,Scoopz,26.336379,-81.437360,Ice Cream Shop
2498,Ave Maria University,26.33711,-81.441082,Gino's Trattoria per Tutti,26.336169,-81.437184,Italian Restaurant
2499,Ave Maria University,26.33711,-81.441082,Ave Maria University baseball Field,26.340568,-81.442806,Baseball Field


In [36]:
UniVenuesMelt = UniversityVenues.melt(id_vars='University', value_vars='Venue Category')
UniVenuesMelt.head()

Unnamed: 0,University,variable,value
0,The University of Alabama,Venue Category,Restaurant
1,The University of Alabama,Venue Category,Coffee Shop
2,The University of Alabama,Venue Category,Fast Food Restaurant
3,The University of Alabama,Venue Category,Fried Chicken Joint
4,The University of Alabama,Venue Category,Food Court


In [37]:
VenueCatCount = UniversityVenues.groupby('University').count()
VenueCatCount

Unnamed: 0_level_0,University Latitude,University Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
University,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alderson Broaddus University,1,1,1,1,1,1
Andrews University,5,5,5,5,5,5
Antioch University-Seattle,100,100,100,100,100,100
Augustana College,4,4,4,4,4,4
Ave Maria University,7,7,7,7,7,7
...,...,...,...,...,...,...
Willamette University,12,12,12,12,12,12
Williams College,22,22,22,22,22,22
Xavier University,12,12,12,12,12,12
Yale University,60,60,60,60,60,60


In [38]:
print('There are {} unique categories.'.format(len(UniversityVenues['Venue Category'].unique())))

There are 299 unique categories.


In [39]:
# one hot encoding
uni_onehot = pd.get_dummies(UniversityVenues[['Venue Category']], prefix="", prefix_sep="")

# add university column back to dataframe
uni_onehot['University'] = UniversityVenues['University'] 

# move university column to the first column
fixed_columns = [uni_onehot.columns[-1]] + list(uni_onehot.columns[:-1])
uni_onehot = uni_onehot[fixed_columns]

uni_onehot.head()

Unnamed: 0,Yoga Studio,ATM,Accessories Store,Airport,American Restaurant,Arcade,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,...,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [40]:
uni_grouped = uni_onehot.groupby('University').mean().reset_index()
uni_grouped

Unnamed: 0,University,Yoga Studio,ATM,Accessories Store,Airport,American Restaurant,Arcade,Arepa Restaurant,Art Gallery,Art Museum,...,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store
0,Alderson Broaddus University,0.00,0.000000,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
1,Andrews University,0.00,0.000000,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
2,Antioch University-Seattle,0.01,0.000000,0.0,0.0,0.020000,0.0,0.0,0.010000,0.000000,...,0.0,0.000000,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0
3,Augustana College,0.00,0.000000,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
4,Ave Maria University,0.00,0.000000,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
150,Willamette University,0.00,0.083333,0.0,0.0,0.083333,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
151,Williams College,0.00,0.000000,0.0,0.0,0.000000,0.0,0.0,0.000000,0.045455,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
152,Xavier University,0.00,0.083333,0.0,0.0,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.000000,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0
153,Yale University,0.00,0.000000,0.0,0.0,0.033333,0.0,0.0,0.033333,0.016667,...,0.0,0.016667,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0


In [41]:
num_top_venues = 5

for hood in uni_grouped['University']:
    print("----"+hood+"----")
    temp = uni_grouped[uni_grouped['University'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 3})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alderson Broaddus University----
          venue  freq
0     Bookstore   1.0
1   Yoga Studio   0.0
2  Music School   0.0
3        Office   0.0
4  Noodle House   0.0


----Andrews University----
                       venue  freq
0      Performing Arts Venue   0.2
1          College Cafeteria   0.2
2  Latin American Restaurant   0.2
3                       Park   0.2
4               Soccer Field   0.2


----Antioch University-Seattle----
                venue  freq
0         Coffee Shop  0.10
1  Italian Restaurant  0.04
2         Pizza Place  0.04
3              Bakery  0.04
4    Sushi Restaurant  0.04


----Augustana College----
                venue  freq
0     College Library  0.25
1  College Rec Center  0.25
2                Food  0.25
3                Café  0.25
4         Yoga Studio  0.00


----Ave Maria University----
                  venue   freq
0                  Café  0.286
1        Baseball Field  0.143
2                   Pub  0.143
3  Gym / Fitness Center  0.143
4    

4   Ice Cream Shop  0.111


----Dordt College----
                venue  freq
0                Pool  0.25
1  College Auditorium  0.25
2      Student Center  0.25
3                Park  0.25
4        Music School  0.00


----Drake University----
               venue   freq
0              Plaza  0.091
1     Sandwich Place  0.091
2                Bar  0.091
3  College Bookstore  0.045
4     Clothing Store  0.045


----Earlham College----
            venue  freq
0          Museum   0.2
1  Sandwich Place   0.2
2      Public Art   0.2
3          Lounge   0.2
4   Grocery Store   0.2


----Eckerd College----
              venue  freq
0              Café  0.50
1  Basketball Court  0.25
2     Movie Theater  0.25
3       Yoga Studio  0.00
4       Music Venue  0.00


----Elon University----
                 venue   freq
0          Coffee Shop  0.235
1       Sandwich Place  0.118
2  American Restaurant  0.118
3    Convenience Store  0.059
4       Ice Cream Shop  0.059


----Embry-Riddle Aeronautica

4  Noodle House   0.0


----Princeton University----
                     venue   freq
0              Coffee Shop  0.067
1           Ice Cream Shop  0.050
2  New American Restaurant  0.033
3      Sporting Goods Shop  0.033
4           Clothing Store  0.033


----Providence College----
            venue   freq
0  Student Center  0.143
1     Pizza Place  0.143
2      Playground  0.143
3             Pub  0.143
4      Donut Shop  0.143


----Quinnipiac University----
                      venue   freq
0                     Trail  0.333
1   State / Provincial Park  0.167
2  College Baseball Diamond  0.167
3                  Bus Stop  0.167
4               Coffee Shop  0.167


----Rensselaer Polytechnic Institute----
                venue   freq
0  Mexican Restaurant  0.062
1   College Bookstore  0.062
2                Café  0.062
3           Bookstore  0.062
4      Student Center  0.062


----Roger Williams University----
               venue   freq
0               Pool  0.143
1            

               venue  freq
0  Convenience Store  0.10
1         Donut Shop  0.10
2         Hookah Bar  0.05
3         Bagel Shop  0.05
4           Pharmacy  0.05


----University of Vermont----
               venue   freq
0               Café  0.375
1  Electronics Store  0.125
2                Pub  0.125
3           Pharmacy  0.125
4         Art Museum  0.125


----University of Wisconsin-River Falls----
                venue   freq
0         Pizza Place  0.118
1                 Bar  0.118
2         Coffee Shop  0.118
3  Mexican Restaurant  0.059
4      Cosmetics Shop  0.059


----Virginia Intermont College----
          venue   freq
0          Park  0.667
1     Gift Shop  0.333
2   Yoga Studio  0.000
3   Music Store  0.000
4  Noodle House  0.000


----Wagner College----
                 venue   freq
0                  Gym  0.143
1                 Café  0.143
2                  Spa  0.143
3  American Restaurant  0.143
4               Market  0.143


----Washington University in St Loui

In [42]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[:num_top_venues]

In [43]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['University']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
university_venues_sorted = pd.DataFrame(columns=columns)
university_venues_sorted['University'] = uni_grouped['University']

for ind in np.arange(uni_grouped.shape[0]):
    university_venues_sorted.iloc[ind, 1:] = return_most_common_venues(uni_grouped.iloc[ind, :], num_top_venues)

university_venues_sorted.head()

Unnamed: 0,University,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alderson Broaddus University,Bookstore,Women's Store,Dive Bar,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf,Discount Store
1,Andrews University,Park,Latin American Restaurant,College Cafeteria,Soccer Field,Performing Arts Venue,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner
2,Antioch University-Seattle,Coffee Shop,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Breakfast Spot,Bar,Café,Restaurant,New American Restaurant
3,Augustana College,College Library,Café,College Rec Center,Food,Women's Store,Discount Store,Deli / Bodega,Department Store,Dessert Shop,Diner
4,Ave Maria University,Café,Ice Cream Shop,Gym / Fitness Center,Italian Restaurant,Pub,Baseball Field,Women's Store,Cycle Studio,Dance Studio,Deli / Bodega


In [44]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

uni_grouped_clustering = uni_grouped.drop('University', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(uni_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 2, 3, 3, 2, 2, 2, 2, 3], dtype=int32)

In [47]:
# add clustering labels
university_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

uni_merged = outofstate[['Name', 'Longitude', 'Latitude']]

# merge uni_grouped with outofstatedf to add latitude/longitude for each university
uni_merged = uni_merged.join(university_venues_sorted.set_index('University'), on='Name')

uni_merged = uni_merged.fillna(0)

uni_merged['Cluster Labels'] = uni_merged['Cluster Labels'].astype(int)

uni_merged.head() # check the last columns!

Unnamed: 0,Name,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,The University of Alabama,-87.545766,33.2144,2,Coffee Shop,Fast Food Restaurant,Sandwich Place,Plaza,Historic Site,Fried Chicken Joint,Food Court,Restaurant,Donut Shop,History Museum
1,Tuskegee University,-85.710315,32.431021,2,Hotel,Historic Site,Basketball Stadium,Museum,Discount Store,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf
2,Embry-Riddle Aeronautical University-Prescott,-112.452285,34.615678,2,Playground,Pool,Café,Coffee Shop,Baseball Field,Bookstore,Donut Shop,Dog Run,Doctor's Office,Dive Bar
3,California Institute of Technology,-118.12574,34.139275,2,Food Truck,Café,Hotel,Gym / Fitness Center,Hardware Store,Scenic Lookout,Beer Garden,Park,Restaurant,Theater
4,Pomona College,-117.711944,34.098298,3,College Cafeteria,American Restaurant,General College & University,College Theater,Arts & Crafts Store,Café,Women's Store,Discount Store,Deli / Bodega,Department Store


In [59]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

# create map
map_clusters = folium.Map(location=[37.0902, -95.7129], zoom_start=3)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(uni_merged['Latitude'], uni_merged['Longitude'], uni_merged['Name'], uni_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# Results

### Cluster 1

In [53]:
uni_merged.loc[uni_merged['Cluster Labels'] == 0, uni_merged.columns[[0] + list(range(4, uni_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Fairfield University,Coffee Shop,Bagel Shop,Sports Club,Deli / Bodega,Soccer Field,Bar,Discount Store,Dance Studio,Department Store,Dessert Shop
37,Briar Cliff University,Coffee Shop,Beach,Disc Golf,Women's Store,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar
48,Bethel College-North Newton,Coffee Shop,History Museum,Women's Store,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf
49,Southwestern College,Coffee Shop,Concert Hall,Women's Store,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf,Discount Store
57,Brandeis University,Coffee Shop,Bagel Shop,Donut Shop,Café,Convenience Store,Bus Stop,Women's Store,Department Store,Dessert Shop,Diner
61,Bard College at Simon's Rock,0,0,0,0,0,0,0,0,0,0
71,Concordia University-Nebraska,Coffee Shop,Scenic Lookout,Disc Golf,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner
82,Colgate University,Coffee Shop,College Cafeteria,Rugby Pitch,Lake,Disc Golf,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner
100,Wilberforce University,0,0,0,0,0,0,0,0,0,0
103,Oklahoma Wesleyan University,Coffee Shop,Women's Store,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf,Discount Store


### Cluster 2

In [54]:
uni_merged.loc[uni_merged['Cluster Labels'] == 1, uni_merged.columns[[0] + list(range(4, uni_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
39,Cornell College,Park,Women's Store,Disc Golf,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store
73,York College,Park,Women's Store,Disc Golf,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store
139,Virginia Intermont College,Park,Gift Shop,Women's Store,Diner,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Disc Golf


### Cluster 3

In [55]:
uni_merged.loc[uni_merged['Cluster Labels'] == 2, uni_merged.columns[[0] + list(range(4, uni_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,The University of Alabama,Coffee Shop,Fast Food Restaurant,Sandwich Place,Plaza,Historic Site,Fried Chicken Joint,Food Court,Restaurant,Donut Shop,History Museum
1,Tuskegee University,Hotel,Historic Site,Basketball Stadium,Museum,Discount Store,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf
2,Embry-Riddle Aeronautical University-Prescott,Playground,Pool,Café,Coffee Shop,Baseball Field,Bookstore,Donut Shop,Dog Run,Doctor's Office,Dive Bar
3,California Institute of Technology,Food Truck,Café,Hotel,Gym / Fitness Center,Hardware Store,Scenic Lookout,Beer Garden,Park,Restaurant,Theater
5,Colorado College,Italian Restaurant,Yoga Studio,Pool,Liquor Store,Deli / Bodega,Taco Place,Tattoo Parlor,Convenience Store,IT Services,Theater
...,...,...,...,...,...,...,...,...,...,...,...
151,Stanford University,Pharmacy,Fountain,Gym,Deli / Bodega,Rental Car Location,Club House,IT Services,Trail,Women's Store,Department Store
152,Antioch University-Seattle,Coffee Shop,Pizza Place,Sushi Restaurant,Bakery,Italian Restaurant,Breakfast Spot,Bar,Café,Restaurant,New American Restaurant
153,Beacon College,Bar,Sporting Goods Shop,Pool Hall,New American Restaurant,Wine Shop,Wine Bar,Hot Dog Joint,American Restaurant,Plaza,Park
154,Johnson & Wales University-Denver,Pizza Place,Italian Restaurant,Bistro,Sushi Restaurant,Intersection,Restaurant,Ice Cream Shop,Coffee Shop,Liquor Store,Sandwich Place


### Cluster 4

In [56]:
uni_merged.loc[uni_merged['Cluster Labels'] == 3, uni_merged.columns[[0] + list(range(4, uni_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Pomona College,College Cafeteria,American Restaurant,General College & University,College Theater,Arts & Crafts Store,Café,Women's Store,Discount Store,Deli / Bodega,Department Store
23,Eckerd College,Café,Basketball Court,Movie Theater,Women's Store,Discount Store,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf
31,Wheaton College,Massage Studio,Café,Soccer Field,Museum,Disc Golf,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner
47,Benedictine College,Coffee Shop,Café,Arcade,Football Stadium,College Theater,Women's Store,Deli / Bodega,Department Store,Dessert Shop,Diner
63,Tufts University,Café,Mediterranean Restaurant,College Arts Building,Athletics & Sports,Concert Hall,Baseball Field,Donut Shop,Dog Run,Dosa Place,Doctor's Office
89,Skidmore College,College Theater,Café,College Arts Building,Dive Bar,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf,Discount Store
126,Augustana College,College Library,Café,College Rec Center,Food,Women's Store,Discount Store,Deli / Bodega,Department Store,Dessert Shop,Diner
135,Goddard College,College Theater,Café,Discount Store,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Disc Golf,Women's Store
137,University of Vermont,Café,Pharmacy,Electronics Store,Pub,Art Museum,Tunnel,Women's Store,Diner,Cycle Studio,Dance Studio
142,University of Puget Sound,Pizza Place,Concert Hall,Café,Track,Cuban Restaurant,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop


### Cluster 5

In [57]:
uni_merged.loc[uni_merged['Cluster Labels'] == 4, uni_merged.columns[[0] + list(range(4, uni_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
107,Lewis & Clark College,Theater,Disc Golf,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Women's Store
114,Lincoln University of Pennsylvania,Theater,Disc Golf,Cupcake Shop,Cycle Studio,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Women's Store
