<h1 align='center'><font size ='8'> Data Analysis Capstone Project</font><h1>
    <h2 align='center'><font size='6'>'A Place To Play'</font><h2>

# The Premise

As someone growing up in a town in southern Indiana with a fairly prominent music school, I've seen many of the excellent results that come from well-crafted musical experiences working hand-in-hand with its oldest friend- social venues. Whilst large-scale shows are easily indentified by the public, many artists at the entry level struggle to find apt venues, connections within the music scene, and consistency of any sort in their public relations. This leads to an unfortunate number of artists struggling and likely never moving forward with what could be great art. Similarly, with a lack of cohesion between the artists and the social venues they may be most celebrated at, many venues themselves may struggle to find or pay artists within their area that maintain the standards put forth by their business model. 

Unfortunately, for the scope of this project, parsing data on so many different artists to lend the businesses a hand may be a bit outside of the timeline and resources I can invest currently, so for the time being I will be utilizing Indiana's public records to find out what the best venues are in the state. Focusing on the smaller venues to avoid outlier data from major concert halls and maintain a support for local establishments, I will then be breaking that cluster up into the categories of all-ages venues, and age-restricted venues. Focusing on the cities with the highest population and/or population density will hopefully provide the maximum probability of finding large music venues with consistent clientele. 

# The Data

For our initial run of data, we'll be focusing on gathering statistical information, and trying to see what there is to be collected for free on this topic. Our initial sources include:
1. __STATSIndiana__, a census data tracking page for finding target demographic information. (https://www.stats.indiana.edu/)
2. __Hoosiers By The Numbers__, a database for industry data pertaining to IN-based businesses. (http://www.hoosierdata.in.gov/)
3. __Foursquare API__, for usage of locational data and exploration of business districts. (https://foursquare.com/)

While there may be more data necessary, let's start with the basic assumptions that if a concert is playing and people are around to hear it, they'll come based off of a rudimentary population curve, accompanied by the likelihood that they're in an area where they are seeking entertainment (e.g. entertainment districts with restaurants and other amenitites) and could be persuaded simply by being in the right place at the right time. 

# Finding A Place To Play

First, we will download and import all necessary libraries and data files, at least that we know we'll be needing immediately.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!pip install geopy
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

#matplotlib & associated libraries
import matplotlib.cm as cm
import matplotlib.colors as colors

!pip install folium
import folium # map rendering library

!pip install beautifulsoup4
from bs4 import BeautifulSoup

print('Libraries imported.')

Collecting geopy
[?25l  Downloading https://files.pythonhosted.org/packages/80/93/d384479da0ead712bdaf697a8399c13a9a89bd856ada5a27d462fb45e47b/geopy-1.20.0-py2.py3-none-any.whl (100kB)
[K     |████████████████████████████████| 102kB 18.0MB/s ta 0:00:01
[?25hCollecting geographiclib<2,>=1.49 (from geopy)
  Downloading https://files.pythonhosted.org/packages/5b/ac/4f348828091490d77899bc74e92238e2b55c59392f21948f296e94e50e2b/geographiclib-1.49.tar.gz
Building wheels for collected packages: geographiclib
  Building wheel for geographiclib (setup.py) ... [?25ldone
[?25h  Stored in directory: /home/jupyterlab/.cache/pip/wheels/99/45/d1/14954797e2a976083182c2e7da9b4e924509e59b6e5c661061
Successfully built geographiclib
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.49 geopy-1.20.0
Collecting beautifulsoup4
[?25l  Downloading https://files.pythonhosted.org/packages/1a/b7/34eec2fe5a49718944e215fde81288eec1fa04638aa3fb57c1c6cd0f98c3/beautifulsoup

In [2]:
popdf = pd.read_csv('INpopdata.csv', skiprows=2)
popdf.head()

Unnamed: 0,Geography,statefips,countyfips,year,College Age (18 to 24)
0,"Adams County, IN",18,1,2018,3009
1,"Allen County, IN",18,3,2018,33762
2,"Bartholomew County, IN",18,5,2018,6689
3,"Benton County, IN",18,7,2018,646
4,"Blackford County, IN",18,9,2018,907


In [3]:
popdf.drop(['statefips', 'countyfips', 'year'], axis=1, inplace=True)
popdf.head()

Unnamed: 0,Geography,College Age (18 to 24)
0,"Adams County, IN",3009
1,"Allen County, IN",33762
2,"Bartholomew County, IN",6689
3,"Benton County, IN",646
4,"Blackford County, IN",907


In [4]:
popsort = popdf.sort_values(by='College Age (18 to 24)', ascending=False)
popsort.head()

Unnamed: 0,Geography,College Age (18 to 24)
48,"Marion County, IN",88123
78,"Tippecanoe County, IN",45634
44,"Lake County, IN",41955
52,"Monroe County, IN",38694
1,"Allen County, IN",33762


## The First Data Discovery
It looks like we have our first bit of data to start basing this project around! Using the target demographic of 18-24 year olds available via the government census data, we can see that __Marion, Tippecanoe, Lake, Monroe, and Allen County__ have the highest populations for that age range! These will likely be our best spots to focus on finding commercial data from in order to start drawing some connections and finding some top-ranked venues. 

### A Little Bit More Time In The Mines
For this next section, we're going to look into creating several dataframes to cross-reference, courtesy of the organization system of Hoosiers By The Numbers. 

In [5]:
!pip install lxml
url='http://www.hoosierdata.in.gov/buslookup/page2.aspx?scope=2&geo_area=105&name_text=&company_size=Z&datacode=72'
source= requests.get(url).text
soup= BeautifulSoup(source,'html.parser')

Collecting lxml
[?25l  Downloading https://files.pythonhosted.org/packages/b4/32/9ce1edcfd91ffbae0af3836a9ae3fe2d72f6c5f8b6980c7c5294935a0266/lxml-4.4.0-cp36-cp36m-manylinux1_x86_64.whl (5.7MB)
[K     |████████████████████████████████| 5.8MB 27.6MB/s eta 0:00:01
[?25hInstalling collected packages: lxml
Successfully installed lxml-4.4.0


In [9]:
#print(soup.prettify())

In [6]:
table_mon_bars = soup.find('table')
tf_mon_bars = pd.read_html(source)
monroe_bars_rough = pd.DataFrame(tf_mon_bars[1])
#monroe_bars_rough

In [7]:
monroe_bars_df = monroe_bars_rough[monroe_bars_rough[1] == 'Bars (722410)']
monroe_bars_df.head()

Unnamed: 0,0,1,2,3,4,5
43,Bluetip Details,Bars (722410),426 S College Ave,Bloomington,10,766000
46,Brothers Bar & Grill Details,Bars (722410),215 N Walnut St,Bloomington,10,766000


In [8]:
us_comp_df = pd.read_csv('us_companies.csv')
us_comp_df.head()
IN_comp_df = us_comp_df[us_comp_df['state'] == 'IN']
IN_comp_df.head()

Unnamed: 0,company_name_id,company_name,url,year_founded,city,state,country,zip_code,full_time_employees,company_type,company_category,revenue_source,business_model,social_impact,description,description_short,source_count,data_types,example_uses,data_impacts,financial_info,last_updated
18,american-red-ball-movers,American Red Ball Movers,http://www.redball.com,1919.0,Indianapolis,IN,us,46239.0,11-50,Private,Transportation,Not reported by company,"Business to Business, Business to Consumer",,American Red Ball Movers is a major long dista...,American Red Ball Movers is a long distance re...,,Demographics & Social,,[],,2014-09-18 16:57:53.964955
30,atlas-van-lines,Atlas Van Lines,http://www.atlasvanlines.com,1948.0,Evansville,IN,us,47712.0,201-500,Private,Transportation,Customers,"Business to Business, Business to Consumer",,Atlas Van Lines is a subsidiary of Atlas World...,"Atlas World Group, Inc. is a family of compani...",1-10,,,[],"Atlas Van Lines, the flagship of Atlas World G...",2014-11-20 12:47:44.601144
42,bekins,Bekins,http://www.bekins.com,1891.0,Indianapolis,IN,us,46250.0,51-200,Private,Transportation,Not reported by company,"Business to Business, Business to Government",,Bekins offers private and corporate domestic a...,Bekins offers private and corporate domestic a...,,,,[],,2014-10-20 11:22:49.477697
274,lilly-open-innovation-drug-discovery,Lilly Open Innovation Drug Discovery,https://openinnovation.lilly.com/dd/,1876.0,Indianapolis,IN,us,46285.0,,,Scientific Research,Not reported by company,Business to Business,,"By providing a platform for idea-sharing, Lill...","By providing a platform for idea-sharing, Lill...",,,,[],,2014-10-21 16:09:30.014911
323,north-american-van-lines,North American Van Lines,http://northamericanvanlines.com,1933.0,Fort Wayne,IN,us,46801.0,"501-1,000",Private,Transportation,Not reported by company,Business to Consumer,,North American Van lines provides relocation s...,"North American Van Lines is a licensed, bonded...",,Geospatial/Mapping,,[],,2014-10-30 16:25:03.761738


### Perhaps A Different Approach?

At this point, I'm getting some rather disappointing returns from what seemed to be an excellently sourced set of dataframes. Unfortunately, that means we'll need to find another route to calculate the popularity of these venues. While it wouldn't be too unreasonable to manually iterate through the sites and build my own Excel document to translate, as this is a coding course, let's see if Foursquare or another similar source could be used to find a useful means to track down the most popular venues in the area. While these will likely have more open-ended data, my hope is to be able to build some basic models to have a better idea of the credentials most important to identifying venue popularity.

### Foursquare, and Then Some

In [9]:
# @hidden cell
CLIENT_ID = 'YMOIZYBKFUKQXKLXSXTH1POHNCTHG3MLYLYRJB513MCL2545' # your Foursquare ID
CLIENT_SECRET = 'CJEI03S3U5WVZ5VMU44TXGCYTYT011NGQA0P510BHZJZARAH' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30

#### Now let's grab the location of Bloomington, IN

It's not the top populated spot, but as my hometown, it'll be a much more entertaining perusal for my own interests. 

In [10]:
address = 'Bloomington, IN'

geolocator = Nominatim(user_agent="bloomington_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

39.1670396 -86.5342881


Awesome, look at that nice little latitude and longitude translation for us to use with Foursquare. Now we'll plug it into Foursquare and see if we can search for some specific spots. Perusing their Venue Categories, it looks like searching for __Music Venue__, __Nightlife Spot__, __Cafe__, or __Indie Theatre__ may be some of our prime suspects in the live music scenes. While Cafe's and Indie Theatres may not seem like the biggest scenes in town, many smaller acts (particularly acoustic performers) have had great luck with semi-regular daytime gigs, as well as Indie Theatre being a wonderful place for all-ages projects to put in some work to have their musical talents recognized. 

In [11]:
search_query = 'Music Venue'
radius = 1000
print(search_query + ' .... OK!')

Music Venue .... OK!


In [12]:
# @hidden cell
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [13]:
results = requests.get(url).json()
#results

In [14]:
venues = results['response']['venues']

# tranform venues into a dataframe
btown_dataframe = json_normalize(venues)
btown_dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet,venuePage.id
0,4c6843b931ba2d7f8f48f772,Venue Fine Arts & Gifts,"[{'id': '4bf58dd8d48988d1e2931735', 'name': 'A...",v-1564877884,False,114 S Grant St,39.16602,-86.529901,"[{'label': 'display', 'lat': 39.16601980178358...",395,47408.0,US,Bloomington,IN,United States,"[114 S Grant St, Bloomington, IN 47408, United...",,
1,4c25395bf1272d7ffa7284c5,Rhino's All-Ages Music Club,"[{'id': '4bf58dd8d48988d1e9931735', 'name': 'R...",v-1564877884,False,331 S Walnut St,39.163432,-86.533378,"[{'label': 'display', 'lat': 39.16343155223261...",409,47401.0,US,Bloomington,IN,United States,"[331 S Walnut St (btwn 3rd & Smith), Bloomingt...",btwn 3rd & Smith,
2,4c5c65eb6ebe2d7fcff6cf2e,Vance Music,"[{'id': '4bf58dd8d48988d1fe941735', 'name': 'M...",v-1564877884,False,,39.16749,-86.534268,"[{'label': 'display', 'lat': 39.16749, 'lng': ...",50,47404.0,US,Bloomington,IN,United States,"[Bloomington, IN 47404, United States]",,
3,4b26da71f964a520078224e3,Landlocked Music,"[{'id': '4bf58dd8d48988d10d951735', 'name': 'R...",v-1564877884,False,202 N Walnut St,39.167765,-86.533483,"[{'label': 'display', 'lat': 39.16776501469022...",106,47404.0,US,Bloomington,IN,United States,"[202 N Walnut St (6th), Bloomington, IN 47404,...",6th,37126344.0
4,4bc4afe84cdfc9b60bff9821,Melody Music Shop,"[{'id': '4bf58dd8d48988d1fe941735', 'name': 'M...",v-1564877884,False,,39.16772,-86.537693,"[{'label': 'display', 'lat': 39.16772027887722...",303,,US,,Indiana,United States,"[Indiana, United States]",,


### Interesting...

While something that most people wouldn't recognize off the bat, an interesting thing has come up in my immediate Foursquare search. I set the parameter at 1k meters, intending to only net the majority of the downtown area. However, I am intrigued that Music Venues returned only one live music venue, and in a bizarre twist, it's a venue that has been torn down as of this past spring, and was closed long before that. Even other venues on the list unrelated to our current search have been closed for some time. Way to make this difficult on me, Foursquare. 

With these discoveries, we shall perhaps need to narrow our scope of research, and redefine our hopes. Discovering all-ages venues with the resources at hand would be... tedious to say the least. With that said, I wish to leave the above proposal in place, to show why I am namely interested in my new proposal.

# A Place To Play 2.0
## The Hunt For An Inclusive Location

While it's not fun restarting a project because there was a major hole in the data we'd hoped to achieve, that data does teach us some exciting things about ideas we hadn't thought of before. For example, the new direction of this project will be the allocation of a new, all-ages, inclusive venue for Bloomington, IN. After our initial data research, it was discovered that Bloomington pretty distinctly lacks a venue that caters specifically to the presentation of art for all-ages audiences. Even the most age-inclusive clubs in town pretty much draw the line at catering specifically to the bar scene, and without a place to perform, the music scene locally may well struggle to encourage young performers to begin pursuing those musical dreams. 

With that being said, the new literal direction of this project will be to find a location within a reasonable distance of the rest of the downtown community, though avoiding crowding from other bars and nightlife locations will be a must in order to assuage both competition, and concerns for the appropriate nature of opening up a venue for minors to be included in. It would also be convenient if this location were reasonably centralized to schools, and perhaps avoiding other performing arts venues, where possible. Despite knowing that Foursquare's data is out of date pretty heavily, we'll continue trying to utilize it, while edifying any mistakes based on our knowledge of the local area. Enjoy!

In [15]:
search_query = 'Nightlife Spot'
radius = 1000
print(search_query + ' .... OK!')

Nightlife Spot .... OK!


In [16]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [17]:
results = requests.get(url).json()
#results

In [18]:
venues = results['response']['venues']

# tranform venues into a dataframe
btown_dataframe = json_normalize(venues)
btown_dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.postalCode
0,58352d7b37da1d46fb6c659e,Nightlife,"[{'id': '4bf58dd8d48988d1d6941735', 'name': 'S...",v-1564877920,False,"Bloomington, IN",39.165325,-86.52638,"[{'label': 'display', 'lat': 39.165325, 'lng':...",708,US,Bloomington,IN,United States,"[Bloomington, IN, Bloomington, IN, United States]",
1,572e734b498eea6266484547,Guac Spot,"[{'id': '52e81612bcbc57f1066b7a24', 'name': 'T...",v-1564877920,False,,39.168978,-86.543881,"[{'label': 'display', 'lat': 39.168978, 'lng':...",855,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",
2,4c353187ed37a593b4407003,New Spot,[],v-1564877920,False,1105 S Madison St,39.15535,-86.53728,"[{'label': 'display', 'lat': 39.15535, 'lng': ...",1326,US,Bloomington,IN,United States,"[1105 S Madison St, Bloomington, IN 47403, Uni...",47403.0
3,5410b6a6498e36dd36001d76,Spotted Ox Hostel,"[{'id': '4d954b06a243a5684965b473', 'name': 'R...",v-1564877920,False,422 S Henderson St,39.162187,-86.527575,"[{'label': 'display', 'lat': 39.162187, 'lng':...",792,US,Bloomington,IN,United States,"[422 S Henderson St, Bloomington, IN 47401, Un...",47401.0
4,53e36007498eff39f4c21ec6,Spotted Ox Hostel,"[{'id': '4d954b06a243a5684965b473', 'name': 'R...",v-1564877920,False,422 S Henderson St,39.162122,-86.527519,"[{'label': 'display', 'lat': 39.162122, 'lng':...",800,US,Bloomington,IN,United States,"[422 S Henderson St, Bloomington, IN 47401, Un...",47401.0


Interesting... Why is there a place defined as Guac Spot on the Nightlife listings? Bloomington, you never cease to amaze. But really, let's try filtering down to just __Nightlife__ venues specifically to see if there's anything particularly useful in this section. 

In [19]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in btown_dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = btown_dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,postalCode,id
0,Nightlife,Strip Club,"Bloomington, IN",39.165325,-86.52638,"[{'label': 'display', 'lat': 39.165325, 'lng':...",708,US,Bloomington,IN,United States,"[Bloomington, IN, Bloomington, IN, United States]",,58352d7b37da1d46fb6c659e
1,Guac Spot,Tree,,39.168978,-86.543881,"[{'label': 'display', 'lat': 39.168978, 'lng':...",855,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,572e734b498eea6266484547
2,New Spot,,1105 S Madison St,39.15535,-86.53728,"[{'label': 'display', 'lat': 39.15535, 'lng': ...",1326,US,Bloomington,IN,United States,"[1105 S Madison St, Bloomington, IN 47403, Uni...",47403.0,4c353187ed37a593b4407003
3,Spotted Ox Hostel,Residential Building (Apartment / Condo),422 S Henderson St,39.162187,-86.527575,"[{'label': 'display', 'lat': 39.162187, 'lng':...",792,US,Bloomington,IN,United States,"[422 S Henderson St, Bloomington, IN 47401, Un...",47401.0,5410b6a6498e36dd36001d76
4,Spotted Ox Hostel,Residential Building (Apartment / Condo),422 S Henderson St,39.162122,-86.527519,"[{'label': 'display', 'lat': 39.162122, 'lng':...",800,US,Bloomington,IN,United States,"[422 S Henderson St, Bloomington, IN 47401, Un...",47401.0,53e36007498eff39f4c21ec6


Okaaaaaaayyyyy then, looks like that __Nightlife__ category might give us a little less than anything useful. Couldn't have been easy. On to the next attempt- let's try specifying __Bar__ as the nightlife location we're interested in. We'll also widen the search area to about 2k meters. Realistically, this is probably excessive, but a nice wide map of the downtown area wouldn't hurt for our immediate query.

In [20]:
search_query = 'Bar'
radius = 2000
print(search_query + ' .... OK!')

Bar .... OK!


In [21]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [22]:
results = requests.get(url).json()
#results

In [23]:
venues = results['response']['venues']

# tranform venues into a dataframe
btown_dataframe = json_normalize(venues)
btown_dataframe.shape

(30, 19)

Oooooh, from a quick look, this may actually be some useful data. Can you believe it? Foursquare, I'm sorry for doubting you. At least our functions were already defined so we can just move on to a nice, clean dataframe quickly... hopefully.

In [24]:
filtered_columns = ['name', 'categories'] + [col for col in btown_dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = btown_dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,id
0,Brothers Bar & Grill,Bar,215 N Walnut St,at W 7th St,39.16825,-86.533801,"[{'label': 'display', 'lat': 39.16824984162702...",141,47404,US,Bloomington,IN,United States,"[215 N Walnut St (at W 7th St), Bloomington, I...",,4b074185f964a520ccfa22e3
1,Kilroy's Bar & Grill,Bar,502 E Kirkwood Ave,,39.166472,-86.528087,"[{'label': 'display', 'lat': 39.16647162279348...",538,47408,US,Bloomington,IN,United States,"[502 E Kirkwood Ave, Bloomington, IN 47408, Un...",,4aea47d8f964a520bbba21e3
2,Coaches Bar & Grill Bloomington,Bar,245 N College Ave,,39.168271,-86.535082,"[{'label': 'display', 'lat': 39.16827105902477...",153,47404,US,Bloomington,IN,United States,"[245 N College Ave, Bloomington, IN 47404, Uni...",,4b54ccc0f964a52007cd27e3
3,Kilroy's Bar & Grill: Sports Bar,Bar,319 N Walnut St,,39.169149,-86.533764,"[{'label': 'display', 'lat': 39.16914893762117...",239,47404,US,Bloomington,IN,United States,"[319 N Walnut St, Bloomington, IN 47404, Unite...",,4b1958c4f964a520a4db23e3
4,Scholars Inn Gourmet Cafe And Wine Bar,American Restaurant,717 N College Ave,at W 11th St.,39.17392,-86.53511,"[{'label': 'display', 'lat': 39.17392, 'lng': ...",769,47404,US,Bloomington,IN,United States,"[717 N College Ave (at W 11th St.), Bloomingto...",,4b002ae0f964a520183b22e3


In [25]:
dataframe_filtered['categories'].unique()

array(['Bar', 'American Restaurant', 'Music Venue', 'Coffee Shop',
       'Hotel', 'Chinese Restaurant', 'Rock Club', 'Cocktail Bar',
       'Hotel Bar', 'College Lab', 'Winery', 'Salon / Barbershop',
       "Dentist's Office", 'College Baseball Diamond',
       'Residential Building (Apartment / Condo)'], dtype=object)

Ok, now Foursquare, I don't know how these responses are related to my query for 'Bars', but I don't think I need anything other than the info for __Bar__, __Music Venue__, __Coffee Shop__, __Rock Club__, and __Cocktail Bar__. While I'm sure Strongbad would appreciate a good show within proximity of a *Dentist's Office*, I'm going to assume those are probably unnecessary outliers to my current data. 

In [26]:
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'American Restaurant'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Hotel'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Chinese Restaurant'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'College Lab'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Winery'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Hotel Bar'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Salon / Barbershop'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Dentist\'s Office'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'College Baseball Diamond'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
IndexDrop = dataframe_filtered[dataframe_filtered['categories'] == 'Residential Building (Apartment / Condo)'].index
dataframe_filtered.drop(IndexDrop, inplace= True)
dataframe_filtered

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,id
0,Brothers Bar & Grill,Bar,215 N Walnut St,at W 7th St,39.16825,-86.533801,"[{'label': 'display', 'lat': 39.16824984162702...",141,47404.0,US,Bloomington,IN,United States,"[215 N Walnut St (at W 7th St), Bloomington, I...",,4b074185f964a520ccfa22e3
1,Kilroy's Bar & Grill,Bar,502 E Kirkwood Ave,,39.166472,-86.528087,"[{'label': 'display', 'lat': 39.16647162279348...",538,47408.0,US,Bloomington,IN,United States,"[502 E Kirkwood Ave, Bloomington, IN 47408, Un...",,4aea47d8f964a520bbba21e3
2,Coaches Bar & Grill Bloomington,Bar,245 N College Ave,,39.168271,-86.535082,"[{'label': 'display', 'lat': 39.16827105902477...",153,47404.0,US,Bloomington,IN,United States,"[245 N College Ave, Bloomington, IN 47404, Uni...",,4b54ccc0f964a52007cd27e3
3,Kilroy's Bar & Grill: Sports Bar,Bar,319 N Walnut St,,39.169149,-86.533764,"[{'label': 'display', 'lat': 39.16914893762117...",239,47404.0,US,Bloomington,IN,United States,"[319 N Walnut St, Bloomington, IN 47404, Unite...",,4b1958c4f964a520a4db23e3
5,The Alley Bar,Bar,210 W Kirkwood Ave,at College Ave,39.166586,-86.535283,"[{'label': 'display', 'lat': 39.16658558836782...",99,47404.0,US,Bloomington,IN,United States,"[210 W Kirkwood Ave (at College Ave), Blooming...",,4aebd299f964a520e3c421e3
6,The Tap,Bar,101 N. College Ave.,,39.166654,-86.535163,"[{'label': 'display', 'lat': 39.16665419106344...",86,47404.0,US,Bloomington,IN,United States,"[101 N. College Ave., Bloomington, IN 47404, U...",,507c2800e4b0b5e9215c9444
7,Blockhouse Bar,Music Venue,,,39.165297,-86.534881,"[{'label': 'display', 'lat': 39.165297, 'lng':...",200,,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,5a0fe55ab6b04b6e8f6561d7
8,Hopscotch Coffee - Espresso Bar,Coffee Shop,212 N Madison St,,39.168145,-86.537105,"[{'label': 'display', 'lat': 39.168145, 'lng':...",272,47404.0,US,Bloomington,IN,United States,"[212 N Madison St, Bloomington, IN 47404, Unit...",,58dfa25114f8f401b3bdfd5d
11,Video Saloon,Bar,105 W 7th St,at Walnut St.,39.168412,-86.533956,"[{'label': 'display', 'lat': 39.1684120350497,...",155,47404.0,US,Bloomington,IN,United States,"[105 W 7th St (at Walnut St.), Bloomington, IN...",,4ae293c2f964a5201a8f21e3
12,Soma Coffeehouse & Juice Bar,Coffee Shop,1400 E 3rd St,S Jordan Avenue,39.163922,-86.516277,"[{'label': 'display', 'lat': 39.16392211424054...",1592,47401.0,US,Bloomington,IN,United States,"[1400 E 3rd St (S Jordan Avenue), Bloomington,...",Jordan Square,4b9a5aabf964a5205fae35e3


What a nice looking dataframe. Let's go ahead and check for some Indie Theatres while we're at it, just to get an idea of who else may be relevant within the area. 'Cafe' didn't make it onto the prior listing, but given the small-scale of their involvement in local music, we'll give them a pass given that 'Coffee Shop' made it into our 'Bar' listing somehow. 

In [27]:
search_query = 'Indie Theatre'
radius = 2000
print(search_query + ' .... OK!')

Indie Theatre .... OK!


In [28]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [29]:
results = requests.get(url).json()
#results

In [30]:
venues = results['response']['venues']

# tranform venues into a dataframe
indie_dataframe = json_normalize(venues)
indie_dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.address,location.postalCode,location.crossStreet
0,50cd2d5a6ebb09a375d0e1b6,Pixy Theatre,[],v-1564877988,False,39.166491,-86.533591,"[{'label': 'display', 'lat': 39.1664907, 'lng'...",85,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,,
1,5338ccf8498e21d2cbb0fab4,Brontez Theatre & Lounge,[],v-1564877988,False,39.167267,-86.532896,"[{'label': 'display', 'lat': 39.167267, 'lng':...",122,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,,
2,4f32334319836c91c7c003cd,Theatre Cafe,"[{'id': '4bf58dd8d48988d1e0931735', 'name': 'C...",v-1564877988,False,39.1665,-86.532799,"[{'label': 'display', 'lat': 39.1665, 'lng': -...",141,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,
3,4b043d5df964a520745222e3,Buskirk-Chumley Theater,"[{'id': '4bf58dd8d48988d137941735', 'name': 'T...",v-1564877988,False,39.16631,-86.532804,"[{'label': 'display', 'lat': 39.16631035660300...",151,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,
4,4bbf86d82a89ef3b2c13ef88,Lee Norvelle Theatre and Drama Center,"[{'id': '4bf58dd8d48988d1ac941735', 'name': 'C...",v-1564877988,False,39.168331,-86.516838,"[{'label': 'display', 'lat': 39.1683309886856,...",1512,US,Bloomington,IN,United States,"[275 N Jordan Ave (7th Street), Bloomington, I...",275 N Jordan Ave,47405.0,7th Street


In [31]:
indie_dataframe.shape

(10, 17)

In [32]:
filtered_columns = ['name', 'categories'] + [col for col in indie_dataframe.columns if col.startswith('location.')] + ['id']
indie_filtered = indie_dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
indie_filtered['categories'] = indie_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
indie_filtered.columns = [column.split('.')[-1] for column in indie_filtered.columns]

indie_filtered.head()

Unnamed: 0,name,categories,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,address,postalCode,crossStreet,id
0,Pixy Theatre,,39.166491,-86.533591,"[{'label': 'display', 'lat': 39.1664907, 'lng'...",85,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,,,50cd2d5a6ebb09a375d0e1b6
1,Brontez Theatre & Lounge,,39.167267,-86.532896,"[{'label': 'display', 'lat': 39.167267, 'lng':...",122,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,,,5338ccf8498e21d2cbb0fab4
2,Theatre Cafe,Coffee Shop,39.1665,-86.532799,"[{'label': 'display', 'lat': 39.1665, 'lng': -...",141,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,,4f32334319836c91c7c003cd
3,Buskirk-Chumley Theater,Theater,39.16631,-86.532804,"[{'label': 'display', 'lat': 39.16631035660300...",151,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,,4b043d5df964a520745222e3
4,Lee Norvelle Theatre and Drama Center,College Theater,39.168331,-86.516838,"[{'label': 'display', 'lat': 39.1683309886856,...",1512,US,Bloomington,IN,United States,"[275 N Jordan Ave (7th Street), Bloomington, I...",275 N Jordan Ave,47405.0,7th Street,4bbf86d82a89ef3b2c13ef88


In [33]:
indie_filtered['categories'].unique()

array([None, 'Coffee Shop', 'Theater', 'College Theater', 'Movie Theater'],
      dtype=object)

Hmmmm... Unfortunately 'Movie Theater', and 'None' need to be dropped from this list. Without knowing the category for 'None' we may run into outliers for our data, and without knowing what all could be screened, 'Movie Theater' may show us data for the wrong artistic venue. None value seems to be problematic so we'll go ahead and erase them by name. 

In [35]:
IndexDrop2 = indie_filtered[indie_filtered['name'] == 'Pixy Theatre'].index
indie_filtered.drop(IndexDrop2, inplace= True)
IndexDrop2 = indie_filtered[indie_filtered['categories'] == 'Movie Theater'].index
indie_filtered.drop(IndexDrop2, inplace= True)
IndexDrop2 = indie_filtered[indie_filtered['name'] == 'Brontez Theatre & Lounge'].index
indie_filtered.drop(IndexDrop2, inplace= True)
indie_filtered

Unnamed: 0,name,categories,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,address,postalCode,crossStreet,id
2,Theatre Cafe,Coffee Shop,39.1665,-86.532799,"[{'label': 'display', 'lat': 39.1665, 'lng': -...",141,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,,4f32334319836c91c7c003cd
3,Buskirk-Chumley Theater,Theater,39.16631,-86.532804,"[{'label': 'display', 'lat': 39.16631035660300...",151,US,Bloomington,IN,United States,"[114 E Kirkwood Ave, Bloomington, IN 47408, Un...",114 E Kirkwood Ave,47408.0,,4b043d5df964a520745222e3
4,Lee Norvelle Theatre and Drama Center,College Theater,39.168331,-86.516838,"[{'label': 'display', 'lat': 39.1683309886856,...",1512,US,Bloomington,IN,United States,"[275 N Jordan Ave (7th Street), Bloomington, I...",275 N Jordan Ave,47405.0,7th Street,4bbf86d82a89ef3b2c13ef88
5,Wells-Metz Theatre,College Theater,39.168625,-86.516606,"[{'label': 'display', 'lat': 39.16862513376156...",1536,US,Bloomington,IN,United States,"[Jordan Avenue (7th street), Bloomington, IN, ...",Jordan Avenue,,7th street,4bad3ecaf964a5207d3d3be3
6,Ruth N. Halls Theatre,College Theater,39.168324,-86.516354,"[{'label': 'display', 'lat': 39.16832397042673...",1554,US,Bloomington,IN,United States,"[Jordan Ave (7th Street), Bloomington, IN, Uni...",Jordan Ave,,7th Street,4bc8f2353740b713b4c85d65
7,Indiana University Opera & Ballet Theater,College Theater,39.166503,-86.517205,"[{'label': 'display', 'lat': 39.16650346049358...",1475,US,Bloomington,IN,United States,"[101 N Jordan Ave, Bloomington, IN 47406, Unit...",101 N Jordan Ave,47406.0,,4c756662ff1fb60c5d19f6a7
8,Theatre West: Studio Theatre,College Theater,39.168385,-86.517416,"[{'label': 'display', 'lat': 39.16838539608138...",1463,US,,Indiana,United States,"[Indiana, United States]",,,,4d376290f1b06ea8ab6c0c55


In [36]:
print('Bar dataframe shape is: ',  dataframe_filtered.shape)
print('Indie dataframe shape is: ', indie_filtered.shape)

Bar dataframe shape is:  (17, 16)
Indie dataframe shape is:  (7, 15)


 Our Dataframes don't quite match up. Let's check the column labels to see where the difference is. 

In [37]:
print('Bar Dataframe Columns are: ', dataframe_filtered.columns)
print('Indie Dataframe Columns are: ', indie_filtered.columns)

Bar Dataframe Columns are:  Index(['name', 'categories', 'address', 'crossStreet', 'lat', 'lng',
       'labeledLatLngs', 'distance', 'postalCode', 'cc', 'city', 'state',
       'country', 'formattedAddress', 'neighborhood', 'id'],
      dtype='object')
Indie Dataframe Columns are:  Index(['name', 'categories', 'lat', 'lng', 'labeledLatLngs', 'distance', 'cc',
       'city', 'state', 'country', 'formattedAddress', 'address', 'postalCode',
       'crossStreet', 'id'],
      dtype='object')


Thankfully, this is a short list and a quick visual comparison tells us that in the bar dataframe, they included the label __'neighborhood'__ that was not included in the Indie Venues dataframe. That's easy enough to drop, then we can easily concatenate our dataframes for easier visualization. 

In [38]:
dataframe_filtered.drop(columns= 'neighborhood', inplace= True)
dataframe_filtered.columns

Index(['name', 'categories', 'address', 'crossStreet', 'lat', 'lng',
       'labeledLatLngs', 'distance', 'postalCode', 'cc', 'city', 'state',
       'country', 'formattedAddress', 'id'],
      dtype='object')

In [39]:
print('Bar dataframe shape is: ',  dataframe_filtered.shape)
print('Indie dataframe shape is: ', indie_filtered.shape)

Bar dataframe shape is:  (17, 15)
Indie dataframe shape is:  (7, 15)


Ah, yes, the easily-manipulated, equally-labeled dataframes. Now, to combine, and visualize them!

In [40]:
frames = [dataframe_filtered, indie_filtered]
final_frame = pd.concat(frames, sort = True)
final_frame.shape

(24, 15)

Ah, what a wonderful site to see. Finally, a moderately comprehensive guide to the venues available in Bloomington. Now, let's create a visual aid to see where all of those places are. 

### The Sites To See, and Hopefully Avoid

In [41]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Conrad Hotel

# add a red circle marker to represent Bloomington's Coordinate Center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Bloomington Base',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Venues as blue circle markers
for lat, lng, label in zip(final_frame.lat, final_frame.lng, final_frame.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

### The Layout Of Venues

From what we can see immediately on our map, it seems that the majority of venues are already clustered into the downtown area, and more sparsely clustered around a region in the east/north-east side of town. Considering the majority of these venues were bars and other age-restricted adult venues, this makes sense given that Bloomington hosts Indiana University just east of Indiana Avenue. While this is great news for the 21+ crowd, we'd like to focus on finding an area friendly to all ages. This gives us an interesting scope of business opportunities in the south-east, south, west, and northwest sides of town, where the street traffic is densest, while not overlapping other adult venue areas. For our next examination, let's find the equidistant point between the major schools, including Indiana University, to try and find an area that will encourage maximum foot traffic, both for those individuals without driver's licenses, and to lower the overall cost to the consumer for attending an event. 

In [42]:
search_query = 'School'
radius = 2000
print(search_query + ' .... OK!')

School .... OK!


In [43]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [45]:
results = requests.get(url).json()
#results

In [47]:
venues = results['response']['venues']

# tranform venues into a dataframe
btown_schools = json_normalize(venues)
btown_schools.shape

(30, 19)

In [55]:
filtered_columns = ['name', 'categories'] + [col for col in btown_schools.columns if col.startswith('location.')] + ['id']
schools_filtered = btown_schools.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
schools_filtered['categories'] = schools_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
schools_filtered.columns = [column.split('.')[-1] for column in schools_filtered.columns]

schools_filtered.head()

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,crossStreet,id
0,Fairview Elementary School,Elementary School,627 W 8th St,39.168686,-86.539214,"[{'label': 'display', 'lat': 39.16868613970279...",462,47404,US,Bloomington,IN,United States,"[627 W 8th St, Bloomington, IN 47404, United S...",,4b5f2828f964a520bca929e3
1,Kelley School of Business,College Academic Building,1275 E 10th St,39.172043,-86.518514,,1470,47405,US,Bloomington,IN,United States,"[1275 E 10th St (Indiana University), Blooming...",Indiana University,4aeaeea0f964a520c4bc21e3
2,Indiana University Maurer School of Law,Law School,211 S Indiana Ave,39.165389,-86.526621,"[{'label': 'display', 'lat': 39.16538880018181...",686,47405,US,Bloomington,IN,United States,"[211 S Indiana Ave (3rd St), Bloomington, IN 4...",3rd St,4b900532f964a520997033e3
3,IU School of Informatics and Computing,College Technology Building,901 E 10th St,39.172034,-86.522867,"[{'label': 'display', 'lat': 39.17203384401911...",1131,47408,US,Bloomington,IN,United States,"[901 E 10th St (Woodlawn Ave), Bloomington, IN...",Woodlawn Ave,4bd9e1123904a593e207449e
4,School Of Fine Arts At IU,College Arts Building,1201 E 7th St,39.168829,-86.518849,"[{'label': 'display', 'lat': 39.16882927034079...",1347,47405,US,Bloomington,IN,United States,"[1201 E 7th St (Jordan Ave.), Bloomington, IN ...",Jordan Ave.,4ae9f6c0f964a5200cb821e3


In [56]:
schools_filtered.shape

(30, 15)

Wow, quite the list of schools, let's go ahead and check the values of the Categories at hand in case some are unnecessary. 

In [57]:
schools_filtered['categories'].unique()

array(['Elementary School', 'College Academic Building', 'Law School',
       'College Technology Building', 'College Arts Building',
       'College Library', 'General College & University', 'University',
       'Trade School', 'Bus Station', 'College Lab', 'Tennis Court',
       'Office', 'College Classroom', None, 'Nursery School',
       'Australian Restaurant'], dtype=object)

Hmmmm... I appreciate the addition of *Bus Station*, but *Australian Restaurant* and *Office* probably need to go, let's go ahead and drop those values. 

In [58]:
IndexDrop3 = schools_filtered[schools_filtered['categories'] == 'Office'].index
schools_filtered.drop(IndexDrop3, inplace= True)
IndexDrop3 = schools_filtered[schools_filtered['categories'] == 'Australian Restaurant'].index
schools_filtered.drop(IndexDrop3, inplace= True)
schools_filtered['categories'].dropna(inplace = True)
schools_filtered['categories'].unique()

array(['Elementary School', 'College Academic Building', 'Law School',
       'College Technology Building', 'College Arts Building',
       'College Library', 'General College & University', 'University',
       'Trade School', 'Bus Station', 'College Lab', 'Tennis Court',
       'College Classroom', 'Nursery School'], dtype=object)

Awesome, a nice, clean school dataframe for the immediate downtown Bloomington area. Let's go ahead and get another map together for easy visual comparison. 

In [59]:
schools_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map

# add a red circle marker to represent Bloomington's Coordinate Center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Bloomington Base',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(schools_map)

# add the Schools as green circle markers
for lat, lng, label in zip(schools_filtered.lat, schools_filtered.lng, schools_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='green',
        popup=label,
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(schools_map)

# display map
schools_map

Interesting, despite the majority of venues being clustered to the downtown area, Indiana University holds the distinct lead for School buildings clustered in the eastern half of the city. Let's go ahead and create a visual representation of the schools and venues areas together.

In [60]:
combined_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map

# add a red circle marker to represent Bloomington's Coordinate Center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Bloomington Base',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(combined_map)

# add the Schools as green circle markers
for lat, lng, label in zip(schools_filtered.lat, schools_filtered.lng, schools_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='green',
        popup=label,
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(combined_map)

# add the Venues as blue circle markers
for lat, lng, label in zip(final_frame.lat, final_frame.lng, final_frame.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(combined_map)
    
# display map
combined_map


We are just booking through this now, aren't we? How about for a third point of reference we check for Neighborhoods that are predominantly Residential area. Identifying these groups will give us another excellent means to identify other locations that people will likely be able to easily travel to our new venue's location!

In [61]:
search_query = 'Residence'
radius = 2000
print(search_query + ' .... OK!')

Residence .... OK!


In [62]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [66]:
results = requests.get(url).json()
#results

In [68]:
venues = results['response']['venues']

# tranform venues into a dataframe
btown_residence = json_normalize(venues)
btown_residence.shape

(9, 17)

In [70]:
filtered_columns = ['name', 'categories'] + [col for col in btown_residence.columns if col.startswith('location.')] + ['id']
residence_filtered = btown_residence.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
residence_filtered['categories'] = residence_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
residence_filtered.columns = [column.split('.')[-1] for column in residence_filtered.columns]

residence_filtered.head()

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,cc,city,state,country,formattedAddress,postalCode,crossStreet,id
0,Read Residence Hall,College Residence Hall,122 S. Jordan Ave.,39.166423,-86.514476,"[{'label': 'display', 'lat': 39.16642287812105...",1711,US,Bloomington,IN,United States,"[122 S. Jordan Ave., Bloomington, IN, United S...",,,4b853a50f964a520bf5131e3
1,Spruce Residence Center,College Residence Hall,,39.165445,-86.511737,"[{'label': 'display', 'lat': 39.16544547057483...",1954,US,Bloomington,IN,United States,"[Bloomington, IN, United States]",,,521df87111d29e744e7e2abe
2,Willkie Residence Center,College Residence Hall,150 N Rose Ave,39.166337,-86.510646,"[{'label': 'display', 'lat': 39.16633662625786...",2041,US,Bloomington,IN,United States,"[150 N Rose Ave, Bloomington, IN 47406, United...",47406.0,,4b26c5b0f964a520bb8024e3
3,Ashton Residence Center,College Quad,1800 E 10th St,39.170513,-86.511497,"[{'label': 'display', 'lat': 39.17051272597274...",2004,US,Bloomington,IN,United States,"[1800 E 10th St, Bloomington, IN 47406, United...",47406.0,,4b1daa7df964a520b31324e3
4,Spruce Residence Hall,College Residence Hall,,39.165528,-86.511892,"[{'label': 'display', 'lat': 39.16552820361804...",1940,US,Bloomington,IN,United States,"[Bloomington, IN 47401, United States]",47401.0,,50d48746e4b023d652a5acf5


In [71]:
residence_filtered['categories'].unique()

array(['College Residence Hall', 'College Quad', 'Assisted Living',
       'Housing Development'], dtype=object)

In [73]:
residence_filtered.shape

(9, 15)

So, while the variables for College housing likely gives us some good population numbers, it also is pretty distinctly lacking in the underage category. Hopefully accounting for local education institutions will give us the numbers of under-18 individuals to support the all-ages aspect of this venue. 

### The Final Map

At this point, it seems we've probably exhausted most of the main sources of information available via Foursquare, so let's see if we can create a locational data map, combine our data into one grand master frame, and see where the equidistant point from all of our categories exists. In the event that the new coordinate location is in too great of proximity to other venues, we'll look at adjusting our numbers based off of excluding values within certain clustering ranges. 

In [74]:
final_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map

# add a red circle marker to represent Bloomington's Coordinate Center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Bloomington Base',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(final_map)

# add the Schools as green circle markers
for lat, lng, label in zip(schools_filtered.lat, schools_filtered.lng, schools_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='green',
        popup=label,
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(final_map)

# add the Venues as blue circle markers
for lat, lng, label in zip(final_frame.lat, final_frame.lng, final_frame.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(final_map)
    
# add the Residences as Orange circle markers
for lat, lng, label in zip(residence_filtered.lat, residence_filtered.lng, residence_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='orange',
        popup=label,
        fill = True,
        fill_color='orange',
        fill_opacity=0.6
    ).add_to(final_map)
    
# display map
final_map

In [75]:
master_frames = [final_frame, schools_filtered, residence_filtered]
master_frame = pd.concat(master_frames, sort = True)
master_frame.shape

(61, 15)

In [76]:
print('The average Latitude measurement between all Master List Values is: ', master_frame['lat'].mean())
print('The average Longitude measurement between all Master List Values is: ', master_frame['lng'].mean())

The average Latitude measurement between all Master List Values is:  39.16859064933999
The average Longitude measurement between all Master List Values is:  -86.52516407062333


Awesome, let's make another map and see where our point came out! Hopefully this represents a seperate location from the other venues in the area, while also being closer to Residence and Education areas within the local region. If not, it's back to the drawing board!

In [77]:
master_lat = master_frame['lat'].mean()
master_lng = master_frame['lng'].mean()

In [78]:
target_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map

# add a red circle marker to represent Bloomington's Coordinate Center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Bloomington Base',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(target_map)

# add the Schools as green circle markers
for lat, lng, label in zip(schools_filtered.lat, schools_filtered.lng, schools_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='green',
        popup=label,
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(target_map)

# add the Venues as blue circle markers
for lat, lng, label in zip(final_frame.lat, final_frame.lng, final_frame.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(target_map)
    
# add the Residences as Orange circle markers
for lat, lng, label in zip(residence_filtered.lat, residence_filtered.lng, residence_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='orange',
        popup=label,
        fill = True,
        fill_color='orange',
        fill_opacity=0.6
    ).add_to(target_map)
    
# add a Black circle marker to represent Bloomington's New Venue Location!
folium.features.CircleMarker(
    [master_lat, master_lng],
    radius=10,
    color='black',
    popup='Bloomington All-Ages Venue',
    fill = True,
    fill_color = 'black',
    fill_opacity = 0.6
).add_to(target_map)
    
# display map
target_map

Wow, what luck, a reasonable-looking venue location! More data-analysis would be appropriate to test the long-term viability, but as far as simple calculations go, this point seems to be fairly centric to a large number of Residence areas, with immediate access just outside of the downtown strip, and the local University as well! 