<span style="text-align: center">
        
# Battle of the Neighborhoods (Week 1)
#### Part of the IBM Data Science Certification: Applied Data Science Capstone, Final Project
    
</span>

<div style="margin-left: 50px; margin-right: 50px; padding-top: 30px">
    
<span style="text-align: center">
        
## Abstract
        
</span>
    
In this work, we will be examining geographical locations from 3 different neighboring areas within northern Virginia. Each area will be divided into equally sized bounding boxes and the types of venues (shops, restaurants, ammenities) will be exmained to find the best fit based on a predetermined set of preferences. This is exercising a content-based recommender algorithm to determine the top `N` (in this case `N = 3`) sections of any of the neighborhoods. 
    
</div>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>
 
1. <a href="#introduction">Introduction</a>
    
1. <a href="#data_description">Data Description</a>
    
1. <a href="#methodology">Methodology</a>
    
1. <a href="#results">Results</a>
    
1. <a href="#t4">Discussion</a>
    
1. <a href="#conclusion">Conclusion</a>
    
</font>
</div>

## Preamble: Imports, Constants, and Functions

In [4]:
# The code was removed by Watson Studio for sharing.

In [2]:
%%capture
# Get stuff installed
!pip install geocoder
!pip install foursquare
!pip install folium
!pip install wordcloud

import pandas as pd
import numpy as np
# import k-means from clustering stage
from sklearn.cluster import KMeans

# Geo-data
import geocoder
import foursquare
import folium # mapping

# Viz stuff
# Matplotlib and associated plotting modules
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
%matplotlib inline

# import package and its set of stopwords
from wordcloud import WordCloud, STOPWORDS
stopwords = set(STOPWORDS)


# Watson Studio stuff
from project_lib import Project

# Others
import re
import html
import math

In [8]:
KM_PER_DEGREE = 111.32
R_EARTH = 6378.1
CHANTILLY = 0
TYSONS = 1
ARLINGTON = 2
ALEXANDRIA = 3
PLACES = pd.DataFrame({
    'id': [CHANTILLY, TYSONS, ARLINGTON, ALEXANDRIA],
    'place': ['Chantilly, VA','Tysons Corner, VA', 'Arlington, VA', 'Alexandria, VA'], 
    'radius-m': [2780, 1660, 4190, 3170]
    })

PLACES = pd.concat([PLACES, pd.DataFrame([geocoder.google(place, key=api_key).latlng for place in PLACES['place']], columns=['lat', 'lng'])], axis=1)
CENTER = PLACES[['lat', 'lng']].mean()
BOUNDS = [list(PLACES[['lat','lng']].min().values), list(PLACES[['lat', 'lng']].max().values)]

FSQ_CLIENT = foursquare.Foursquare(client_id=fsq['id'], client_secret=fsq['sec'], version=fsq['ver'])
PROJECT = Project(None, proj['id'], proj['token'])

SURVEY = pd.read_csv('Survey.csv')

print(f'This notebooks is part of the \'{PROJECT.get_name()}\' project')

FileNotFoundError: [Errno 2] File b'Survey.csv' does not exist: b'Survey.csv'

<a id="introduction"></a>

## Introduction

ACME, Inc. is a growing company currently located in Chantilly, Virginia. Due to their growth, they need to relocate to a larger office and have decided to relocate closer to Washington, D.C. in response to their employees' preference. The owners of ACME, Inc. have determined that *Tysons Corner*, *Arlington*, or *Alexandria* would be all be viable locations for their new office. However, determining a specific location has become an issue. 

In order to improve their employees' work experience, they issued an employee-wide survey to determine the types of venues, shops, restaurants or other ammenities that need to be near the new location (within approximately 500 meters or about 5 minutes walk). 

In [26]:
map = folium.Map(location=CENTER, zoom_control=False)
map.fit_bounds(BOUNDS)
for _, row in PLACES[['lat', 'lng', 'radius-m']].iterrows():
    folium.vector_layers.Circle(location=(row.loc['lat'], row.loc['lng']), radius=row.loc['radius-m'], fill=True, color='red' if _ == 0 else 'blue').add_to(map)
    
for _, row in SITES.iterrows():
    folium.map.Marker([row.loc['lat'], row.loc['lng']]).add_to(map)
    folium.vector_layers.Circle([row.loc['lat'], row.loc['lng']], radius=500, color='black').add_to(map)
map


In [6]:
tysons = PLACES[['lat','lng']].iloc[1,:].values
tysons = f'{tysons[0]},{tysons[1]}'

In [7]:
tysons
t_venues = FSQ_CLIENT.venues.explore(params={'ll': tysons, 'radius': 1660, 'limit': 1000, 'time': 'any'})

In [8]:
t_venues


{'suggestedFilters': {'header': 'Tap to show:',
  'filters': [{'name': 'Open now', 'key': 'openNow'},
   {'name': '$-$$$$', 'key': 'price'}]},
 'headerLocation': 'Tysons Corner',
 'headerFullLocation': 'Tysons Corner',
 'headerLocationGranularity': 'city',
 'totalResults': 144,
 'suggestedBounds': {'ne': {'lat': 38.933662214940014,
   'lng': -77.21192618431388},
  'sw': {'lat': 38.90378218505998, 'lng': -77.25025881568612}},
 'groups': [{'type': 'Recommended Places',
   'name': 'recommended',
   'items': [{'reasons': {'count': 0,
      'items': [{'summary': 'This spot is popular',
        'type': 'general',
        'reasonName': 'globalInteractionReason'}]},
     'venue': {'id': '51891fea498ee05ee808e258',
      'name': 'REI',
      'location': {'address': '8209 Watson Street',
       'lat': 38.9183498,
       'lng': -77.2288267,
       'labeledLatLngs': [{'label': 'display',
         'lat': 38.9183498,
         'lng': -77.2288267}],
       'distance': 200,
       'postalCode': '22102'

In [17]:
import json
project = Project(None, 'b4b6e557-186a-403e-b446-c7b826c1c3d4', 'p-e999860364eab6c53f34c62a9d8585eeb600df6d')
project.save_data('test.json', data=json.dumps(t_venues))

{'file_name': 'test.json',
 'message': 'File saved to project storage.',
 'bucket_name': 'courseradatasciencecertification-donotdelete-pr-rlttrl1sp7ama1',
 'asset_id': 'ef73151f-5478-47a6-8fe9-6d2ca910a0f8'}

In [30]:
[print(v['venue']) for v in t_venues['groups'][0]['items'] if v['venue']['categories'] and v['venue']['categories'][0]['name'] == 'Metro Station']

{'id': '505367d2e4b058885119e5b4', 'name': 'Tysons Corner Metro Station', 'location': {'address': '1943 Chain Bridge Rd', 'crossStreet': 'Tysons Blvd', 'lat': 38.92044684686478, 'lng': -77.22177037739041, 'labeledLatLngs': [{'label': 'display', 'lat': 38.92044684686478, 'lng': -77.22177037739041}], 'distance': 829, 'postalCode': '22102', 'cc': 'US', 'city': 'Tysons Corner', 'state': 'VA', 'country': 'United States', 'formattedAddress': ['1943 Chain Bridge Rd (Tysons Blvd)', 'Tysons Corner, VA 22102', 'United States']}, 'categories': [{'id': '4bf58dd8d48988d1fd931735', 'name': 'Metro Station', 'pluralName': 'Metro Stations', 'shortName': 'Metro', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/subway_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}, 'venuePage': {'id': '37501183'}}


[None]

In [49]:
[print(cat) for cat in cats]

Adult Boutique
Animal Shelter
Auto Dealership
Bakery
Bank
Boutique
Building
Business Service
Café
Car Wash
Chiropractor
Cocktail Bar
College Classroom
Conference Room
Coworking Space
Deli / Bodega
Dentist's Office
Doctor's Office
Electronics Store
Event Space
Eye Doctor
Financial or Legal Service
Flower Shop
Food
Food Truck
General Entertainment
Government Building
Gym
Gym / Fitness Center
Hookah Bar
Hotel
IT Services
Italian Restaurant
Korean Restaurant
Laundry Service
Liquor Store
Lounge
Mattress Store
Medical Center
Meeting Room
Men's Store
Office
Persian Restaurant
Pilates Studio
Residential Building (Apartment / Condo)
Shopping Mall
Spiritual Center
Sporting Goods Shop
Sushi Restaurant
Taxi
Tech Startup
Travel & Transport
Urgent Care Center


[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

In [51]:
v_cats = [venue for venue in t_venues['venues'] if len(venue['categories']) > 0]

In [54]:
[print(venue) for venue in v_cats if 'Transport' in venue['categories'][0]['name']]

{'id': '4e91bb7061afcfde2a582a7d', 'name': 'Parking Garage @ Hilton Garden Inn', 'location': {'lat': 38.916551853142764, 'lng': -77.2327270085821, 'labeledLatLngs': [{'label': 'display', 'lat': 38.916551853142764, 'lng': -77.2327270085821}], 'distance': 280, 'cc': 'US', 'city': 'Vienna', 'state': 'VA', 'country': 'United States', 'formattedAddress': ['Vienna, VA', 'United States']}, 'categories': [{'id': '4d4b7105d754a06379d81259', 'name': 'Travel & Transport', 'pluralName': 'Travel & Transport', 'shortName': 'Travel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/default_', 'suffix': '.png'}, 'primary': True}], 'referralId': 'v-1586536460', 'hasPerk': False}


[None]

In [55]:
[print(venue) for venue in v_cats if 'Office' in venue['categories'][0]['name']]

{'id': '5cf655eda22db7002cc97536', 'name': 'David Chow, MD, MPH, FACS', 'location': {'address': '8330 Boone Blvd Ste 160', 'lat': 38.9172564, 'lng': -77.2328814, 'labeledLatLngs': [{'label': 'display', 'lat': 38.9172564, 'lng': -77.2328814}], 'distance': 225, 'postalCode': '22182', 'cc': 'US', 'city': 'Vienna', 'state': 'VA', 'country': 'United States', 'formattedAddress': ['8330 Boone Blvd Ste 160', 'Vienna, VA 22182', 'United States']}, 'categories': [{'id': '4bf58dd8d48988d177941735', 'name': "Doctor's Office", 'pluralName': "Doctor's Offices", 'shortName': "Doctor's Office", 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/medical_doctorsoffice_', 'suffix': '.png'}, 'primary': True}], 'referralId': 'v-1586536460', 'hasPerk': False}
{'id': '4b41f10bf964a52083ca25e3', 'name': 'Management Concepts', 'location': {'address': '8230 Leesburg Pike', 'lat': 38.917718390552814, 'lng': -77.22928520467556, 'labeledLatLngs': [{'label': 'display', 'lat': 38.917718390552814, 'l

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

In [61]:
CENTER

lat    38.874451
lng   -77.203971
dtype: float64

In [63]:
PLACES[['lat', 'lng']].min()

lat    38.804836
lng   -77.431099
dtype: float64

In [67]:
print(type(PLACES[['lat','lng']].max().values))

<class 'numpy.ndarray'>
