## Exploring neighborhood - DATA (part 1 of Capstone Project)

### Data - sourcing and usage

To answer main question of the project - i.e. choose best place to start with Robotics club following steps will be performed:

1. Geospatial data. We need to divide city into at least districts if not neighborhoods. Geospatial data I only to find for Kyiv was .geojson file that contains polygon coordinates of each district borders. We then need to pick centers of neighborhoods somehow. This might be using borders coordinates or choosing randomly centers within polygons or manual choice of centers within districts
https://github.com/denysboiko/kyivmap/blob/1ad68c16c2aa1f2bfe5a31fbbc261b722fbf6a0c/static/media/kyiv.34272c8c.geojson 




2. Main parameters of choice are Primary schools / Rivals / Other courses / Rent level.
All this can be answered using Foursquare data. As it is can be seen (see example below) request for search_query = 'primary school' in Kyiv, returns only couple primary schools and rest is plenty of various Courses / Language Schools / Coworking etc.
 1. Using category we might filter only schools and for each Neighborhood record quantity of schools in 2-3 km radius
 2. Using category we might as well filter rivals (if it is technical education) and other courses for same age group (non-technical education). That quantities would give us rivals and complements.
 3. To define indirectly higher / lower rent expected, I decided to use following assumption: the more restaurants / cafee / bars is near - the higher rent rate is. So additional parameters for each neighborhood will be expensive (in terms of rent) objects.


3. Having collected and munged data on previous step Expert opinion is required for setting level of importance of each parameter. Then we would be able to define the best options. 
Such coefficients might be defined basis humble opinion. However, there is more sophisticated but reliable way to get them. We can use locations of existing Robotics courses / Technical clubs and check their surroundings then using actual data of shares in each group among total venues as importance coefficients.


Below you can find aforementioned example of "primary schools" request for Kyiv (to be used for defining Schools / Rivals / Complements (non-technical courses) and 200 city venues (to be used for defining share of each type of venues for existing Robotics clubs)

In [None]:
#importing libraries
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

In [5]:
#Let us look at the city as a whole, so get coordinates of Kyiv
address = 'Kyiv, Ukraine'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))

The geograpical coordinate of Kyiv, Ukraine are 50.4500336, 30.5241361.


In [9]:
results = pd.DataFrame({'Name':['Kyiv'],'Latitude':[latitude],'Longitude':[longitude]})

In [10]:
results

Unnamed: 0,Name,Latitude,Longitude
0,Kyiv,50.450034,30.524136


In [11]:
#Credentials. You don't mean to use them, do you ?
CLIENT_ID = 'N2NVGFIVEEPWIDUNHMWCR0Q2HIQI3XYGO03OVZ3CD5JCK2D0' # your Foursquare ID
CLIENT_SECRET = 'HFQRYHNC5BWU3YTFDQLNN0KLDSD2YOLOZEMHY3SCCYTU4WZE' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [19]:
radius = 1000
LIMIT = 100
search_query = 'primary school'
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
#url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID,CLIENT_SECRET,VERSION,latitude,longitude,radius,LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=N2NVGFIVEEPWIDUNHMWCR0Q2HIQI3XYGO03OVZ3CD5JCK2D0&client_secret=HFQRYHNC5BWU3YTFDQLNN0KLDSD2YOLOZEMHY3SCCYTU4WZE&ll=50.4500336,30.5241361&v=20180605&query=primary school&radius=1000&limit=100'

In [20]:
results = requests.get(url).json()

In [21]:
results

{'meta': {'code': 200, 'requestId': '5d63ac78ad1789002c8dc86c'},
 'response': {'venues': [{'id': '4f5f8dffe4b01a2cc85e0ead',
    'name': 'New York Language School',
    'location': {'address': 'вул. Пушкінська, 12а',
     'lat': 50.44622426192626,
     'lng': 30.51844793192073,
     'labeledLatLngs': [{'label': 'display',
       'lat': 50.44622426192626,
       'lng': 30.51844793192073}],
     'distance': 585,
     'cc': 'UA',
     'city': 'Київ',
     'state': 'м. Київ',
     'country': 'Україна',
     'formattedAddress': ['вул. Пушкінська, 12а', 'Київ', 'Україна']},
    'categories': [{'id': '52e81612bcbc57f1066b7a48',
      'name': 'Language School',
      'pluralName': 'Language Schools',
      'shortName': 'Language School',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/education/default_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1566813304',
    'hasPerk': False},
   {'id': '57ee10ba498eeb97eaf896c2',
    'name': 'Progress Englis

In [12]:
#declaring same function we used to collect venuse data and store them (I've commented printing each name)
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [13]:
#Getting information of not more than 100 venues for each neighborhood
LIMIT = 200
city_venues = getNearbyVenues(names=results['Name'],
                                   latitudes=results['Latitude'],
                                   longitudes=results['Longitude']
                                  )

In [14]:
city_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kyiv,50.450034,30.524136,Cinnabon,50.450865,30.522829,Bakery
1,Kyiv,50.450034,30.524136,П‘яна Вишня,50.450969,30.521464,Cocktail Bar
2,Kyiv,50.450034,30.524136,Très Français,50.451462,30.523311,French Restaurant
3,Kyiv,50.450034,30.524136,Art Eclair,50.45198,30.523797,Dessert Shop
4,Kyiv,50.450034,30.524136,Майдан Незалежності,50.449939,30.524118,Plaza
5,Kyiv,50.450034,30.524136,UA made,50.450635,30.523261,Gift Shop
6,Kyiv,50.450034,30.524136,МАМА ГОЧІ,50.448205,30.52564,Caucasian Restaurant
7,Kyiv,50.450034,30.524136,Львівські Круасани,50.449121,30.52281,Bakery
8,Kyiv,50.450034,30.524136,Умка,50.452139,30.524015,Ice Cream Shop
9,Kyiv,50.450034,30.524136,Soroka Coffee Bar,50.448299,30.522869,Coffee Shop
