# Capstone Project - The Battle of Neighborhoods


# Week1/Part 1

### 1) A description of the problem and a discussion of the background
### 2) A description of the data and how it will be used to solve the problem

In [1]:
# Import all Required Libraries
import numpy as np # library to handle data in a vectorized manner
import time
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.0.1               |             py_0         575 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.0               |             py_0          26 KB  conda-forge
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be 

In [2]:
## A description of the problem and a discussion of the background

### Introduction:
My Use Case: I am working @ XYZ Company and residing in Schaumburg, Chicago area. My location is walking distance from Woodfield mall and very close to all basic required amenities. I enjoy many amenities and venues in the area, such as various international and Indian restaurants, cafes, food shops and entertainment. I have been offered a great opportunity to work for a same company XYZ in New York City. I am excited and I want to use this opportunity to trail run what I have had learned so far in Coursera in order to get answer to relevant potential questions. Question could be how can I find a convenient and enjoyable place like I have in Schaumburg, Chicago? Idea is to use and apply my learning during the course. In order to make a comparison and evaluation of the rental options in NY, I must set some basis, therefore the apartment in NY must meet the following demands:


•	Apartment must be 1 bedrooms <br>
•	Location is near a metro station in the NY suburb and within 1.0 mile <br>
•	Price of rent not exceed $XXXXX per month <br>
•	Amenities in the neighborhood should be similar to current residence <br>
•	Venues such as coffee shops, Indian restaurants, gym and Asian food shops <br>
•	I have included a map of venues near current residence in Schaumburg, Chicago. <br>


### Problem Statement:
The challenge is to find a suitable apartment/studio for rent in or around NY city that complies with the demands on location, price and venues. The data required to resolve this challenge is described in the following section 


In [3]:
## A description of the data and how it will be used to solve the problem

### Data Section: Description of Data
The following data is required to answer the issues of the problem:
•	List of Boroughs and neighborhoods of NY with their GeoData (latitude and longitude) <br>
•	List of Metro stations/Bus Station in and around NY with their address location <br>
•	List of apartments for rent in surrounding area with their addresses and price <br>
•	List of apartment for rent with additional information, such as price, address, area <br>
•	Venues for each neighborhood 


 #### How the data will be used to solve the problem
The data will be used as follows:
•	Use Foursquare and GeoPy data to map top 10 venues for all neighborhoods and clustered in groups  <br>
•	Use foursquare and GeoPy data to map the location of subway metro stations , separately and on top of the above clustered map in order to be able to identify the venues and amenities near each metro station, or explore each subway location separately <br>
•	Use Foursquare and GeoPy data to map the location of rental places <br>
•	Create a map that depicts the average rental price around a radius of 1.0 mile around each subway station <br> 
•	Addresses from rental locations will be converted to GeoData (latitude and longitude) using GeoPy -distance and Nominatim. <br>
•	Data will be searched in open data sources if available, from real estate sites if open to reading, libraries or other government agencies such as Metro New York MTA, etc. <br>


The processing of these data will allow to answer the key questions to make a decision: <br>
•	What is the cost of apartment rent around a mile radius from each subway metro station? <br>
•	What is the area of suburb with best rental pricing that meets criteria established?<br>
•	What is the distance from work place and the tentative future home?<br>
•	What are the venues of the two best places to live? <br>
•	How venues distribute among neighborhoods and around metro stations?<br>
•	Any other interesting statistical data findings of the real estate and overall data.<br>


In [4]:
# Chicago
address = 'Schaumburg, Chicago'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Chicago home are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Chicago home are 42.024937300000005, -88.06305248648468.


In [5]:
neighborhood_latitude=42.024937300000005
neighborhood_longitude=-88.06305248648468

In [6]:
# The code was removed by Watson Studio for sharing.

In [7]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
##url # display URL

In [8]:
results = requests.get(url).json()
#results

In [9]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#### Reference of venues around current residence in Schaumburg , Chicago for comparison to NY City/New Place

In [10]:
venues = results['response']['groups'][0]['items']
    
SGnearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
SGnearby_venues =SGnearby_venues.loc[:, filtered_columns]

# filter the category for each row
SGnearby_venues['venue.categories'] = SGnearby_venues.apply(get_category_type, axis=1)

# clean columns
SGnearby_venues.columns = [col.split(".")[-1] for col in SGnearby_venues.columns]

SGnearby_venues.head(15)

Unnamed: 0,name,categories,lat,lng
0,Maxfield's Pancake House,Breakfast Spot,42.029146,-88.0618
1,Schaumburg Prairie Center for the Arts,Performing Arts Venue,42.025993,-88.06702
2,Bawarchi Biryani Point,Indian Restaurant,42.027498,-88.059736
3,Volkening Heritage Farm,Farm,42.023838,-88.061172
4,Patel Brothers,Grocery Store,42.027654,-88.059611
5,Baskin-Robbins,Ice Cream Shop,42.027741,-88.059312
6,The UPS Store,Shipping Store,42.027485,-88.059264
7,Redbox,Video Store,42.02848,-88.061477
8,Dunkin',Donut Shop,42.027776,-88.059315
9,Walgreens,Pharmacy,42.028743,-88.061441


### Map of Schaumburg Chicago with venues near residence place - for reference

In [11]:
# create map of Chicago place  using latitude and longitude values
map_sg = folium.Map(location=[latitude, longitude], zoom_start=20)

# add markers to map
for lat, lng, label in zip(SGnearby_venues['lat'], SGnearby_venues['lng'], SGnearby_venues['name']):
    label = folium.Popup(label, parse_html=True)
    folium.RegularPolygonMarker(
        [lat, lng],
        number_of_sides=4,
        radius=10,
        popup=label,
        color='blue',
        fill_color='#0f0f0f',
        fill_opacity=0.7,
    ).add_to(map_sg)  
    
map_sg