# Battle of Neighborhoods Coursera Project

## Introduction and Data Sections

### Scenario

How can I find a convenient and enjoyable place similar to one in Singapore? In order to make a comparison and evaluation of the rental options in Manhattan NY, I must set some basis, therefore the apartment in Manhattan must meet the following demands

* apartment must be 2 or 3 bedrooms
* desired location is near a metro station in the Manhattan area and within 1.0 mile (1.6 km) radius
* price of rent should not exceed 7,000 dollars per month
* top ammenities in the selected neighborhood shall be similar to current residence
* desirable to have venues such as coffee shops, restaurants Asian Thai, wine stores, gym and food shops


### Business Problem

 The challenge is to find a suitable apartment for rent in Manhattan NY that complies with the demands on location, price and venues. The data required to resolve this challenge is described in the following section.

### Interested Audience

I believe this is a relevant challenge with valid questions for anyone moving to other large city in US, EU or Asia. The same methodology can be applied in accordance to demands as applicable. This case is also applicable for anyone interested in exploring starting or locating a new business in any city. Lastly, it can also serve as a good practical exercise to develop Data Science skills.

### Description of the Data

The following data is required to answer the issues of the problem:

* List of Boroughs and neighborhoods of Manhattan with their geodata (latitud and longitud)
* List of Subway metro stations in Manhattan with their address location
* List of apartments for rent in Manhattan area with their addresses and price
* Preferably, a list of apartment for rent with additional information, such as price, address, area, # of beds, etc
* Venues for each Manhattan neighborhood ( than can be clustered)
* Venues for subway metro stations, as needed

### How the data will be used to solve the problem

The data will be used as follows:

* Use Foursquare and geopy data to map top 10 venues for all Manhattan neighborhoods and clustered in groups
* Use foursquare and geopy data to map the location of subway metro stations , separately and on top of the above clustered map in order to be able to identify the venues and ammenities near each metro station, or explore each subway location separately
* Use Foursquare and geopy data to map the location of rental places, in some form, linked to the subway locations.
* Create a map that depicts, for instance, the average rental price per square ft, around a radious of 1.0 mile around each subway station - or a similar metrics. I will be able to quickly point to the popups to know the relative price per subway area.
* Addresses from rental locations will be converted to geodata using Geopy-distance and Nominatim.
* Data will be searched in open data sources if available, from real estate sites if open to reading, libraries or other government agencies such as Metro New York MTA, etc.

### Answer the key questions to make a decision

* what is the cost of rent (per square ft) around a mile radius from each subway metro station?
* what is the area of Manhattan with best rental pricing that meets criteria established?
* what is the distance from work place ( assume: Park Ave and 53 rd St) and the tentative future home?
* what are the venues of the two best places to live? How the prices compare?
* how venues distribute among Manhattan neighborhoods and around metro stations?
* are there tradeoffs between size and price and location?
* any other interesting statistical data findings of the real estate and overall data?

### Reference of venues around residence in Singapore for comparison to  place in Manhattan

In [7]:
import numpy as np 
import time
import pandas as pd 
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json
import requests 
from pandas.io.json import json_normalize 

from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut
import folium

print('Libraries imported.')

Libraries imported.


In [8]:
address = 'Mccallum Street, Singapore'

geolocator = Nominatim()
location = geolocator.geocode(address, timeout=10)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of the place in Singapore are {}, {}.'.format(latitude, longitude))

  This is separate from the ipykernel package so we can avoid doing imports until


The geograpical coordinate of the place in Singapore are 1.2792423, 103.8481312.


In [10]:
neighborhood_latitude=1.2792423
neighborhood_longitude=103.8481312

In [15]:
CLIENT_ID="LUETHARMZN0ATS5LKT1YNTB2C5Y2MS42IUKEYIJ5JGN1NNJU"
CLIENT_SECRET="4H4F054UDSZJ2EATK5DEMCMRPR3RNRCTXCZBCCCIELED3EZB"
VERSION = '20180604'
LIMIT = 100

In [16]:
radius = 500 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=LUETHARMZN0ATS5LKT1YNTB2C5Y2MS42IUKEYIJ5JGN1NNJU&client_secret=4H4F054UDSZJ2EATK5DEMCMRPR3RNRCTXCZBCCCIELED3EZB&v=20180604&ll=1.2792423,103.8481312&radius=500&limit=100'

In [22]:
results = requests.get(url).json()

In [21]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [23]:
venues = results['response']['groups'][0]['items']
    
SGnearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
SGnearby_venues =SGnearby_venues.loc[:, filtered_columns]

# filter the category for each row
SGnearby_venues['venue.categories'] = SGnearby_venues.apply(get_category_type, axis=1)

# clean columns
SGnearby_venues.columns = [col.split(".")[-1] for col in SGnearby_venues.columns]

SGnearby_venues.head(10)

Unnamed: 0,name,categories,lat,lng
0,Napoleon Food & Wine Bar,Wine Bar,1.279925,103.847333
1,Pepper Bowl,Asian Restaurant,1.279371,103.84671
2,Native,Cocktail Bar,1.280135,103.846844
3,Park Bench Deli,Deli / Bodega,1.279872,103.847287
4,Freehouse,Beer Garden,1.281254,103.848513
5,Sofitel So Singapore,Hotel,1.280124,103.849867
6,Coffee Break,Coffee Shop,1.279529,103.846695
7,PS.Cafe,Café,1.280468,103.846264
8,Dumpling Darlings,Dumpling Restaurant,1.280483,103.846942
9,Nouri,Modern European Restaurant,1.280267,103.84675


### Map of Singapore with venues near residence place for reference

In [24]:
# create map of Singapore place  using latitude and longitude values
map_sg = folium.Map(location=[latitude, longitude], zoom_start=20)

# add markers to map
for lat, lng, label in zip(SGnearby_venues['lat'], SGnearby_venues['lng'], SGnearby_venues['name']):
    label = folium.Popup(label, parse_html=True)
    folium.RegularPolygonMarker(
        [lat, lng],
        number_of_sides=4,
        radius=10,
        popup=label,
        color='blue',
        fill_color='#0f0f0f',
        fill_opacity=0.7,
    ).add_to(map_sg)  
    
map_sg