<h1 align=center><font size = 5>Where to Set-Up Shop in Los Angeles?</font></h1>

## Introduction

In this lab, we will convert addresses into their equivalent latitude and longitude values. Also, we will use the Foursquare API to explore neighborhoods in Los Angeles. We will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. We will use the *k*-means clustering algorithm to complete this task. We will use the Folium library to visualize the neighborhoods in Los Angeles and their emerging clusters. Finally, we will decide which neighborhood would be an ideal place to open up an Italian Restaurant, and which neighborhood to open a Mechanic Shop. 

Utilizing location and feature data to determine the ideal location for a new shop is beneficial for the shop owner to ensure they place their shop in the correct market. An ill-placed shop could mean inadequate costumer base and revenue steams, causing the shop to go out of business.

## Data

The data required for this project will come from two sources. The Neighborhood coordinates and delineations will come from boundaries.latimes.com. This will label and provide the location of each neighborhood. The venue and shop data will come from foursquare. With the neighborhood location data we can find the top and most common venues in each neighborhood and consequently label each neighborhood accordingly. 

The neighborhood data is a json file that plots the neighborhoods in as a MultiPolygon. For ex:


In [34]:
print(neighborhoods_data[1]['properties']['name'])
print(neighborhoods_data[1]['geometry'])

Adams-Normandie
{'type': 'MultiPolygon', 'coordinates': [[[[-118.3090080000001, 34.03741099912408], [-118.30040800000012, 34.03731199912409], [-118.291508, 34.03681199912407], [-118.2914080000001, 34.025511999124234], [-118.305408, 34.025711999124255], [-118.3090080000001, 34.025611999124216], [-118.3090080000001, 34.03741099912408]]]]}


Links for Neighborhood data:

districts: http://s3-us-west-2.amazonaws.com/boundaries.latimes.com/archive/1.0/boundary-set/la-city-council-districts-2012.geojson
    
neighborhoods: http://s3-us-west-2.amazonaws.com/boundaries.latimes.com/archive/1.0/boundary-set/la-county-neighborhoods-v6.geojson

more neighborhoods: http://s3-us-west-2.amazonaws.com/boundaries.latimes.com/archive/1.0/boundary-set/la-county-neighborhoods-current.geojson

boroughs: http://s3-us-west-2.amazonaws.com/boundaries.latimes.com/archive/1.0/boundary-set/la-county-regions-current.geojson

list of neighborhoods: https://en.wikipedia.org/wiki/List_of_districts_and_neighborhoods_of_Los_Angeles

In [35]:
############### Code for project begins below

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.0

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    scikit-learn-0.20.1        |   py36h22eb022_0         5.7 MB
    liblapack-3.8.0            |      11_openblas          10 KB  conda-forge
    liblapacke-3.8.0           |      11_openblas          10 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    libopenblas-0.3.6          |       h5a2b251_2         7.7 MB
    numpy-1.17.3               |   py36h95a1406_0         5.2 MB  conda-forge
    scipy-1.4.1                |   py36h921218d_0        

In [3]:
!wget -q -O 'LA_data.json' http://s3-us-west-2.amazonaws.com/boundaries.latimes.com/archive/1.0/boundary-set/la-county-neighborhoods-current.geojson
print('Data downloaded!')

Data downloaded!


In [4]:
with open('LA_data.json') as json_data:
    LA_data = json.load(json_data)

In [11]:
neighborhoods_data = LA_data['features']
neighborhoods_data[1]

{'type': 'Feature',
 'properties': {'kind': 'L.A. County Neighborhood (Current)',
  'external_id': 'adams-normandie',
  'name': 'Adams-Normandie',
  'slug': 'adams-normandie-la-county-neighborhood-current',
  'set': '/1.0/boundary-set/la-county-neighborhoods-current/',
  'metadata': {'sqmi': 0.805350187789,
   'type': 'segment-of-a-city',
   'name': 'Adams-Normandie',
   'slug': 'adams-normandie'},
  'resource_uri': '/1.0/boundary/adams-normandie-la-county-neighborhood-current/'},
 'geometry': {'type': 'MultiPolygon',
  'coordinates': [[[[-118.3090080000001, 34.03741099912408],
     [-118.30040800000012, 34.03731199912409],
     [-118.291508, 34.03681199912407],
     [-118.2914080000001, 34.025511999124234],
     [-118.305408, 34.025711999124255],
     [-118.3090080000001, 34.025611999124216],
     [-118.3090080000001, 34.03741099912408]]]]}}

In [6]:
# define the dataframe columns
column_names = ['Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [47]:
neighborhoods_data[1]['geometry']['coordinates'][0][0][0]    

[-118.3090080000001, 34.03741099912408]

In [48]:
for data in neighborhoods_data: 
    neighborhood_name = data['properties']['name']
        
    #neighborhood_latlon = data['geometry']['coordinates'][0][0][0]
    neighborhood_lat = data['geometry']['coordinates'][0][0][0][1]
    neighborhood_lon = data['geometry']['coordinates'][0][0][0][0]
    
    neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [49]:
neighborhoods

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Acton,"[-118.18946958918569, 34.5385546636616]","[-118.2026174792054, 34.53898972076929]"
1,Adams-Normandie,"[-118.30040800000012, 34.03731199912409]","[-118.3090080000001, 34.03741099912408]"
2,Agoura Hills,"[-118.726317, 34.16787499912258]","[-118.7619250000001, 34.16820299912263]"
3,Agua Dulce,"[-118.25550542881398, 34.539292874673706]","[-118.25467739592212, 34.55830403375057]"
4,Alhambra,"[-118.11686600000012, 34.10732199912337]","[-118.12174700000016, 34.10503999912332]"
5,Alondra Park,"[-118.3264919904623, 33.88291600056771]","[-118.32651297216451, 33.897572005620816]"
6,Altadena,"[-118.14083997180039, 34.2150280059893]","[-118.15135397181479, 34.21550800599232]"
7,Angeles Crest,"[-118.09672894180616, 34.47352001660662]","[-118.09679656118362, 34.48076701266776]"
8,Arcadia,"[-118.0181860000001, 34.17623199912254]","[-118.017052, 34.177181999122524]"
9,Arleta,"[-118.42281500000016, 34.221702999121845]","[-118.4220150000001, 34.22410299912182]"
