# Los Angeles (South Bay) Urgent Care Location Analysis

<img src="https://drive.google.com/uc?export=view&id=1O3SzuGYbGyk_rQaTG3Dp3yR3yx8SNaVH" alt="Alt text" title="Title text" />

## Introduction - Business Problem

<br>
I am planning to open an Urgent Care somewhere in the South Bay of Los Angeles, however I am unsure of which neighborhood to select. I want to make sure I select a neigbhorhood that has an adequate population and is not saturated with other Urgent Care offices. While this analysis is relevant to myself, it could be similarly applied to anyone looking for an ideal location for their business by simply altering the Foursquare Categories they are retrieving.
<br>

## Data Used

The data I will begin with is a CSV table of Los Angeles neighborhoods from the IRS. The table contains the Neighborhood name, Zip Code, Population and District. See below for an example:

<img src="https://drive.google.com/uc?export=view&id=1kJxShMVRjH3IORmAA-j_e7hVbDmGE82D" alt="Alt text" title="Title text" />
<br>

I will also be using a second table of latitude and longitude values by zip code. See an example below:

<img src="https://drive.google.com/uc?export=view&id=15BhY2n_7h_fD4EKlLp4zL2rnq5PS9S4z" alt="Alt text" title="Title text" />

<br>

## Methodology

<br>

I will take the following steps in my analysis:
1. Import CSVs into pandas dataframes
2. Format the dataframs and merge along Zip Code
3. Create and execute a function to send location data to Foursquare API and return only Urgent Care Centers
4. Create a new dataframe that combines Neighborhood, Population, and count of Urgent Care Centers by 10,000 population (divide the count of centers by the neighborhood population and multiply by 10,000)
5. Compare and Analyze
<br>

## Results

<br>

The results of the data analysis are displayed below:

<img src="https://drive.google.com/uc?export=view&id=1xTsAl1wpos4_9pkF774fEIoCK4ryQzVZ" alt="Alt text" title="Title text" />

<br>

## Discussion

<br>

After conducting this analysis it is clear that certain neighbhorhoods have significatnly more Urgent Cares per capita than others. For example Rancho Palos Verdes and Inglewood have roughly .17 Urgent Care locations per 10,000 residents, while Marina Del Rey and Redondo Beach have over 3.5 locations per 10,000 residents. While this information alone should not be used to determine the optimal location for the business, it does shed light on the level of competition that will be present in the area. I believe this analysis is a good first step to identify a potential "short-list" of neighborhoods that should be further analyzed (i.e. accessibility, cost of rent, affluency, etc) prior to finalizing the selection of the neighbhorhood.

Additionally, given that the average number of centers in the whole area is 1.7/10,000 it may be prudent to conduct further analysis to determine why the high and low areas have more and less locations currently.

<br>

## Conclusion

<br>

In conclusion this has been a successful analysis that has delivered actionable results that can be used in my location selection for the Urgent Care center. 

All calculations can be found below.

#### ALL DATA ANALYSIS CALCULATIONS 

In [1]:
!pip install geopy
!pip install folium
import pandas as pd
import numpy as np
import requests
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim
import folium # map rendering library
print('Libraries imported.')

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/72/ff/004bfe344150a064e558cb2aedeaa02ecbf75e60e148a55a9198f0c41765/folium-0.10.0-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 16.1MB/s eta 0:00:01
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/63/36/1c93318e9653f4e414a2e0c3b98fc898b4970e939afeedeee6075dd3b703/branca-0.3.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.1 folium-0.10.0
Libraries imported.


## IMPORT DATA CSV

In [2]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Neighborhood,Zip,Population,District
0,Inglewood,90009,58104,4
1,HERMOSA BEACH,90254,125328,4
2,MANHATTAN BEACH,90266,221231,4
3,PALOS VERDES PENINSULA,90274,287764,4
4,RANCHO PALOS VERDES,90275,142191,4


# IMPORT LAT/LONG CSV FILE

In [3]:


body = client_f32f5c4920084615b15c4d597e57d07c.get_object(Bucket='applieddatasciencecapstone-donotdelete-pr-3zecmyooy6vd6w',Key='latlong-1.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

lat_long = pd.read_csv(body)
lat_long.head()


Unnamed: 0,Zipcode,Lat,Long
0,96162,39.3,-120.31
1,96161,39.33,-120.24
2,96160,39.32,-120.18
3,96158,38.92,-119.96
4,96157,38.94,-119.97


# FORMAT AND MERGE

In [4]:

la_data.rename(columns={'Zip': 'Zipcode'}, inplace=True)
la_data.set_index('Zipcode')
lat_long.set_index('Zipcode')
comboLA = pd.merge(la_data, lat_long, on='Zipcode')

# Create neighboarhood and population datafram for later use
comboLAgrp = comboLA.drop(['Zipcode', 'Lat','Long','District'], axis=1)
comboLAgrp = comboLAgrp.groupby('Neighborhood').sum()

# SET FOURSQUARE PARAMETERS

In [5]:

CLIENT_ID = 'AKCX2GTFV10033CD3WV2BEJMD1OCM2DOP0JWBMPGTNW54G0B' # your Foursquare ID
CLIENT_SECRET = 'UGZADPTBYFFWVG5R31GBEQBFKZTUGC5UMVGSYPXTI2C0GEGK' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [6]:
CATID = '56aa371be4b08b9a8d573526'
LIMIT = 50

# DEFINE FUNCTION TO PROCESS GET REQUESTS

In [7]:


def getNearbyVenues(names, latitudes, longitudes, radius=5000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            CATID)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

# LOOP AND RETURN VALUES


In [8]:

la_venues = getNearbyVenues(names=comboLA['Neighborhood'],
                                   latitudes=comboLA['Lat'],
                                   longitudes=comboLA['Long']
                                  )

Inglewood
HERMOSA BEACH
MANHATTAN BEACH
PALOS VERDES PENINSULA
RANCHO PALOS VERDES
REDONDO BEACH
REDONDO BEACH
TORRANCE
TORRANCE
MARINA DEL REY
PLAYA DEL REY
INGLEWOOD
INGLEWOOD
INGLEWOOD
SANTA MONICA
TORRANCE
HARBOR CITY
WILMINGTON
CARSON
CARSON
HAWTHORNE
LAWNDALE


# CONSOLIDATE NUMBER OF LOCATIONS AND NEIGHBORHOOD


In [9]:
consol_venue = la_venues['Neighborhood'].value_counts()

# CREATE NEW DATAFRAME THAT COMBINES Neighborhood, Count, and Population 


In [10]:

LAfinal = pd.DataFrame({'Neighborhood':consol_venue.index, 'Urgent Care Count':consol_venue.values, 'Population':comboLAgrp.Population}).set_index('Neighborhood')

# Add value for urgent cares per 10,000 residents

In [11]:

LAfinal['UC per 10,000'] = LAfinal['Urgent Care Count']/LAfinal['Population']*10000

# sort values so that the neighborhood with the fewest urgent care locations per 10,000 residents are towards the top

In [12]:
LAfinal.sort_values(by=['UC per 10,000'])

Unnamed: 0_level_0,Urgent Care Count,Population,"UC per 10,000"
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
RANCHO PALOS VERDES,3,175631,0.170813
Inglewood,5,287764,0.173753
CARSON,4,172386,0.232037
WILMINGTON,4,142191,0.281312
HAWTHORNE,3,104474,0.287153
LAWNDALE,7,221231,0.316411
PALOS VERDES PENINSULA,4,93632,0.427204
HARBOR CITY,6,109454,0.548175
INGLEWOOD,11,125328,0.877697
MANHATTAN BEACH,3,33307,0.900712
