# Week 5 Battle of the Neighborhoods - Long Beach CA Edition

## Introduction - Business Problem 

Currently Long Beach CA is seeing a lot of investment coming into the city, in the shape of new highrise buildings and commercial space opening. As an investor, it would be a good idea to open up a gym / dance studio where you will serve the community as a place for people to gather and build camaraderie with their fellow neighbors.

As Long Beach is a pretty big city, the goal of this project is to find a location within the city to place the business. It would be great if we can find a neighborhood that is currently being underserved with these types of venues. This will help the people around your location come to your business, instead of having to drive farther away and go somewhere else. Also, if you are in a location that is farther away from venues of the same type, it will be easier to compete in the market, compared to finding two business with close proximity to one another.

## Data

We will be pulling a list of all the neighborhoods (including population) in Long Beach from the Long Beach city website. This will be our starting population. We will then merge the list of neighborhoods with location data from Foursquare in order to find the latitude and longitude of each neighborhood, as well as all the venues around each neighborhood. It would be a good idea to not only find all the gym / dance studio venues within each neighborhood, but also find other venue categories within each neighborhood. This will help us decide what neighborhood we would like to implement the new business. I would think that a neighborhood with more venues (assuming higher foot traffic) would be better compared to one with less venues, as that would mean more possible customers.

## Methodology - Analysis

In [2]:
!pip install bs4
!pip install -U scikit-learn
print('Base install complete')

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
Requirement already up-to-date: scikit-learn in /opt/conda/envs/Python-3.7-OpenCE/lib/python3.7/site-packages (0.24.2)
Base install complete


In [3]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python-3.7-OpenCE

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2021.5.30          |   py37h89c1867_0         141 KB  conda-forge
    geographiclib-1.52         |     pyhd8ed1ab_0          35 KB  conda-forge
    geopy-2.2.0                |     pyhd8ed1ab_0          67 KB  conda-forge
    python_abi-3.7             |          2_cp37m           4 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         247 KB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.52-pyhd8ed1ab_0
  geopy              conda-forge/noarch::geopy-2.2.0-pyhd8ed1ab_0
  python_abi         con

First lets find the latitude and longitude of Long Beach CA, as well as creating a map of the city.

In [4]:
#Finding the latitude and longitude of Long Beach CA
address = 'Long Beach, CA'

geolocator = Nominatim(user_agent="lb_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Long Beach CA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Long Beach CA are 33.7690164, -118.191604.


In [13]:
# create map of Long Beach using latitude and longitude values
map_lb = folium.Map(location=[latitude, longitude], zoom_start=12)
folium.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup=address,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lb) 
map_lb

As we can see, Long Beach seems to be located at the very bottom of the map. Long Beach actually takes up a lot of space to the north and east of the name on the map. Lets create a list of neighborhoods we got within Long Beach from Google maps, and plot those on the map.

In [7]:
#Creating a list of all neighborhoods located in long beach
list = [['College Square', 33.8775995, -118.2079018],
        ['Freeway Circle', 33.8768558, -118.2027089],
        ['Hamilton', 33.8811402, -118.1877853],
        ['Longwood', 33.8671904, -118.2073224],
        ['Coolidge Triangle', 33.8685174, -118.2050372],
        ['Jordan', 33.8684773, -118.1936082],
        ['DeForest', 33.8634772, -118.1961],
        ['Grant', 33.868504, -118.1817342],
        ['Ramona Park', 33.8695198, -118.1579376],
        ['Cherry Manor', 33.8712171, -118.1708872],
        ['Davenport Park', 33.8573062, -118.1621111],
        ['Harte', 33.8593329, -118.1769383],
        ['Lindbergh', 33.8555106, -118.1854892],
        ['Addams', 33.8491399, -118.1951451],
        ['Sutter', 33.849968, -118.2069147],
        ['Carmelitos', 33.8496079, -118.1831717],
        ['Jackson', 33.8518089, -118.1746424],
        ['Bixby Knolls', 33.8395653, -118.1888944],
        ['Los Cerritos', 33.8313076, -118.2059385],
        ['California Heights', 33.8228504, -118.1828433],
        ['Lakewood Village', 33.8397248, -118.1382463],
        ['Old Lakewood City', 33.8287285, -118.1339371],
        ['Carson Park', 33.8246544, -118.1083898],
        ['South of Conant', 33.817825, -118.1286768],
        ['Rancho Estates', 33.8146588, -118.0994011],
        ['El Dorado Park', 33.8159123, -118.0855862],
        ['Plaza', 33.8102836, -118.1165647],
        ['El Dorado South', 33.7915596, -118.1010587],
        ['Stratford Square', 33.8034749, -118.1338763],
        ['Los Altos', 33.7960652, -118.1227255],
        ['Artcraft Manor', 33.7997134, -118.1435893],
        ['Aubry at Alamitos Bridge', 33.7945535, -118.1541153],
        ['Traffic Circle Area', 33.7894794, -118.1470662],
        ['Park Estates', 33.7825936, -118.1321293],
        ['College Estates', 33.7780417, -118.1022978],
        ['Bixby Hill', 33.7779121, -118.1097061],
        ['Arlington', 33.8199268, -118.2220424],
        ['Upper Westide', 33.8150948, -118.2231474],
        ['Lower Westside', 33.79088, -118.22624],
        ['Bixby Village', 33.7716293, -118.1180961],
        ['University Park Estates', 33.7711299, -118.1100172],
        ['Marina', 33.7560727, -118.1211129],
        ['Naples', 33.7551016, -118.1258904],
        ['Peninsula', 33.7484023, -118.1284118],
        ['Alamitos Heights', 33.7747105, -118.1308043],
        ['Belmont Shore', 33.7573656, -118.1447544],
        ['Belmont Park', 33.7637761, -118.1327623],
        ['Belmont Heights', 33.767901, -118.1469902],
        ['Bluff Park', 33.7627415, -118.1630444],
        ['Bluff Heights', 33.7681131, -118.1612689],
        ['Zaferia', 33.784444, -118.1611562],
        ['Rose Park', 33.7771898, -118.162229],
        ['Carroll Park', 33.7699014, -118.1648736],
        ['Rose Park South', 33.7735422, -118.1644179],
        ['Alamitos Beach', 33.7667061, -118.1784886],
        ['North Alamitos Beach', 33.7735511, -118.1766807],
        ['Hellman Street', 33.7771664, -118.1761069],
        ['Lincoln', 33.7808238, -118.1764126],
        ['MacArthur Park', 33.7844399, -118.1728506],
        ['Whittier', 33.7880824, -118.1764126],
        ['East Village', 33.7708177, -118.1891799],
        ['Waterfront', 33.7634818, -118.198235],
        ['West Gateway', 33.7694176, -118.2013143],
        ['Willmore', 33.7770261, -118.2034927],
        ['North Pine', 33.7753973, -118.1937503],
        ['Washington', 33.7862533, -118.1992436],
        ['Poly High', 33.7926069, -118.1874052],
        ['South Wrigley', 33.7971499, -118.2037925],
        ['Sunrise', 33.8024661, -118.184486],
        ['Wrigley Heights', 33.8222043, -118.2029879],
        ['Memorial Heights', 33.8173239, -118.1961108]]

#creating a dataframe out of the list of neighborhoods in Long Beach
df_lb = pd.DataFrame(list, columns = ['Neighborhood', 'Latitude', 'Longitude'])
print('Shape of neighborhood dataframe is: ',df_lb.shape)
df_lb.head()

Shape of neighborhood dataframe is:  (71, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,College Square,33.8776,-118.207902
1,Freeway Circle,33.876856,-118.202709
2,Hamilton,33.88114,-118.187785
3,Longwood,33.86719,-118.207322
4,Coolidge Triangle,33.868517,-118.205037


In [9]:
# create map of Long Beach, with all neighborhoods using latitude and longitude values
map_lb = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, neighborhood in zip(df_lb['Latitude'], df_lb['Longitude'], df_lb['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='purple',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lb) 

map_lb

As we can see from plotting all the Long Beach neighborhoods, we have a large city to explore. Next we will locate venues around each neighborhood, and try to find a location where we can open up a dance studio / gym.

In [15]:
# The code was removed by Watson Studio for sharing.

Foursquare credentials have been saved


In [42]:
#creating function to pull all venues near each neighborhood
def getNearbyVenues(names, latitudes, longitudes, radius=800):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Now that we created our function to search for venues around each neighborhood, we will feed it all the neighborhoods in our dataframe, and locate all venues within a half-mile radius of each neighborhood.

In [43]:
# passing each distinct neighborhood to function so we can pull all venues close to each neighborhood
lb_venues = getNearbyVenues(names=df_lb['Neighborhood'],
                                latitudes=df_lb['Latitude'],
                                longitudes=df_lb['Longitude']
                           )
print('Neighborhood search is complete!')

College Square
Freeway Circle
Hamilton
Longwood
Coolidge Triangle
Jordan
DeForest
Grant
Ramona Park
Cherry Manor
Davenport Park
Harte
Lindbergh
Addams
Sutter
Carmelitos
Jackson
Bixby Knolls
Los Cerritos
California Heights
Lakewood Village
Old Lakewood City
Carson Park
South of Conant
Rancho Estates
El Dorado Park
Plaza
El Dorado South
Stratford Square
Los Altos
Artcraft Manor
Aubry at Alamitos Bridge
Traffic Circle Area
Park Estates
College Estates
Bixby Hill
Arlington
Upper Westide
Lower Westside
Bixby Village
University Park Estates
Marina
Naples
Peninsula
Alamitos Heights
Belmont Shore
Belmont Park
Belmont Heights
Bluff Park
Bluff Heights
Zaferia
Rose Park
Carroll Park
Rose Park South
Alamitos Beach
North Alamitos Beach
Hellman Street
Lincoln
MacArthur Park
Whittier
East Village
Waterfront
West Gateway
Willmore
North Pine
Washington
Poly High
South Wrigley
Sunrise
Wrigley Heights
Memorial Heights
Neighborhood search is complete!


Let's see how many total venues we pulled, and place them on a map.

In [44]:
print('Shape of the Long Beach venues dataframe is: ',lb_venues.shape)
lb_venues.head()

Shape of the Long Beach venues dataframe is:  (2066, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,College Square,33.8776,-118.207902,7-Eleven,33.881571,-118.204423,Convenience Store
1,College Square,33.8776,-118.207902,MLB Urban Youth Academy,33.875924,-118.213345,Baseball Stadium
2,College Square,33.8776,-118.207902,Starbucks,33.879499,-118.214887,Coffee Shop
3,College Square,33.8776,-118.207902,ampm,33.875575,-118.215197,Convenience Store
4,College Square,33.8776,-118.207902,Love Laundry Long Beach,33.875168,-118.203001,Laundromat


In [45]:
# Checking counts per neighborhood
lb_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Addams,18,18,18,18,18,18
Alamitos Beach,71,71,71,71,71,71
Alamitos Heights,30,30,30,30,30,30
Arlington,5,5,5,5,5,5
Artcraft Manor,28,28,28,28,28,28
Aubry at Alamitos Bridge,23,23,23,23,23,23
Belmont Heights,40,40,40,40,40,40
Belmont Park,92,92,92,92,92,92
Belmont Shore,37,37,37,37,37,37
Bixby Hill,28,28,28,28,28,28


In [53]:
# create map of Long Beach, with all venues pulled using latitude and longitude values
map_lb = folium.Map(location=[latitude, longitude], zoom_start=13)

# add markers to map
for lat, lng, neighborhood, venuecat in zip(lb_venues['Venue Latitude'], lb_venues['Venue Longitude'], lb_venues['Neighborhood'], lb_venues['Venue Category']):
    label = '{}, {}'.format(venuecat, neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        popup=label,
        color='purple',
        fill=True,
        fill_color='purple',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lb) 

map_lb

Now that we have all the venues, let check the type of venues we have.

In [47]:
print('There are {} uniques categories.'.format(len(lb_venues['Venue Category'].unique())))
lbv = lb_venues['Venue Category'].unique()
print('They are: ',sorted(lbv))

There are 251 uniques categories.
They are:  ['ATM', 'Accessories Store', 'Airport', 'Airport Service', 'American Restaurant', 'Antique Shop', 'Aquarium', 'Arcade', 'Argentinian Restaurant', 'Art Gallery', 'Art Museum', 'Arts & Crafts Store', 'Arts & Entertainment', 'Asian Restaurant', 'Athletics & Sports', 'Auto Dealership', 'Automotive Shop', 'BBQ Joint', 'Bagel Shop', 'Bakery', 'Bank', 'Bar', 'Baseball Field', 'Baseball Stadium', 'Basketball Court', 'Beach', 'Bed & Breakfast', 'Beer Garden', 'Beer Store', 'Big Box Store', 'Bike Shop', 'Bike Trail', 'Bistro', 'Board Shop', 'Boat Launch', 'Boat or Ferry', 'Bookstore', 'Boutique', 'Bowling Alley', 'Breakfast Spot', 'Brewery', 'Bubble Tea Shop', 'Buffet', 'Building', 'Burger Joint', 'Burrito Place', 'Bus Station', 'Business Service', 'Cafeteria', 'Café', 'Cajun / Creole Restaurant', 'Cambodian Restaurant', 'Candy Store', 'Cheese Shop', 'Chinese Restaurant', 'Churrascaria', 'Clothing Store', 'Cocktail Bar', 'Coffee Shop', 'College Classr

We would like to compete against other gyms / fitness centers. Let's create a new dataframe only consisting of these categories.

In [51]:
#limiting long beach venues to only dance studios and gyms
cat = ['Dance Studio','Gym','Gym / Fitness Center','Gym Pool','Yoga Studio']
lb_dg = lb_venues[lb_venues['Venue Category'].isin(cat)]
print(lb_dg.shape)
lb_dg.head()

(46, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
78,Ramona Park,33.86952,-118.157938,Metroflex,33.862731,-118.155221,Gym
187,Carmelitos,33.849608,-118.183172,Fairfield Family YMCA,33.845578,-118.185733,Gym
212,Bixby Knolls,33.839565,-118.188894,Crunch - Long Beach,33.835824,-118.188637,Gym / Fitness Center
248,Bixby Knolls,33.839565,-118.188894,Fairfield Family YMCA,33.845578,-118.185733,Gym
298,California Heights,33.82285,-118.182843,Long Beach Ballet,33.818823,-118.177886,Dance Studio


In [54]:
# create map of Long Beach, with all venues pulled using latitude and longitude values
map_lb = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, venue, venuecat in zip(lb_venues['Venue Latitude'], lb_venues['Venue Longitude'], lb_venues['Venue'], lb_venues['Venue Category']):
    label = '{}, {}'.format(venue, venuecat)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        popup=label,
        color='red',
        fill=True,
        fill_color='red',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lb) 


for lat, lng, venue, venuecat in zip(lb_dg['Venue Latitude'], lb_dg['Venue Longitude'], lb_dg['Venue'], lb_dg['Venue Category']):
    label = '{}, {}'.format(venue, venuecat)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='purple',
        fill=True,
        fill_color='purple',
        fill_opacity=0.9,
        parse_html=False).add_to(map_lb) 

for lat, lng, neighborhood in zip(df_lb['Latitude'], df_lb['Longitude'], df_lb['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=7,
        popup=label,
        color='green',
        fill=True,
        fill_color='green',
        fill_opacity=1.0,
        parse_html=False).add_to(map_lb)     
    
    
map_lb

The map above shows that we do have gyms / fitness centers in Long Beach, but most of them are located in the south and eastern part of the city. We have no such venues on the west side of Long Beach, and have only one dance center in the north. Zooming in, we are able to also carve out another are with no gyms in the Poly High / Washington / Whittier area. 

## Results and Discussion

Looking at the final map, we see that we have multiple gyms spread throughout the city. We have a couple of dedicated dance studios, which will be our main competition. Most of the gyms / dance studios are located in the southern and eastern part of the city. We have two big gaps of available real estate on the west side (Lower Westside to Poly High), and on the north side (north of Lindbergh). Aside from not wanting multiple competitors within a short distance, we also wanted to look at the neighborhood having multiple venues, so that we can have the possibility of having those venues bring in new customers just by walking by / word of mouth. When comparing both zones, we notice that that are more venues located in the north side compared to the west side.

## Conclusion

The objective of this project was to determine of a possible location within Long Beach CA for a new dance studio / gym. We pulled data from google maps for all the neighborhoods within Long Beach, then used the Foursquare API to locate all the venues within a half-mile radius of each neighborhood. After we pulled all the venues, we filtered our results to only show us our competitors. We then mapped all the neighborhoods, venues, and competition on a map so we can visualize our results. After plotting all the data, we noticed that we have multiple gym / dance studio locations within the city. Taking a closer look at the map, we found two pockets that seemed to be underserved in these venue categories. The west side of city, and the north side of the city seem to be promising, as they do not have any gym / dance studio currently available. Taking a deeper look into the map, we notice that we have more venues available in the north side of the city, compared to the west side. Having more venues within a section of the city could possibly mean more foot traffic, which in turn could mean more customers passing by our business.

With this in mind, it seems that the clear choice for a new dance studio / gym could be in the north side of the city. The next steps would be to search for a location within that section of the city that's available for lease, and doing some additional research into the amount of people within the community that would be interested in a new dance studio. There are a couple of schools in the area, and that could be a possible location for the business, as most dance studio customers are in the teen / early 20's age group.