# Battle of the Neighborhoods 

## Table of Contents
* [Introduction](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Results](#results)
* [Discussion](#discussion)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a id="introduction"></a>

A client wants to open a comic book store. However, they are torn between selecting two different cities (El Segundo, California and Hermosa Beach, California). Another factor in consideration is that the business would be a speciality store and must exist with some distance from its competiton. The question is, "What would be the best city for the client's store with consideration for their competiton."

## Data <a id='data'></a>

The data that will be used in this notebook will be.
  * Load all the necesary libraies to manipulate data
  * the creation of folium maps for each of the cities based on the locaiton data
  * the Foursquare venue infomartion that  will be super-imposed on the maps
  * general city infomation for each city from wikipedia 
    * "https://en.wikipedia.org/wiki/List_of_municipalities_in_California"

### Libraries

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

from geopy.geocoders import Nominatim #getting the location coordinates

from bs4 import BeautifulSoup # loading the tables from wikipedia

print('Libraries imported.')

Libraries imported.


In [3]:
#creating the map of the El Segundo
address = 'El Segundo, California'

geolocator = Nominatim(user_agent="CA_explorer")
location = geolocator.geocode(address)
ES_lat = location.latitude
ES_lng = location.longitude
print('The geograpical coordinates of El Segundo are {}, {}.'.format(ES_lat, ES_lng))

The geograpical coordinates of El Segundo are 33.917028, -118.4156337.


In [4]:
#creating the map of the Hermosa Beach
address = 'Hermosa Beach, California'

geolocator = Nominatim(user_agent="CA_explorer")
location = geolocator.geocode(address)
HB_lat = location.latitude
HB_lng = location.longitude
print('The geograpical coordinates of Hermosa Beach are {}, {}.'.format(HB_lat, HB_lng))

The geograpical coordinates of Hermosa Beach are 33.86428, -118.39591.


### Using the Foursquare credentials to collect the venue information

In [5]:
CLIENT_ID = 'TAIL0VYFMOI2HD41UOZXJRO05VTNQZCPNAIKPL4EQGZ4DRI2' # your Foursquare ID
CLIENT_SECRET = 'DK0RICVRLNEYYVMQXKEGQGDAESI1ZD5KUFSMSZZEFJFVV3YY' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: TAIL0VYFMOI2HD41UOZXJRO05VTNQZCPNAIKPL4EQGZ4DRI2
CLIENT_SECRET:DK0RICVRLNEYYVMQXKEGQGDAESI1ZD5KUFSMSZZEFJFVV3YY


In [6]:
req = requests.get("https://en.wikipedia.org/wiki/List_of_municipalities_in_California") 
soup = BeautifulSoup(req.content,'lxml') 
table = soup.find_all('table')[1]  
df_data= pd.read_html(str(table)) 
df=pd.DataFrame(df_data[0])
df.head()

Unnamed: 0_level_0,Name,Type,County,Population (2010)[1][8][9],Land area[1],Land area[1],Incorporated[7]
Unnamed: 0_level_1,Name,Type,County,Population (2010)[1][8][9],sq mi,km2,Incorporated[7]
0,Adelanto,City,San Bernardino,31765,56.01,145.1,"December 22, 1970"
1,Agoura Hills,City,Los Angeles,20330,7.79,20.2,"December 8, 1982"
2,Alameda,City,Alameda,73812,10.61,27.5,"April 19, 1854"
3,Albany,City,Alameda,18539,1.79,4.6,"September 22, 1908"
4,Alhambra,City,Los Angeles,83089,7.63,19.8,"July 11, 1903"


## Methodology <a id='methodolgu'></a>

In order to address the client's problem I will first want to create a map of the two cities using the folium libraries. Once the maps are complete, I would then use the foursquare location data for the location of the venues inside of each city. I would compile that information into two Pandas dataframes. Each of the dataframe would allow me to manipulate the information so I would be able to narrow down the type of venues that are not necessarily important to answer the question. When I would be able to see if there is another comic book store within the given areas. Once I able to swift through the information and the categories of venues, I could make recommendations to answer his problem.

## Maps

I will be creating the maps for the target cities to get a better sense of their location and layout.

In [7]:
#Creating a map of El Segundo from the latitude and longitude 

ES_map= folium.Map(location=[33.917028, -118.4156337], zoom_start=13)
ES_map

In [8]:
# creating a folium map of Hermosa Beach based on latitude and longitude data

HB_map=folium.Map(location=[33.86428, -118.39591], zoom_start=13)
HB_map

Since the maps of the target cities have been visualized, it will be easier to see and understand the types of venues and where they are placed within each of the cities. As well the a look at what surrounds the cities.

In [9]:
# This cell of code will get all the inofrmation of venues inside of El Segundo
LIMIT=100
radius=500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    ES_lat, 
    ES_lng, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=TAIL0VYFMOI2HD41UOZXJRO05VTNQZCPNAIKPL4EQGZ4DRI2&client_secret=DK0RICVRLNEYYVMQXKEGQGDAESI1ZD5KUFSMSZZEFJFVV3YY&v=20180605&ll=33.917028,-118.4156337&radius=500&limit=100'

In [10]:
# finding the reult of the venues
ES_results = requests.get(url).json()
ES_results

{'meta': {'code': 200, 'requestId': '6039c8124e785233ee5e8f77'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'El Segundo',
  'headerFullLocation': 'El Segundo',
  'headerLocationGranularity': 'city',
  'totalResults': 56,
  'suggestedBounds': {'ne': {'lat': 33.921528004500004,
    'lng': -118.41022112968353},
   'sw': {'lat': 33.9125279955, 'lng': -118.42104627031647}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4e30d618d22d33105980a212',
       'name': 'El Segundo Brewing Company',
       'location': {'address': '140 Main St',
        'lat': 33.917753039416205,
        'lng': -118.41559258096558,
        'labeledLatLngs': [{'label': 

In [11]:
#function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [12]:
venues = ES_results['response']['groups'][0]['items']
    
ESnearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
ESnearby_venues =ESnearby_venues.loc[:, filtered_columns]

# filter the category for each row
ESnearby_venues['venue.categories'] = ESnearby_venues.apply(get_category_type, axis=1)

# clean columns
ESnearby_venues.columns = [col.split(".")[-1] for col in ESnearby_venues.columns]

ESnearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,El Segundo Brewing Company,Brewery,33.917753,-118.415593
1,Brewport Tap House,Gastropub,33.918093,-118.415759
2,Jame Enoteca,Italian Restaurant,33.918329,-118.416105
3,Sausal,Mexican Restaurant,33.918085,-118.416048
4,El Tarasco Mexican Food,Mexican Restaurant,33.918286,-118.415845


In [13]:
#looking for the unique categories of the venues
print('There are {} uniques categories.'.format(len(ESnearby_venues['categories'].unique())))

There are 42 uniques categories.


In [14]:
#Looking at the categories to see if any of the venues in El Segundo match the client's category forhgis business
ESnearby_venues["categories"]


0                       Brewery
1                     Gastropub
2            Italian Restaurant
3            Mexican Restaurant
4            Mexican Restaurant
5           American Restaurant
6              Cuban Restaurant
7                   IT Services
8                    Taco Place
9                    Sports Bar
10          American Restaurant
11                  Coffee Shop
12           Italian Restaurant
13                  Pizza Place
14                  Coffee Shop
15         Fast Food Restaurant
16          American Restaurant
17                 Dance Studio
18                  Coffee Shop
19               Ice Cream Shop
20                          Bar
21     Mediterranean Restaurant
22               Sandwich Place
23               Discount Store
24             Greek Restaurant
25                    BBQ Joint
26                 Cocktail Bar
27                Movie Theater
28                   Restaurant
29          Japanese Restaurant
30                        Diner
31      

In [15]:
ESV_map=folium.Map(location=[33.917028, -118.4156337], zoom_start=17)
for i,row in ESnearby_venues.iterrows():
    folium.Marker(location=[row['lat'],row['lng']], popup=row["name"]).add_to(ESV_map)
ESV_map

The results of the foursquare data super imposed on the folium map

In [16]:
# This cell of code will get all the inofrmation of venues inside of Hermosa Beach
LIMIT=100
radius=500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    HB_lat, 
    HB_lng, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=TAIL0VYFMOI2HD41UOZXJRO05VTNQZCPNAIKPL4EQGZ4DRI2&client_secret=DK0RICVRLNEYYVMQXKEGQGDAESI1ZD5KUFSMSZZEFJFVV3YY&v=20180605&ll=33.86428,-118.39591&radius=500&limit=100'

In [17]:
HB_results = requests.get(url).json()
HB_results

{'meta': {'code': 200, 'requestId': '6039c813a6f02c11e2e3c422'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': '$-$$$$', 'key': 'price'},
    {'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Hermosa Beach',
  'headerFullLocation': 'Hermosa Beach',
  'headerLocationGranularity': 'city',
  'totalResults': 70,
  'suggestedBounds': {'ne': {'lat': 33.8687800045, 'lng': -118.39050077587581},
   'sw': {'lat': 33.8597799955, 'lng': -118.4013192241242}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b1ee637f964a520432124e3',
       'name': 'Creme de la Crepe',
       'location': {'address': '424 Pier Ave',
        'lat': 33.8641795028075,
        'lng': -118.39710441028895,
        'labeledLatLngs': [{'label': 'display',
   

In [18]:
#function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [19]:
venues = HB_results['response']['groups'][0]['items']
    
HB_nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
HB_nearby_venues =HB_nearby_venues.loc[:, filtered_columns]

# filter the category for each row
HB_nearby_venues['venue.categories'] = HB_nearby_venues.apply(get_category_type, axis=1)

# clean columns
HB_nearby_venues.columns = [col.split(".")[-1] for col in HB_nearby_venues.columns]

HB_nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Creme de la Crepe,French Restaurant,33.86418,-118.397104
1,Hermosa Beach Fish Shop,Seafood Restaurant,33.865028,-118.394182
2,Fritto Misto Italian Cafe,Italian Restaurant,33.863774,-118.398149
3,The Source Cafe,Vegetarian / Vegan Restaurant,33.864499,-118.396651
4,The Strand,Trail,33.86703,-118.394904


In [20]:
HB_nearby_venues.shape

(70, 4)

In [21]:
#looking at the Hermosa Beach venues categories
HB_nearby_venues['categories']

0                 French Restaurant
1                Seafood Restaurant
2                Italian Restaurant
3     Vegetarian / Vegan Restaurant
4                             Trail
5                       Yoga Studio
6                      Burger Joint
7                         Wine Shop
8                Mexican Restaurant
9             Australian Restaurant
10              American Restaurant
11              Peruvian Restaurant
12                   Massage Studio
13               Mexican Restaurant
14                      Men's Store
15                             Café
16          New American Restaurant
17                     Burger Joint
18                 Sushi Restaurant
19                   Ice Cream Shop
20                        Juice Bar
21                      Pizza Place
22              Government Building
23                     Antique Shop
24                      Coffee Shop
25                         Pharmacy
26               Italian Restaurant
27                      Coff

In [22]:
# creating a map of all the local venues in the Hermosa Beach area
HBV_map=folium.Map(location=[33.86428, -118.39591], zoom_start=16)
for i,row in HB_nearby_venues.iterrows():
    folium.Marker(location=[row['lat'],row['lng']], popup=row["name"]).add_to(HBV_map)
HBV_map

In [23]:
#pulling the information about Hermosa Beach and El Segundo out of the dataframe
df.iloc[[126,175]]

Unnamed: 0_level_0,Name,Type,County,Population (2010)[1][8][9],Land area[1],Land area[1],Incorporated[7]
Unnamed: 0_level_1,Name,Type,County,Population (2010)[1][8][9],sq mi,km2,Incorporated[7]
126,El Segundo,City,Los Angeles,16654,5.46,14.1,"January 18, 1917"
175,Hermosa Beach,City,Los Angeles,19506,1.43,3.7,"January 14, 1907"


## Results<a id='results'></a>

By pulling the information of Foursquare of the venues in both Hermosa Beach and El Segundo we could see all the type of business within those specific coordinates. The results from the above work are as follows.
* Both El Segundo and Hermosa Beach, California have a similar amount of businesses.
* Each city are very diverse in terms of venue categories. 
* Neither city has a comic book store which is good because that means both locations met one of the requirements posed by the client.
* The search of population size for each city yielded that Hermosa Beach had a higher population by nearly 3,000 people.

## Discussion<a id='discussion'></a>

While the two cities are very similar in terms of population and business venue categories, there are some difference that make them exceeding different. According the the city information pulled form Wikipedia, El Segundo is roughly 4x times larger than Hermosa Beach. Under normal situations for beach cities a larger city contain more people and a larger amount of the business to serve those people. However, El Segundo has 2 anomalies that make it larger without providing some of the benefits. The first folium map reveal that a large portion of the city is taken up by the Chevron Refinery and the Los Angeles Airport. While these places undoubtedly bring something to El Segundo, there are many unknowns that may impact these businesses. For example, noise pollution along side an increase of regular pollution only to name a few. Hermosa Beach while smaller it has not only a higher population but also more businesses within the city center. Having a diverse amount of the businesses in a single area can be conducive to not only grow but continue success.

## Conclusion <a id='conclusion'></a>

This project was formed to answer the clients question on what city would be best to open a comic book store. Through the use of the Foursquare information along with population and size data from Wikipedia, I would recommend that the client stat his business in Hermosa Beach California. I came to this conclusion because Hermosa Beach not only has a higher population but they also have a much larger business center within the city. This larger business center would normally be a problem because of the increase amount of business in the area. However, due to the fact that the business the client wants to start is a speciality store and there is no other store in the area like it, the increase in people traffic could potentially help their business thrive. The client's business venture could symbiotical work with the other type of the businesses in the area.