***IBM Capstone Week 4 - Introduction/Business Problem***

An investor in the Miami, Florida area is well aware of the population growth and economic development trends in the area and is exploring the idea of opening a venue that is a combination of a health food store with a fitness center adjacent.  The investor is interested to know which neighborhoods are the most densely populated and might have the most potential for success.  For example, an area such as Coconut Grove appears to have a nice mix of residential and commercial land uses, and the area's population continues to grow with time.    

Data to be evaluated will include the Foursquare database to evaluate existing competition throughout the city's neighborhoods.  Foresquare data will be utilized to locate and evaluate ratings of potential competition.  

Additionally, a list of Miami-area neighborhoods and their approximate GPS coordinates will be utilized from this site:  https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Miami.

***IBM Capstone Week 4 - Data Collection and Cleaning***

In [94]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests

*First, evaluate and organize the Miami neighborhood data.*

In [95]:
MiamiListURL="https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Miami"
def getHTMLContent(MiamiListURL):
    html=urlopen(MiamiListURL)
    soup=BeautifulSoup(html, 'html.parser')
    return soup

In [96]:
req=requests.get(MiamiListURL)
soup2=BeautifulSoup(req.content, 'lxml')
table2=soup2.find_all('table')[0]
df=pd.read_html(str(table2))
neighborhoodDF=pd.DataFrame(df[0])
neighborhoodDF

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates
0,Allapattah,,54289,4401,,25.815-80.224
1,Arts & Entertainment District,,11033,7948,,25.799-80.190
2,Brickell,Brickellite,31759,14541,West Brickell,25.758-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257
5,Coral Way,,35062,4496,"Coral Gate, Golden Pines, Shenandoah, Historic...",25.750-80.283
6,Design District,,3573,3623,,25.813-80.193
7,Downtown,Downtowner,"71,000 (13,635 CBD only)",10613,"Brickell, Central Business District (CBD), Dow...",25.774-80.193
8,Edgewater,,15005,6675,,25.802-80.190
9,Flagami,,50834,5665,"Alameda, Grapeland Heights, and Fairlawn",25.762-80.316


In [97]:
# Drop "Demonym", as it is irrelevant to this analysis.
MIAneighb=neighborhoodDF.drop('Demonym', axis=1)
MIAneighb.head()

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Coordinates
0,Allapattah,54289,4401,,25.815-80.224
1,Arts & Entertainment District,11033,7948,,25.799-80.190
2,Brickell,31759,14541,West Brickell,25.758-80.193
3,Buena Vista,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192
4,Coconut Grove,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257


In [98]:
# Split "Coorindates" into two separate columns: "Latitude" and "Longitude".  Make sure Lat and Long are float format and Longitude is negative.

# new data frame with split value columns 
coord = MIAneighb["Coordinates"].str.split("-", n = 0, expand = True).astype(float)
  
# making separate Latitude column from new data frame 
MIAneighb["Latitude"]= coord[0] 
  
# making separate Longitude column from new data frame 
MIAneighb["Longitude"]= -1*coord[1] 
  
# Dropping old Coordinates columns 
MIAneighb.drop(columns =["Coordinates"], inplace = True) 
  
# Display revised dataframe
MIAneighb

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
0,Allapattah,54289,4401,,25.815,-80.224
1,Arts & Entertainment District,11033,7948,,25.799,-80.19
2,Brickell,31759,14541,West Brickell,25.758,-80.193
3,Buena Vista,9058,3540,Buena Vista East Historic District and Design ...,25.813,-80.192
4,Coconut Grove,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712,-80.257
5,Coral Way,35062,4496,"Coral Gate, Golden Pines, Shenandoah, Historic...",25.75,-80.283
6,Design District,3573,3623,,25.813,-80.193
7,Downtown,"71,000 (13,635 CBD only)",10613,"Brickell, Central Business District (CBD), Dow...",25.774,-80.193
8,Edgewater,15005,6675,,25.802,-80.19
9,Flagami,50834,5665,"Alameda, Grapeland Heights, and Fairlawn",25.762,-80.316


In [99]:
# Noted that Health District does not have Lat/Long coordinates, and we should remove Index 25 (sum totals) to avoid confusion.

# Health District (aka Civic Center).  Lat/Long (from Google Maps): 25.790, -80.215
MIAneighb["Latitude"].fillna("25.790", inplace = True)
MIAneighb["Longitude"].fillna("-80.215", inplace = True)

# Drop Row 25
MIAneighb=MIAneighb.drop(index=25, axis=0)
MIAneighb

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
0,Allapattah,54289,4401,,25.815,-80.224
1,Arts & Entertainment District,11033,7948,,25.799,-80.19
2,Brickell,31759,14541,West Brickell,25.758,-80.193
3,Buena Vista,9058,3540,Buena Vista East Historic District and Design ...,25.813,-80.192
4,Coconut Grove,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712,-80.257
5,Coral Way,35062,4496,"Coral Gate, Golden Pines, Shenandoah, Historic...",25.75,-80.283
6,Design District,3573,3623,,25.813,-80.193
7,Downtown,"71,000 (13,635 CBD only)",10613,"Brickell, Central Business District (CBD), Dow...",25.774,-80.193
8,Edgewater,15005,6675,,25.802,-80.19
9,Flagami,50834,5665,"Alameda, Grapeland Heights, and Fairlawn",25.762,-80.316


In [100]:
# Also, Midtown and the Venetian Islands need population data.  

# Venetian Islands data was extremely conflicting when researched.  Population and land area estimates were very inconsistent.  For the purposes of this exercise,
# we are going to drop Venetian Islands from this data set.
MIAneighb=MIAneighb.drop(index=21, axis=0)
MIAneighb=MIAneighb.reset_index(drop=True)
MIAneighb

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
0,Allapattah,54289,4401,,25.815,-80.224
1,Arts & Entertainment District,11033,7948,,25.799,-80.19
2,Brickell,31759,14541,West Brickell,25.758,-80.193
3,Buena Vista,9058,3540,Buena Vista East Historic District and Design ...,25.813,-80.192
4,Coconut Grove,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712,-80.257
5,Coral Way,35062,4496,"Coral Gate, Golden Pines, Shenandoah, Historic...",25.75,-80.283
6,Design District,3573,3623,,25.813,-80.193
7,Downtown,"71,000 (13,635 CBD only)",10613,"Brickell, Central Business District (CBD), Dow...",25.774,-80.193
8,Edgewater,15005,6675,,25.802,-80.19
9,Flagami,50834,5665,"Alameda, Grapeland Heights, and Fairlawn",25.762,-80.316


In [101]:
# Midtown was a new development in 2010, so we will use what data we can gather from 
# https://www.point2homes.com/US/Neighborhood/FL/Midtown-Edgewater-Demographics.html and 
# https://www.cpexecutive.com/post/midtown-opportunities-l-l-c-acquires-22-acres-of-land-in-midtown-miami/.  
# The population is approximatley 3,162, and the land area is 56 acres (0.23km2).

MIAneighb.at[16,'Population2010']= 3162
MIAneighb.at[16,'Population/Km²']= 3162/.23
MIAneighb[16:17]

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
16,Midtown,3162,13747.8,Edgewater and Wynwood,25.807,-80.193


In [102]:
# Downtown population needs to be cleaned up.
MIAneighb.at[7,'Population2010']= 71000
MIAneighb[7:8]

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
7,Downtown,71000,10613,"Brickell, Central Business District (CBD), Dow...",25.774,-80.193


In [103]:
# Virginia Key is approximatley 863 acres (3.49 km2).  Let's insert its population density.
MIAneighb.at[21,'Population/Km²']= 14/3.49
MIAneighb[21:22]

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
21,Virginia Key,14,4.01146,,25.736,-80.155


In [104]:
# Let's check the entire table
MIAneighb

Unnamed: 0,Neighborhood,Population2010,Population/Km²,Sub-neighborhoods,Latitude,Longitude
0,Allapattah,54289,4401.0,,25.815,-80.224
1,Arts & Entertainment District,11033,7948.0,,25.799,-80.19
2,Brickell,31759,14541.0,West Brickell,25.758,-80.193
3,Buena Vista,9058,3540.0,Buena Vista East Historic District and Design ...,25.813,-80.192
4,Coconut Grove,20076,3091.0,"Center Grove, Northeast Coconut Grove, Southwe...",25.712,-80.257
5,Coral Way,35062,4496.0,"Coral Gate, Golden Pines, Shenandoah, Historic...",25.75,-80.283
6,Design District,3573,3623.0,,25.813,-80.193
7,Downtown,71000,10613.0,"Brickell, Central Business District (CBD), Dow...",25.774,-80.193
8,Edgewater,15005,6675.0,,25.802,-80.19
9,Flagami,50834,5665.0,"Alameda, Grapeland Heights, and Fairlawn",25.762,-80.316


Going forward, this data set will be used in combination with Foursquare to evaluate population density and nearby competition in each neighborhood.