# Georgia and Florida Reopening Among COVID-19 Concerns

## Applied Data Science Capstone by IBM/Coursera

## Introduction/Business Problem
There are cities like Jacksonville reopening beaches while Georgia is planning to reopen several businesses starting April 24th. Is this a good time to consider reopening with the risk to public health from  COVID-19 or will there be too much economic damage if not? In order to answer the questions, I will take a look at county health population information from https://www.countyhealthrankings.org/ and business information(leveraging Foursquare API)in Jacksonsville, Florida and compare to Savannah, Georgia since they are large cities located in the neighboring states. The governers of Georgia and Florida should take these factors into consideration before opening their states among COVID-19.

## Data
<b>I.Foursquare Location API</b> <br>
I will explore business information such as number and type of businesses to see the type of impact closing the cities will have. I also want to see if there are nearby clinics, hospitals, and/or pharmacies to learn about capacity for handling rise in cases for decision in reopening. 

<b>II.County Health Rankings</b> <br>
I will be using county health rankings to get detailed information about the population using information such as:
- Quality of Life
- Health Behaviors
- Clinical Care
- Social & Economic Factors

Code follows:

In [None]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans



print('Libraries imported.')

In [2]:
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.0               |             py_0          26 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         713 KB

The following NEW packages will be INSTALLED:

    altair:  4.1.0-py_1 conda-forge
    branca:  0.4.0-py_0 conda-forge
    folium:  0.5.0-py_0 conda-forge
    vincent: 0.4.4-py_1 conda-forge


Downloading and Extracting Packages
altair-4.1.0         | 614 KB    | #####

In [2]:
CLIENT_ID = 'KTOGVYYQUEWMCLZYN3EHY3W1FJ2EUDJ1FIXZKSXQCWBXNQJA' # your Foursquare ID
CLIENT_SECRET = '01RRALCQSNM3UZKYRWZYBNMJL5O5TBFCCBRVKYI4XUA0MSWW' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KTOGVYYQUEWMCLZYN3EHY3W1FJ2EUDJ1FIXZKSXQCWBXNQJA
CLIENT_SECRET:01RRALCQSNM3UZKYRWZYBNMJL5O5TBFCCBRVKYI4XUA0MSWW


In [8]:
address = 'Savannah,Georgia'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))

The geograpical coordinate of Savannah,Georgia are 32.0809263, -81.0911768.


In [7]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


In [9]:
jk_df = {'City': ['Jacksonville','Cocoa Beach','Daytona Beach','Savannah'],
        'State': ['Florida','Florida','Florida','Georgia'],
        'Latitude':['30.3321838','28.3200','29.2108','32.0809263'],
        'Longitude':['-81.655651','-80.6076','-81.0228','-81.0911768']
        }

jk_df = pd.DataFrame(jk_df, columns = ['City', 'State','Latitude','Longitude'])

print (jk_df)

            City    State    Latitude    Longitude
0   Jacksonville  Florida  30.3321838   -81.655651
1    Cocoa Beach  Florida     28.3200     -80.6076
2  Daytona Beach  Florida     29.2108     -81.0228
3       Savannah  Georgia  32.0809263  -81.0911768


In [10]:
city_latitude = jk_df.loc[0, 'Latitude'] # neighborhood latitude value
city_longitude = jk_df.loc[0, 'Longitude'] # neighborhood longitude value

city_name = jk_df.loc[0, 'City'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Jacksonville are 30.3321838, -81.655651.


In [14]:
city_latitude = jk_df.loc[1, 'Latitude'] # neighborhood latitude value
city_longitude = jk_df.loc[1, 'Longitude'] # neighborhood longitude value

city_name = jk_df.loc[1, 'City'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Cocoa Beach are 28.3200, -80.6076.


In [17]:
city_latitude = jk_df.loc[2, 'Latitude'] # neighborhood latitude value
city_longitude = jk_df.loc[2, 'Longitude'] # neighborhood longitude value

city_name = jk_df.loc[2, 'City'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Daytona Beach are 29.2108, -81.0228.


In [20]:
city_latitude = jk_df.loc[3, 'Latitude'] # neighborhood latitude value
city_longitude = jk_df.loc[3, 'Longitude'] # neighborhood longitude value

city_name = jk_df.loc[3, 'City'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of Savannah are 32.0809263, -81.0911768.


In [22]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

# create URL
url2 = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    city_latitude, 
    city_longitude, 
    radius, 
    LIMIT)
url2 # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=KTOGVYYQUEWMCLZYN3EHY3W1FJ2EUDJ1FIXZKSXQCWBXNQJA&client_secret=01RRALCQSNM3UZKYRWZYBNMJL5O5TBFCCBRVKYI4XUA0MSWW&v=20180604&ll=32.0809263,-81.0911768&radius=500&limit=100'

In [23]:
results = requests.get(url2).json()
results

{'meta': {'code': 200, 'requestId': '5ea24e220cc1fd001bc783f4'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': '$-$$$$', 'key': 'price'},
    {'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Historic District-North',
  'headerFullLocation': 'Historic District-North, Savannah',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 55,
  'suggestedBounds': {'ne': {'lat': 32.085426304500004,
    'lng': -81.08587571542196},
   'sw': {'lat': 32.0764262955, 'lng': -81.09647788457804}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4d63d214dd51ba7a7b30ba02',
       'name': 'Savannah Bee Company',
       'location': {'address': '1 W River St',
        'lat': 32.08167968720722,
        'lng': -81.09118353022185,
  

In [5]:
# create map of Miami using latitude and longitude values
map_miami = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_miami)  
    
#map_miami

NameError: name 'folium' is not defined

In [6]:
#Create a handle, page, to handle the contents of the website
#page = requests.get(url)
URL = 'https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Miami'
page = requests.get(URL)

In [7]:
import lxml.html as lh
#Store the contents of the website under doc
doc = lh.fromstring(page.content)

In [8]:
#Parse data that are stored between <tr>..</tr> of HTML
tr_elements = doc.xpath('//tr')

In [9]:
[len(T) for T in tr_elements[:12]]

[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]

In [10]:
tr_elements = doc.xpath('//tr')
#Create empty list
col=[]
i=0   
    #For each row, store each first element (header) and an empty list
for t in tr_elements[0]:
    i+=1
    name=t.text_content()

    print (i,name)
    col.append((name,[]))

1 Neighborhood
2 Demonym
3 Population2010
4 Population/Km²
5 Sub-neighborhoods
6 Coordinates



In [11]:
#Since out first row is the header, data is stored on the second row onwards
for j in range(1,len(tr_elements)):
    #T is our j'th row
    T=tr_elements[j]
    
    #If row is not of size 3, the //tr data is not from our table 
    if len(T)!=6:
        break
    
    #i is the index of our column
    i=0
    
    #Iterate through each element of the row
    for t in T.iterchildren():
        data=t.text_content() 
        #Check if row is empty
        if i>0:
        #Convert any numerical value to integers
            try:
                data=int(data)
            except:
                pass
        #Append the data to the empty list of the i'th column
        col[i][1].append(data)
        #Increment i for the next column
        i+=1

In [12]:
[len(C) for (title,C) in col]

[26, 26, 26, 26, 26, 26]

In [13]:
Dict={title:column for (title,column) in col}
df=pd.DataFrame(Dict)
df.head()

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates
0,Allapattah,,54289,4401,,"25.815,-80.224\n"
1,Arts & Entertainment District,,11033,7948,,"25.799,-80.190\n"
2,Brickell,Brickellite,31759,14541,West Brickell,"25.758,-80.193\n"
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,"25.813,-80.192\n"
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...","25.712,-80.257\n"


In [14]:
df2= df.replace(r'\n',  '', regex=True)
df2.head()

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates
0,Allapattah,,54289,4401,,"25.815,-80.224"
1,Arts & Entertainment District,,11033,7948,,"25.799,-80.190"
2,Brickell,Brickellite,31759,14541,West Brickell,"25.758,-80.193"
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,"25.813,-80.192"
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...","25.712,-80.257"


In [15]:
df2.columns = df.columns.str.replace("\n", "")
list(df2)

['Neighborhood',
 'Demonym',
 'Population2010',
 'Population/Km²',
 'Sub-neighborhoods',
 'Coordinates']

In [16]:
# new data frame with split value columns 
new = df2["Coordinates"].str.split(",", n = 1, expand = True) 
new.head()

Unnamed: 0,0,1
0,25.815,-80.224
1,25.799,-80.19
2,25.758,-80.193
3,25.813,-80.192
4,25.712,-80.257


In [17]:
# making separate first name column from new data frame 
df2["Latitude"]= new[0] 
  
# making separate last name column from new data frame 
df2["Longitude"]= new[1] 

df2.head(20)

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates,Latitude,Longitude
0,Allapattah,,54289,4401,,"25.815,-80.224",25.815,-80.224
1,Arts & Entertainment District,,11033,7948,,"25.799,-80.190",25.799,-80.19
2,Brickell,Brickellite,31759,14541,West Brickell,"25.758,-80.193",25.758,-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,"25.813,-80.192",25.813,-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...","25.712,-80.257",25.712,-80.257
5,Coral Way,,35062,4496,"Coral Gate, Golden Pines, Shenandoah, Historic...","25.750,-80.283",25.75,-80.283
6,Design District,,3573,3623,,"25.813,-80.193",25.813,-80.193
7,Downtown,Downtowner,"71,000 (13,635 CBD only)",10613,"Brickell, Central Business District (CBD), Dow...","25.774,-80.193",25.774,-80.193
8,Edgewater,,15005,6675,,"25.802,-80.190",25.802,-80.19
9,Flagami,,50834,5665,"Alameda, Grapeland Heights, and Fairlawn","25.762,-80.316",25.762,-80.316


In [18]:
LH_data = df2[df2['Neighborhood'] == 'Little Havana'].reset_index(drop=True)
LH_data.head()

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates,Latitude,Longitude
0,Little Havana,,76163,8423,Riverside and South River Drive Historic District,"25.773,-80.215",25.773,-80.215


In [19]:
address = 'Little Havana, FL'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Little Havana are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Little Havana are 25.7681503, -80.2334686.


In [20]:
LH_data.loc[0, 'Neighborhood']

'Little Havana'

In [21]:
neighborhood_latitude = LH_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = LH_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = LH_data.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Little Havana are 25.773, -80.215.


In [23]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

# create URL
url2 = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url2 # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=KTOGVYYQUEWMCLZYN3EHY3W1FJ2EUDJ1FIXZKSXQCWBXNQJA&client_secret=01RRALCQSNM3UZKYRWZYBNMJL5O5TBFCCBRVKYI4XUA0MSWW&v=20180604&ll=25.773,-80.215&radius=500&limit=100'

In [24]:
results = requests.get(url2).json()
results

{'meta': {'code': 200, 'requestId': '5ea11d3dc546f3001c01032a'},
 'response': {'headerLocation': 'Little Havana',
  'headerFullLocation': 'Little Havana, Miami',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 7,
  'suggestedBounds': {'ne': {'lat': 25.777500004500006,
    'lng': -80.2100122332913},
   'sw': {'lat': 25.768499995499994, 'lng': -80.21998776670871}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4c017871f7ab0f47e42e16b6',
       'name': 'Pinolandia',
       'location': {'address': '119 NW 12th Ave',
        'lat': 25.774786961412257,
        'lng': -80.21464073279162,
        'labeledLatLngs': [{'label': 'display',
          'lat': 25.774786961412257,
          'lng': -80.21464073279162}],
        'distance': 202,
        'postalCode': '33

In [25]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [26]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Pinolandia,Latin American Restaurant,25.774787,-80.214641
1,Sedano's,Grocery Store,25.774067,-80.215799
2,Walgreens,Pharmacy,25.77294,-80.214986
3,El Palacio De Los Jugos,Latin American Restaurant,25.77381,-80.213158
4,Viva México,Mexican Restaurant,25.768581,-80.214614


In [27]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

7 venues were returned by Foursquare.


In [28]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [29]:
# type your answer here

LH_venues = getNearbyVenues(names=LH_data['Neighborhood'],
                                   latitudes=LH_data['Latitude'],
                                   longitudes=LH_data['Longitude']
                                  )

Little Havana


In [30]:
print(LH_venues.shape)
LH_venues.head()

(7, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Little Havana,25.773,-80.215,Pinolandia,25.774787,-80.214641,Latin American Restaurant
1,Little Havana,25.773,-80.215,Sedano's,25.774067,-80.215799,Grocery Store
2,Little Havana,25.773,-80.215,Walgreens,25.77294,-80.214986,Pharmacy
3,Little Havana,25.773,-80.215,El Palacio De Los Jugos,25.77381,-80.213158,Latin American Restaurant
4,Little Havana,25.773,-80.215,Viva México,25.768581,-80.214614,Mexican Restaurant


In [34]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.
