# The Battle of the Neighborhoods: Sunshine State


## Introduction/Business Problem 
Florida, aka the Sunshine State, is the third most populous state in the United States, with a population over 21 million (U.S. Census Bureau). 

Orlando and Jacksonville are two of the most popular cities in Florida. Orlando is located in Central Florida and is largely known for its high volumes of tourism. Orlando attracts tourist through attractions, such as Walt Disney World and Universal Studios, as well as being at popular location for conferences and conventions. Jacksonville is approximately 140 miles northwest of Orlando and is the most populous municipality in Florida. Jacksonville also attracts tourist with its beautiful beaches along the Atlantic Coast, as well as golf tourism.

This study is being conducted in order to assist a group of investors in determining which Florida city, Orlando or Jacksonville, would be best suited for their business. 

## Data
In order to recommend one of the two cities to the investors, we must analyze data on both Orlando and Jacksonville. We will consider the following factors in our decision:
1. Population size 
2. Crime statistics
3. Other venues in the area
   

**Data Sources:** 

1. *<a href="https://www.fdle.state.fl.us/FSAC/Data-Statistics/UCR-Offense-Data.aspx" target="_blank">Florida Department of Law Enforcement</a>*
        The data is provided by the FDLE's Uniform Crime Report (UCR) system. This system provides "standardized reports on crime statistics based on data gathered from across the state. Reports that provide both summary and detail information are issued semi-annually and annually." We will be using the Total Index Crime for Florida by Jurisdiction, 2019 data to evaluate the population and level of crime present in each city. 
    
2. *Geocoder package*
        The Geocodes package finds the latitude and longitude of a location using the Google Geocoding API. This is used to identify the geographic coordinates of the two cities.

3. *<a href="https://foursquare.com/" target="_blank">Foursquare</a>*
        The Foursquare API provides location based data on venues and users. This tool will be used to analyze current venues that may be considered competition.
    

## Methodology

In order to determine the optimal city for opening a business , we will analysis different characteristics of both cities being considered.

### Exploratory Data Analysis

We will begin by looking at the population and safety of both of the cities. To do so we must first import the necessary libraries.

In [1]:
# Importing necessary libraries 
!pip install geopy
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # transform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries imported.


We begin looking at our Total Index Crime for Florida by Jurisdiction (2019) data provided by the Florida Department of Law Enforcement.

In [4]:
#Downloading data into dataframes using Pandas library
orl_jax=pd.read_excel('https://github.com/jec16j/Coursera_Capstone/blob/master/FL_Index_Crime_by_Jurisdiction_2019.xlsx?raw=true')
orl_jax.head()

Unnamed: 0,County,Agency,Year,Population,Total Index Crimes,"Crime Rate per 100,000 Population",Murder,Murder Clearances,Rape^,Rape Clearances,Robbery,Robbery Clearances,Aggravated Assault^^,Aggravated Assault Clearances,Burglary,Burglary Clearances,Larceny,Larceny Clearances,Motor Vehicle Theft,Motor Vehicle Theft Clearances,Total Clearances,Clearance Rate per 100 Offenses
0,01-ALACHUA,Alachua County Sheriff's Office,2019.0,117496.0,2555,2174.5,4,6,111,40,99,57,524,269,414,59,1260,146,143,37,614,24.0
1,01-ALACHUA,Gainesville Police Department,2019.0,133068.0,5640,4238.4,2,1,153,39,185,74,588,275,501,78,3803,893,408,89,1449,25.7
2,01-ALACHUA,High Springs Police Department,2019.0,6444.0,180,2793.3,0,0,3,2,1,1,24,11,34,6,110,34,8,6,60,33.3
3,01-ALACHUA,University of Florida Police Department,2019.0,0.0,318,,0,0,10,0,1,1,5,3,22,6,258,50,22,7,67,21.1
4,01-ALACHUA,Alachua Police Department,2019.0,10298.0,271,2631.6,0,0,4,2,7,6,30,13,32,3,191,32,7,1,57,21.0


We see that this data provides information on a county/agency level. We observe 2019 data on the population of each county as well as the Total Index Crimes, Crime Rate per 100,000 Population, and then specifications on the type of crime and their clearance rates such as: Murder, Murder Clearances,	Rape, Rape Clearances, Robbery,	Robbery Clearances,	Aggravated Assault, Aggravated Assault Clearances, Burglary, Burglary Clearances, Larceny, Larceny Clearances,	Motor Vehicle Theft, Motor Vehicle Theft Clearances, Total Clearances, and Clearance Rate per 100 Offenses. 

For this analysis we will focus the population and total crime rate per 100,000 residents, for Orlando and Jacksonville. We will look at data from the Orlando Police Department and Jacksonville Sheriff's Office. 

In [10]:
#Selecting the data we need
orl_jax.set_index('Agency')
rows=[188, 455] 
columns=['Agency','Population','Total Index Crimes','Crime Rate per 100,000 Population']
orl_jax.loc[rows,columns]

Unnamed: 0,Agency,Population,Total Index Crimes,"Crime Rate per 100,000 Population"
188,Jacksonville Sheriff's Office,926315.0,35974,3883.6
455,Orlando Police Department,291800.0,16257,5571.3


The data we obtained gives us information of the safety of each of these Florida cities. While we see that the total index of crimes in Jacksonville Florida are double that of Orlando FL, the population of Jacksonville more than triples that of Orlando. The crime rate per 100,000 people in the population makes comparing the safety of both cities easier. Jacksonville's crime rate is substantially lower therefore we determine that Jacksonville is the safer city. Jacksonville also has the largest population which is beneficial to new business owners who will need to ensure they have a large amount of potential customers. Both of these things will be important to make investors aware of before they make their decision. 

We will now visualize our data by creating a map using Folium to better see where in Florida both of these cities are located.

In [11]:
#Using Nominatim from geopy.geocoders to obtain the logititudes and latitudes of both cities 
geolocator = Nominatim(user_agent="cruzerj10@gmail.com")
orl = geolocator.geocode("Orlando FL")
jax = geolocator.geocode("Jacksonville FL")
print("Orlando, FL:", (orl.latitude, orl.longitude))
print("Jacksonville, FL:",(jax.latitude, jax.longitude))

Orlando, FL: (28.5479786, -81.41278418563017)
Jacksonville, FL: (30.3321838, -81.655651)


In [12]:
#Creating a dataframe in pandas to store our location data cleanly 
a={'City': ["Orlando","Jacksonville"], "Latitude":[(orl.latitude),(jax.latitude)],"Longitude":[(orl.longitude),(jax.longitude)]}
geodf=pd.DataFrame(data=a)
geodf

Unnamed: 0,City,Latitude,Longitude
0,Orlando,28.547979,-81.412784
1,Jacksonville,30.332184,-81.655651


In [13]:
#Obtaining the latitude and longitude of Florida for our map
location = geolocator.geocode("Florida USA")
latitude = location.latitude
longitude = location.longitude

In [14]:
#Generating a map using Folium
map_florida = folium.Map(location=[latitude,longitude], zoom_start=6)

# add markers to map
for lat, lng, neighborhood in zip(geodf['Latitude'], geodf['Longitude'], geodf['City']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_florida)  
    
map_florida

Now that we can better visualize both of these cities, we will use the *Foursquare API* to get a better picture of the venues currently in each city. We will focus on looking at the quantity of venues in both cities as they will likely be competition for our investor's business. 

In [15]:
#Defining Foursquare Credentials and Version
CLIENT_ID = 'GTOMCPHDW2AUB3XTEMC2SD0JAXTIQKMSCEOBFAUWUYS3NE5Q' 
CLIENT_SECRET = '4D214225TQNKRP33E0RN3P2L0WLIIXX1DWZPYIXT0SBD23BP'
VERSION = '20180604'

In [20]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    LIMIT = 100 # Maximum number of venues returned by Foursquare API
    radius = 500 
    venuesoj=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # Defining the corresponding URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        
        # Sending GET Request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # Assigning relevant part of JSON to venues
        venuesoj.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearbyvenues = pd.DataFrame([item for venuesoj in venuesoj for item in venuesoj])
    nearbyvenues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearbyvenues)

In [21]:
#Getting venues for both cities in our dataset
venues = getNearbyVenues(names=geodf['City'],
                                latitudes=geodf['Latitude'],
                                longitudes=geodf['Longitude'])

Orlando
Jacksonville


In [22]:
#Viewing data
venues.head()

Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Orlando,28.547979,-81.412784,Caribbean Sunshine Bakery,28.550942,-81.412459,Caribbean Restaurant
1,Orlando,28.547979,-81.412784,Orange County Sheriff's Office,28.551151,-81.411204,Office
2,Orlando,28.547979,-81.412784,BP,28.546179,-81.413117,Gas Station
3,Orlando,28.547979,-81.412784,7-Eleven,28.552263,-81.414344,Convenience Store
4,Orlando,28.547979,-81.412784,Shell,28.5491,-81.413107,Gas Station


In [23]:
#Grouping data by city
venuecount=venues.groupby('City').count()
venuecount["Venue"].rank #Ranking data based on city with most venues

<bound method NDFrame.rank of City
Jacksonville    21
Orlando          9
Name: Venue, dtype: int64>

Using the Foursquare API we are able to see that Jacksonville, FL has a larger quantity of venues than that observed in Orlando, FL. While we are only able to view 100 venues using the Foursquare API this information is still noteworthy to investors.

## Results
In this analysis we focused on three specific factors to help advice investors on what Sunshine State city they should open their business in, Orlando, FL or Jacksonville FL. 
1. Population size: In analysis the population size of both cities we see that Jacksonville FL has a population of 926,315. This is over triple the population size of Orlando with a population of 291,800.
2. Crime statistics: As we examined the data on crime in both of these cities we find that Orlando FL has a crime rate of 5,571.3 per 100,000 population, compared to 3883.6 per 100,000 population in Jacksonville FL. 
3. Other venues in the area: Using the Foursquare API, we saw that Jacksonville FL had a larger amount of venues in the city, compared to the number in Orlando.

## Discussion
Based on the results observed during this analysis I would advise the investors to open their business in Jacksonville, FL. Jacksonville's larger population size gives the potential investors a large range of potential customers. This will be key in ensuring the success of the business. Jacksonville also has a significantly lower crime rate per 100,000 population which is very important to consider when deciding to invest money into the city. The larger number of venues in Jacksonville in comparision to Orlando may have been a deterrent if it hadn't been for Jacksonville's larger population, with that in consideration, Jacksonville being three times as larger, Orlando would still have a larger number of venues per person. 

## Conclusion 
In this study, I was able to analysis two different cities on important factors that are important when considered a business problem. I was able to use various data analysis and visualization tools, such as a geocoder, the Foursquare API, and many more. While the scope of my analysis into both of the cities has been limited, I believe that with the analysis conducted Jacksonville, FL is the right choice to invest in.