# Applied Data Science Capstone

## Introduction

In my scenario, I am interested in finding an ideal neighborhood for a late-night French Fry Boutique. Even in Atlanta, there are few late night food options and there is potential for demand. This hypothetical french fry boutique would feature items ranging from about $5-15 and we will focus on snack and comfort food rather than entrees, for the sake of defining our business.

To guide enquiry, we will lead with a few hypotheses and verify them as best as we can with available data.
### Positive Influencers
* Proximity to younger areas, high schools, and colleges
* Affluent area
* Local bars and late-night attractions
* Lots of pedestrians
    * Area should be known as generally safe
    * Pedestrians new to the store may be more willing to stop in and grab a snack  
    
### Negative Influencers  

* Fast Food
    * May lose potential new customers if they compare our fries 'with the works' to a chain's $0.99 box of fries

## Data

We will be using several datasets to evaluate our hypotheses above and answer our ultimate question.
* [GeoJson](https://github.com/blackmad/neighborhoods) to map the neighborhood borders in Folium
* Data concerning various statistics about the neighborhoods
    * I have already scraped the data from [here](http://www.city-data.com/nbmaps/neigh-Atlanta-Georgia.html) and saved it to a csv file 'atl_neighborhoods.csv"
* Data gathered from Foursquare
    * Discover information about potential competitors to confirm, deny, or modify our hypotheses
    * Determine companion shops via clustering to generate viable neighborhoods
* [Crime data](http://www.atlantapd.org/i-want-to/crime-data-downloads) to evaluate safety

In [1]:
import folium
import geopy
import pandas as pd
import numpy as np
from folium.plugins import FastMarkerCluster

In [2]:
atl_df = pd.read_csv('atl_neighborhoods.csv')
atl_df = atl_df.set_index('Unnamed: 0')
atl_df = atl_df.reset_index(drop=True)
crime_df = pd.read_csv('COBRA-2019.csv')

In [3]:
atl_df.head()

Unnamed: 0,Neighborhood,Area,Population,Median Income,Median Rent,Num Males,Num Females,Median Age Male,Median Age Female
0,26th Street/Ardmore,0.07,341,66548.0,1113.0,170,170,31.5,29.3
1,Aberdeen Forest,0.322,705,82962.0,1259.0,274,430,36.3,37.4
2,Adair Park,0.375,1645,24874.0,626.0,760,885,34.2,39.7
3,Adamsville,0.995,3734,24372.0,593.0,1407,2326,20.6,33.2
4,Amberidge,0.177,313,137237.0,2581.0,147,166,52.1,49.0


In [5]:
atl_df.dtypes

Neighborhood          object
Area                 float64
Population             int64
Median Income        float64
Median Rent          float64
Num Males              int64
Num Females            int64
Median Age Male      float64
Median Age Female    float64
dtype: object

In [4]:
crime_df.head()

Unnamed: 0,Report Number,Report Date,Occur Date,Occur Time,Possible Date,Possible Time,Beat,Apartment Office Prefix,Apartment Number,Location,Shift Occurrence,Location Type,UCR Literal,UCR #,IBR Code,Neighborhood,NPU,Latitude,Longitude
0,190010138,2019-01-01,2019-01-01,20,2019-01-01,25,511.0,,,50 UPPER ALABAMA ST SW,Morning Watch,13.0,LARCENY-NON VEHICLE,620,2302,Downtown,M,33.75194,-84.38964
1,190010299,2019-01-01,2019-01-01,120,2019-01-01,130,511.0,,,20 BROAD ST,Morning Watch,,LARCENY-NON VEHICLE,620,2302,Downtown,M,33.75312,-84.39208
2,190011858,2019-01-01,2019-01-01,1740,2019-01-01,1750,411.0,,A15,3000 CONTINENTAL COLONY PKWY SW,Evening Watch,26.0,LARCENY-NON VEHICLE,620,2302,Greenbriar,R,33.68077,-84.4937
3,190010845,2019-01-01,2019-01-01,415,2019-01-01,420,607.0,,,1362 BOULEVARD SE,Morning Watch,23.0,LARCENY-NON VEHICLE,630,2303,Benteen Park,W,33.71744,-84.36818
4,190011541,2019-01-01,2019-01-01,1400,2019-01-01,1430,210.0,,,3393 PEACHTREE RD NE @LENOX MALL,Evening Watch,8.0,LARCENY-NON VEHICLE,630,2303,Lenox,B,33.84676,-84.36212


In [15]:
crime_df.dtypes

Report Number                int64
Report Date                 object
Occur Date                  object
Occur Time                  object
Possible Date               object
Possible Time                int64
Beat                       float64
Apartment Office Prefix     object
Apartment Number            object
Location                    object
Shift Occurrence            object
Location Type               object
UCR Literal                 object
UCR #                        int64
IBR Code                    object
Neighborhood                object
NPU                         object
Latitude                   float64
Longitude                  float64
dtype: object

In [14]:
atl_map = folium.Map(location=[33.7176502, -84.3601671], zoom_start=12)
neighborhoods = folium.GeoJson('atlanta.json', name='Neighborhood Boundaries')
atl_map.add_child(neighborhoods)

crime_layer = folium.FeatureGroup(name='Crimes')
mc = FastMarkerCluster(crime_df[['Latitude', 'Longitude']].values.tolist())
crime_layer.add_child(mc)
atl_map.add_child(crime_layer)
atl_map.add_child(folium.LayerControl())
atl_map.save('atl_crime_map.html')