# Data sources to answer what neighborhood is the most 'Minnesota Nice'?

## The hypothetical data science firm of MCG has researched the 'Minnesota Nice' business problem and determined that a variety of data needs to be gathered. In particular, geolocation data will be critical from the Foursquare API.

### A more detailed description of data requirements follows:

* Neighborhood names along with census data from the American Community Survey will be pulled from the Minnesota open data website:
 * https://www.mncompass.org/profiles/neighborhoods/minneapolis-saint-paul#!community-areas 
* Neighborhood names will be associated with central latitute/longitude coordinates using the the methods described in the StackOverflow post:
 * https://stackoverflow.com/questions/44616592/search-google-geocoding-api-by-neighborhood
   * This will use the Google API searching for a combination of Neighborhood + City and then pulling the lat-long coordinates.
   
* Foursquare data will be obtained similar to the Toronto neighborhood analysis. We plan to look at restaurants, parks, schools, and spiritual centers.
 * https://developer.foursquare.com/docs/resources/categories 
 
* Walk scores for the neighborhoods will be obtained from the 'Walk Score' API:
 * https://www.walkscore.com/professional/api.php  

## First we import a couple of useful packages

In [2]:
import pandas as pd
import numpy as np

## Now I import a couple of .csv files that were pulled from the mncompass.org website. We'll combine and pull just the neighborhood names.

In [3]:
# The code was removed by Watson Studio for sharing.

### Just quick sanity check on the import

In [4]:
TwinCityHoods.head()

Unnamed: 0,geography,City
0,Mid-City Industrial,Minneapolis
1,University of Minnesota,Minneapolis
2,Northeast Park,Minneapolis
3,Beltrami,Minneapolis
4,Downtown East,Minneapolis


In [5]:
TwinCityHoods.shape

(113, 2)

### We know there are 102 neighborhoods from the website listing so let's drop any duplicates.

In [6]:
TwinCityHoods = TwinCityHoods.drop_duplicates().dropna()

### Now since we need lat-longs, we'll make a list of the neighborhoods we want to search for on the google API.

In [7]:
TwinCityHoods['neighborhood'] = TwinCityHoods.geography + ", " + TwinCityHoods.City

In [8]:
TwinCityHoods.head()

Unnamed: 0,geography,City,neighborhood
0,Mid-City Industrial,Minneapolis,"Mid-City Industrial, Minneapolis"
1,University of Minnesota,Minneapolis,"University of Minnesota, Minneapolis"
2,Northeast Park,Minneapolis,"Northeast Park, Minneapolis"
3,Beltrami,Minneapolis,"Beltrami, Minneapolis"
4,Downtown East,Minneapolis,"Downtown East, Minneapolis"


In [84]:
#NeighborhoodList = list(TwinCityHoods.neighborhood)
NeighborhoodList = TwinCityHoods[['neighborhood']]


In [104]:
import googlemaps

def geocode_address_lat(loc):
    gmaps = googlemaps.Client(key='AIzaSyD0YIuwN5fxAGFPbMDwfO3UCIdABxsQHOk')
    geocode_result = gmaps.geocode(loc)
    lat = geocode_result[0]["geometry"]["location"]["lat"]
    #test - print results
    return(lat)
    
def geocode_address_lon(loc):
    gmaps = googlemaps.Client(key='AIzaSyD0YIuwN5fxAGFPbMDwfO3UCIdABxsQHOk')
    geocode_result = gmaps.geocode(loc)
    lon = geocode_result[0]["geometry"]["location"]["lng"]
    #test - print results
    return(lon)


In [105]:
NeighborhoodList['latitude'] = NeighborhoodList['neighborhood'].apply(geocode_address_lat);
NeighborhoodList['longitude'] = NeighborhoodList['neighborhood'].apply(geocode_address_lon);


In [106]:
NeighborhoodList.head()

Unnamed: 0,neighborhood,latitude,longitude
0,"Mid-City Industrial, Minneapolis",44.998862,-93.217771
1,"University of Minnesota, Minneapolis",44.97399,-93.227728
2,"Northeast Park, Minneapolis",45.00312,-93.241263
3,"Beltrami, Minneapolis",44.994943,-93.2416
4,"Downtown East, Minneapolis",44.975911,-93.254587
