# Peer-graded Assignment: Capstone Project - The Battle of Neighborhoods (Week 1)

# 1) Introduction/Business Problem

The basis of this study is to help a small group of investors planning to open a new grocery in Toronto. They are interested in building in an area that have been neglected or underserviced, thus creating a food desert for members of that community. Food deserts are defined by the USDA as parts of the country vapid of fresh fruit, vegetables, and other healthful whole foods, usually found in impoverished areas. This is largely due to a lack of grocery stores, farmers’ markets, and healthy food providers. The information gained will assist in chosing the right location by providing data about the income and population of each neighborhood, in addition to lack of groceries present in these areas. While the endeavor is designed to bring a resource to a forgotten neighborhood, the project must be sustatinable for years to come.

# 2) Data
The necessary information needed by the investing group will come from the following sources:

* __[City of Toronto Neighborhood Profiles](https://www.toronto.ca/city-government/data-research-maps/neighbourhoods-communities/neighbourhood-profiles/)__ for providing an overview of the neighborhoods in Toronto
* __[City of Toronto Open Data Catalogue](https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#8c732154-5012-9afe-d0cd-ba3ffc813d5a)__ : __[Downloadable City of Toronto Census CSV File](https://www.toronto.ca/ext/open_data/catalog/data_set_files/2016_neighbourhood_profiles.csv)__ 
* __[Neighborhoods of Toronto shape file](https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#a45bd45a-ede8-730e-1abc-93105b2c439f)__ for mapping
* __[Foursquare API](https://developer.foursquare.com/)__ to collect information on areas lacking proper access to proper food sources
* __[American Nutrition Association](http://americannutritionassociation.org/newsletter/usda-defines-food-deserts)__ for defining food deserts

The Census of Population is held across Canada every five years and collects data about age and sex, families and households, language, immigration and internal migration, ethnocultural diversity, Aboriginal peoples, housing, education, income, and labour.  City of Toronto Neighborhood Profiles use this Census data to provide a portrait of the demographic, social and economic characteristics of the people and households in each City of Toronto neighborhood. The profiles present selected highlights from the data, but these accompanying data files provide the full data set assembled for each neighborhood.

To assess the neighborhoods and provide guidance to the investors, we will be utilizing the data from the 2016 Toronto Census, the shapefile to define our neighborhoods, and the Foursquare API to collect information on areas lacking proper access to proper food sources.

### __Import and install the necessary libaries and tools__

In [1]:
import numpy as np 
import pandas as pd 

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

print('Libraries imported and loaded.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/DSX-Python35

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.0.2r             |       h14c3975_0         3.1 MB  conda-forge
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    certifi-2018.8.24          |        py35_1001         139 KB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    geopy-1.19.0               |             py_0          53 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0         conda-forge
    geopy:           1.19.0-py_0       conda-forge

The following packages will be UPDATED:

   

### __Pull in the data file and create a dataframe__

In [7]:
# Toronto Open Data Catalogue - Neighbourhood Profiles 2016 (CSV)

path = 'https://www.toronto.ca/ext/open_data/catalog/data_set_files/2016_neighbourhood_profiles.csv'
df = pd.read_csv(path,encoding='latin1')
df.head()

Unnamed: 0,Category,Topic,Data Source,Characteristic,City of Toronto,Agincourt North,Agincourt South-Malvern West,Alderwood,Annex,Banbury-Don Mills,...,Willowdale West,Willowridge-Martingrove-Richview,Woburn,Woodbine Corridor,Woodbine-Lumsden,Wychwood,Yonge-Eglinton,Yonge-St.Clair,York University Heights,Yorkdale-Glen Park
0,Neighbourhood Information,Neighbourhood Information,City of Toronto,Neighbourhood Number,,129,128,20,95,42,...,37,7,137,64,60,94,100,97,27,31
1,Neighbourhood Information,Neighbourhood Information,City of Toronto,TSNS2020 Designation,,No Designation,No Designation,No Designation,No Designation,No Designation,...,No Designation,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,NIA,Emerging Neighbourhood
2,Population,Population and dwellings,Census Profile 98-316-X2016001,"Population, 2016",2731571,29113,23757,12054,30526,27695,...,16936,22156,53485,12541,7865,14349,11817,12528,27593,14804
3,Population,Population and dwellings,Census Profile 98-316-X2016001,"Population, 2011",2615060,30279,21988,11904,29177,26918,...,15004,21343,53350,11703,7826,13986,10578,11652,27713,14687
4,Population,Population and dwellings,Census Profile 98-316-X2016001,Population Change 2011-2016,4.50%,-3.90%,8.00%,1.30%,4.60%,2.90%,...,12.90%,3.80%,0.30%,7.20%,0.50%,2.60%,11.70%,7.50%,-0.40%,0.80%


### __After reviewing the data, create a list of neighbordhoods in Toronto__

In [11]:
neighborhoods = list(df.columns.values)
neighborhoods = neighborhoods[5:]
print(neighborhoods)

['Agincourt North', 'Agincourt South-Malvern West', 'Alderwood', 'Annex', 'Banbury-Don Mills', 'Bathurst Manor', 'Bay Street Corridor', 'Bayview Village', 'Bayview Woods-Steeles', 'Bedford Park-Nortown', 'Beechborough-Greenbrook', 'Bendale', 'Birchcliffe-Cliffside', 'Black Creek', 'Blake-Jones', 'Briar Hill-Belgravia', 'Bridle Path-Sunnybrook-York Mills', 'Broadview North', 'Brookhaven-Amesbury', 'Cabbagetown-South St. James Town', 'Caledonia-Fairbank', 'Casa Loma', 'Centennial Scarborough', 'Church-Yonge Corridor', 'Clairlea-Birchmount', 'Clanton Park', 'Cliffcrest', 'Corso Italia-Davenport', 'Danforth', 'Danforth East York', 'Don Valley Village', 'Dorset Park', 'Dovercourt-Wallace Emerson-Junction', 'Downsview-Roding-CFB', 'Dufferin Grove', 'East End-Danforth', 'Edenbridge-Humber Valley', 'Eglinton East', 'Elms-Old Rexdale', 'Englemount-Lawrence', 'Eringate-Centennial-West Deane', 'Etobicoke West Mall', 'Flemingdon Park', 'Forest Hill North', 'Forest Hill South', 'Glenfield-Jane Heig

### __Create dataframe idexing the neighborhoods of Toronto and populate the dataframe with necessary data__

In [12]:
toronto_hoods = pd.DataFrame(index=Neighborhoods, columns=["population", "population_change", "population_density", "household_size", "after_tax_income"])

# population = Population 2016 per Census Profile 98-316-X2016001
# population_change = Population Change 2011-2016
# population_density = Population density per square kilometre
# houselhold_size =  Average household size
# after_tax_income =   After-tax income: Average amount ($)

for index, row in toronto_hoods.iterrows():
    toronto_hoods.at[index, 'population'] = df[index][2]
    toronto_hoods.at[index, 'population_change'] = df[index][4]
    toronto_hoods.at[index, 'population_density'] = df[index][11]
    toronto_hoods.at[index, 'household_size'] = df[index][74]
    toronto_hoods.at[index, 'after_tax_income'] = df[index][2354]
toronto_hoods.head()

Unnamed: 0,population,population_change,population_density,household_size,after_tax_income
Agincourt North,29113,-3.90%,11305,3.16,26955
Agincourt South-Malvern West,23757,8.00%,9965,2.88,27928
Alderwood,12054,1.30%,5220,2.6,39159
Annex,30526,4.60%,15040,1.8,80138
Banbury-Don Mills,27695,2.90%,10810,2.23,51874
