# Coursera Data Science Program: Capstone Project (Week 1)
**Author:** *Clinton Johnson*

## Introduction
Discussing the business problem and who would be interested in this project

* Business Problem
* Stakeholders

### Business Problem
Discussing the business problem

#### Background
Racial inequities plague communities around the US, affecting the health and well-being of people of color and other racialized groups and ultimately limiting the capacity and progress of the cities, states, and regions where they persist. 

##### What is Racial Inequity?
A racial inequity exists when there are significant disparities in:
* availability of community resources (like access to quality schools), 
* exposure to harmful community conditions (like exposure to toxins in the air), and
* education, health, or economic results (like disparities in educational attainment)

Racial inequities can be determined when race, ethnicity, religion, language, country of origin, or other typically racialized characteristics can be used to predict availability of community resources, exposure to harmful community conditions, or people's outcomes.

#### Business Question
This research effort seeks to answer the following question:
> Does the proportion of people of any particular race in a neighborhoods in Philadelphia impact the neighborhood's access to financial services?

### Stakeholders
Discussing stakeholders who might be interested in this project

The following stakeholder groups would be interested in this project:
* Philadelphia Government Leaders and Teams
    * **Commerce Departments** working to assess and increase prosperity for communities in Philadelphai
    * **Health & Human Services Agencies** working to assess and increase prosperity to increase economic resources in Philadelphia, recognizing that the social determinants of health suggests a link between health and financial well-being
    * **Community-based Organizations (CBOs)** working to increase prosperity in communities in Philadelphia
    * **Financial Services Institutions** working to expand into new business areas.
    * **Philadelphia's Constituents** living, working, pursuing entrepreneurial ventures, playing, visiting or traveling through Philadelphia

## Data
Discussing the data that will be used to solve the problem and the source of the data

* Population Demographics by Zipcode by Race
* Financial Services Venues

### Population Demographics by Zipcode by Race

**Source:**
* Organization: **[US Census](https://data.census.gov/)**
* Description: **RACE OF HOUSEHOLDER**, 
    * Survey/Program: Decennial Census 
    * Universe (*aka scope*): Occupied housing units
    * TableID: H6
    * Product: 2010 (*survey year*): DEC Summary File 1

**Attribute Descriptions:**

|Source Name|Source Description|Notes|
|---|---|---|
|GEO ID|ID|Unique ID for the specific geographic area
|NAME|Geographic Area Name|Name of the geographic area
|H006001|Total|Total population in the geographic area
|H006002|Total!!Householder who is White alone|Total number of householders who identify as White and no other race according to the [US Census Definitions](https://www.census.gov/topics/population/race/about.html)
|H006003|Total!!Householder who is Black or African American alone|Total number of householders who identify as Black or African American and no other race according to the US Census Definitions
|H006004|Total!!Householder who is American Indian and Alaska Native alone|Total number of householders who identify as American Indian or Alaska Native and no other race according to the US Census Definitions
|H006005|Total!!Householder who is Asian alone|Total number of householders who identify as Asian and no other race according to the US Census Definitions
|H006006|Total!!Householder who is Native Hawaiian and Other Pacific Islander alone|Total number of householders who identify as Native Hawaiian or other Pacific Islander and no other race according to the US Census Definitions
|H006007|Total!!Householder who is Some Other Race alone|Total number of householders who does not identify with any race defined by the US Census Definitions
|H006008|Total!!Householder who is Two or More Races|Total number of householders who identify two or more race categories as defined by the US Census Definitions


### Financial Services Venues

**Source:**
* Organization: **[FourSquare API](https://foursquare.com/developers)**
* Description: **Venue Search**, 
    * Function: Returns a list of venues near the current location, optionally matching a search term. 
    * Base URI: https://api.foursquare.com/v2/venues/explore
    * Version: 20180604

**Parameters Used:**

|Source Name|Source Description|Notes|
|---|---|---|
|near|required unless ll is provided. A string naming a place in the world. If the near string is not geocodable, returns a failed_geocode error. Otherwise, searches within the bounds of the geocode and adds a geocode object to the response.|Intend to search by zipcode
|v|Specifies the version of the API to use
|query|A search term to be applied against venue names.|Intend to specify criteria to return financial services venues
|limit|Number of results to return, up to 50.
|radius|Limit results to venues within this many meters of the specified location. Defaults to a city-wide area. Only valid for requests that use categoryId or query. The maximum supported radius is currently 100,000 meters.

**Response Descriptions** (used in this project):

|Source Name|Source Description|Notes|
|---|---|---|
|venue-> name|The best known name for this venue.
|venue-> categories|An array, possibly empty, of categories that have been applied to this venue. One of the categories will have a primary field indicating that it is the primary category for the venue. For the complete category tree, see categories.

# Methodology
Description of any exploratory data analysis, any inferential statistical testing, if any, and machine learning approaches

## Exploratory Data Analysis

* Explore a variety of search terms to retrieve relevant financial services
* Look for patterns in the categories of financial services venues as it relates to race
* Look for patterns that indicate racial segregation between zipcodes
* Look for racial disparities in the number of financial services venues available
* Look for correllations between data attributes, particularly between: 
    * race of householder and type and quantity of venues
    * each race category for householders (could indicate segregation)

## Machine Learning Approaches

**Predictive Approaches**
   * User regression to determine whether type or quantity of financial services can be predicted based on the race of householders (i.e. Does the percentage or number of Asian householders impact the number of financial institutions?
   * User regression to determine whether the percentage of one race category can be determined given another (i.e. Does the percent or number of Black householders impact the percent or number of White householders, Asian householders, or householders of other races?)