# Group Lab 3 – Comparative Urban Change in US States

Authors: Andrew Baker, Jamie Marken, James Lyou, Alberto Melendez, Emmanual Robi

## Part 1

Executive Summary:

The main objective of Lab 3 was to calculate and visualize urban change in WA State from 2008 to 2018. Definitions of "urban" and "non-urban" were used to determine urban change. First, to decide what makes a block group urban or non-urban, the density of census block groups in WA State were calculated. To determine the density, the US definition of urban was used which states that 1,000 people in 1 square mile is urban. This calculation proved that roughly 71% of census block groups in WA State are defined as urban in 2018. Then, the sum of urban land was divided by the sum of land in WA State to give the result of 2% of urbanized land in WA State in 2018. Next, the density of census block groups in 2008 were compared to 2018 block groups to determine a change in urban populations. This comparison indicates that census block groups in WA State are becoming more urban, as 65 block groups became urban from 2008 to 2018 while only 2 block groups became de-urbanized. Finally, two interactive maps were created. The first map visualizes which census block groups in WA State are de-urbanized, urbanized, or that experienced no change in urbanization in 2018. The second map visualizes census block groups that are classified as urban or non-urban in 2018.
 
For part 4, urbanization trends in Oregon and WA State were compared because they are in the same region and have similar land mass, although WA State has about 3 million more people. The methodology for Oregon was the same for WA State, but instead of using census block groups for WA State, counties were used for Oregon. The main result for urbanization classification in Oregon was that Multnomah County is the only county in Oregon that is classified as urban from 2012 to 2018. When the change in urban populations was calculated, the result showed that 0 counties experienced urban change from 2012 to 2018. Therefore, there were no changes in urban trends in Oregon from 2012 to 2018. The differences in urban trends in comparison to WA State are that 65 block groups became urban from 2008 to 2018, while 0 counties became urban in Oregon from 2012 to 2018. Another difference is that only one county (Multnomah) in Oregon is classified as urban, while 65 census block groups in WA State are classified as urban. 


Definitions of urban relative to the United States:

The United States splits "urban area" into two categories. 
Urbanized areas: indicates an area of 50,000 or more people. 
Urban clusters: indicates areas of at least 2,500 people and less than 50,000 people.

The United States defines census block groups as having a population density of at least 1,000 people per square mile.


Definitions of urban relative to Japan and New Zealand:

In Japan, urbanized areas are defined as bordering areas of densely populated districts. They number districts as units with a density requirement of 4,000 people per square km, or 10,000/ square mile.

New Zealand has three classes of urban areas.
main urban areas: the 17 urban areas which have a population of 30,000 more.
secondary urban areas: the 14 urban areas that have a population of 10,000 or more but less than 30,000.
minor urban areas: have a population of at least 1,000 but less than 10,000.

<a href="https://en.wikipedia.org/wiki/Urban_area">Link to Urban Area Reference</a>

Issues with US Census Data:

The data used below are Washington State census block data from Washington State’s Office of Financial Management. Potential issues with this data are how the US Census defines urban, non-urban, and suburban, which can be unclear. 

For example, the definitions of suburban are non-specific. The common definition of suburban is a mixed-use or residential area that is a part of a city/urban area or as a separate residential area within commuting distance to a city/urban area. Definitions of non-urban areas such as this can be messy, as it does not define a suburban area with a numerical density requirement.

<a href="https://en.wikipedia.org/wiki/Suburb">Link to Suburban Definition Reference</a>

There are issues with using US Census Data specifically with demographics. Regarding demographic analysis, there are limitations in the feasibility of producing estimates only at the national level, not at lower geographic levels. There are also issues with only recording broad racial categories (Black or non-Black). In addition, there is uncertainty in estimating total international migration to the United States, particularly emigration, temporary migration, and unauthorized migration.

<a href="https://www.census.gov/history/pdf/2010-background-crs.pdf">Link to Census Data Demographic Issues Reference</a>

## Part 2

In [2]:
#Import pandas and geopandas
import pandas as pd
import geopandas

In [3]:
#Read the SAEP data
fp = "./saep_bg10/saep_bg10.shp"
data = geopandas.read_file(fp)

In [4]:
#Read the FIPS dbf data
fp2 = "./WashingtonFIPS.dbf"
dbf = geopandas.read_file(fp2)

In [None]:
#Divides SAEP data into counties and write the individual divisions to JSON files
county_counter = 0
for row in dbf.FIPSCounty:
    is_in_county = data['COUNTYFP10'] == row
    subset = data[is_in_county]
    subset.to_file(dbf.CountyName[county_counter] + '.json', driver = 'GeoJSON')
    county_counter += 1

In [None]:
#Creates a new dataframe with total populations of each county in 2017
#and prints the top 10 most populous counties
county_pop = pd.DataFrame()
county_pop['FIPSCounty'] = dbf.FIPSCounty
county_pop['CountyName'] = dbf.CountyName
county_pop['Population'] = 0
counter = 0
    
for row in county_pop.FIPSCounty:
    is_in_county = data['COUNTYFP10'] == row
    subset = data[is_in_county]
    county_pop.Population[counter] = sum(subset['POP2017'] + county_pop.Population[counter])
    counter += 1

county_pop = county_pop.sort_values('Population', ascending=False)
print(county_pop.head(n=10))

## Part 3

In [None]:
#imports necessary packages
import geopandas as gpd
import numpy as np

In [None]:
fp = "./saep_bg10/saep_bg10.shp"
data = gpd.read_file(fp)

#### Part 3.1

In [None]:
#calculates density to determine if a group is urban or non-urban (1000 people in 1 sq mile is urban)
data["urban2018"] = np.where((data['POP2018'] / data['ALANDMI'])>= 1000, 'urban', 'non-urban')

#### Part 3.2

In [None]:
urban_counter = 0
for row in data['urban2018']:
    if row == 'urban':
        urban_counter += 1

In [None]:
#calculates percentage of groupds that are urban in washington
percent_State_Urban = (float(urban_counter)/4783) * 100
print percent_State_Urban

#### Part 3.3

In [None]:
urban_land = data

urban_land = urban_land.drop(urban_land[urban_land["urban2018"] != "urban"].index)

In [None]:
land_sum = data['ALANDMI'].sum()
urban_land_sum = urban_land['ALANDMI'].sum()

In [None]:
percent_urbanized = (urban_land_sum / land_sum) * 100
print percent_urbanized

#### Part 3.4

In [None]:
#calculates density of population
data["urban2008"] = np.where((data["POP2008"]/data["ALANDMI"]) >= 1000, 'urban', 'non-urban')

In [None]:
#initializes a column and then compares columns to determine if there is a change
data['ClassChange'] = ''
data["ClassChange"] = np.where((data['urban2008'] == data['urban2018']), 'no change in category', data['ClassChange'])

In [None]:
#keywords are used similarly like the code above, determines change but had to be done slightly differently
keyword1 = 'urban'
keyword2 = 'non-urban'

data["ClassChange"] = np.where((data['urban2008'] == keyword2) & (data['urban2018'] == keyword1), 'urbanized', data['ClassChange'])
data["ClassChange"] = np.where((data['urban2008'] == keyword1) & (data['urban2018'] == keyword2), 'de-urbanized', data['ClassChange'])

In [None]:
data

#### Part 3.5

In [None]:
print data['ClassChange'].value_counts()

#### Part 3.6

In [None]:
%matplotlib notebook

#removes non-populated blocks (mostly ones that are water)
mapdata = data.drop(data[data['POP2018'] == 0].index)
mapdata.plot(column = "ClassChange",figsize= (10,8), legend = True)

In [None]:
%matplotlib notebook
mapdata.plot(column = "urban2018",figsize= (10,8), legend = True)

## Part 4

In [None]:
#imports necessary packages
import geopandas as gpd
import numpy as np
import csv
import pandas as pd

In [None]:
#imports data
shp = "./Part4/cb_2017_41_bg_500k.shp"
pop = pd.read_csv("./Part4/Population.csv")
data2 = gpd.read_file(shp)


In [None]:
#creates COUNTYFP AND AREA Column
pop['COUNTYFP'] = ["001", "003", "005", "007", "009", "011", "013", "015", "017", "019", "021", "023", "025", "027", "029", "031", "033", "035", "037", "039", "041", "043", "045", "047", "049", "051", "053", "055", "057", "059", "061", "063", "065", "067", "069", "071"]
pop ['Area'] = 0

In [None]:
counter = 0
for row in pop.COUNTYFP: 
    A = data['COUNTYFP'] == row
    subset = data[A]
    total_area = sum(subset['ALAND'])
    pop.Area[counter] = total_area + pop.Area[counter]
    counter += 1

In [None]:
#converts to INT
pop['Population2012'] = pop['Population2012'].astype(int)
pop['Population2018'] = pop['Population2018'].astype(int)

### Part 4.1

In [None]:
#calculates density of population
pop["Area"] = pop["Area"] * .0000003861

In [None]:
pop["urban2018"] = np.where((pop["Population2018"] / (pop["Area"])) >= 1000, 'urban', 'non-urban')
pop

### Part 4.2

In [None]:
urban_count = 0
for row in pop['urban2018']:
    if row == 'urban':
        urban_count += 1

In [None]:
#calculates percentage of groupds that are urban in washington
percent_State_Urban2 = (float(urban_count)/36) * 100
print percent_State_Urban2

### Part 4.3

In [None]:
urban_land2 = pop

urban_land2 = urban_land2.drop(urban_land2[urban_land2["urban2018"] != "urban"].index)

In [None]:
land_sum2 = pop['Area'].sum()

urban_land_sum2 = urban_land2['Area'].sum()

In [None]:
percent_urbanized2 = (urban_land_sum2 / land_sum2) * 100

print percent_urbanized2

### Part 4.4

In [None]:
#calculates density of population
pop["urban2012"] = np.where((pop["Population2012"]/pop["Area"]) >= 1000, 'urban', 'non-urban')

In [None]:
#initializes a column and then compares columns to determine if there is a change
pop['ClassChange'] = ''
pop["ClassChange"] = np.where((pop['urban2012'] == pop['urban2018']), 'no change in category', pop['ClassChange'])

In [None]:
#keywords are used similarly like the code above, determines change but had to be done slightly differently
keyword1 = 'urban'
keyword2 = 'non-urban'

pop["ClassChange"] = np.where((pop['urban2012'] == keyword2) & (pop['urban2018'] == keyword1), 'urbanized', pop['ClassChange'])
pop["ClassChange"] = np.where((pop['urban2012'] == keyword1) & (pop['urban2018'] == keyword2), 'de-urbanized', pop['ClassChange'])

### Part 4.6

In [None]:
#Merge data to shape file
data2 = pd.merge(data2, pop)
data2

In [None]:
#Map 1 of change in category
%matplotlib notebook

#removes non-populated blocks (mostly ones that are water)
mapdata = data2.drop(data2[data2['Population2018'] == 0].index)
mapdata.plot(column = "ClassChange",figsize= (10,8), legend = True)

In [None]:
#Map 2 of urban and non-urban
%matplotlib notebook
mapdata.plot(column = "urban2018",figsize= (10,8), legend = True)