In [None]:
import geopandas as gpd

# <center>__Group Lab 3: Comparative Urban Change in US States__</center>

__Authors:__ Kevin Ho, Carmelita Deleon, Jin Chang, Billy Wang, Alisha Husain <br>
__Date:__ February 20, 2019

## PART 1

### __What makes a state "urban" or "not urban"?__
An urban area is the region that surrounds a city. Also, urban areas usually refers to a human settlement that has high population density and infrustructure of a built environment [(SOURCE: Wikipedia)](https://en.wikipedia.org/wiki/Urban_area). Urban areas include towns and cities - places where opportunities for education, transportation, business and social interaction and overall better standard of living are prevalent [(SOURCE)](http://www.differencebetween.net/miscellaneous/difference-between-urban-and-rural/). Places with urban areas are filled with densely deveoloped territory and encompass residential, commercial, and other non-residential urban land uses. 

While "urban" areas contain a large amount of human activity, "not urban" areas (commonly referred to as rural areas) have less representation of infrastructure. Rural areas are located outside towns and cities. Rural areas have a low representation of population density and small settlements (villages). 

### _Examples of urban areas and rural areas_
- [Los Angeles, CA](https://en.wikipedia.org/wiki/Greater_Los_Angeles): With a population of nearly 18 million, Los Angeles is filled with opportunities and has a high population density. 
- [Leavenworth, WA](https://www.onlyinyourstate.com/washington/delightful-small-towns-rural-wa/): This rural town had a population of 1,979 in 2014. [(SOURCE: City-data)](http://www.city-data.com/city/Leavenworth-Washington.html)

### Overview of the data

### Potential issues with using US Census data (demographic)

## PART 2

In [None]:
datafile = open("./data/WashingtonFIPS.dbf", "rb")
data = datafile.read()

print(data)

In [None]:
shapefile = gpd.read_file("./data/saep_bg10/saep_bg10.shp")

print(shapefile.head(5))

## PART 3

### __What Makes An Area Urban and Rural__
Our team compared the 2018's estimated total population to a population value of 2,500. This specific value represents whether an area is considered urban or rural. An article from the [United States Census Bureau](https://www2.census.gov/geo/pdfs/reference/GARM/Ch12GARM.pdf) states that an area is classified as urban if there are at least 2,500 inhabitants. While in contrast, for rural areas the number of inhabitants would be less than 2,500 residents. 

In [None]:
# 1
shapefile['block_category'] = None

# Iterate through the shapefile and categorize each block group value as urban or rural.
for index, row in shapefile.iterrows():
    if row['POP2018'] >= 2500:
        shapefile.loc[index, 'block_category'] = "urban"
    else :
        shapefile.loc[index, 'block_category'] = "rural"
        
print(shapefile['block_category'].head())

### What percentage of the population of the state is urbanized in the most recent year? 

Urbanized = greater than or equal to 50000 inhabitants

In [None]:
# 2 Per block group 
urbanPopCt = sum(shapefile.block_category == "urban")
size = float(len(shapefile))

print(str(round(urbanPopCt / size * 100, 2)) + "%")

In [None]:
# 2 per county 
# Condense the population values to represent the total population for each county
urbanPop = shapefile.loc[:,["COUNTYFP10","POP2018"]]
urbanPop = urbanPop.groupby("COUNTYFP10").sum()

# filter and calculate the total number of counties that is urbanized
urbanPopCt = sum(urbanPop.POP2018 >= 50000)
size = float(len(urbanPop))

print(str(round(urbanPopCt / size * 100, 2)) + "%")

### What percentage of the land area of the state is urbanized in the most recent year?

In order to determine what the percentage of the land area is urbanized for the 2018, we will calculate population density, residents per square mile. Then, our team will categorize that value as either urban or rural by comparing it with a population density value of 1000 people per square mile. This value is defined by the [United States Census Bureau](https://www2.census.gov/geo/pdfs/reference/GARM/Ch12GARM.pdf).

In [None]:
# 3
# Filter the dataframe for any group blocks that contains an land area of 0
landArea = shapefile[(shapefile.ALANDMI != 0) & (shapefile.POP2018 != 0)]

# Calculate the total number of urbanized land areas
urbanLandCt = sum((landArea.POP2018 / landArea.ALANDMI) >= 1000)
size = float(len(shapefile))

print((urbanLandCt / size * 100))

### How many block groups are urbanized and how many are deurbanized over the previous decade?

In [None]:
# 4 
shapefile['class'] = None

# Iterate through the shapefile and determine whether each block group shows a categorical change 
# over the previous decade and categorize them as urbanized, deurbanized, or no change in category.
for index, row in shapefile.iterrows():
    pop08 = row['POP2008']
    category = row['block_category']
    # Determine whether the block groups changed in categories
    if pop08 >= 2500 and category == "rural" :
        shapefile.loc[index, 'class'] = "deurbanized"
    elif pop08 < 2500 and category == "urban":
        shapefile.loc[index, 'class'] = "urbanized"
    else :
        shapefile.loc[index, 'class']= "no change in category"

In [None]:
# 5
urbanizedCt = sum(shapefile['class'] == "urbanized")
deurbanizedCt = sum(shapefile['class'] == "deurbanized")

print(str(urbanizedCt) + " block groups were urbanized and " + str(deurbanizedCt) + " block groups were deurbanized")