## U.S. Census Breakdown

In this activity, you will be provided with a large dataset from the 2019 U.S. Census. Your task is to clean up this dataset and create a new CSV file that is easier to comprehend.

### Instructions

* Create a Python application that reads in the data from the 2019 U.S. Census.

* Then, store the contents of `Place`, `Population`, `Per Capita Income`, and `Poverty Count` into Python Lists.

* Then, zip these lists together into a single tuple.

* Finally, write the contents of your extracted data into a CSV. Make sure to include the titles of these columns in your CSV.

In [8]:
# load dependencies
import csv
import os

# define file path to census data
census_filepath = os.path.join("..", "Resources", "census_starter.csv")

# instantiate empty lists for desired data
place = []
pop = []
pcinc = []
povcount = []

# open file for reading
with open(census_filepath) as census_file:
    
    # create reader object
    filereader = csv.reader(census_file, delimiter=',')
    
    # iterate over all rows in the file and append to each list
    for row in filereader:
        place.append(row[0])
        pop.append(row[1])
        pcinc.append(row[4])
        povcount.append(row[8])
        
# store the contents of all lists in a zip object for writing later
zip_obj = zip(place, pop, pcinc, povcount)

# stupid check
print(place[0], pop[0], pcinc[0], povcount[0])

Autauga County, Alabama 55380 29819 8340


In [9]:
# instantiate the header tuple
header = ("Place", "Population", "Per Capita Income", "Poverty Count")

# define a path to the output file
output_filepath = os.path.join("..", "clean_census.csv")

# open file for writing
with open(output_filepath, "w") as outfile:
    
    # create writer object
    writer = csv.writer(outfile)
    
    # write header
    writer.writerow(header)
    
    # write extracted data
    writer.writerows(zip_obj)

#### Bonus

* Find the poverty rate (percentage of population living in poverty). Include this in your final output, converting the rate to a string and including a "%" at the end of the string.

* Parse the string associated with `Place`, separating it into `County` and `State`, so we can store both in separate columns.

In [29]:
# load dependencies
import csv
import os

# define file path to census data
census_filepath = os.path.join("..", "Resources", "census_starter.csv")

# instantiate empty lists for desired data
county = []
state = []
pop = []
pcinc = []
povcount = []
povpercent = []

# open file for reading
with open(census_filepath) as census_file:
    
    # create reader object
    filereader = csv.reader(census_file, delimiter=',')
    
    # iterate over all rows in the file and append to each list
    for row in filereader:
        
        place = row[0].split(", ")
        county.append(place[0])
        state.append(place[1])
        
        pop.append(row[1])
        pcinc.append(row[4])
        povcount.append(row[8])
        
        povpercent.append(str(round(float(row[8]) / float(row[1]), 4) * 100) + "%")
        
# store the contents of all lists in a zip object for writing later
zip_obj = zip(county, state, pop, pcinc, povcount, povpercent)

# stupid check
print(county[0], state[0], pop[0], pcinc[0], povcount[0], povpercent[0])

Autauga County Alabama 55380 29819 8340 15.06%


In [30]:
# instantiate the header tuple
header = ("County", "State", "Population", "Per Capita Income", "Poverty Count", "Povert Percentage")

# define a path to the output file
output_filepath = os.path.join("..", "clean_census_addl.csv")

# open file for writing
with open(output_filepath, "w") as outfile:
    
    # create writer object
    writer = csv.writer(outfile)
    
    # write header
    writer.writerow(header)
    
    # write extracted data
    writer.writerows(zip_obj)

#### Hints

* Windows users may get a `UnicodeDecodeError`. To avoid this, pass in `encoding="utf8"` as an additional parameter when reading in the file.

* As with many datasets, the file does not include the header line. Use the following list as a guide to the columns: "Place,Population,Median Age,Household Income,Per Capita Income,Employed Civilians,Unemployed Civilians,People in the Military,Poverty Count"

#### References

Data Source: [U.S. Census API - ACS 5-Year Estimates 2019](https://www.census.gov/data/developers/data-sets/census-microdata-api.ACS_5-Year_PUMS.html)