# Process US Census Population Density Data

This script adds the US Census Population Density data in order to get a count of people per unit area for each county in the dataset.

Data obtained from uscensus.gov, available [here](https://api.census.gov/data/2019/pep/population?get=DENSITY,POP&for=county:*) (YEAR 2019)

In [1]:
# change this variable to the table of processed data from 010
DATA_URL = "commuting_flows_processed_70days.csv"

import requests
import pandas as pd

# import all needed data
get_data = requests.get("https://api.census.gov/data/2019/pep/population?get=DENSITY,POP&for=county:*").json()
census_df = pd.DataFrame(get_data[1:], columns=get_data[0])

commuting_df = pd.read_csv(DATA_URL, index_col=0)

After initialization, iterate over the existing rows and add the population density data:

In [2]:
# store updated rows in this list
new_df_rows = []
for index, row in commuting_df.iterrows():
    # find the rows involving this FIPS
    string_fips = f"{row.fips:05}"
    try:
        # this throws exception if FIPS isn't found
        data_row = census_df.loc[(census_df.state == string_fips[0:2]) & (census_df.county == string_fips[2:])].iloc[0]

        # add the data to the existing row
        row["density"] = data_row["DENSITY"]
        row["pop_2019"] = data_row["POP"]

        new_df_rows.append(row)
    except Exception as e:
        print(e)
        pass

Finally, save the rows as a dataframe to a csv

In [3]:
new_data_df = pd.DataFrame(new_df_rows)
new_data_df.to_csv(DATA_URL.replace("commuting_flows", "population_density"))