# Tech Access in the USA 

## Purpose

This Jupyter Notebook reads in Excel files that have been manually downloaded from the U.S. Census website: https://www.census.gov/programs-surveys/household-pulse-survey/data.html 

These files contain data from the Household Pulse Survey, "designed to quickly and efficiently deploy data collected on how people’s lives have been impacted by the COVID-19 pandemic."

Specifically, this notebook reads in files from different weeks of data collection around technology access in families with children and extracts data from Washington State. We retain only rows that record technology access by different income groups. The data from different weeks are appended to the same dataframe and exported as a CSV file. 

## Set up environment

In [1]:
# import modules 
import pandas as pd # data manipulation
from datetime import datetime # handling dates

## Import data

### Income

In [3]:
# Initialize list of file names 
file_names = ["educ3_050520.xlsx", "educ3_051220.xlsx", "educ3_051920.xlsx", "educ3_052620.xlsx"]

locations = ["WA", "Seattle_Metro_Area"]


# Initialize empty dataframe 
income_df = pd.DataFrame()

# Loop through files to extract data
for i in range(0, len(file_names)):
    # Loop through locations to select sheet
    for j in range(0, len(locations)):
        # Read sheet from Excel file 
        file = pd.read_excel(file_names[i], sheet_name = locations[j], skiprows = 4, na_values = '-')
        # Extract income rows
        income = file.iloc[56:65,2:14]
        # Add column to specify location
        income.insert(0, "Location", locations[j])
        # Add column to specify date
        income.insert(1, "Date", [file_names[i][6:12]]*9, True) 
        # Add column to specify income group 
        income.insert(2, "Group", file.iloc[56:65,0], True) 
        # Add column to specify total surveyed 
        income.insert(3, "Total", file.iloc[56:65,1], True)
        # Append to dataframe
        income_df = income_df.append(income)

# Show dataframe
income_df

Unnamed: 0,Location,Date,Group,Total,Device always available for educational purposes,Device usually available for educational purposes,Device sometimes available for educational purposes,Device rarely available for educational purposes,Device never available for educational purposes,Did not report,Internet always available for educational purposes,Internet usually available for educational purposes,Internet sometimes available for educational purposes,Internet rarely available for educational purposes,Internet never available for educational purposes,Did not report.1
56,WA,050520,"Less than $25,000",126801.0,82138.0,23821.0,15521.0,2661.0,2661.0,,72749.0,43280.0,5450.0,2661.0,2661.0,
57,WA,050520,"$25,000 - $34,999",142422.0,91223.0,34196.0,9156.0,7026.0,820.0,,91048.0,30024.0,10628.0,9902.0,820.0,
58,WA,050520,"$35,000 - $49,999",137649.0,104024.0,14927.0,2928.0,5103.0,10666.0,,107465.0,19518.0,,,10666.0,
59,WA,050520,"$50,000 - $74,999",250272.0,152880.0,62590.0,18203.0,16598.0,,,180698.0,42946.0,26627.0,,,
60,WA,050520,"$75,000 - $99,999",125245.0,93336.0,20611.0,4813.0,6485.0,,,79949.0,39671.0,5625.0,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
60,Seattle_Metro_Area,052620,"$75,000 - $99,999",79481.0,69395.0,10086.0,,,,,72598.0,5523.0,1257.0,,102.0,
61,Seattle_Metro_Area,052620,"$100,000 - $149,999",126916.0,118355.0,7226.0,457.0,877.0,,,111580.0,15336.0,,,,
62,Seattle_Metro_Area,052620,"$150,000 - $199,999",63847.0,54419.0,7437.0,1646.0,,,344.0,57682.0,5322.0,842.0,,,
63,Seattle_Metro_Area,052620,"$200,000 and above",140252.0,108925.0,29637.0,1008.0,682.0,,,115003.0,24561.0,688.0,,,


### Food security

In [16]:
# Initialize empty dataframe 
food_df = pd.DataFrame()

# Loop through files to extract data
for i in range(0, len(file_names)):
    # Loop through locations to select sheet
    for j in range(0, len(locations)):
        # Read sheet from Excel file 
        file = pd.read_excel(file_names[i], sheet_name = locations[j], skiprows = 4, na_values = '-')
        # Extract income rows
        food = file.iloc[50:55,2:14]
        # Add column to specify location
        food.insert(0, "Location", locations[j])
        # Add column to specify date
        food.insert(1, "Date", [file_names[i][6:12]]*5, True)
        # Add column to specify income group 
        food.insert(2, "Group", file.iloc[50:55,0], True) 
        # Add column to specify total surveyed 
        food.insert(3, "Total", file.iloc[50:55,1], True)
        # Append to dataframe
        food_df = food_df.append(food)

# Show dataframe
food_df

Unnamed: 0,Location,Date,Group,Total,Device always available for educational purposes,Device usually available for educational purposes,Device sometimes available for educational purposes,Device rarely available for educational purposes,Device never available for educational purposes,Did not report,Internet always available for educational purposes,Internet usually available for educational purposes,Internet sometimes available for educational purposes,Internet rarely available for educational purposes,Internet never available for educational purposes,Did not report.1
50,WA,50520,Enough of the types of food wanted,1044897.0,816638.0,168218.0,29909.0,10885.0,9657.0,9590.0,841228.0,148759.0,31229.0,2661.0,9657.0,11364.0
51,WA,50520,"Enough food, but not always the types wanted",248673.0,136434.0,57694.0,38066.0,5865.0,10615.0,,143298.0,82212.0,8851.0,3698.0,10615.0,
52,WA,50520,Sometimes not enough to eat,92317.0,44759.0,19709.0,57.0,25079.0,2713.0,,46648.0,17878.0,16598.0,6205.0,4990.0,
53,WA,50520,Often not enough to eat,2462.0,2462.0,,,,,,241.0,2221.0,,,,
54,WA,50520,Did not report,,,,,,,,,,,,,
50,Seattle_Metro_Area,50520,Enough of the types of food wanted,590820.0,482647.0,82515.0,17460.0,6782.0,1416.0,,489087.0,89364.0,10953.0,,1416.0,
51,Seattle_Metro_Area,50520,"Enough food, but not always the types wanted",99431.0,63980.0,22930.0,11699.0,822.0,,,72007.0,16085.0,7641.0,3698.0,,
52,Seattle_Metro_Area,50520,Sometimes not enough to eat,33153.0,14354.0,7547.0,57.0,8482.0,2713.0,,16243.0,5716.0,,6205.0,4990.0,
53,Seattle_Metro_Area,50520,Often not enough to eat,2221.0,2221.0,,,,,,,2221.0,,,,
54,Seattle_Metro_Area,50520,Did not report,,,,,,,,,,,,,


In [17]:
# Convert date columns to date object
income_df["Date"] = pd.to_datetime(income_df["Date"], format = '%m%d%y')
food_df["Date"] = pd.to_datetime(food_df["Date"], format = '%m%d%y')

In [18]:
# Save to CSV 
income_df.to_csv('educ3_income_multitime.csv', index=False)
food_df.to_csv('educ3_food_multitime.csv', index=False)