# Project 3

American states that have experienced the highest income growth in the past decade are those where technology-related occupations have grown the most as a share of the economy.

Datasets Used:
1. American Community Survey (ACS) Data: Selected Economic Characteristics (2012 and 2022)
   - URL: [https://data.census.gov/table/ACSDP5Y2022.DP03?t=Industry&g=010XX00US$0400000&y=2022]
   
Website link: [add here]

In [1]:
import pandas as pd
import plotly.express as px
import plotly.io as pio

pio.renderers.default = "vscode+jupyterlab+notebook_connected"

### Step 1

Reading in the datasets and previewing them to understand the format and structure.

In [2]:
acs_2012 = pd.read_csv('ACSDP5Y2012.DP03-Data.csv')
acs_2022 = pd.read_csv('ACSDP5Y2022.DP03-Data.csv')

In [3]:
#Displaying a preview of each dataset
acs_2012.head()

Unnamed: 0,GEO_ID,NAME,DP03_0001E,DP03_0001M,DP03_0001PE,DP03_0001PM,DP03_0002E,DP03_0002M,DP03_0002PE,DP03_0002PM,...,DP03_0135PM,DP03_0136E,DP03_0136M,DP03_0136PE,DP03_0136PM,DP03_0137E,DP03_0137M,DP03_0137PE,DP03_0137PM,Unnamed: 550
0,Geography,Geographic Area Name,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,Percent!!EMPLOYMENT STATUS!!Population 16 year...,Percent Margin of Error!!EMPLOYMENT STATUS!!Po...,Estimate!!EMPLOYMENT STATUS!!In labor force,Margin of Error!!EMPLOYMENT STATUS!!In labor f...,Percent!!EMPLOYMENT STATUS!!In labor force,Percent Margin of Error!!EMPLOYMENT STATUS!!In...,...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WH...,Margin of Error!!PERCENTAGE OF FAMILIES AND PE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WH...,Margin of Error!!PERCENTAGE OF FAMILIES AND PE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,
1,0400000US01,Alabama,3779457,1500,3779457,(X),2265008,5900,59.9,0.2,...,0.2,(X),(X),15.6,0.3,(X),(X),30.5,0.4,
2,0400000US02,Alaska,545497,570,545497,(X),389890,1910,71.5,0.3,...,0.5,(X),(X),7.4,0.4,(X),(X),18.1,0.7,
3,0400000US04,Arizona,4967615,1895,4967615,(X),3049419,8211,61.4,0.2,...,0.2,(X),(X),15.1,0.3,(X),(X),26.0,0.4,
4,0400000US05,Arkansas,2285328,1147,2285328,(X),1377211,4863,60.3,0.2,...,0.3,(X),(X),16.1,0.3,(X),(X),31.6,0.5,


In [4]:
acs_2022.head()

Unnamed: 0,GEO_ID,NAME,DP03_0001E,DP03_0001M,DP03_0002E,DP03_0002M,DP03_0003E,DP03_0003M,DP03_0004E,DP03_0004M,...,DP03_0133PE,DP03_0133PM,DP03_0134PE,DP03_0134PM,DP03_0135PE,DP03_0135PM,DP03_0136PE,DP03_0136PM,DP03_0137PE,DP03_0137PM
0,Geography,Geographic Area Name,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...
1,0400000US01,Alabama,4046614,2035,2345086,8674,2329696,8624,2209666,9114,...,13.9,0.2,14.8,0.2,11,0.3,12.5,0.3,29.8,0.4
2,0400000US02,Alaska,573998,602,383078,1898,362197,2166,339162,2259,...,9.7,0.4,10,0.5,7.8,0.7,8,0.5,19.5,0.8
3,0400000US04,Arizona,5764417,1806,3490030,8002,3467247,8044,3281189,9302,...,11.7,0.1,12.4,0.2,9.3,0.2,10.6,0.2,22.7,0.3
4,0400000US05,Arkansas,2402462,1327,1397075,6750,1391084,6730,1319483,6785,...,14.4,0.2,15.5,0.2,10.7,0.3,13.1,0.3,29.3,0.5


### Step 2
Concatenating the datasets to have a combined dataframe which has observations from both 2012 and 2022.

In order to do this, we check that the datasets are identical (in terms of structure and columns).

In [5]:
columns_2012 = set(acs_2012.columns)
columns_2022 = set(acs_2022.columns)

if columns_2012 == columns_2022:
    print("The columns in both datasets are identical.")
else:
    print("The columns are not identical")

The columns are not identical


Since they have identical columns (as expected) we can concatenate them - ie vertically stack them or join the one on top of the other.

In [6]:
acs_2012["Year"] = 2012
acs_2022["Year"] = 2022

#removing the first row so the combined data flows properly
acs_2022 = acs_2022[1:]

acs_combined = pd.concat([acs_2012,acs_2022], ignore_index=True)
acs_combined

Unnamed: 0,GEO_ID,NAME,DP03_0001E,DP03_0001M,DP03_0001PE,DP03_0001PM,DP03_0002E,DP03_0002M,DP03_0002PE,DP03_0002PM,...,DP03_0136E,DP03_0136M,DP03_0136PE,DP03_0136PM,DP03_0137E,DP03_0137M,DP03_0137PE,DP03_0137PM,Unnamed: 550,Year
0,Geography,Geographic Area Name,Estimate!!EMPLOYMENT STATUS!!Population 16 yea...,Margin of Error!!EMPLOYMENT STATUS!!Population...,Percent!!EMPLOYMENT STATUS!!Population 16 year...,Percent Margin of Error!!EMPLOYMENT STATUS!!Po...,Estimate!!EMPLOYMENT STATUS!!In labor force,Margin of Error!!EMPLOYMENT STATUS!!In labor f...,Percent!!EMPLOYMENT STATUS!!In labor force,Percent Margin of Error!!EMPLOYMENT STATUS!!In...,...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WH...,Margin of Error!!PERCENTAGE OF FAMILIES AND PE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WH...,Margin of Error!!PERCENTAGE OF FAMILIES AND PE...,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHO...,Percent Margin of Error!!PERCENTAGE OF FAMILIE...,,2012
1,0400000US01,Alabama,3779457,1500,3779457,(X),2265008,5900,59.9,0.2,...,(X),(X),15.6,0.3,(X),(X),30.5,0.4,,2012
2,0400000US02,Alaska,545497,570,545497,(X),389890,1910,71.5,0.3,...,(X),(X),7.4,0.4,(X),(X),18.1,0.7,,2012
3,0400000US04,Arizona,4967615,1895,4967615,(X),3049419,8211,61.4,0.2,...,(X),(X),15.1,0.3,(X),(X),26.0,0.4,,2012
4,0400000US05,Arkansas,2285328,1147,2285328,(X),1377211,4863,60.3,0.2,...,(X),(X),16.1,0.3,(X),(X),31.6,0.5,,2012
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100,0400000US53,Washington,6209213,1750,6209213,(X),4010799,7197,64.6,0.1,...,(X),(X),6.8,0.2,(X),(X),20.7,0.3,,2022
101,0400000US54,West Virginia,1476838,855,1476838,(X),786112,4619,53.2,0.3,...,(X),(X),13,0.5,(X),(X),31.1,0.7,,2022
102,0400000US55,Wisconsin,4764779,1536,4764779,(X),3129606,6425,65.7,0.1,...,(X),(X),7.3,0.1,(X),(X),21.8,0.3,,2022
103,0400000US56,Wyoming,460637,640,460637,(X),302838,1838,65.7,0.4,...,(X),(X),7.4,0.5,(X),(X),22.7,1,,2022


Now we have 2 rows per state/ territory in the US - i.e. 104 rows of data in total (excluding headers).

Let's first fix the column names so they are more descriptive - ie make row 1 the column header.

In [7]:
acs_combined.columns = acs_combined.iloc[0]
acs_combined = acs_combined[1:]
acs_combined.columns = acs_combined.columns.astype(str)
acs_combined.rename(columns={"2012": "Year"},inplace=True)
acs_combined



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Geography,Geographic Area Name,Estimate!!EMPLOYMENT STATUS!!Population 16 years and over,Margin of Error!!EMPLOYMENT STATUS!!Population 16 years and over,Percent!!EMPLOYMENT STATUS!!Population 16 years and over,Percent Margin of Error!!EMPLOYMENT STATUS!!Population 16 years and over,Estimate!!EMPLOYMENT STATUS!!In labor force,Margin of Error!!EMPLOYMENT STATUS!!In labor force,Percent!!EMPLOYMENT STATUS!!In labor force,Percent Margin of Error!!EMPLOYMENT STATUS!!In labor force,...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!People in families,Margin of Error!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!People in families,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!People in families,Percent Margin of Error!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!People in families,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Unrelated individuals 15 years and over,Margin of Error!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Unrelated individuals 15 years and over,Percent!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Unrelated individuals 15 years and over,Percent Margin of Error!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Unrelated individuals 15 years and over,nan,Year
1,0400000US01,Alabama,3779457,1500,3779457,(X),2265008,5900,59.9,0.2,...,(X),(X),15.6,0.3,(X),(X),30.5,0.4,,2012
2,0400000US02,Alaska,545497,570,545497,(X),389890,1910,71.5,0.3,...,(X),(X),7.4,0.4,(X),(X),18.1,0.7,,2012
3,0400000US04,Arizona,4967615,1895,4967615,(X),3049419,8211,61.4,0.2,...,(X),(X),15.1,0.3,(X),(X),26.0,0.4,,2012
4,0400000US05,Arkansas,2285328,1147,2285328,(X),1377211,4863,60.3,0.2,...,(X),(X),16.1,0.3,(X),(X),31.6,0.5,,2012
5,0400000US06,California,29163075,4124,29163075,(X),18821426,20899,64.5,0.1,...,(X),(X),13.0,0.1,(X),(X),25.9,0.2,,2012
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100,0400000US53,Washington,6209213,1750,6209213,(X),4010799,7197,64.6,0.1,...,(X),(X),6.8,0.2,(X),(X),20.7,0.3,,2022
101,0400000US54,West Virginia,1476838,855,1476838,(X),786112,4619,53.2,0.3,...,(X),(X),13,0.5,(X),(X),31.1,0.7,,2022
102,0400000US55,Wisconsin,4764779,1536,4764779,(X),3129606,6425,65.7,0.1,...,(X),(X),7.3,0.1,(X),(X),21.8,0.3,,2022
103,0400000US56,Wyoming,460637,640,460637,(X),302838,1838,65.7,0.4,...,(X),(X),7.4,0.5,(X),(X),22.7,1,,2022


### Step 3: Filtering and Reshaping the Dataset

There are too many columns, which is making the data hard to read and work with. Let's filter them out slightly by getting rid of all the margin of error columns, and retaining all the estimate columns.

In [8]:
acs_combined.columns = acs_combined.columns.astype(str)
estimate_columns = [col for col in acs_combined.columns if col.startswith("Estimate")]
required_columns = ["Geography", "Geographic Area Name", "Year"] + estimate_columns
acs_estimates = acs_combined.loc[:,required_columns]
acs_estimates

Unnamed: 0,Geography,Geographic Area Name,Year,Estimate!!EMPLOYMENT STATUS!!Population 16 years and over,Estimate!!EMPLOYMENT STATUS!!In labor force,Estimate!!EMPLOYMENT STATUS!!In labor force.1,Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force,Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force.1,Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force!!Employed,Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force!!Employed.1,...,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!All people,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Under 18 years,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Under 18 years!!Related children under 18 years,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Under 18 years!!Related children under 18 years!!Related children under 5 years,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Under 18 years!!Related children under 18 years!!Related children 5 to 17 years,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!18 years and over,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!18 years and over!!18 to 64 years,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!18 years and over!!65 years and over,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!People in families,Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!Unrelated individuals 15 years and over
1,0400000US01,Alabama,2012,3779457,2265008,1074255,2248665,1071803,2017887,960434,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
2,0400000US02,Alaska,2012,545497,389890,173809,372484,171704,341115,159818,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
3,0400000US04,Arizona,2012,4967615,3049419,1409513,3029669,1406746,2733537,1278068,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
4,0400000US05,Arkansas,2012,2285328,1377211,651709,1370423,650944,1253069,597858,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
5,0400000US06,California,2012,29163075,18821426,8591238,18673806,8574181,16614362,7647135,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100,0400000US53,Washington,2022,6209213,4010799,1823080,3949085,1813408,3752076,1724257,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
101,0400000US54,West Virginia,2022,1476838,786112,367548,783715,367265,736212,347455,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
102,0400000US55,Wisconsin,2022,4764779,3129606,1481660,3125976,1481070,3020890,1435889,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)
103,0400000US56,Wyoming,2022,460637,302838,135842,299351,135128,287895,130880,...,(X),(X),(X),(X),(X),(X),(X),(X),(X),(X)


In [9]:
pd.set_option('display.max_columns', None)
acs_estimates.columns.tolist()

#pd.reset_option('display.max_columns')

['Geography',
 'Geographic Area Name',
 'Year',
 'Estimate!!EMPLOYMENT STATUS!!Population 16 years and over',
 'Estimate!!EMPLOYMENT STATUS!!In labor force',
 'Estimate!!EMPLOYMENT STATUS!!In labor force',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force!!Employed',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force!!Employed',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Civilian labor force!!Unemployed',
 'Estimate!!EMPLOYMENT STATUS!!In labor force!!Armed Forces',
 'Estimate!!EMPLOYMENT STATUS!!Not in labor force',
 'Estimate!!EMPLOYMENT STATUS!!Civilian labor force',
 'Estimate!!EMPLOYMENT STATUS!!Percent Unemployed',
 'Estimate!!EMPLOYMENT STATUS!!Females 16 years and over',
 'Estimate!!EMPLOYMENT STATUS!!In labor force',
 'Estimate!!EMPLOYMENT STATUS!!In labor force',
 'Estimate!!EMPLOYMENT STATUS!!In lab

Next, let's filter it down further to retain only the column of relevance - Mean household income.

In [10]:
income_data = acs_estimates[["Geography","Geographic Area Name","Year","Estimate!!INCOME AND BENEFITS (IN 2012 INFLATION-ADJUSTED DOLLARS)!!Mean household income (dollars)"]]
income_data

Unnamed: 0,Geography,Geographic Area Name,Year,Estimate!!INCOME AND BENEFITS (IN 2012 INFLATION-ADJUSTED DOLLARS)!!Mean household income (dollars)
1,0400000US01,Alabama,2012,59273
2,0400000US02,Alaska,2012,86208
3,0400000US04,Arizona,2012,67444
4,0400000US05,Arkansas,2012,55158
5,0400000US06,California,2012,85265
...,...,...,...,...
100,0400000US53,Washington,2022,122880
101,0400000US54,West Virginia,2022,75575
102,0400000US55,Wisconsin,2022,94995
103,0400000US56,Wyoming,2022,94901


Reshaping the data so we have the 2012 and 2022 incomes as columns in order to calculate the % change in income over the decade.

In [11]:
column_name = "Estimate!!INCOME AND BENEFITS (IN 2012 INFLATION-ADJUSTED DOLLARS)!!Mean household income (dollars)"
income_data[column_name] = pd.to_numeric(income_data[column_name])

income_data = income_data.pivot(index=["Geography","Geographic Area Name"],
    columns="Year", 
    values=column_name)

income_data.reset_index()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Year,Geography,Geographic Area Name,2012,2022
0,0400000US01,Alabama,59273,82992
1,0400000US02,Alaska,86208,110602
2,0400000US04,Arizona,67444,98569
3,0400000US05,Arkansas,55158,79592
4,0400000US06,California,85265,130718
5,0400000US08,Colorado,77900,117508
6,0400000US09,Connecticut,97051,130601
7,0400000US10,Delaware,77453,104600
8,0400000US11,District of Columbia,99511,150292
9,0400000US12,Florida,66599,96992


### Step 4: Data Analysis

Calculating the % change in per capita income over the past decade to identify the key winner and loser states.

In [12]:
income_data["%Change in Income"] = ((income_data[2022]-income_data[2012])/income_data[2012])*100
income_data.sort_values(by="%Change in Income",ascending=False)

Unnamed: 0_level_0,Year,2012,2022,%Change in Income
Geography,Geographic Area Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0400000US53,Washington,77232,122880,59.105034
0400000US41,Oregon,66161,103330,56.179622
0400000US16,Idaho,59974,92780,54.70037
0400000US06,California,85265,130718,53.307922
0400000US49,Utah,73002,111416,52.620476
0400000US30,Montana,59569,90142,51.323675
0400000US11,District of Columbia,99511,150292,51.030539
0400000US08,Colorado,77900,117508,50.844673
0400000US25,Massachusetts,89965,134568,49.578169
0400000US13,Georgia,67659,99345,46.831907


The 3 states with top income growth of 54-57% are: Washington, Oregon, Idaho.

The states lagging behind economically are: New Mexico, Alaska, Puerto Rico.

Let's investigate what are the underlying causes for this heterogenous growth. 
The sectoral composition of the economy in the top and bottom states may point us towards which sectors drive productivity and therefore income growth for state residents.

In [13]:
washington_sectoral = acs_estimates.loc[100]
washington_sectoral.to_frame().reset_index()
industries = ['Estimate!!INDUSTRY!!Agriculture, forestry, fishing and hunting, and mining',
 'Estimate!!INDUSTRY!!Construction',
 'Estimate!!INDUSTRY!!Manufacturing',
 'Estimate!!INDUSTRY!!Wholesale trade',
 'Estimate!!INDUSTRY!!Retail trade',
 'Estimate!!INDUSTRY!!Transportation and warehousing, and utilities',
 'Estimate!!INDUSTRY!!Information',
 'Estimate!!INDUSTRY!!Finance and insurance, and real estate and rental and leasing',
 'Estimate!!INDUSTRY!!Professional, scientific, and management, and administrative and waste management services',
 'Estimate!!INDUSTRY!!Educational services, and health care and social assistance',
 'Estimate!!INDUSTRY!!Arts, entertainment, and recreation, and accommodation and food services',
 'Estimate!!INDUSTRY!!Other services, except public administration',
 'Estimate!!INDUSTRY!!Public administration',]

washington_sectoral = washington_sectoral.loc[industries]
washington_sectoral = washington_sectoral.to_frame().reset_index()
washington_sectoral.rename(columns={0:"Industry",100:"Employment by Industry"},inplace=True)
washington_sectoral

Unnamed: 0,Industry,Employment by Industry
0,"Estimate!!INDUSTRY!!Agriculture, forestry, fis...",92950
1,Estimate!!INDUSTRY!!Construction,269999
2,Estimate!!INDUSTRY!!Manufacturing,346299
3,Estimate!!INDUSTRY!!Wholesale trade,92595
4,Estimate!!INDUSTRY!!Retail trade,438819
5,Estimate!!INDUSTRY!!Transportation and warehou...,214143
6,Estimate!!INDUSTRY!!Information,88696
7,"Estimate!!INDUSTRY!!Finance and insurance, and...",199104
8,"Estimate!!INDUSTRY!!Professional, scientific, ...",534819
9,"Estimate!!INDUSTRY!!Educational services, and ...",800574


In [17]:
fig1 = px.pie(washington_sectoral,names="Industry",values="Employment by Industry")
fig1.show()

In [15]:
p_rico_sectoral = acs_estimates.loc[104]
p_rico_sectoral.to_frame().reset_index()

p_rico_sectoral = p_rico_sectoral.loc[industries]
p_rico_sectoral = p_rico_sectoral.to_frame().reset_index()
p_rico_sectoral.rename(columns={0:"Industry",104:"Employment by Industry"},inplace=True)
p_rico_sectoral

Unnamed: 0,Industry,Employment by Industry
0,"Estimate!!INDUSTRY!!Agriculture, forestry, fis...",14546
1,Estimate!!INDUSTRY!!Construction,65125
2,Estimate!!INDUSTRY!!Manufacturing,97168
3,Estimate!!INDUSTRY!!Wholesale trade,30551
4,Estimate!!INDUSTRY!!Retail trade,145326
5,Estimate!!INDUSTRY!!Transportation and warehou...,41105
6,Estimate!!INDUSTRY!!Information,17111
7,"Estimate!!INDUSTRY!!Finance and insurance, and...",58970
8,"Estimate!!INDUSTRY!!Professional, scientific, ...",115411
9,"Estimate!!INDUSTRY!!Educational services, and ...",238783


In [16]:
fig2 = px.pie(p_rico_sectoral,names="Industry",values="Employment by Industry")
fig2.show()