## Economic Profile By County

- This notebook explores the BEA (https://bea.gov) Economic Profile By County dataset.
- It contains over 30 categories of income data by US region, State and County.
- Various sub tables can be extracted and combined for interesting comparisons.

This notebook shows how to get and plot a specific category by county. We'll look at unemployment compensation payed to residents of a single county for the last 50 years.

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
# Load the data
df = pd.read_csv('../input/us-economic-profile-by-county/profile_by_county_1969_2019.csv', dtype='string');
df.shape

- This set has almost 100K rows of data in 59 columns.
- The first 8 columns describe the row.
- The rest of the columns are the values for each year.
- There are multiple rows per *GeoName* (area .. US, State or County/State). Data can be extracted on a national, state or county basis.

In [None]:
df.head()

### Unique data categories

In [None]:
# Count the unique categories
category_count = str(len(df['Description'].unique()))
print("There are " + category_count + " categories, therefore " + category_count + " rows for each 'area' (not counting N/A).")

In [None]:
# List all the categories
df['Description'].unique()

- We can choose any of these categories for an 'area'. Let's get data for the county of Baldwin Alabama ..

In [None]:
df1 = df[df['GeoName'] == 'Baldwin, AL']
df1.shape

- We get 31 rows, or categories for this county, lets look at some of them.

In [None]:
df1.head(10)

- Let's grab the "Unemployment insurance compensation" row by using it's index in the original dataframe.

In [None]:
# Get a single row from the original DF
row = df.iloc[[97]]
row.head()

- Now let's pivot the columns, or un-encode them, as I like to think of it.

In [None]:
# Make DF to pivot the columns into rows
new_df = pd.DataFrame(columns = ['year','Unemployment insurance compensation'])

# Loop through the years and get the values from each column
for year in range(1969,2019):
    year = str(year)
    df2 = pd.DataFrame(row[[year]])

    df2.columns = ['Unemployment insurance compensation']
    df2.insert(0, 'year', year)
    
    new_df = pd.concat([new_df,df2], ignore_index=True, axis=0)
    new_df = new_df.astype('int64')

    
new_df.head(10)

Now we have the unemployment compensation for a specific county for the last 50 years.

In [None]:
# Plot the Unemployment insurance compensation in Baldwin County Alabama for the last 50 years
plt.figure(figsize = (12,5))
sns.lineplot(data=new_df, x='year', y='Unemployment insurance compensation');

- There are lots of interesting categories in this dataset and various rows could be combined to make some interesting predictions!