### US Employment by Industry
- This notebook explores the BEA (https://bea.gov) US Employment By Industry.
- It represents 34 industry types, based on the NAICS classification.

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
# Load the data
df = pd.read_csv('../input/us-employment-by-industry-20012019/CAEMP25N__ALL_AREAS_2001_2019.csv');
df.shape

- This set has 105,538 rows of data in 27 columns.
- The first 8 columns describe the row (as with most BEA csv files).
- The GeoName column specifies whether the row references the entire US, or a specific state.
- The Description column specifies the type of industry (Construction, Farm Employment, Manufacturing etc).
- The rest of the columns are the 'number of jobs' values for each year.

In [None]:
df.head(20)

### How many unique industry types are there?

In [None]:
# Count the unique industries
category_count = str(len(df['Description'].unique()))
print("There are " + category_count + " categories, therefore " + category_count + " rows for each 'area' (not counting N/A).")

In [None]:
# List all the industries
df['Description'].unique()

In [None]:
df1 = df[df['GeoName'] == 'Florida']
df1.shape

This gives us 33 rows, one for each industry. Let's look at a few of the rows.

In [None]:
df1.head(33)

We'll grab the "Educational services" industry row by using it's index in the original dataframe.

In [None]:
# Get a single row from the original DF
row = df.iloc[[11704]]
row.head()

- Now let's pivot the columns, or un-encode them, as I like to think of it.
- (This doesn't have to be done, or could be done with a pivot table)


In [None]:
# Make DF to pivot the columns into rows
new_df = pd.DataFrame(columns = ['year','Educational services'])

# Loop through the years and get the values from each column
for year in range(2001,2019):
    year = str(year)
    df2 = pd.DataFrame(row[[year]].astype('float'))

    df2.columns = ['Educational services']
    df2.insert(0, 'year', year)
    new_df = pd.concat([new_df,df2], ignore_index=True, axis=0)
    
new_df.head(10)

- This gives us a nice dataframe with the year and job count for the Educational Services industry in Florida.

In [None]:
# Plot the job counts for the Educational Services industry in Florida for 2001-2019
plt.figure(figsize = (12,5))
plt.title("Education Services job count in Florida 2001-2019")
sns.lineplot(data=new_df, x='year', y='Educational services');