# Exploratory Data Analysis

## Overview

In this notebook, I will perform a data analysis of the U.S. education data. I will analyze the file named ‘states_all_extended.csv’ which contains state-level aggregate information.

## Data description

**Identification**
 - PRIMARY_KEY: A combination of the year and state name.
 - YEAR
 - STATE <br>
 
**Enrollment:**<br>
- *A breakdown of students enrolled in schools by school year. <br>*
 - GRADES_PK: Number of students in Pre-Kindergarten education.
 - GRADES_4: Number of students in fourth grade.
 - GRADES_8: Number of students in eighth grade.
 - GRADES_12: Number of students in twelfth grade.
 - GRADES_1_8: Number of students in the first through eighth grades.
 - GRADES 9_12: Number of students in the ninth through twelfth grades.
 - GRADES_KG_12: Number of students in Kindergarten through twelfth grade.
 - GRADES_ALL: The count of all students in the state. Comparable to ENROLL in the financial data (which is the U.S. Census Bureau's estimate for students in the state).
<br>
- *A breakdown of students enrolled in schools by race and gender. <br>*
The represented races include AM (American Indian or Alaska Native), AS (Asian), HI (Hispanic/Latino), BL (Black or African American), WH (White), HP (Hawaiian Native/Pacific Islander), and TR (Two or More Races). The represented genders include M (Male) and F (Female). For example:
 - Grades_ALL_AS: Number of students whose ethnicity was classified as "Asian".
 - Grades_ALL_ASM: Number of male students whose ethnicity was classified as "Asian".
 - Grades_ALL_ASF: Number of female students whose ethnicity was classified as "Asian".<br>

**Financials**<br>
- *A breakdown of states by revenue and expenditure.*<br>

 - ENROLL: The U.S. Census Bureau's count for students in the state. Should be comparable to GRADES_ALL (which is the NCES's estimate for students in the state).
 - TOTAL REVENUE: The total amount of revenue for the state.<br>
      - FEDERAL_REVENUE
      - STATE_REVENUE
      - LOCAL_REVENUE<br>
 - TOTAL_EXPENDITURE: The total expenditure for the state.
     - INSTRUCTION_EXPENDITURE.
     - SUPPORT_SERVICES_EXPENDITURE.
     - CAPITAL_OUTLAY_EXPENDITURE
     - OTHER_EXPENDITURE <br>
     
**Academic Achievement**<br>
- *A breakdown of student performance as assessed by the corresponding exams (math and reading, grades 4 and 8).*
    - AVG_MATH_4_SCORE: The state's average score for fourth graders taking the NAEP math exam.
    - AVG_MATH_8_SCORE: The state's average score for eight graders taking the NAEP math exam.
    - AVG_READING_4_SCORE: The state's average score for fourth graders taking the NAEP reading exam.
    - AVG_READING_8_SCORE: The state's average score for eighth graders taking the NAEP reading exam.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
%matplotlib inline
warnings.filterwarnings('ignore')
pd.pandas.set_option('display.max_columns', None)

In [None]:
data = pd.read_csv('../input/states_all_extended.csv')

In [None]:
data.head()

In [None]:
data.shape

In [None]:
data.describe()

## Missing values

In [None]:
vars_with_na = [var for var in data.columns if data[var].isnull().sum()>1]
print(len(vars_with_na))

We have 190 variables with missing values

In [None]:
dict_missing = { var: np.round(data[var].isnull().mean()*100, 3) for var in vars_with_na}

In [None]:
import collections
sorted_dict = sorted(dict_missing.items(), key=lambda kv: kv[1], reverse=True)

In [None]:
sorted_dict

In [None]:
# create a dataframe of missing values
missings_df = pd.DataFrame.from_dict(sorted_dict)
missings_df.columns = ['columns', 'Percent missing']
missings_df.head()

In [None]:
missings_df.shape

In [None]:
d = missings_df.iloc[:50]
plt.figure(figsize = [20, 10]);
g = sns.barplot(x="columns", y="Percent missing", data=d)
g.set_xticklabels(g.get_xticklabels(), rotation=90);

## Revenues - Federal, State, Local

In [None]:
revenues = data[['YEAR','STATE','TOTAL_REVENUE', 'FEDERAL_REVENUE', 'STATE_REVENUE', 'LOCAL_REVENUE']]

In [None]:
revenues_millions = revenues[['TOTAL_REVENUE', 'FEDERAL_REVENUE', 'STATE_REVENUE', 'LOCAL_REVENUE']]/1000000

In [None]:
revenues_millions.head()

In [None]:
# Create a figure and axes
fig, ax = plt.subplots(2, 2, figsize=(20, 10))

# plot the total revenue
ax[0, 0].hist(revenues_millions.TOTAL_REVENUE.dropna(), bins=50)
ax[0, 0].set_title('Total revenue in Millions')
ax[0, 0].set_xlabel('Revenue in Millions')
ax[0, 0].set_ylabel('Count')

# plot the federal revenue
ax[0, 1].hist(revenues_millions.FEDERAL_REVENUE.dropna(), bins=50)
ax[0, 1].set_title('Federal revenue in Millions')
ax[0, 1].set_xlabel('Revenue in Millions')
ax[0, 1].set_ylabel('Count')

# plot the state revenue
ax[1, 0].hist(revenues_millions.STATE_REVENUE.dropna(), bins=50)
ax[1, 0].set_title('State revenue in Millions')
ax[1, 0].set_xlabel('Revenue in Millions')
ax[1, 0].set_ylabel('Count')

# plot the local revenue
ax[1, 1].hist(revenues_millions.LOCAL_REVENUE.dropna(), bins=50)
ax[1, 1].set_title('Local revenue in Millions')
ax[1, 1].set_xlabel('Revenue in Millions')
ax[1, 1].set_ylabel('Count')


In [None]:
base_color = sns.color_palette()[2]
plt.figure(figsize = [10, 10])
plt.title('Revenue in Millions')
dfm = revenues_millions.melt(var_name='columns')
sns.violinplot(data = dfm, y='columns', x='value', color=base_color, inner = 'quartile')

In [None]:
base_color = sns.color_palette()[3]
plt.figure(figsize = [10, 10])
plt.title('Revenue in Millions')
dfm = revenues_millions.melt(var_name='columns')
sns.boxplot(data = dfm, y='columns', x='value', color=base_color)

The figures above shows that the school receive most of its revenue from the state and local revenue

## Revenue over the years

In [None]:
dfm = pd.melt(revenues, id_vars =['YEAR'], value_vars =['TOTAL_REVENUE', 'FEDERAL_REVENUE', 'STATE_REVENUE', 'LOCAL_REVENUE'])
dfm.columns = ['Year', 'Revenue_type', 'Dollar_Amount']
dfm.head()
dfm.Dollar_Amount = dfm.Dollar_Amount/1000000
dfm.head()


In [None]:
plt.figure(figsize = [20, 10])
plt.title('Revenue in millions')
sns.lineplot(x='Year', y='Dollar_Amount', hue='Revenue_type' , data=dfm, ci=None)

### Total Revenue

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average total revenue in Millions')
(revenues.groupby('YEAR')['TOTAL_REVENUE'].mean()/1000000).plot.bar()

In [None]:
total_rev = pd.concat([revenues['TOTAL_REVENUE']/1000000, revenues['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="TOTAL_REVENUE", data=total_rev)
plt.ylabel('Total revenue in millions')
plt.title('Annual total revenue')

### Federal revenue

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average federal revenue in Millions')
(revenues.groupby('YEAR')['FEDERAL_REVENUE'].mean()/1000000).plot.bar()

In [None]:
total_rev = pd.concat([revenues['FEDERAL_REVENUE']/1000000, revenues['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="FEDERAL_REVENUE", data=total_rev)
plt.xlabel('Federal revenue in millions')
plt.title('Annual Federal revenue')

### State revenue

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average state revenue in Millions')
(revenues.groupby('YEAR')['STATE_REVENUE'].mean()/1000000).plot.bar()

In [None]:
total_rev = pd.concat([revenues['STATE_REVENUE']/1000000, revenues['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="STATE_REVENUE", data=total_rev)
plt.xlabel('State revenue in millions')
plt.title('Annual State revenue')

### Local Revenue

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average local revenue in Millions')
(revenues.groupby('YEAR')['LOCAL_REVENUE'].mean()/1000000).plot.bar()


In [None]:
total_rev = pd.concat([revenues['LOCAL_REVENUE']/1000000, revenues['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="LOCAL_REVENUE", data=total_rev)
plt.xlabel('Local revenue in millions')
plt.title('Annual Local revenue')

The average yearly total revenue increase over the year. Apparently the schools receive more money from the federal government in years 2010 and 2011.

## Revenue by states

### Total Revenue

In [None]:
rev_data = revenues.groupby('STATE')['TOTAL_REVENUE'].mean()/1000
rev_data = rev_data.reset_index()
rev_data = rev_data.sort_values('TOTAL_REVENUE', ascending=False)
rev_data.plot.barh(x='STATE', y='TOTAL_REVENUE', figsize=(10, 25))
plt.xlabel('Average Total revenue in thousands')

In [None]:
total_rev = pd.concat([revenues['TOTAL_REVENUE']/1000000, revenues['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x="TOTAL_REVENUE", data=total_rev)
plt.xlabel('Total revenue in millions')
plt.title('Annual total revenue')

### Federal Revenue

In [None]:
rev_data = revenues.groupby('STATE')['FEDERAL_REVENUE'].mean()/1000
rev_data = rev_data.reset_index()
rev_data = rev_data.sort_values('FEDERAL_REVENUE', ascending=False)
rev_data.plot.barh(x='STATE', y='FEDERAL_REVENUE', figsize=(10, 25))
plt.xlabel('Average federal revenue in thousands')

In [None]:
total_rev = pd.concat([revenues['FEDERAL_REVENUE']/1000000, revenues['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x="FEDERAL_REVENUE", data=total_rev)
plt.xlabel('Federal revenue in millions')
plt.title('Annual Federal revenue')

### State revenue

In [None]:
rev_data = revenues.groupby('STATE')['STATE_REVENUE'].mean()/1000
rev_data = rev_data.reset_index()
rev_data = rev_data.sort_values('STATE_REVENUE', ascending=False)
rev_data.plot.barh(x='STATE', y='STATE_REVENUE', figsize=(10, 25))
plt.xlabel('Average state revenue in thousands')

In [None]:
total_rev = pd.concat([revenues['STATE_REVENUE']/1000000, revenues['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x="STATE_REVENUE", data=total_rev)
plt.xlabel('State revenue in millions')
plt.title('Annual State revenue')

### Local Revenue

In [None]:
rev_data = revenues.groupby('STATE')['LOCAL_REVENUE'].mean()/1000
rev_data = rev_data.reset_index()
rev_data = rev_data.sort_values('LOCAL_REVENUE', ascending=False)
rev_data.plot.barh(x='STATE', y='LOCAL_REVENUE', figsize=(10, 25))
plt.xlabel('Average local revenue in thousands')

In [None]:
total_rev = pd.concat([revenues['LOCAL_REVENUE']/1000000, revenues['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x="LOCAL_REVENUE", data=total_rev)
plt.xlabel('Local revenue in millions')
plt.title('Annual State revenue')

## Expenditures

In [None]:
expenditures = data[['YEAR','STATE','TOTAL_EXPENDITURE','INSTRUCTION_EXPENDITURE','SUPPORT_SERVICES_EXPENDITURE','OTHER_EXPENDITURE', 'CAPITAL_OUTLAY_EXPENDITURE']]

In [None]:
expenditures.head()

In [None]:
melt_expenditures = pd.melt(expenditures, id_vars =['YEAR'], value_vars =['TOTAL_EXPENDITURE','INSTRUCTION_EXPENDITURE','SUPPORT_SERVICES_EXPENDITURE','OTHER_EXPENDITURE', 'CAPITAL_OUTLAY_EXPENDITURE']) 

In [None]:
melt_expenditures.columns = ['Year', 'Expenditure', 'Dollar_Amount']

In [None]:
melt_expenditures.Dollar_Amount = melt_expenditures.Dollar_Amount/1000000
melt_expenditures.head()

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Expenditures in millions')
sns.lineplot(x='Year', y='Dollar_Amount', hue='Expenditure' , data=melt_expenditures, ci=None)
plt.show()

### Total Expenditures

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average total expenditure in Millions')
(expenditures.groupby('YEAR')['TOTAL_EXPENDITURE'].mean()/1000000).plot.bar()
plt.show()

In [None]:
total_exp = pd.concat([expenditures['TOTAL_EXPENDITURE']/1000000, expenditures['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="TOTAL_EXPENDITURE", data=total_exp)
plt.ylabel('Total expenditures in millions')
plt.title('Average total expenditures')
plt.show()

### Instruction Expenditures

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average instruction expenditure in Millions')
(expenditures.groupby('YEAR')['INSTRUCTION_EXPENDITURE'].mean()/1000000).plot.bar()
plt.show()

In [None]:
inst_exp = pd.concat([expenditures['INSTRUCTION_EXPENDITURE']/1000000, expenditures['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="INSTRUCTION_EXPENDITURE", data=inst_exp)
plt.ylabel('Instruction expenditures in millions')
plt.title('Average instruction expenditures')
plt.show()

### Support Services Expenditures

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average support services expenditure in Millions')
(expenditures.groupby('YEAR')['SUPPORT_SERVICES_EXPENDITURE'].mean()/1000000).plot.bar()
plt.show()

In [None]:
ss_exp = pd.concat([expenditures['SUPPORT_SERVICES_EXPENDITURE']/1000000, expenditures['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="SUPPORT_SERVICES_EXPENDITURE", data=ss_exp)
plt.ylabel('Support services expenditures in millions')
plt.title('Average support services expenditures')
plt.show()

### Capital Outlay Expenditures

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average capital outlet expenditures in Millions')
(expenditures.groupby('YEAR')['CAPITAL_OUTLAY_EXPENDITURE'].mean()/1000000).plot.bar()
plt.show()

In [None]:
co_exp = pd.concat([expenditures['CAPITAL_OUTLAY_EXPENDITURE']/1000000, expenditures['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="CAPITAL_OUTLAY_EXPENDITURE", data=co_exp)
plt.ylabel('Capital outlay expenditures in millions')
plt.title('Capital Outlay Expenditures')
plt.show()

### Other Expenditures

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average other expenditures in Millions')
(expenditures.groupby('YEAR')['OTHER_EXPENDITURE'].mean()/1000000).plot.bar()
plt.show()

In [None]:
o_exp = pd.concat([expenditures['OTHER_EXPENDITURE']/1000000, expenditures['YEAR']], axis=1)
f, ax = plt.subplots(figsize=(20, 10))
fig = sns.boxplot(x='YEAR', y="OTHER_EXPENDITURE", data=o_exp)
plt.ylabel('Other expenditures in millions')
plt.title('Other Expenditures')
plt.show()

## Expenditures by state

### Total expenditure

In [None]:
exp_data = expenditures.groupby('STATE')['TOTAL_EXPENDITURE'].mean()/1000
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('TOTAL_EXPENDITURE', ascending=False)
exp_data.plot.barh(x='STATE', y='TOTAL_EXPENDITURE', figsize=(10, 25))
plt.xlabel('Average Total expenditure in thousands')
plt.show()

In [None]:
total_exp = pd.concat([expenditures['TOTAL_EXPENDITURE']/1000000, expenditures['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x="TOTAL_EXPENDITURE", data=total_exp)
plt.xlabel('Total expenditure in millions')
plt.title('Annual total expenditure')
plt.show()

### Instruction expenditure

In [None]:
exp_data = expenditures.groupby('STATE')['INSTRUCTION_EXPENDITURE'].mean()/1000
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('INSTRUCTION_EXPENDITURE', ascending=False)
exp_data.plot.barh(x='STATE', y='INSTRUCTION_EXPENDITURE', figsize=(10, 25))
plt.xlabel('Average instruction expenditure in thousands')
plt.show()

In [None]:
total_exp = pd.concat([expenditures['INSTRUCTION_EXPENDITURE']/1000000, expenditures['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x='INSTRUCTION_EXPENDITURE', data=total_exp)
plt.xlabel('Instruction expenditure in millions')
plt.title('Annual instruction expenditure')
plt.show()

### Support services expenditure

In [None]:
exp_data = expenditures.groupby('STATE')['SUPPORT_SERVICES_EXPENDITURE'].mean()/1000
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('SUPPORT_SERVICES_EXPENDITURE', ascending=False)
exp_data.plot.barh(x='STATE', y='SUPPORT_SERVICES_EXPENDITURE', figsize=(10, 25))
plt.xlabel('Average support services expenditure in thousands')
plt.show()

In [None]:
total_exp = pd.concat([expenditures['SUPPORT_SERVICES_EXPENDITURE']/1000000, expenditures['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x='SUPPORT_SERVICES_EXPENDITURE', data=total_exp)
plt.xlabel('Support services expenditure in millions')
plt.title('Annual support services expenditure')
plt.show()

### Capital outlay expenditure

In [None]:
exp_data = expenditures.groupby('STATE')['CAPITAL_OUTLAY_EXPENDITURE'].mean()/1000
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('CAPITAL_OUTLAY_EXPENDITURE', ascending=False)
exp_data.plot.barh(x='STATE', y='CAPITAL_OUTLAY_EXPENDITURE', figsize=(10, 25))
plt.xlabel('Average capital outlay expenditure in thousands')
plt.show()

In [None]:
total_exp = pd.concat([expenditures['CAPITAL_OUTLAY_EXPENDITURE']/1000000, expenditures['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x='CAPITAL_OUTLAY_EXPENDITURE', data=total_exp)
plt.xlabel('Capital outlay expenditure in millions')
plt.title('Annual capital outlay expenditure')
plt.show()

### Other expenditure

In [None]:
exp_data = expenditures.groupby('STATE')['OTHER_EXPENDITURE'].mean()/1000
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('OTHER_EXPENDITURE', ascending=False)
exp_data.plot.barh(x='STATE', y='OTHER_EXPENDITURE', figsize=(10, 25))
plt.xlabel('Average other expenditure in thousands')
plt.show()

In [None]:
total_exp = pd.concat([expenditures['OTHER_EXPENDITURE']/1000000, expenditures['STATE']], axis=1)
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x='OTHER_EXPENDITURE', data=total_exp)
plt.xlabel('Other expenditure in millions')
plt.title('Annual other expenditure')
plt.show()

* ## Expenditure vs Revenue

In [None]:
rev_exp = data[['YEAR', 'STATE', 'TOTAL_EXPENDITURE', 'TOTAL_REVENUE']]
rev_exp['not_spent'] = rev_exp.TOTAL_REVENUE - rev_exp.TOTAL_EXPENDITURE
rev_exp.head()

In [None]:
def plot_line(df, title,x, y, h=None, figsize=[20, 10]):
    plt.figure(figsize=figsize)
    plt.title(title)
    sns.lineplot(x=x, y=y, hue=h , data=df, ci=None)
    
plot_line(rev_exp, 'Average dollar amount not spent', x='YEAR', y='not_spent')

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Average not spent dollar amount')
(rev_exp.groupby('YEAR')['not_spent'].mean()).plot.bar()

In [None]:
plt.figure(figsize = [20, 10])
plt.title('Median not spent dollar amount')
(rev_exp.groupby('YEAR')['not_spent'].median()).plot.bar()

In [None]:
f, ax = plt.subplots(figsize=(25, 15))
fig = sns.boxplot(x='YEAR', y="not_spent", data=rev_exp)
plt.ylabel('Not spent dollar amount')

1. ### What state overspends the most?

In [None]:
exp_data = rev_exp.groupby('STATE')['not_spent'].mean()
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('not_spent', ascending=False)
exp_data.plot.barh(x='STATE', y='not_spent', figsize=(10, 25))
plt.xlabel('Average dollar amount not spent')
plt.show()

In [None]:
exp_data = rev_exp.groupby('STATE')['not_spent'].median()
exp_data = exp_data.reset_index()
exp_data = exp_data.sort_values('not_spent', ascending=False)
exp_data.plot.barh(x='STATE', y='not_spent', figsize=(10, 25))
plt.xlabel('Median dollar amount not spent')
plt.show()

In [None]:
f, ax = plt.subplots(figsize=(10, 25))
fig = sns.boxplot(y='STATE', x='not_spent', data=rev_exp)
plt.xlabel('Dollar amount not spent')
plt.title('Annual dollar amount not spent')
plt.show()

### Enrollment demographics by state

In [None]:
melt_demographic = pd.melt(data, id_vars =['YEAR', 'STATE'], value_vars =['GRADES_ALL_AM','GRADES_ALL_AS','GRADES_ALL_HI', 'GRADES_ALL_BL', 'GRADES_ALL_WH','GRADES_ALL_HP', 'GRADES_ALL_TR' ])
melt_demographic.columns = ['Year','STATE', 'Demographic', 'Enrollments']
plot_line(melt_demographic, 'Enrollment demographic', x='Year', y='Enrollments', h='Demographic')

In [None]:
f, ax = plt.subplots(figsize=(40, 20))
sns.barplot(x="Year", y="Enrollments", hue="Demographic", data=melt_demographic, ci=None)

In [None]:
main_land_states = data.STATE.unique()[:51]

In [None]:
for state in main_land_states:
    d = melt_demographic[melt_demographic.STATE == state]
    f, ax = plt.subplots(figsize=(40, 20))
    sns.barplot(x="Year", y="Enrollments", hue="Demographic", data=d, ci=None)
    plt.title('Enrollments demographic in ' + state)

This is a work in progress, I will update this notebook with more interesting insights.