# LiveLab: Creating a Simple Dash App 


## Dataset Description:

You are working with a college sports dataset from [Kaggle](https://www.kaggle.com/datasets/umerhaddii/us-collegiate-sports-dataset)

### Step 1:  Load & Inspect Data 

In [7]:
#Import pandas, plotly, and dash packages 
import dash
from dash import html, dcc
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px

In [8]:
# Read in the data 

df = pd.read_csv('/Users/Marcy_Student/Downloads/sports.csv')
df


Columns (8) have mixed types. Specify dtype option on import or set low_memory=False.



Unnamed: 0,year,unitid,institution_name,city_txt,state_cd,zip_text,classification_code,classification_name,classification_other,ef_male_count,...,partic_coed_women,sum_partic_men,sum_partic_women,rev_men,rev_women,total_rev_menwomen,exp_men,exp_women,total_exp_menwomen,sports
0,2015,100654,Alabama A & M University,Normal,AL,35762.0,2,NCAA Division I-FCS,,1923,...,,31,0,345592.0,,345592.0,397818.0,,397818.0,Baseball
1,2015,100654,Alabama A & M University,Normal,AL,35762.0,2,NCAA Division I-FCS,,1923,...,,19,16,1211095.0,748833.0,1959928.0,817868.0,742460.0,1560328.0,Basketball
2,2015,100654,Alabama A & M University,Normal,AL,35762.0,2,NCAA Division I-FCS,,1923,...,,61,46,183333.0,315574.0,498907.0,246949.0,251184.0,498133.0,All Track Combined
3,2015,100654,Alabama A & M University,Normal,AL,35762.0,2,NCAA Division I-FCS,,1923,...,,99,0,2808949.0,,2808949.0,3059353.0,,3059353.0,Football
4,2015,100654,Alabama A & M University,Normal,AL,35762.0,2,NCAA Division I-FCS,,1923,...,,9,0,78270.0,,78270.0,83913.0,,83913.0,Golf
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
132322,2019,800001,Simon Fraser University,,,,4,NCAA Division II with football,,5681,...,,30,21,652950.0,357251.0,1010201.0,652950.0,357251.0,1010201.0,Soccer
132323,2019,800001,Simon Fraser University,,,,4,NCAA Division II with football,,5681,...,,0,19,,381402.0,381402.0,,381402.0,381402.0,Softball
132324,2019,800001,Simon Fraser University,,,,4,NCAA Division II with football,,5681,...,,17,17,281726.0,237917.0,519643.0,281726.0,237917.0,519643.0,Swimming
132325,2019,800001,Simon Fraser University,,,,4,NCAA Division II with football,,5681,...,,0,19,,402135.0,402135.0,,402135.0,402135.0,Volleyball


### Step 2:  Clean the Data and Only keep the Revenue Columns 

In [9]:
# Drop the NAs on the revenue columns 

df_clean = df[['rev_men', 'rev_women']].dropna()
df_clean


Unnamed: 0,rev_men,rev_women
1,1211095.0,748833.0
2,183333.0,315574.0
7,78274.0,131145.0
11,4189826.0,1966556.0
13,407728.0,346987.0
...,...,...
132319,403871.0,378615.0
132321,221193.0,224243.0
132322,652950.0,357251.0
132324,281726.0,237917.0


### Step 3:  Compute Averages and Put in a New DataFrame 

In [10]:
# Compute averages
avg_men = df_clean['rev_men'].mean()
avg_women = df_clean['rev_women'].mean()

# Put in a new DataFrame for plotting
df_plot = pd.DataFrame({
    "Gender": ["Men", "Women"],
    "Average Revenue": [avg_men, avg_women]
})
df_plot

Unnamed: 0,Gender,Average Revenue
0,Men,405013.608406
1,Women,269807.298711


### Step 4:  Plot the Data

In [11]:
# Plot grouped bar chart of Gender and Average Revenue 
fig = px.bar(df_plot, x="Gender", y="Average Revenue", title="Average Sports Revenue by Gender")
fig.show()

### Step 4 Build the Dash App

In [None]:
app = dash.Dash(__name__)
app.title = 'Average Sports Revenue by Gender'

app.layout = html.Div([
    html.H1("Average Sports Revenue by Gender", style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

if __name__ == "__main__":
    app.run(debug=True, port=8050)

: 

### Step 5:  View in Browser

Open up a browser to see if it pops up:  http://localhost:8050/
