## 📘 Data Challenge 11 – Dash App with a Pie or Scatter Plot

Assignment Type: Partner/Group 
Estimated Time: 45–60 minutes

---
### 🎯 Targeted KSBs (Knowledge, Skills, and Behaviors)
S6 – Creates dynamic visualizations using Python (Plotly)

S7 – Designs clear dashboards using Dash

K10 – Chooses appropriate chart types for data and audience

B6 – Applies structured thinking to turn analysis into a dashboard app

---

### 📊 Scenario:
You’re helping an athletic director create a quick dashboard for just **five sports**. She wants to better understand how men’s and women’s revenue compare across a few programs. Your job is to make a simple Dash app with a pie chart or scatter plot showing the difference.

---
### ✅ Your Task (Step-by-Step)

- Load the sports.csv file using pandas.

- Drop missing values in the sports, rev_men, and rev_women columns.

- Pick 5 unique sports (your choice!) and filter the DataFrame to only include those.

    - Example: "Basketball", "Tennis", "Soccer", "Volleyball", "Golf"

- Create a new column called "Total_Revenue" by adding men’s and women’s revenue.

- Create either a pie chart or a scatter plot:

In [39]:
# Import packages 

import dash
from dash import html, dcc
import pandas as pd
import plotly.express as px
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

In [40]:
# Load and filter the data
df = pd.read_csv('/Users/gabriel/Desktop/marcy/DA2025_Lectures2/Mod2/data/sports.csv')
df = df[["sports", "rev_men", "rev_women"]].dropna()


Columns (8) have mixed types. Specify dtype option on import or set low_memory=False.



In [41]:
# Pick 5 sports
top5= ['Basketball', 'All Track Combined', 'Tennis', 'Golf', 'Soccer']

#Copying the dataframe to not overwrite the original 
df_5 = df[df["sports"].isin(top5)].copy()

In [42]:
# Create new column called Total_Revenue that adds up the men and women's revenue columns
df_5["Total_Revenue"] = df_5['rev_men'].sum() + df_5['rev_women'].sum()
print(df_5.head())

                sports    rev_men  rev_women  Total_Revenue
1           Basketball  1211095.0   748833.0   2.429588e+10
2   All Track Combined   183333.0   315574.0   2.429588e+10
7               Tennis    78274.0   131145.0   2.429588e+10
11          Basketball  4189826.0  1966556.0   2.429588e+10
13                Golf   407728.0   346987.0   2.429588e+10


In [43]:
# Make your pie or scatteplot using plotly 

fig = px.pie(df_5, values='Total_Revenue', names='sports',
             title='Total Revenue by Sports')

# Show the plot
fig.show()


In [None]:
# Make the App -- DO NOT RUN THIS CELL YET It may give you a "port already in use error if you do"

app = dash.Dash(__name__)
app.title = "Sports Dashboard"

app.layout = html.Div([
    html.H1("Revenue Analysis for 5 Sports", style={'textAlign': 'center'}),
    dcc.Graph(figure=fig)
])

if __name__ == '__main__':
    app.run(debug=True)

: 

### Copy and paste the code in this notebook into a file called `app.py` and run that file; then go to your localhost address:  http://localhost:8050/ to see the updated visual