# Graphs

We want to graph the population trends of a single state over time.

To do this, let's start by reading in our data and filtering it to a single state

In [34]:
import pandas as pd
from IPython.display import display

df = pd.read_csv("state_data.csv")

ca_mask = df["State"] == "California"
df_ca = df[ca_mask]
df_ca

Unnamed: 0,State,Year,Total Population,Median Household Income,State Abbrev
72,California,2005,35278768,53629,CA
73,California,2006,36457549,56645,CA
74,California,2007,36553215,59948,CA
75,California,2008,36756666,61021,CA
76,California,2009,36961664,58931,CA
77,California,2010,37349363,57708,CA
78,California,2011,37691912,57287,CA
79,California,2012,38041430,58328,CA
80,California,2013,38332521,60190,CA
81,California,2014,38802500,61933,CA


# Plotly

In this course we'll be using the Plotly library to make interactive graphs. 

Plotly's "Plotly Express" module makes it easy to create graphs. Most functions have descriptive names (like `line` for making line graphs) that take:

  * A dataframe as the first agument
  * `x`: The column to use for the x-axis
  * `y`: The column to use for the y-axis
  * `title`: The text to use for the title

Here's an example that creates a line graph of the population of California over time. Note that when you hover over the line, you can see the exact values (year, population) of the point.

In [35]:
import plotly.express as px

px.line(
    df_ca, x="Year", y="Total Population", title="Population of California over Time"
)

# Exercise: Plotly Graphs in Jupyter Notebooks

1. Use `px.bar` to create a graph of California's population over time. What are the pros and cons of each visualization? (Hint: `px.bar` takes the same parameters as `px.line`.)
2. The line graph would be clearer if it also had a point for each data point. Copy the `px.line` code above into Copilot and ask it to modify it so that it also had points for each datapoint.

In [36]:
# Your code here.

# Plotly Graphs in Streamlit

To output a plotly graph in a Streamlit app you must call the function `st.plotly_chart`.

A common pattern is:

```py
fig = px.line(
    df_ca, x="Year", y="Total Population", title="Population of California over Time"
)
st.plotly_chart(fig)
```

That is, it's common to store the result of `px.line` in a variable called `fig`. And then pass `fig` to `st.plotly_chart()`.

# f-Strings

For the next exercise, it will help to know about "f-strings".

In [37]:
state = "New York"
title = f"Graph for {state}"
title

'Graph for New York'

# Exercise: Plotly Graphs in Streamlit

Update `graph_app.py` to create a line graph of the population of the selected State.

# Final Exam Part 1: Inputs & Graphs

This exercise can be thought of as a "final exam" for inputs and graphs, as it requires you to create a new input and connect it to an existing graph. 

Create a UI widget that lets the user select which demographic they want to graph. Use the value from that widget, as well as the state selection widget, to decide which graph to make.

Here are the steps:
  1. Add a new select box to `graph_app.py`. Populate it with the values `Total Population` and `Median Household Income`.
  2. Store the value returned from that selectbox in a variable called `demographic`.
  3. Use the value of `demographic` as the value for `y` in your line graph.

# Choropleth Maps

A choropleth map shows regions (like states), and expresses values for those regions (like population) using color. 

Use `px.choropleth` to create a choropleth. Instead of specifying `x` and `y`, you specify:
  * `locations`: The column that identifies the location of the observation (ex. "New York").
  * `color`: The column you want to map to color (ex. "Total Population").

The `title` parameter is the same. But there are two other parameters unique to maps:
  * `scope='usa'` zooms the map in on the US.
  * `locationmode='usa-states'` clarifies that the location is recorded using two-letter state abbreviations to identify US states. This function does not understand full state names. So the location column must be `State Abbrev`.

Below is code to create a choropleth map of the population of US States for 2013. 

In [39]:
df

Unnamed: 0,State,Year,Total Population,Median Household Income,State Abbrev
0,Alabama,2005,4442558,36879,AL
1,Alabama,2006,4599030,38783,AL
2,Alabama,2007,4627851,40554,AL
3,Alabama,2008,4661900,42666,AL
4,Alabama,2009,4708708,40489,AL
...,...,...,...,...,...
931,Wyoming,2018,577737,61584,WY
932,Wyoming,2019,578759,65003,WY
933,Wyoming,2021,578803,65204,WY
934,Wyoming,2022,581381,70042,WY


In [55]:
mask = df["Year"] == 2023
df_2023 = df[mask]

px.choropleth(
    df_2023,
    locations="State Abbrev",  # Column for region
    locationmode="USA-states",
    color="Total Population",  # Column for color
    scope="usa",
    title="2023 Total Population",
)

# Final Exam Part 2: Choropleth Map

1. Copy the above code, verbatim, to the app. Verify that it works.
2. Connect the map to the `demographic` selectbox. Do the results surprise you?
3. Create a new selectbox that lets the user select which `Year` of data to map.
4. Connect the year selectbox to the map. 

The result should be an app that lets users select which year and demographic statistic to map.