# Graphs

We want to graph the population trends of a single state over time.

To do this, let's start by reading in our data and filtering it to a single state

In [6]:
import pandas as pd
from IPython.display import display

df = pd.read_csv("state_data.csv")

ca_mask = df["State"] == "California"
df_ca = df[ca_mask]
df_ca

Unnamed: 0,State,Year,Total Population,Median Household Income
72,California,2005,35278768,53629
73,California,2006,36457549,56645
74,California,2007,36553215,59948
75,California,2008,36756666,61021
76,California,2009,36961664,58931
77,California,2010,37349363,57708
78,California,2011,37691912,57287
79,California,2012,38041430,58328
80,California,2013,38332521,60190
81,California,2014,38802500,61933


# Plotly

In this course we'll be using the Plotly library to make interactive graphs. 

Plotly's "Plotly Express" module makes it easy to create graphs. Most functions have descriptive names (like `line` for making line graphs) that take:

  * A dataframe as the first agument
  * `x`: The column to use for the x-axis
  * `y`: The column to use for the y-axis
  * `title`: The text to use for the title

Here's an example that creates a line graph of the population of California over time. Note that when you hover over the line, you can see the exact values (year, population) of the point.

In [27]:
import plotly.express as px

px.line(
    df_ca, x="Year", y="Total Population", title="Population of California over Time"
)

# Exercise: Plotly Graphs in Jupyter Notebooks

1. Use `px.bar` to create a graph of California's population over time. What are the pros and cons of each visualization?
2. The line graph would be clearer if it also had a point for each data point. Copy the `px.line` code above into Copilot and ask it to modify it so that it also had points for each datapoint.

In [26]:
# Your code here.

# Plotly Graphs in Streamlit

To output a plotly graph in a Streamlit app you must call the function `st.plotly_chart`.

A common pattern is:

```py
fig = px.line(
    df_ca, x="Year", y="Total Population", title="Population of California over Time"
)
st.plotly_chart(fig)
```

That is, it's common to store the result of `px.line` in a variable called `fig`. And then pass `fig` to `st.plotly_chart()`.

# f-Strings

For the next exercise, it will help to know about "f-strings".

In [29]:
state = "New York"
title = f"Graph for {state}"
title

'Graph for New York'

# Exercise: Plotly Graphs in Streamlit

Update `graph_app.py` to create a line graph of the population of the selected State.

# Final Exam: Inputs & Graphs

This exercise can be thought of as a "final exam" for inputs and graphs, as it requires you to create a new input and connect it to an existing graph. 

Create a UI widget that lets the user select which demographic they want to graph. Use the value from that widget, as well as the state selection widget, to decide which graph to make.

Here are the steps:
  1. Add a new select box to `graph_app.py`. Populate it with the values `Total Population` and `Median Household Income`.
  2. Store the value returned from that selectbox in a variable called `demographic`.
  3. Use the value of `demographic` as the value for `y` in your line graph.

# Choropleth Maps


In [22]:
state_abbrev = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "Puerto Rico": "PR",
}

df["State Abbrev"] = df["State"].map(state_abbrev)

In [23]:
df_2021 = df[df["Year"] == 2021]
df_2021

Unnamed: 0,State,Year,Total Population,Median Household Income,State Abbrev
15,Alabama,2021,5039877,53913,AL
33,Alaska,2021,732673,77845,AK
51,Arizona,2021,7276316,69056,AZ
69,Arkansas,2021,3025891,52528,AR
87,California,2021,39237836,84907,CA
105,Colorado,2021,5812069,82254,CO
123,Connecticut,2021,3605597,83771,CT
141,Delaware,2021,1003384,71091,DE
159,District of Columbia,2021,670050,90088,
177,Florida,2021,21781128,63062,FL


In [24]:
px.choropleth(
    df_2021,
    locations="State Abbrev",
    locationmode="USA-states",
    color="Median Household Income",
    scope="usa",
    #    color_continuous_scale='Viridis',
    #    labels={'Total Population': 'Population'},
    title="Total Population by U.S. State",
)