# Data Journalism Lesson 18: Slope charts

Comparing one time period to another.

In [None]:
import warnings
from IPython.core.interactiveshell import InteractiveShell

# Keep hold of the real method
_orig_should_run = InteractiveShell.should_run_async

# Wrap it so that any DeprecationWarning it emits is silenced
def should_run_async(self, code, *args, **kwargs):
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", category=DeprecationWarning)
        return _orig_should_run(self, code, *args, **kwargs)

# Apply the monkey‑patch
InteractiveShell.should_run_async = should_run_async

In [None]:
import micropip
await micropip.install('plotly')
await micropip.install("nbformat>=4.2.0")

In [None]:
from IPython.display import display, HTML
import pandas as pd

# --- Simple Grading/Checking Functions ---
def display_feedback(correct, message_correct, message_incorrect):
    if correct:
        display(HTML(f'<div style="background-color: #dff0d8; padding: 10px; border-radius: 5px;"><strong>Correct!</strong> {message_correct}</div>'))
    else:
        display(HTML(f'<div style="background-color: #f2dede; padding: 10px; border-radius: 5px;"><strong>Not quite!</strong> {message_incorrect}</div>'))

def check_df_exists(df, df_name, expected_rows=None, expected_cols=None):
    if not isinstance(df, pd.DataFrame) or df.empty:
        display_feedback(False, f'{df_name} DataFrame is not loaded correctly or is empty.', 'Please check the loading process.')
        return False
    msg = f'{df_name} DataFrame loaded successfully.'
    correct = True
    if expected_rows is not None and len(df) != expected_rows:
        msg += f' Expected {expected_rows} rows, got {len(df)}.'
        correct = False
    if expected_cols is not None and df.shape[1] != expected_cols:
        msg += f' Expected {expected_cols} columns, got {df.shape[1]}.'
        correct = False
    
    if correct:
        display_feedback(True, msg, '')
    else:
        display_feedback(False, '', msg)
    return correct

def check_plot_params(params, expected_params, plot_name):
    correct = True
    messages = []
    for p_name, p_val_expected in expected_params.items():
        p_val_actual = params.get(p_name)
        if isinstance(p_val_actual, pd.DataFrame) and isinstance(p_val_expected, pd.DataFrame):
            if p_val_actual.equals(p_val_expected):
                messages.append(f'Correct DataFrame for {p_name} in {plot_name}.')
            else:
                correct = False
                messages.append(f'Incorrect DataFrame for {p_name} in {plot_name}. Content differs.')
        elif p_val_actual == p_val_expected:
            messages.append(f'Correct {p_name} for {plot_name}.')
        else:
            correct = False
            messages.append(f'Incorrect {p_name} for {plot_name}. Expected \'{p_val_expected}\', got \'{p_val_actual}\'.')
    final_message_correct = f'Plot parameters for {plot_name} are correct!'
    final_message_incorrect = ' '.join(messages)
    display_feedback(correct, final_message_correct, final_message_incorrect)

In [None]:
# --- State Setup and Data Loading ---
state_abbr = 'MN'
state_full_name = 'Minnesota'

peak_data_url = "../_static/peak-unemployment/peak.csv"
peak_df_initial = pd.read_csv(peak_data_url)

state_peak_df_initial = peak_df_initial[peak_df_initial['Location'] == state_full_name].copy()
staterows_expected = len(state_peak_df_initial)


In [None]:
from myst_nb import glue

glue("state_full_name", state_full_name, display=False)
glue("staterows_expected", staterows_expected, display=False)

## The Goal

In this lesson, you'll learn how to create slope charts, a powerful tool for visualizing changes between two time periods. By the end of this tutorial, you'll understand when to use slope charts, how to prepare your data, and how to construct these charts using Plotly Express. You'll practice filtering data, manipulating chart elements, and adding informative labels to create clear and impactful visualizations. These skills will enable you to effectively communicate trends and comparisons in your data journalism projects, especially when dealing with before-and-after scenarios or year-over-year changes.

## Why Visualize Data?

Darla Cameron, now the chief product officer at the Texas Tribune, has worked in visual journalism for more than 15 years with stops in Florida, Washington D.C. and now Texas. She's been at ground zero for an awakening in journalism that started in the 80s, got turbocharged by the internet and now seems like it was always so: journalism can be presented graphically. And for some stories, it's vastly better to do it with graphics.

"Visual journalism is important because humans communicate visually. That's how we learn to read. That's how we learn to read our parents' faces when we're babies," Cameron said. "We are really good at understanding visual cues and visual context. So as visual journalists, we have this superpower because we can leverage that, this thing that's so innate inside people's brains, and we can tell them a story with that. It's amazing. It's so cool."

Learning what makes a better narrative versus a better graphic is a critical skill, one that will make you more valuable in any enterprise that tries to tell stories with data.

Or, as Cameron puts it, being able to communicate visually gives you a superpower.

"How do you know that somebody's going to read a long narrative?" she said. "You don't. So that's why we need to be thinking about visuals in terms of how we communicate information. It's like a superpower that you can tap into the part of people's brains that can understand information better when they learn it visually. We're visual thinkers. It's so innate."

## The Basics

A line chart, as we learned, shows change over time, with the date on the x-axis and each line representing a state or county or some other entity. But sometimes, you have just two time periods -- a line chart of two years is ... not a line chart in the traditional sense of showing a continuous trend. 

But with some fiddling with lines and points, you can create a new chart type that does show change over two time periods quite well: A slope chart. 

Think of a slope chart like the meme you see online: How it started vs. how it's going. 

For purposes of this exercise, we're going to look at unemployment again, but this time, we're just looking at April 2020 and April 2024. April 2020 is the month when unemployment was highest for almost every state. How does that compare four years later? 

Think of it like playing connect the dots from elementary school.

The difference between slope charts and most other chart types in Plotly Express is that there is no dedicated `px.slope()` function. We're going to make one using `px.line()` and then potentially customize it by adding markers and text annotations. This is another example of where layering concepts (achieved through multiple traces or figure updates) will come in handy.

We'll need `pandas` for data manipulation and `plotly.express` for plotting.

In [None]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go # For more detailed customization if needed

And we'll grab the data, which is just one file for the nation, containing unemployment data for two specific time points (April 2020 and April 2024 for each state).

In [None]:
peak_df = pd.read_csv("../_static/peak-unemployment/peak.csv")

We should take a quick look at our data so we know what we're working with. We'll use `head()` for that.

In [None]:
display(peak_df.head())

### Exercise 1: Our first slope chart

Our data is in pretty good shape already, so we can go right to making a chart. 

A slope chart is really made up of lines connecting points for two time periods. With `plotly.express`, we can use `px.line()` and tell it to draw lines for each `Location`. We'll also add markers at each data point.
- `x`: `year` (this will be treated as categorical by default if it's just two distinct numbers, which is fine for a slope chart).
- `y`: `Unemployment_Rate`.
- `color`: `Location` to get a separate line/slope for each state.
- `markers`: `True` to show points at the start and end of each slope.

We'll also set a default color for these lines to grey for now, which we can do by modifying the figure's traces after creation, or by setting a `color_discrete_sequence` if we map `Location` to `line_group` instead of `color`.

In [None]:
fig_ex1 = px.line(
    data_frame=____,
    x=____,
    y=____,
    color=____, # Use 'Location' to get a line per state
    markers=____, # Set to True
    title="Unemployment Rate Change (2020 vs 2024)"
)

# Set all lines to grey initially
fig_ex1.update_traces(line=dict(color='grey'), marker=dict(color='grey'))

fig_ex1.show()

Well, we got something. But who changed the most? What should we focus on? We don't have anything to guide us here.  

Might as well be selfish and focus on {glue:text}`state_full_name`. Unless you work for a major national news organization, you're going to be focusing on where you live -- on the audience you serve. People care about where they are, so give them what they want.

### Exercise 2: Filtering

To let us focus, let’s build a dataframe called `state_peak_df` and filter the `Location` by your state name.

In [None]:
state_peak_df = peak_df[peak_df[_____] == _____]

display(state_peak_df)
check_df_exists(state_peak_df, f'{state_full_name} peak_df', expected_rows=staterows_expected)

Now, with a state to focus on, we can modify our previous chart to highlight this specific state. We'll redraw the chart with all states in grey, and then find the trace corresponding to our `state_full_name` and change its color to red.

### Exercise 3: Give readers a focus

1. Recreate the slope chart from Exercise 1 with all lines/markers initially grey.
2. Iterate through the `fig.data` (the traces). If a trace's `name` (which corresponds to `Location`) matches `state_full_name`, update its line and marker color to 'red'.

In [None]:
fig_ex3 = px.line(
    data_frame=peak_df,
    x='year',
    y='Unemployment_Rate',
    color='Location', 
    markers=True,
    title=f"Unemployment Rate Change: {state_full_name} vs. Others"
)

# Highlight the specific state
for trace in fig_ex3.data:
    if trace.name == ____: # The name of your state_full_name
        trace.line.color = ____ # e.g., 'red'
        trace.marker.color = ____ # e.g., 'red'
        trace.line.width = 2 # Optional: make highlighted line thicker
    else:
        trace.line.color = 'lightgrey'
        trace.marker.color = 'lightgrey'
        trace.opacity = 0.7 # Optional: make other lines slightly transparent

fig_ex3.show()

Now we can start with our questions: What story does this tell? Is it very clear? 

### Exercise 4: Getting creative with spacing and breaks

Now that we have something to work with, it's time to start improving it. It would be good to label the state we're highlighting. Maybe we label them on each end of the graph to help people follow the line. Given how much space the default Plotly behavior might leave, we might need to make some space or adjust label positioning.

We can adjust the x-axis using `fig.update_xaxes()`. 
- `tickvals`: Sets specific tick locations (e.g., `[2020, 2024]`).
- `range`: Sets the visible range of the axis (e.g., `[2019.5, 2024.5]`) to add padding.

In [None]:
# We'll apply this to fig_ex3 (our highlighted chart)
fig_ex4 = go.Figure(fig_ex3) # Create a copy to modify

fig_ex4.update_xaxes(
    tickvals=[____, ____],
    range=[____, ____]
)

fig_ex4.show()

A ha! Better. We now just have the two years labeled and we have some space to put our state name in there to help people.

### Exercise 5: Adding labels

To add text labels next to the points of our highlighted state, we can use `fig.add_annotation()`. We'll iterate through the `state_peak_df` (which has two rows for our highlighted state, one for each year) and add an annotation for each point.

- `x`, `y`: Coordinates for the annotation (from the DataFrame).
- `text`: The label text (the state's `Location`).
- `showarrow=False`: To have just text, no arrow.
- `xanchor`, `xshift`: To control horizontal position and nudge the text slightly left or right of the point.


In [None]:
# We'll apply this to fig_ex4 (our chart with custom x-axis)
fig_ex5 = go.Figure(fig_ex4) # Create a copy to modify

for index, row in state_peak_df.iterrows():
    fig_ex5.add_annotation(
        x=row['year'], 
        y=row['Unemployment_Rate'], 
        text=row[____],
        showarrow=False,
        xanchor='left' if row['year'] == state_peak_df['year'].max() else 'right',
        xshift=7 if row['year'] == state_peak_df['year'].max() else -7, # Nudge amount in pixels
        font=dict(color='red', size=10) # Match highlight color
    )

fig_ex5.update_layout(showlegend=False) # Hide legend if labels make it redundant
fig_ex5.show()

And there you have it. A clean and clear slope chart showing how the peak of the pandemic compares to four years later for {glue:text}`state_full_name` relative to other states.

## The Recap

Throughout this lesson, you've mastered the art of creating slope charts to visualize changes between two time periods using Plotly Express and Plotly graph objects. You've learned how to prepare your data, construct the basic chart structure with lines and markers, highlight specific series, customize axes, and add text annotations for clarity. Remember, slope charts are particularly effective for showing how different entities (like states in our example) change relative to each other over two distinct points in time. As you go forward, you'll find slope charts to be a valuable addition to your visualization toolkit, especially when you need to clearly communicate before-and-after comparisons or year-over-year changes in a compelling and easy-to-understand format.

## Terms to Know

- **Slope chart**: A type of chart that shows changes in values for different categories between two time points, connecting the points with lines (slopes).
- **`plotly.express.line()` (or `px.line()`)**: Used as the base for creating slope charts by drawing lines between the two time points for each category.
- **`markers=True`**: An argument in `px.line()` to display markers at each data point (start and end of the slope).
- **`color` argument (in `px.line`)**: Used to differentiate categories (e.g., states) with different colored lines and markers.
- **`fig.update_traces()`**: A Plotly method to modify properties (like color, width, opacity) of traces (lines/markers) in a figure.
- **`fig.update_xaxes()`**: A Plotly method to customize the x-axis, including `tickvals` (specific tick locations) and `range` (axis limits).
- **`fig.add_annotation()`**: A Plotly method to add text labels or other annotations to a figure at specific x/y coordinates.
- **`xanchor`, `xshift` (in annotations)**: Properties to control the horizontal alignment and pixel offset of text annotations.