# Storytelling with Data! in Altair

by Maisa de Oliveira Fraiz

## Introduction

This project aims to replicate the examples from Cole Nussbaumer's book, "Storytelling with Data - Let's Practice!", using `Python Altair`. Our primary objective is to document the reasoning behind the modifications proposed by the author, while also highlighting the challenges that arise when transitioning from the book's Excel-based approach to programming in a different software environment.

`Altair` was selected for this project due to its declarative syntax, interactivity, grammar of graphics, and compatibility with `Streamlit` and other web formatting tools, while within the user-friendly Python environment. Anticipated challenges include the comparatively smaller documentation and development community of Altair compared to more established libraries like `Matplotlib`, `Seaborn`, or `Plotly`. Furthermore, tasks that might appear straightforward in Excel may require multiple iterations to translate effectively into the language.


## Imports

In [15]:
import pandas as pd
import numpy as np
import altair as alt

## Chapter 6 - tell a story

*Data in a spreadsheet or facts on a slide aren’t things that naturally stick with
us—they are easily forgotten. Stories, on the other hand, are memorable.*

### Exercise 6 - differentiate between live & standalone stories

The data for this exercise can be found here: https://www.storytellingwithdata.com/letspractice/downloads

In [54]:
table = pd.read_excel(r"..\..\Data\6.6 EXERCISE.xlsx", usecols = [2, 3, 4, 5, 6, 7], header = 4, skipfooter = 5)

table

Unnamed: 0,Unnamed: 2,Unnamed: 3,Internal,External,Overall,Goal
0,Jan,2019-01-01,47.6,44.8,45.05,60
1,Feb,2019-02-01,37.9,48.5,47.25,60
2,Mar,2019-03-01,17.6,49.5,46.15,60
3,Apr,2019-04-01,18.6,55.2,50.35,60
4,May,2019-05-01,40.6,56.5,55.55,60
5,Jun,2019-06-01,28.8,60.7,53.85,60
6,Jul,2019-07-01,27.1,44.2,42.85,60
7,Aug,2019-08-01,36.9,29.0,31.15,60
8,Sep,2019-09-01,37.1,61.2,59.15,60
9,Oct,2019-10-01,25.9,44.9,41.55,60


In [55]:
table.rename(columns = {'Unnamed: 2': '2019', 'Unnamed: 3': 'Date'}, inplace = True)
table

Unnamed: 0,2019,Date,Internal,External,Overall,Goal
0,Jan,2019-01-01,47.6,44.8,45.05,60
1,Feb,2019-02-01,37.9,48.5,47.25,60
2,Mar,2019-03-01,17.6,49.5,46.15,60
3,Apr,2019-04-01,18.6,55.2,50.35,60
4,May,2019-05-01,40.6,56.5,55.55,60
5,Jun,2019-06-01,28.8,60.7,53.85,60
6,Jul,2019-07-01,27.1,44.2,42.85,60
7,Aug,2019-08-01,36.9,29.0,31.15,60
8,Sep,2019-09-01,37.1,61.2,59.15,60
9,Oct,2019-10-01,25.9,44.9,41.55,60


In [56]:
table.drop(columns = ['Date', 'Goal', 'Overall'], inplace = True)

In [57]:
melted_table = pd.melt(table, id_vars = ['2019'], var_name = 'Metric', value_name = 'Value')
melted_table

Unnamed: 0,2019,Metric,Value
0,Jan,Internal,47.6
1,Feb,Internal,37.9
2,Mar,Internal,17.6
3,Apr,Internal,18.6
4,May,Internal,40.6
5,Jun,Internal,28.8
6,Jul,Internal,27.1
7,Aug,Internal,36.9
8,Sep,Internal,37.1
9,Oct,Internal,25.9


In [99]:
title_chart = alt.Title("Time to fill",
                        fontSize = 18,
                        fontWeight = 'normal',
                        anchor = 'start',
                        offset = 10)

line = alt.Chart(melted_table, title = title_chart).mark_line().encode(
    x = alt.X('2019',
               sort = None,
               axis = alt.Axis(labelAngle = 0, 
                               titleAnchor = 'start',
                               labelColor = "#888888", 
                               titleColor = '#888888', 
                               titleFontWeight = 'normal', 
                               ticks = False), 
               ),
    y = alt.Y('Value', 
              axis = alt.Axis(grid = False, 
                              titleAnchor = 'end',
                              labelColor = "#888888", 
                              titleColor = '#888888', 
                              titleFontWeight = 'normal'), 
              title = "TIME TO FILL (DAYS)"),
    color = alt.Color("Metric", scale = alt.Scale(range = ['black', 'black']), legend = None),
    ).properties(width = 500)

goal = alt.Chart().mark_rule(strokeDash = [4,4]).encode(
    x = alt.datum('Jan'),
    x2 = alt.datum('Dec'),
    y = alt.datum(60)
)

final = line + goal

final.configure_view(stroke = None)

![Alt text](\Images\6_6a.png)

In [104]:

empty = alt.Chart(melted_table, title = title_chart).mark_line(opacity = 0).encode(
    x = alt.X('2019',
               sort = None,
               axis = alt.Axis(labelAngle = 0, 
                               titleAnchor = 'start',
                               labelColor = "#888888", 
                               titleColor = '#888888', 
                               titleFontWeight = 'normal', 
                               ticks = False), 
               ),
    y = alt.Y('Value', 
              axis = alt.Axis(grid = False, 
                              titleAnchor = 'end',
                              labelColor = "#888888", 
                              titleColor = '#888888', 
                              titleFontWeight = 'normal'), 
              title = "TIME TO FILL (DAYS)")
            ).properties(width = 500)
empty.configure_view(stroke = None)

![Alt text](\Images\6_6b.png)

In [106]:
(empty + goal).configure_view(stroke = None)

![Alt text](\Images\6_6c.png)

![Alt text](\Images\6_6d.png)

![Alt text](\Images\6_6e.png)

![Alt text](\Images\6_6f.png)

![Alt text](\Images\6_6g.png)

![Alt text](\Images\6_6h.png)

![Alt text](\Images\6_6i.png)

![Alt text](\Images\6_6j.png)

![Alt text](\Images\6_6k.png)

![Alt text](\Images\6_6l.png)

![Alt text](\Images\6_6m.png)

![Alt text](\Images\6_6n.png)