# Storytelling with Data! in Altair

by Maisa de Oliveira Fraiz

## Introduction

This project aims to replicate the examples from Cole Nussbaumer's book, "Storytelling with Data - Let's Practice!", using `Python Altair`. Our primary objective is to document the reasoning behind the modifications proposed by the author, while also highlighting the challenges that arise when transitioning from the book's Excel-based approach to programming in a different software environment.

`Altair` was selected for this project due to its declarative syntax, interactivity, grammar of graphics, and compatibility with `Streamlit` and other web formatting tools, while within the user-friendly Python environment. Anticipated challenges include the comparatively smaller documentation and development community of Altair compared to more established libraries like `Matplotlib`, `Seaborn`, or `Plotly`. Furthermore, tasks that might appear straightforward in Excel may require multiple iterations to translate effectively into the language.


## Imports

In [1]:
import pandas as pd
import numpy as np
import altair as alt

## Chapter 4 - Focus Attention

*Where do you want your audience to look?*

### Exercise 2 - focus on...

The data for this exercise can be found here: https://www.storytellingwithdata.com/letspractice/downloads

In [2]:
table = pd.read_excel(r"..\..\Data\4.2 EXERCISE.xlsx")
table

Unnamed: 0,EXERCISE 4.2,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7
0,,,,,,,,
1,,,,,,,,
2,,DATA TO GRAPH,,,,,,
3,,,,,,,,
4,,,$ Vol % change,spacing for dot plot,,,,
5,,Fran's Recipe,-0.14,0,,,,
6,,Wholesome Goodness,-0.13,1,,,,
7,,Lifestyle,-0.1,2,,,,
8,,Coat protection,-0.09,3,,,,
9,,Diet Lifestyle,-0.08,4,,,,


In [3]:
del table

In [4]:
table = pd.read_excel(r"..\..\Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']

table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table

Unnamed: 0,spacing for dot plot,Brands,Change
0,0,Fran's Recipe,-0.14
1,1,Wholesome Goodness,-0.13
2,2,Lifestyle,-0.1
3,3,Coat protection,-0.09
4,4,Diet Lifestyle,-0.08
5,5,Feline Basics,-0.05
6,6,Lifestyle Plus,-0.04
7,7,Feline Freedom,-0.02
8,8,Feline Gold,0.01
9,9,Feline Platinum,0.01


In [5]:
alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = "Brands")

In [6]:
alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = alt.Y("Brands", sort = None))

In [7]:
chart = alt.Chart(
    table, title = alt.Title(
       "Cat food brands: YoY sales change",
       subtitle = "% CHANGE IN VOLUME ($)",
       color = "black",
       subtitleColor = "gray",
       offset = 10,
       anchor = "start",
       fontSize = 19, 
       subtitleFontSize = 11,
       fontWeight = "normal"
       )
    ).mark_bar(color = "#8b8b8b", size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None)
    )

# It isn't possible to use condition on the x value

#label = alt.Chart(table).mark_text(align = 'left').encode(
#    x = alt.condition("1 > 0", alt.value(155), alt.value(300)),
#    y = alt.Y('Brands', sort = None),
#    text = alt.Text('Brands')
#)



label1 = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', color = "#8b8b8b", fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )


label2 = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', color = "#8b8b8b", fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )

final = chart + label1 + label2

final.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2a.png)

In [13]:
conditions = [
    f'datum.Brands == "{brand}"' for brand in table['Brands'] if 'Lifestyle' in brand
]

condition = f"({'|'.join(conditions)})"


chart_bw = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('black'), alt.value('#8b8b8b'))
    )

chart = alt.Chart(
    table, title = alt.Title(
       "Cat food brands:",
       subtitle = "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)",
       color = "black",
       subtitleColor = "gray",
       anchor = "start",
       fontSize = 19, 
       subtitleFontSize = 11,
       fontWeight = "normal"
       )
    ).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition("(datum.Brands == 'Lifestyle'|datum.Brands == 'Lifestyle Plus'|datum.Brands == 'Diet Lifestyle')",
                          alt.value('black'), alt.value('#8b8b8b'))
    )

# It isn't possible to use condition on the x value

#label = alt.Chart(table).mark_text(align = 'left').encode(
#    x = alt.condition("1 > 0", alt.value(155), alt.value(300)),
#    y = alt.Y('Brands', sort = None),
#    text = alt.Text('Brands')
#)

label1_bw = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('black'), alt.value('#8b8b8b'))
    )

title_bw = alt.Chart(
    {"values": [{"text":  [
        'Lifestyle line brands decline'
         ]}]}
).mark_text(size = 19, align = "left", dx = 172, dy = -250, fontWeight = 700, color = 'black').encode(
    text = "text:N"
)

final = chart + label1_bw + label2 + title_bw

final.properties(width = 400).configure_view(stroke = None)

In [14]:

title_bw = alt.Chart(
    {"values": [{"text":  ["Cat food brands:"]}]}
    ).mark_text(
        size = 16, align = "left", dx = -200, dy = -270, fontWeight = 'normal', color = 'black'
        ).encode(
            text = "text:N"
            )

title_bw_bold = alt.Chart(
    {"values": [{"text":  [
        'Lifestyle line brands decline'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = 'black').encode(
    text = "text:N"
)

subtitle_bw = alt.Chart(
    {"values": [{"text":  [
        "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"
         ]}]}
).mark_text(size = 11, align = "left", dx = -200, dy = -250, fontWeight = 'normal', color = 'gray').encode(
    text = "text:N"
)



final = chart_bw + label1_bw + label2 + title_bw + title_bw_bold + subtitle_bw

final.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2b.png)

In [23]:
conditions = [
    f'datum.Brands == "{brand}"' for brand in table['Brands'] if 'Feline' in brand
]

condition = f"({'|'.join(conditions)})"


chart_purple = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('#713a97'), alt.value('#8b8b8b'))
    )

label1_purple = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#713a97'), alt.value('#8b8b8b'))
    )

label2_purple = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#713a97'), alt.value('#8b8b8b'))
    )

title_purple = alt.Chart(
    {"values": [{"text":  [
        'most in Feline line increased'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#713a97').encode(
    text = "text:N"
)




final = chart_purple + label1_purple + label2_purple + title_bw + title_purple + subtitle_bw

final.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2c.png)

In [44]:
condition = "datum.Change < 0"

chart_orange = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('#ec7c30'), alt.value('#8b8b8b'))
    )

label1_orange = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.value('#ec7c30')
    )


title_orange = alt.Chart(
    {"values": [{"text":  [
        '8 brands decreased in sale'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#ec7c30').encode(
    text = "text:N"
)




final = chart_orange + label1_orange + label2 + title_bw + title_orange + subtitle_bw

final.properties(width = 400).configure_view(stroke = None)

# I didn't do the axis title

![Alt text](\Images\4_2d.png)

In [58]:
decreased_most = table.nsmallest(2, 'Change')

brands_decreased = decreased_most['Brands'].tolist()

conditions = [f'datum.Brands == "{brand}"' for brand in brands_decreased]

condition = f"({'|'.join(conditions)})"

chart_oranges = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition, alt.value('#ec7c30'), alt.value('#efb284'))
)

label1_oranges = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#ec7c30'), 
                          alt.value('#efb284'))
    )


title_oranges = alt.Chart(
    {"values": [{"text":  [
        '2 brands decreased the most'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#ec7c30').encode(
    text = "text:N"
)




final = chart_oranges + label1_oranges + label2 + title_bw + title_oranges + subtitle_bw

final.properties(width = 400).configure_view(stroke = None)

# I didn't do the axis title

![Alt text](\Images\4_2e.png)

In [27]:
condition = "datum.Change > 0"

chart_blue = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", labelColor = "#888888", titleColor = '#888888'),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('#4772b8'), alt.value('#8b8b8b'))
    )

label2_blue = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#4772b8'), alt.value('#8b8b8b'))
    )

title_blue = alt.Chart(
    {"values": [{"text":  [
        '11 brands flat to increasing'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#4772b8').encode(
    text = "text:N"
)




final = chart_blue + label1 + label2_blue + title_bw + title_blue + subtitle_bw

final.properties(width = 400).configure_view(stroke = None)

# I didn't do the axis title

![Alt text](\Images\4_2f.png)

![Alt text](\Images\4_2g.png)

![Alt text](\Images\4_2h.png)

# 