# Storytelling with Data! in Altair

by Maisa de Oliveira Fraiz

## Introduction

This project aims to replicate the examples from Cole Nussbaumer's book, "Storytelling with Data - Let's Practice!", using `Python Altair`. Our primary objective is to document the reasoning behind the modifications proposed by the author, while also highlighting the challenges that arise when transitioning from the book's Excel-based approach to programming in a different software environment.

`Altair` was selected for this project due to its declarative syntax, interactivity, grammar of graphics, and compatibility with `Streamlit` and other web formatting tools, while within the user-friendly Python environment. Anticipated challenges include the comparatively smaller documentation and development community of Altair compared to more established libraries like `Matplotlib`, `Seaborn`, or `Plotly`. Furthermore, tasks that might appear straightforward in Excel may require multiple iterations to translate effectively into the language.


## Imports

In [2]:
import pandas as pd
import numpy as np
import altair as alt

## Chapter 4 - Focus Attention

*"Where do you want your audience to look?"* - Cole Nussbaumer

### Exercise 2 - focus on...

The data for this exercise can be found here: https://www.storytellingwithdata.com/letspractice/downloads

In [5]:
# Loading table
table = pd.read_excel(r"..\..\Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)

# Fixing names
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']

table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table

NameError: name 'pd' is not defined

Not using `sort = None` results in the brands being arranged in alphabetical order.

In [6]:
alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = "Brands")

In [7]:
# Not sorted graph

alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = alt.Y("Brands", sort = None))

In [99]:
chart = alt.Chart(
    table, title = alt.Title(
       "Cat food brands: YoY sales change",
       subtitle = "% CHANGE IN VOLUME ($)",
       color = "black",
       subtitleColor = "gray",
       offset = 10,
       anchor = "start",
       fontSize = 19, 
       subtitleFontSize = 11,
       fontWeight = "normal"
       )
    ).mark_bar(color = "#8b8b8b", size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None)
    )


label1 = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', color = "#8b8b8b", fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )


label2 = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', color = "#8b8b8b", fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )

gray = chart + label1 + label2

gray.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2a.png)

### Highlight the Lifestyle brands

In [100]:
conditions = [
    f'datum.Brands == "{brand}"' for brand in table['Brands'] if 'Lifestyle' in brand
]

condition = f"({'|'.join(conditions)})"


chart_bw = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('black'), alt.value('#c6c6c6'))
    )

chart = alt.Chart(
    table, title = alt.Title(
       "Cat food brands:",
       subtitle = "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)",
       color = "black",
       subtitleColor = "gray",
       anchor = "start",
       fontSize = 19, 
       subtitleFontSize = 11,
       fontWeight = "normal"
       )
    ).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition("(datum.Brands == 'Lifestyle'|datum.Brands == 'Lifestyle Plus'|datum.Brands == 'Diet Lifestyle')",
                          alt.value('black'), alt.value('#c6c6c6'))
    )

label1_bw = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('black'), alt.value('#c6c6c6'))
    )

label2_bw = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', color = "#c6c6c6", fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )

title_bw = alt.Chart(
    {"values": [{"text":  [
        'Lifestyle line brands decline'
         ]}]}
).mark_text(size = 19, align = "left", dx = 172, dy = -250, fontWeight = 700, color = 'black').encode(
    text = "text:N"
)

(chart + label1_bw + label2_bw + title_bw).properties(width = 400).configure_view(stroke = None)

# Trying the "Cat food brands: " as a title and "Lifestyle line brands decline" as an add on didn't work

In [128]:

title_bw = alt.Chart(
    {"values": [{"text":  ["Cat food brands:"]}]}
    ).mark_text(
        size = 16, align = "left", dx = -200, dy = -270, fontWeight = 'normal', color = 'black'
        ).encode(
            text = "text:N"
            )

title_bw_bold = alt.Chart(
    {"values": [{"text":  [
        'Lifestyle line brands decline'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = 'black').encode(
    text = "text:N"
)

subtitle_bw = alt.Chart(
    {"values": [{"text":  [
        "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"
         ]}]}
).mark_text(size = 11, align = "left", dx = -200, dy = -250, fontWeight = 'normal', color = 'gray').encode(
    text = "text:N"
)



lifestyle = chart_bw + label1_bw + label2_bw + title_bw + title_bw_bold + subtitle_bw

lifestyle.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2b.png)

Feline has a purple logo

In [146]:
conditions = [
    f'datum.Brands == "{brand}"' for brand in table['Brands'] if 'Feline' in brand
]

condition_purple = f"({'|'.join(conditions)})"


chart_purple = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
     x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition_purple,
                          alt.value('#713a97'), alt.value('#c6c6c6'))
    )

label1_purple = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition_purple,
                          alt.value('#713a97'), alt.value('#c6c6c6'))
    )

label2_purple = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition_purple,
                          alt.value('#713a97'), alt.value('#c6c6c6'))
    )

title_purple = alt.Chart(
    {"values": [{"text":  [
        'most in Feline line increased'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#713a97').encode(
    text = "text:N"
)




feline = chart_purple + label1_purple + label2_purple + title_bw + title_purple + subtitle_bw

feline.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2c.png)

Errata! 

I tend to
avoid red and green for bad and good connotation, respectively, because of the
inaccessibility for those who are colorblind (red/green colorblindness is the most
prevalent, affecting nearly 10% of the population). I’ll often use orange for negative and blue for positive, as I feel you still get the desired connotation

In [106]:
condition = "datum.Change < 0"

chart_orange = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('#ec7c30'), alt.value('#c6c6c6'))
    )

label1_orange = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.value('#ec7c30')
    )


title_orange = alt.Chart(
    {"values": [{"text":  [
        '8 brands decreased in sale'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#ec7c30').encode(
    text = "text:N"
)

decreased_orange = alt.Chart(
    {"values": [{"text":  [
        'DECREASED'
         ]}]}
).mark_text(size = 11, align = "left", dx = -80, dy = -220, fontWeight = 700, color = '#ec7c30').encode(
    text = "text:N"
)

increased_gray = alt.Chart(
    {"values": [{"text":  [
            '|    INCREASED'
         ]}]}
).mark_text(size = 11, align = "left", dx = -0, dy = -220, fontWeight = 700, color = '#8b8b8b').encode(
    text = "text:N"
)

decreased = chart_orange + label1_orange + label2_bw + title_bw + title_orange + subtitle_bw + decreased_orange + increased_gray

decreased.properties(width = 400).configure_view(stroke = None)

# I didn't do the axis title

![Alt text](\Images\4_2d.png)

In [170]:
decreased_most = table.nsmallest(2, 'Change')
brands_decreased = decreased_most['Brands'].tolist()
conditions = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition = f"({'|'.join(conditions)})"

positive_brands = table.loc[table['Change'] > 0, 'Brands'].unique()
positive_brands_list = positive_brands.tolist()

chart_oranges = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition, alt.value('#ec7c30'), alt.value('#efb284'))
)

chart_oranges2 = alt.Chart(table).mark_bar( 
        size = 15, color = '#c6c6c6', opacity = 1
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf = positive_brands_list)
    )

label1_oranges = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#ec7c30'), 
                          alt.value('#efb284'))
    )


title_oranges = alt.Chart(
    {"values": [{"text":  [
        '2 brands decreased the most'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#ec7c30').encode(
    text = "text:N"
)




decreased2 = chart_oranges + chart_oranges2 + label1_oranges + label2_bw + title_bw + title_oranges + subtitle_bw + decreased_orange + increased_gray

decreased2.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2e.png)

In [181]:
condition = "datum.Change > 0"

chart_blue = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition,
                          alt.value('#4772b8'), alt.value('#c6c6c6'))
    )

label2_blue = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition,
                          alt.value('#4772b8'), alt.value('#c6c6c6'))
    )

title_blue = alt.Chart(
    {"values": [{"text":  [
        '11 brands flat to increasing'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = '#4772b8').encode(
    text = "text:N"
)


decreased_gray = alt.Chart(
    {"values": [{"text":  [
        'DECREASED    |'
         ]}]}
).mark_text(size = 11, align = "left", dx = -80, dy = -220, fontWeight = 700, color = '#8b8b8b').encode(
    text = "text:N"
)

increased_blue = alt.Chart(
    {"values": [{"text":  [
            'INCREASED'
         ]}]}
).mark_text(size = 11, align = "left", dx = 20, dy = -220, fontWeight = 700, color = '#4772b8').encode(
    text = "text:N"
)

increased = chart_blue + label1 + label2_blue + title_bw + title_blue + subtitle_bw + decreased_gray + increased_blue

increased.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2f.png)

In [148]:

chart_bw2 = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.value('black')
    ).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf=['Lifestyle', 'Lifestyle Plus', 'Diet Lifestyle'])
    )


label1_bw_2 = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.value('black')).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf=['Lifestyle', 'Lifestyle Plus', 'Diet Lifestyle'])
    )

title_bw_bold_2 = alt.Chart(
    {"values": [{"text":  [
        'mixed results in sales year-over-year'
         ]}]}
).mark_text(size = 16, align = "left", dx = -78, dy = -270, fontWeight = 700, color = 'black').encode(
    text = "text:N"
)

not_lifestyle = table[~table['Brands'].isin(['Lifestyle', 'Lifestyle Plus', 'Diet Lifestyle'])]
not_lifestyle = not_lifestyle['Brands'].tolist()

label1_purple_2 = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition_purple,
                          alt.value('#713a97'), alt.value('#c6c6c6'))
    ).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf= not_lifestyle)
)


mixed = chart_purple + chart_bw2 + label1_bw + label1_purple_2 + label2_purple + title_bw + title_bw_bold_2 + subtitle_bw

mixed.properties(width = 400).configure_view(stroke = None)

![Alt text](\Images\4_2g.png)

In [242]:
decreased_most = table.nsmallest(2, 'Change')
increased_most = table.nlargest(2, 'Change')

brands_decreased = decreased_most['Brands'].tolist()
brands_increased = increased_most['Brands'].tolist()

conditions_decreased = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition_decreased = f"({'|'.join(conditions_decreased)})"

conditions_increased = [f'datum.Brands == "{brand}"' for brand in brands_increased]
condition_increased = f"({'|'.join(conditions_increased)})"

chart_gray = alt.Chart(
    table
    ).mark_bar(color = "#c6c6c6", size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None)
    )

label1_gray = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', color = "#c6c6c6", fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )


label2_gray = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', color = "#c6c6c6", fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands')
    )


chart_oranges_mix = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition_decreased, alt.value('#ec7c30'), alt.value('#efb284'))
).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf = ["Fran's Recipe", 'Wholesome Goodness',
                                                     'Lifestyle', 'Coat protection', 'Diet Lifestyle'])
    )

label_oranges = alt.Chart(table.loc[table['Change'] < 0]).mark_text(align = 'left', fontWeight = 700).encode(
    x = alt.value(207),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition_decreased,
                          alt.value('#ec7c30'), 
                          alt.value('#efb284'))
    ).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf = ["Fran's Recipe", 'Wholesome Goodness',
                                                     'Lifestyle', 'Coat protection', 'Diet Lifestyle'])
    )


chart_blue_mix = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = alt.Axis(grid = False, orient = "top", 
                        labelColor = "#888888", titleColor = '#888888', 
                        titleFontWeight = 'normal', format = "%"),
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = alt.condition(condition_increased,
                          alt.value('#4772b8'), alt.value('#91a9d5'))
    ).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf = ['Feline Focus', 'Feline Grain Free', 'Feline Silver',
                                                    'Nutri Balance', 'Farm Fresh Basics'])
    )

label_blue = alt.Chart(table.loc[table['Change'] > 0]).mark_text(align = 'right', fontWeight = 700).encode(
    x = alt.value(192),
    y = alt.Y('Brands', sort = None),
    text = alt.Text('Brands'),
    color = alt.condition(condition_increased,
                          alt.value('#4772b8'), 
                          alt.value('#91a9d5'))
    ).transform_filter(
    alt.FieldOneOfPredicate(field='Brands', oneOf = ['Feline Focus', 'Feline Grain Free', 'Feline Silver',
                                                    'Nutri Balance', 'Farm Fresh Basics'])
    )


separation =  alt.Chart(
    {"values": [{"text":  [
            '|'
         ]}]}
).mark_text(size = 11, align = "left", dx = 3, dy = -220, fontWeight = 700, color = '#c6c6c6').encode(
    text = "text:N"
)

In [243]:
mixed2 = (chart_gray + chart_oranges_mix + 
          chart_blue_mix + label1_gray + 
          label2_gray + label_oranges + label_blue + 
          title_bw + title_bw_bold_2 + subtitle_bw + 
          decreased_orange + increased_blue + separation)

mixed2.properties(width = 400).configure_view(stroke = None)

# 