# Statistical Charts and Interactive Graph

In this project, I'm working on how to show complex data that includes lots of details and changes over time. I'm using Altair, a tool for making charts with Python, to both copy some charts I've been given and try making some new ones myself to show data in clear and interesting ways.

I need to make four different charts for this project. The first three are about practicing making charts that look just like examples I've been given, which helps me learn how to pay attention to small details and get everything just right. The fourth chart is more open-ended, where I get to come up with my own way of showing the data. This part is especially cool because I get to be creative and think about the best way to explain the data to other people.

For this project, I'm using data and articles from FiveThirtyEight, which is a website that does a lot of data journalism. This gives me some real-world examples to work with and learn from.

In [12]:
# load up the resources we need
import urllib.request
import os.path
from os import path
import pandas as pd
import altair as alt
import numpy as np
from sklearn import manifold
from sklearn.metrics import euclidean_distances
from sklearn.decomposition import PCA
import ipywidgets as widgets
from IPython.display import display
from PIL import Image

## Bob Ross

Today's assignment will have you working with artwork created by [Bob Ross](https://en.wikipedia.org/wiki/Bob_Ross). Bob was a very famous painter who had a televised painting show from 1983 to 1994. Over 13 seasons and approximately 400 paintings, Bob would walk the audience through a painting project. Often these were landscape images. Bob was famous for telling his audience to paint "happy trees" and sayings like, "We don't make mistakes, just happy little accidents." His soothing voice and bushy hair are well known to many generations of viewers.

If you've never seen an episode, I might suggest starting with [this one](https://www.youtube.com/watch?v=Fw6odlNp7_8). 

![bob ross](assets/bobrosspaints.png)

Bob Ross left a long legacy of art which makes for an interesting dataset to analyze. It's both temporally rich and has a lot of variables we can code. We'll be starting with the dataset created by 538 for their article on a [Statistical Analysis of Bob Ross](https://fivethirtyeight.com/features/a-statistical-analysis-of-the-work-of-bob-ross/). The authors of the article coded each painting to indicate what features the image contained (e.g., one tree, more than one tree, what kinds of clouds, etc.). 

In addition, we've downloaded a second dataset that contains the actual images. We know what kind of paint colors Bob used in each episode, and we have used that to create a dataset for you containing the color distributions. For example, we approximate how much '<font color='#614f4b'>burnt umber</font>' he used by measuring the distance (in color space) from each pixel in the image to the color. We then add the 'similarity' of each pixel to the burnt umber RGB value into the respective column. This is imperfect, of course (paints don't mix this way), but it'll be close enough for our analysis. Note that the sum of those rows will not add to 1 and the total value for any column can be more than 1. The only thing we can guarantee is that the metric is consistent across colors and between paintings.

In [13]:
# the paints Bob used
rosspaints = ['alizarin crimson','bright red','burnt umber','cadmium yellow','dark sienna', 
              'indian yellow','indian red','liquid black','liquid clear','black gesso',
              'midnight black','phthalo blue','phthalo green','prussian blue','sap green',
              'titanium white','van dyke brown','yellow ochre']

# hex values for the paints above
rosspainthex = ['#94261f','#c06341','#614f4b','#f8ed57','#5c2f08','#e6ba25','#cd5c5c',
                '#000000','#ffffff','#000000','#36373c','#2a64ad','#215c2c','#325fa3',
                '#364e00','#f9f7eb','#2d1a0c','#b28426']

# boolean features about what an image includes
imgfeatures = ['Apple frame', 'Aurora borealis', 'Barn', 'Beach', 'Boat', 
               'Bridge', 'Building', 'Bushes', 'Cabin', 'Cactus', 
               'Circle frame', 'Cirrus clouds', 'Cliff', 'Clouds', 
               'Coniferous tree', 'Cumulus clouds', 'Decidious tree', 
               'Diane andre', 'Dock', 'Double oval frame', 'Farm', 
               'Fence', 'Fire', 'Florida frame', 'Flowers', 'Fog', 
               'Framed', 'Grass', 'Guest', 'Half circle frame', 
               'Half oval frame', 'Hills', 'Lake', 'Lakes', 'Lighthouse', 
               'Mill', 'Moon', 'At least one mountain', 'At least two mountains', 
               'Nighttime', 'Ocean', 'Oval frame', 'Palm trees', 'Path', 
               'Person', 'Portrait', 'Rectangle 3d frame', 'Rectangular frame', 
               'River or stream', 'Rocks', 'Seashell frame', 'Snow', 
               'Snow-covered mountain', 'Split frame', 'Steve ross', 
               'Man-made structure', 'Sun', 'Tomb frame', 'At least one tree', 
               'At least two trees', 'Triple frame', 'Waterfall', 'Waves', 
               'Windmill', 'Window frame', 'Winter setting', 'Wood framed']

# load the data frame
bobross = pd.read_csv("assets/bobross.csv")

# enable correct rendering (unnecessary in later versions of Altair)
alt.renderers.enable('default')

# uses intermediate json files to speed things up
alt.data_transformers.enable('json')

DataTransformerRegistry.enable('json')

## The bobross dataframe
Data processing of the ```bobross``` dataframe has been done previously. The ```bobross``` dataframe has a row for every painting created by Bob (removed those created by guest artists). In the dataframe there is an episode identifier (EPISODE, which contains the season and episode number), the image title (TITLE), the release date (RELEASE_DATE as well as another column for the year). There are also a number of boolean columns for the features coded by 538. A '1' means the feature is present, a '0' means it is not. A list of those columns is available in the ```imgfeatures``` variable. The columns that contain the amount of each color in the paintings are listed in ```rosspaints```. There is also an analogous list variable called ```rosspainthex``` that has the hex values for the paints. These hex values are approximate.

In [14]:
bobross.sample(5)

Unnamed: 0,EPISODE,TITLE,RELEASE_DATE,Apple frame,Aurora borealis,Barn,Beach,Boat,Bridge,Building,...,phthalo blue,phthalo green,prussian blue,sap green,titanium white,van dyke brown,yellow ochre,img_url,week_number,year
130,S11E12,"""ROADSIDE BARN""",3/18/87,0,0,1,0,0,0,0,...,0.0,0.0,0.435614,0.475242,0.237322,0.506903,0.448354,https://raw.githubusercontent.com/jwilber/Bob_...,12,1987
184,S16E02,"""NESTLED CABIN""",8/24/88,0,0,0,0,0,0,0,...,0.38705,0.0,0.0,0.557809,0.140414,0.601304,0.498373,https://raw.githubusercontent.com/jwilber/Bob_...,34,1988
331,S27E13,"""GOLDEN GLOW OF MORNING""",5/20/93,0,0,0,0,0,0,0,...,0.319609,0.0,0.0,0.410829,0.275873,0.0,0.513479,https://raw.githubusercontent.com/jwilber/Bob_...,20,1993
197,S17E04,"""STORMY SEAS""",1/25/89,0,0,0,1,0,0,0,...,0.407256,0.0,0.0,0.0,0.244775,0.472211,0.475445,https://raw.githubusercontent.com/jwilber/Bob_...,4,1989
161,S14E04,"""SNOWY SOLITUDE""",1/20/88,0,0,0,0,0,0,0,...,0.392239,0.0,0.0,0.0,0.646181,0.134282,0.226976,https://raw.githubusercontent.com/jwilber/Bob_...,3,1988


In [15]:
print(imgfeatures)

['Apple frame', 'Aurora borealis', 'Barn', 'Beach', 'Boat', 'Bridge', 'Building', 'Bushes', 'Cabin', 'Cactus', 'Circle frame', 'Cirrus clouds', 'Cliff', 'Clouds', 'Coniferous tree', 'Cumulus clouds', 'Decidious tree', 'Diane andre', 'Dock', 'Double oval frame', 'Farm', 'Fence', 'Fire', 'Florida frame', 'Flowers', 'Fog', 'Framed', 'Grass', 'Guest', 'Half circle frame', 'Half oval frame', 'Hills', 'Lake', 'Lakes', 'Lighthouse', 'Mill', 'Moon', 'At least one mountain', 'At least two mountains', 'Nighttime', 'Ocean', 'Oval frame', 'Palm trees', 'Path', 'Person', 'Portrait', 'Rectangle 3d frame', 'Rectangular frame', 'River or stream', 'Rocks', 'Seashell frame', 'Snow', 'Snow-covered mountain', 'Split frame', 'Steve ross', 'Man-made structure', 'Sun', 'Tomb frame', 'At least one tree', 'At least two trees', 'Triple frame', 'Waterfall', 'Waves', 'Windmill', 'Window frame', 'Winter setting', 'Wood framed']


In [16]:
print("paint names",rosspaints)
print("")
print("hex values", rosspainthex)

paint names ['alizarin crimson', 'bright red', 'burnt umber', 'cadmium yellow', 'dark sienna', 'indian yellow', 'indian red', 'liquid black', 'liquid clear', 'black gesso', 'midnight black', 'phthalo blue', 'phthalo green', 'prussian blue', 'sap green', 'titanium white', 'van dyke brown', 'yellow ochre']

hex values ['#94261f', '#c06341', '#614f4b', '#f8ed57', '#5c2f08', '#e6ba25', '#cd5c5c', '#000000', '#ffffff', '#000000', '#36373c', '#2a64ad', '#215c2c', '#325fa3', '#364e00', '#f9f7eb', '#2d1a0c', '#b28426']


### 1. Recreate Bar Chart 

Recreate the [first chart from the Bob Ross article](assets/bob_ross_538.png) (source: [Statistical Analysis of Bob Ross](https://fivethirtyeight.com/features/a-statistical-analysis-of-the-work-of-bob-ross/)). This one simply shows a bar chart for the percent of images that have certain features. 

!["Bob Ross feature distribution"](assets/bob_ross_altair.png)


In [17]:
def makeBobRossBar(br, ifeatures):
    # input: br -- a dataframe in the shape of the bobross frame defined above
    # input: ifeatures -- a list of the features we want to test (see imgfeatures above)
    # return: implement this function to return an altair chart as defined above
    #         e.g., return alt.Chart(...)
      
    # Create the new df that has imafeatures as the index and the corresponding percentage of each feature as the value
    df = br[ifeatures].mean().to_frame(name='value').reset_index()
    
    # Sort the df in descending order
    df = df.sort_values(by='value', ascending=False)
    
    # Create a horizontal bar chart using Altair
    chart = alt.Chart(df).mark_bar(color='#4DAFDB', size=12).encode(
        x=alt.X('value:Q', axis=None),
        y=alt.Y('index:N', sort='-x', title=None),
        tooltip=['index', 'value']
    )

    # Add text labels at the end of the bars
    text = chart.mark_text(
        align='left',
        baseline='middle',
        dx=3  
    ).encode(
        text=alt.Text('value:Q', format='.0%')
    )
    
    
    # Combine the bar chart and text labels
    combined_chart = alt.layer(chart, text)
    
    # Configure the combined chart
    final_chart = combined_chart.properties(
        width=350,
        height=1000,
        title={
            "text": 'The Paintings of Bob Ross',
            "subtitle": ['Percentage containing each element'],
            "color": "black",
            "subtitleColor": "black"
        }
    ).configure_view(
        strokeWidth=0  
    ).configure_axis(
        labelFontSize=12,
        titleFontSize=14
    ).configure_title(
        fontSize=20,
        anchor='start'
    )
    
    # Display the chart
    final_chart.display()

In [18]:
# run this code to validate
alt.themes.enable('fivethirtyeight')
makeBobRossBar(bobross, imgfeatures)

## 2. Create Small Mulitples

The 538 article ([Statistical Analysis of Bob Ross](https://fivethirtyeight.com/features/a-statistical-analysis-of-the-work-of-bob-ross/)) has a long analysis of conditional probabilities. Essentially, we want to know the probability of one feature given another (e.g., what is the probability of Snow given Trees?). The article calculates this over the entire history of the show, but we would like to visualize these probabilities over time. Have they been constant? or evolving?  We will only be doing this for a few variables (otherwise, we'll have a matrix of over 3000 small charts). The example below is for: 'At least one tree','At least two trees','Clouds','Grass','At least one mountain','Lake.' Each small multiple plot will be a line chart corresponding to the conditional probability over time. The matrix "cell" indicates which pairs of variables are being considered (e.g., probability of at least two trees given the probability of at least one tree is the 2nd row, first column in our example).

Generate small multiples plots. For example:

!["Small multiples"](assets/matrix_small.png)

The full image is [available here](assets/matrix_full.png).

In [19]:
def condprobability(frame,column1,column2,year):
    # we suggest you implement this function to make your life easier. 
    # input: frame -- the input dataframe in the style of the bobross dataframe above
    # input: column1 -- the first column to test (e.g, the A in probability of A given B)
    # input: column2 -- the second column to test (e.g., the B in the probability of A given B)
    # input: year -- the year for which to calculate the probability
    # return: a conditional probability value

    # Select specified year
    year_frame = frame[frame['year']==year]
    
    # Calculate the total occurences of column2 in the specified year
    total_column2 = year_frame[column2].sum()
    
    # Caculate the total occurences of both column1 and column2 in the specified year
    total_column1_column2 = year_frame[(year_frame[column1]==1) & (year_frame[column2]==1)][column1].sum()
    
    # Conditional Probability
    if total_column2 == 0:
        return 0
    else:
        return total_column1_column2 / total_column2

In [20]:
condprobability(bobross, 'Clouds', 'Lake', 1992)

0.1

In [21]:
def makeBobRossCondProb(br, totest):
    # implement this function to return an altair chart
    # 
    # input: br the dataframe (e.g., the bobross frame as defined above)
    # input: totest is a variable that holds an array of properties we want compared (see example below)
    
    
    # Create the long form dataframe
    long_lst = []
    
    for col1 in totest:
        for col2 in totest:
            for year in br['year'].unique():
                # Calculate conditional probability of col1 given col2
                con_prob = condprobability(br, col1, col2, year)
                
                # Store the variables into long_st
                long_lst.append({'key1': col1, 'key2': col2, 'year': year, 'prob': con_prob})
    
    long_df = pd.DataFrame(long_lst)
    
    # Create the charts of first column
    chart1 = alt.Chart(long_df).mark_line(color='#4DAFDB', interpolate='monotone').encode(
        x= alt.X('year:O', axis=alt.Axis(title=None)),
        y=alt.Y('prob:Q', axis=alt.Axis(title=None), scale=alt.Scale(domain=[0, 1])),
        tooltip=['year', 'prob', 'key1', 'key2']
    ).properties(
        width=180,
        height=180
    ).facet(
        row=alt.Row('key1:N', title='', 
                    header=alt.Header(
                        labelExpr="'Probability of ' + datum.value",
                        titleFontSize=30,
                        labelOrient='left', 
                        labelAlign='center', 
                        labelPadding=10
                    )
                   ),
        column=alt.Column('key2:N', title=None, header=alt.Header(labelOrient='top', labelAlign='center', labelPadding=10))
    ).resolve_scale(
        x='independent',
        y='independent'
    ).transform_filter(
        alt.FieldOneOfPredicate(field='key2', oneOf=[totest[0]])  # Only select the first feature for the first column
    )

    
    # Create the rest of the charts
    chart2 = alt.Chart(long_df).mark_line(color='#4DAFDB', interpolate='monotone').encode(
        x= alt.X('year:O', axis=alt.Axis(title=None)),
        y=alt.Y('prob:Q', axis=alt.Axis(title=None, ticks=False), scale=alt.Scale(domain=[0, 1])),
        tooltip=['year', 'prob', 'key1', 'key2']
    ).properties(
        width=180,
        height=180
    ).facet(
        row=alt.Row('key1:N', title=None, header=alt.Header(labelExpr="' '", labelPadding=-9)),
        column=alt.Column('key2:N', title='Given...', header=alt.Header(labelOrient='top', labelAlign='center', labelPadding=10))
    ).resolve_scale(
        x='independent',
        y='independent'
    ).transform_filter(
        alt.FieldOneOfPredicate(field='key2', oneOf=totest[1:])  # Only select the first feature for the first column
    )

    final_chart = alt.hconcat(chart1, chart2, spacing=20)
    
    

    # Display the chart
    final_chart.display()

In [22]:
# Produce the small multiples grid for the example in the description.
makeBobRossCondProb(bobross, ['At least one tree','At least two trees','Clouds','Grass','At least one mountain','Lake'])

## Multidimensional Visualization

Practice using dimensionality reduction to visualize the  multidimensional data information. Here, we explore how images are similar to each other in 'feature' space. Specifically, how similar are they based on the image features? Are images that have beaches close to those with waves? 

We are going to create a 2D MDS plot using the scikit learn package. 

In [23]:
def augmentWithMDS(br=bobross, ifeatures=imgfeatures):
    # input: br -- the bobross shaped dataframe
    # input: ifeatures -- the features we want to use for calculate the MDS layout
    # output: a modified bobross dataframe that has new columns for the x/y coordinates
    
    # create the seed
    seed = np.random.RandomState(seed=3)

    # generate the MDS configuration, we want 2 components, etc. You can tweak this if you want to see how
    # the settings change the layout
    mds = manifold.MDS(n_components=2, max_iter=3000, eps=1e-9, random_state=seed, n_jobs=1)

    # fit the data. At the end, 'pos' will hold the x,y coordinates
    pos = mds.fit(br[ifeatures]).embedding_

    # we'll now load those values into the bobross data frame, giving us a new x column and y column
    br['x'] = [x[0] for x in pos]
    br['y'] = [x[1] for x in pos]
    return(br)

bobross = augmentWithMDS()

In [24]:
def genMDSPlot(br,key):
    # input: br -- a bobross dataframe (augmented with the x/y columns as describe above)
    # input: key -- is a string indicating which images should be visually highlighted (i.e., images containing the feature
    #        should be made salient). For example: 'Barn'
    # return: an altair chart (e.g., return alt.Chart(...))
    
    # Create a highlight column based on the presence of the key feature
    br['highlight'] = br[key].astype(bool)
    br['Feature'] = br['highlight'].apply(lambda x: f'Has a(n) {key}' if x else f'Doesn\'t have a(n) {key}')

    # Base chart for common properties
    base = alt.Chart(br).encode(
        x='x:Q',
        y='y:Q',
        tooltip=['TITLE', 'RELEASE_DATE', 'Feature']
    )

    # Layer 1: Highlighted images with a rectangle mark
    highlights = base.mark_rect(width=60, height=60, strokeWidth=2).encode(
        color=alt.condition(alt.datum.highlight, alt.value('orange'), alt.value('#4DAFDB'))
    )
    
    # Layer 2: Images using the mark_image
    images = base.mark_image(width=50, height=50).encode(
        url='img_url:N'
    )

    # Layer 3: Invisible mark for legend
    legend = base.mark_point().encode(
        color=alt.Color('Feature', legend=alt.Legend(title=key), 
                        scale=alt.Scale(domain=[f'Has a(n) {key}', f'Doesn\'t have a(n) {key}'], range=['orange', '#4DAFDB']))
    ).transform_filter(
        alt.datum.highlight == True  # Filter to have only one point
    )
    
    # Combine layers
    final_chart = ((highlights + images)+legend).properties(
        width=1852, 
        height=2156
    ).configure_view(
        strokeWidth=0
    ).configure_axis(
        grid=False,
        domain=False
    ).encode(
        x=alt.X(axis=None),    
        y=alt.Y(axis=None)
    )

    return final_chart

    

In [25]:
genMDSPlot(bobross,'Oval frame')

We are going to create an interactive widget that allows you to select the feature you want to be highlighted. The plot would change when you select new items from the list. We would ordinarily do this directly in Altair, but because we don't have control over the way you created your visualization, so it's easiest for us to use the widgets built into Jupyter to create the dropdown menu on top left.

In [26]:
output = widgets.Output()

def clicked(b):
    output.clear_output()
    with output:
        # when the selection is changed, we pull the value and call the altair plot generator
        highlight = filterdrop.value
        if (highlight == ""):
            print("please enter a query")
        else:
            genMDSPlot(bobross,highlight).display()


featurecount = bobross[imgfeatures].sum()

filterdrop = widgets.Dropdown(
    options=list(featurecount[featurecount > 2].keys()),
    description='Highlight:',
    disabled=False,
)

filterdrop.observe(clicked, names=['value'])

display(filterdrop,output)

with output:
    genMDSPlot(bobross,'Barn').display()


Dropdown(description='Highlight:', options=('Barn', 'Beach', 'Bridge', 'Bushes', 'Cabin', 'Cactus', 'Cirrus cl…

Output()

## Creative Small Multiples Plot

In this task, I'll be diving into the colorful world of Bob Ross's paintings, focusing on a specific season's artwork. My task is to craft small multiples, with each one showcasing the palette of colors used in individual paintings. The goal is to explore creative ways to display the distribution of colors and their quantities, aiming to uncover any patterns in Bob Ross's color choices in a way that is not only informative but also visually engaging. 

In [27]:
# this is optional, you can use this to produce a single multiple
# you may not find this helpful for your solution
def colorSmallMultiple(season, episodenumber, br=bobross, rp=rosspaints, rph=rosspainthex):
    # input: season -- a season number (integer), assumed to exist in the dataset
    # input: episodenumber -- an episode number (integer), assumed to exist in the dataset
    # input: br -- a dataset structed as the bobross data above (default is "bobross")
    # input: rp -- the names of paints (default rosspaints as defined above)
    # input: rph -- the hex values of the paints (default rosspaintshex as defined above)
    # return: a single multiple visualization for the season/episode
    
    # YOUR CODE HERE
    raise NotImplementedError()

# test 
# colorSmallMultiple(12,10)  # season 12, episode 10
# colorSmallMultiple(5,1)    # season 5, episode 1

In [28]:
def colorSmallMultiples(season, br=bobross, rp=rosspaints, rph=rosspainthex):
    # input: season -- a season number (integer), assumed to exist in the dataset. This is the 
    #               integer representing the season of the show are interested in. Limit your images
    #               to that season in the small multiples display.
    # input: br -- a dataset structed as the bobross data above (default is "bobross")
    # input: rp -- the names of paints (default rosspaints as defined above)
    # input: rph -- the hex values of the paints (default rosspainthex as defined above)
    # return: an Altair chart providing small multiples for that season
    
    # YOUR CODE HERE
    # raise NotImplementedError()
    # Filter data for the specified season
    season_df = br[br['season'] == season]

    # Check if the filtered data is empty
    if season_df.empty:
        raise ValueError(f"No data found for season {season}")

    # Create a list of color hex values for encoding
    color_to_hex = dict(zip(rp, rph))

    # Total number of episodes in the season
    total_episodes = len(season_df)

    # Initialize an empty list to store all rows of charts
    rows_of_charts = []

    # Iterate through the season data in steps of 4
    for start_index in range(0, total_episodes, 4):
        # Current row of charts
        current_row = []

        # Iterate over episodes in the current row
        for i in range(start_index, min(start_index + 4, total_episodes)):
            episode = season_df.iloc[i]

            # Create DataFrame for episode's color
            episode_colors = pd.DataFrame({
                'Color': rp,
                'Proportion': [episode[color] for color in rp],
                'ColorHex': [color_to_hex[color] for color in rp]
            })

            # Create the chart for this episode
            chart = alt.Chart(episode_colors).mark_bar().encode(
                x=alt.X('Proportion:Q', title=None),
                y=alt.Y('Color:N', sort=rp, title=None,
                        axis=alt.Axis(title=None, labels=(i % 4 == 0), ticks=(i % 4 == 0))),
                color=alt.Color('ColorHex:N', legend=None),
                tooltip=['Color', 'ColorHex', 'Proportion']
            ).properties(
                width=180,
                height=150,
                title=alt.TitleParams(text=episode['TITLE'], fontSize=15)
            )

            # Add the chart to the current row
            current_row.append(chart)

        # Add the current row of charts to the list of rows
        rows_of_charts.append(alt.hconcat(*current_row, spacing=15))

    # Combine all rows
    final_chart = alt.vconcat(*rows_of_charts, spacing=15).resolve_scale(
        x='independent', 
        y='independent'
    )

    return final_chart

In [29]:
# run this to test your code for season 1
colorSmallMultiples(1)

In [30]:
# run this to test your code for season 2
colorSmallMultiples(2)

### Why Choosing This Design:

In this problem, I am asked to visualize the proportion of the colors used in each of Bob Ross's paintings in a specified season. Abstractedly, the goal is to visualize quantitative data, the proportion of colors and categorical data, the colors themself. The encoding method I chose for the quantitative data is length, which consists of bar plots for different colors. The encoding of the colors is straightforward. I used their own color to visualize them. 

Pros: Using length as the encoding method of quantitative data is effective because human eyes can easily distinguish the differences in length and quickly notice which colors are used more often in one of Bob's paintings. The use of actual color for encoding also makes the chart intuitive. I maintained consistent color order when creating small multiples to make comparing multiple plots easier. For example, in season 1, the dominant color is titanium white. Bob also used Prussian blue frequently. We can also observe that bright red and alizarin crimson are frequently used together, indicating a pattern. Overall, this approach allows the audience to see common patterns in the paintings in the season while retaining details of individual painting color usage.

Cons: The bar charts separate the color used, which does not reflect real-world color mixing. We may lose important information on how colors interact. My bar chart approach barely works with 18 colors regarding spatial efficiency. If more colors are used, each plot can be too cluttered and messy for viewers. Moreover, paintings are visualized in separate plots, making it difficult to compare the difference in color usage of each color more precisely. 

Notes: I wasn't able to figure out why the color doesn't appear right in my plots, even though the hex number are correctly matched. I showed the hex number in the tooltip for reference.