<a href="https://colab.research.google.com/github/odu-cs625-datavis/public-fall24-mcw/blob/main/Stacked_Grouped_Bars_Seaborn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stacked and Grouped Bar Charts with Seaborn Objects

In this notebook, we'll demonstrate creating stacked bar charts and grouped bar charts with Seaborn Objects. We'll be re-creating the stacked and grouped charts from the [Vega-Lite Chart Types notebook](https://observablehq.com/@observablehq/vega-lite-chart-types).

In [None]:
import pandas as pd
import seaborn as sns
import seaborn.objects as so
import matplotlib.pyplot as plt

The Vega-Lite Chart Types notebook uses Portland weather data, from three years of daily observations of minimum and maximum temperatures, an overall condition (fog, rain, snow, or sun), and amount of precipitation sourced from [NOAA’s Global Historical Climatology Network](https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily) (*this is an updated link, the Vega-Lite notebook's link is 404*).

I've saved the Portland weather data from the Vega-Lite notebook as [`pwm-weather.csv`](https://github.com/odu-cs625-datavis/public-fall24-mcw/blob/main/pwm-weather.csv).  We can load this into our notebook using `read_csv()`.



In [None]:
weather = pd.read_csv('https://raw.githubusercontent.com/odu-cs625-datavis/public-fall24-mcw/main/pwm-weather.csv')

In [None]:
weather.head()

## Stacked bar chart

We’ll attempt a stacked bar chart grouped by month, counting all the days within a month for different conditions. For three years worth of data, each month should have about 90 days total. Before we can create the charts, we need to do some data processing.

First, we need to convert the `date` attribute from a String to a Date, using Pandas [`to_datetime()`](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html).

In [None]:
weather['date'] = pd.to_datetime(weather['date'])

Then, we create a new attribute `month` that pulls just the month information from `date`.  

**Tip:** If you'll be using Python and time series a lot, you'll want to get familiar with [`strftime()`](https://docs.python.org/3/library/datetime.html#datetime.datetime.strftime) and its various [formatting codes](https://strftime.org/).

In [None]:
weather['month'] = weather['date'].dt.strftime('%b')

Let's peek at the dataset again to see the added attribute.

In [None]:
weather.head()

First, let's just create a bar chart that counts the number of items in each month.  We map `month` to the x-axis and the count of items to the y-axis using the [`so.Count()`](https://seaborn.pydata.org/generated/seaborn.objects.Count.html) function.

Note: This notebook is using `()` to allow us to break the charting code over multiple lines.

In [None]:
(
    so.Plot(weather, x="month")
    .add(so.Bar(), so.Count())
)

Now, we want to color the bars based on `condition`. To do that, we can add `color="condition"` to our `Plot()` function.

In [None]:
(
    so.Plot(weather, x="month", color="condition")
    .add(so.Bar(), so.Count())
)

This isn't quite right. All of the conditions are starting from 0, so they're overlapping each other. To fix this, we need to add [`so.Stack()`](https://seaborn.pydata.org/generated/seaborn.objects.Stack.html).

In [None]:
(
    so.Plot(weather, x="month", color="condition")
    .add(so.Bar(), so.Count(), so.Stack())
)

Now we can see that Portland seems sunny about half the time year-round, with snow from Nov-Apr.

## Grouped bar chart

To create a grouped bar chart for this data, we just need to replace `so.Stack()` with [`so.Dodge()`](https://seaborn.pydata.org/generated/seaborn.objects.Dodge.html).

In [None]:
(
    so.Plot(weather, x="month", color="condition")
    .add(so.Bar(), so.Count(), so.Dodge())
)

This isn't exactly like the grouped chart in the Vega-Lite notebook, but it's a standard grouped bar chart.