# Tutorial 7: Advanced Multi-View Composition

---

## Introduction

When visualizing multiple data fields, we might be tempted to use as many visual encoding channels as possible: `x`, `y`, `color`, `size`, `shape`, and so on. However, as the number of encoding channels increases, a chart can rapidly become cluttered and difficult to read. 

An alternative to "over-loading" a single chart is to **compose multiple charts** in a way that facilitates rapid comparisons. You've already learned the basics of faceting, concatenation, layering, and repeating charts in previous tutorials. 

**In this tutorial, we advance beyond the basics** to explore:
- How composition operators work together as a **view composition algebra**
- How to control **resolution** of scales, axes, and legends across views
- Advanced patterns for building sophisticated multi-view dashboards

---

## Learning Goals

By the end of this tutorial, you will be able to:

- **Distinguish** between the four composition operators (facet, concatenate, layer, repeat) and their use cases
- **Apply** the facet operator with layout control using the `columns` parameter
- **Create** nested concatenation layouts for complex view arrangements
- **Understand** and control how scales, axes, and legends resolve across multiple views
- **Combine** multiple composition operators using the view composition algebra
- **Build** sophisticated dashboards that integrate faceting, concatenation, layering, and repetition

---

## Composition Hierarchy: Conceptual Overview

Before we dive into examples, let's establish a mental model for the four composition operators in Altair.

### The Four Composition Operators

**FACET** - Small Multiples
- **What:** Subdivide data into groups, create separate plot for each group
- **When:** Comparing distributions/patterns across categories
- **Layout:** Organized grid (rows/columns)
- **Example:** "Show temperature distribution for each weather type"

**CONCATENATE** - Flexible Positioning
- **What:** Place independent charts side-by-side or stacked
- **When:** Comparing different variables or showing different views
- **Layout:** User-controlled horizontal/vertical arrangement
- **Example:** "Show temperature AND precipitation AND wind in separate panels"

**REPEAT** - Template Application
- **What:** Apply same template specification to multiple fields
- **When:** Creating scatter plot matrices or comparing same analysis across variables
- **Layout:** Automatic grid based on field arrays
- **Example:** "Show scatter plots for all pairs of measurements"

**LAYER** - Overlay
- **What:** Superimpose marks in the same coordinate space
- **When:** Showing reference lines, annotations, or multiple data series together
- **Layout:** Z-axis stacking (overlaid)
- **Example:** "Show bars with average line overlay"

### Decision Tree: Which Operator to Use?

```
┌─ Same data, same scales, overlapping? ──────► LAYER
│
├─ Same data, subdivided into groups? ──────────► FACET
│
├─ Multiple variables, same template? ──────────► REPEAT
│
└─ Different charts, flexible layout? ──────────► CONCATENATE
```

---



## Dataset: Weather Data

We'll use weather statistics for Seattle and New York to explore multi-view composition.

In [4]:
import pandas as pd
import altair as alt

weather = 'https://cdn.jsdelivr.net/npm/vega-datasets@1/data/weather.csv'
df = pd.read_csv(weather)

# View first few rows
df.sample(10)

Unnamed: 0,location,date,precipitation,temp_max,temp_min,wind,weather
2530,New York,2014-12-05,14.0,10.0,1.7,4.5,fog
2657,New York,2015-04-11,0.0,16.1,7.2,7.5,sun
1125,Seattle,2015-01-30,0.0,8.3,1.1,0.8,fog
2551,New York,2014-12-26,0.0,10.6,3.3,4.7,sun
1177,Seattle,2015-03-23,8.1,11.1,5.6,2.8,fog
284,Seattle,2012-10-11,0.0,13.9,7.2,1.3,drizzle
2807,New York,2015-09-08,0.0,32.8,22.8,5.7,sun
1591,New York,2012-05-10,5.6,19.4,10.6,6.2,rain
1532,New York,2012-03-12,0.0,16.7,4.4,4.0,sun
1014,Seattle,2014-10-11,7.4,18.3,11.7,3.5,rain


**Dataset Fields:**
- `date` - Date of observation
- `location` - City (Seattle or New York)
- `temp_max` - Maximum temperature (°C)
- `temp_min` - Minimum temperature (°C)
- `precipitation` - Precipitation (mm)
- `wind` - Wind speed
- `weather` - Weather type (drizzle, fog, rain, snow, sun)

---


Let's create a dataframe for our sister city Seattle


In [5]:
seattle_data = df[df["location"] == "Seattle"]


## Quick Review: The Four Operators

You've seen these operators before. Here's a quick reminder of the basics before we explore advanced techniques.

### Review 1: Faceting with Encoding Channels

**Exploratory Question:** *How does maximum temperature vary across weather conditions in Seattle?*

In [6]:
colors = alt.Scale(
    domain=['drizzle', 'fog', 'rain', 'snow', 'sun'],
    range=['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52']
)

alt.Chart(seattle_data).mark_bar().encode(
    alt.X('temp_max:Q').bin(True).title('Temperature (°C)'),
    alt.Y('count():Q'),
    alt.Color('weather:N').scale(colors),
    alt.Column('weather:N')  # ← Faceting happens here
).properties(
    width=110,
    height=110
)

**Key Reminder:** Using `column` (or `row`) encoding channels creates small multiples.

---

### Review 2: Concatenation with | and &

**Exploratory Question:** *How do temperature, precipitation, and wind patterns compare across months?*

In [7]:
base = alt.Chart(weather).mark_line().encode(
    alt.X('month(date):T').title(None),
    color='location:N'
).properties(
    width=220,
    height=160
)

temp = base.encode(alt.Y('average(temp_max):Q'))
precip = base.encode(alt.Y('average(precipitation):Q'))
wind = base.encode(alt.Y('average(wind):Q'))

temp | precip | wind  # ← Horizontal concatenation


Altair provides *concatenation* operators to combine arbitrary charts into a composed chart. The `hconcat` operator (shorthand `|` ) performs horizontal concatenation, while the `vconcat` operator (shorthand `&`) performs vertical concatenation.

**Key Reminder:** Use `|` for horizontal concatenation, `&` for vertical.


---

### Review 3: Repeat for Efficiency


In [8]:
alt.Chart(df).mark_line().encode(
    alt.X('month(date):T').title(None),
    alt.Y(alt.repeat('column'), aggregate='average', type='quantitative'),
    color='location:N'
).properties(
    width=220,
    height=160
).repeat(
    column=['temp_max', 'precipitation', 'wind']  # ← Template applied to each field
)

**Key Reminder:** `repeat` applies the same specification to multiple fields.

---

### Review 4: Layering with +

**Exploratory Question:** *How do Seattle's monthly temperatures compare to the overall average?*

In [9]:
bars = alt.Chart(seattle_data).mark_bar().encode(
    alt.X('month(date):O'),
    alt.Y('average(temp_max):Q')
)

rule = alt.Chart(seattle_data).mark_rule(stroke='orange', strokeWidth=4).encode(
    alt.Y('average(temp_max):Q')
)

bars + rule  # ← Layering: overlaying in same coordinate space

**Key Reminder:** Use `+` or `alt.layer()` to overlay marks with shared scales.

---

## Advanced Technique 1: Facet Customization Facet Operator
### Controlling Layout with the Columns Parameter
If you want control over the facetting, e.g. how many columns, you need to use facet not at the top level, but at the encoding level. t the top level. When you have many categories, you might want to wrap facets into multiple rows. To do this, you need to use `facet` at the encoding level and not top level. 



In [11]:

controlled_facetting = alt.Chart(seattle_data).mark_bar().encode(
    alt.X('temp_max:Q').bin(True).title('Temperature (°C)'),
    alt.Y('count():Q'),
    alt.Color('weather:N').scale(colors),
    alt.Facet('weather:N', columns=3) # ← here we set the number of columns and use Facet at the encoding level. 
).properties(
    width=150,
    height=150
)
controlled_facetting

**Try it yourself:** Change `columns=3` to `columns=2` or `columns=4` and see how the layout changes.

---

### Using Both Row and Column

You can facet along both dimensions simultaneously:

In [12]:
alt.Chart(weather).mark_point().encode(
    alt.X('temp_max:Q'),
    alt.Y('precipitation:Q'),
    alt.Row('location:N'),
    alt.Column('weather:N')
).properties(
    width=120,
    height=120
)


This creates a 2D grid: locations as rows, weather types as columns.
Note that you can create a similar grid by using `facet` at the top level. 

## Advanced Technique 2: Nested Concatenation

Concatenation operators can be **nested** to create sophisticated layouts. Understanding operator precedence and parentheses is crucial.

### Example: Mixed Horizontal and Vertical Layout

In [13]:
base = alt.Chart(weather).mark_line().encode(
    alt.X('month(date):T').title(None),
    color='location:N'
).properties(width=200, height=150)

temp = base.encode(alt.Y('average(temp_max):Q').title('Temperature'))
precip = base.encode(alt.Y('average(precipitation):Q').title('Precipitation'))
wind = base.encode(alt.Y('average(wind):Q').title('Wind'))

# Put temp and precip side-by-side, then stack with wind
(temp | precip) & wind

**Important:** Parentheses matter! Try removing them: `temp | precip & wind` produces a different layout.

**Why?** Python's operator precedence means `&` (vertical concat) binds tighter than `|` (horizontal concat).

---

### Alternative: Explicit Methods

You can also use `alt.hconcat()` and `alt.vconcat()` for more explicit control:

In [14]:
alt.vconcat(
    alt.hconcat(temp, precip),
    wind
)

This is functionally equivalent but sometimes clearer for complex layouts.

---


## Advanced Technique 3: Multi-Layer Compositions

You've layered two charts before. Let's explore more sophisticated multi-layer patterns.

### Layering Within Repeat

Combine layer and repeat to create reference lines across multiple panels:

In [17]:
alt.layer(
    # Layer 1: Monthly bars
    alt.Chart().mark_bar().encode(
        alt.X('month(date):O', title='Month'),
        alt.Y(alt.repeat('column'), aggregate='average', type='quantitative')
    ),
    # Layer 2: Overall average line
    alt.Chart().mark_rule(stroke='firebrick', strokeWidth=2).encode(
        alt.Y(alt.repeat('column'), aggregate='average', type='quantitative')
    )
).properties(
    width=200,
    height=150
).repeat(
    data=seattle_data,   #  WHAT, YUP you can do this. 
    column=['temp_max', 'precipitation', 'wind']
)

**Result:** Each repeated panel gets both the bars AND the reference line.
Altair is layering two marks on top of each other. So together, each mini-chart shows monthly variation + overall mean. Then the *power* move is the use of `repeat()`.
`repeat()` tells Altair to replicate the layered chart for each field listed in `column=[...]`
You could this for this as a for loop in visualization form. 
```python
for col in ['temp_max', 'precipitation', 'wind']:
    make_chart_for(col)
```
but done declaratively. 


---

### Three-Layer Example: Band  + Lines + Points

Create a sophisticated time series with trend line, data points, and uncertainty band:

In [18]:
# Base chart with shared properties
base = alt.Chart(seattle_data).encode(
    alt.X('month(date):T').title('Month')
)

# Layer 1: Confidence band (using min/max as proxy)
band = base.mark_area(opacity=0.2).encode(
    alt.Y('min(temp_max):Q'),
    alt.Y2('max(temp_max):Q')
)

# Layer 2: Line
line = base.mark_line().encode(
    alt.Y('average(temp_max):Q').title('Temperature (°C)')
)

# Layer 3: Points
points = base.mark_point(
    filled=True, 
    size=50, 
    opacity=0.7
).encode(
    alt.Y('average(temp_max):Q')
)



band + line + points

**Notice the layering order:** Band → Line → Points ensures proper visual hierarchy.

A couple of notes
1. By defining `base` once, all layers share the same data and x-axis, which keeps them perfectly aligned.
2. Layer 1: this is new, we draw a shaded area between the minimum and maximum temperature values for each month.This is not a true statistical confidence interval, but it serves as a visual proxy for temperature variability. The transparency (opacity=0.2) makes it subtle, providing context without hiding the other marks.Think of this as "uncertainty context;
3. Layer 2: draws a line showing the the average maximum temperature per month. Think of this as the "main trend/signal"
4. Layer 3: adds points on top of the line at each montnly average, think of this as "data emphasis"



---

## Resolution Systems

When composing multiple views, Altair needs to decide: should scales, axes, and legends be **shared** or **independent** across views?

### Understanding Resolution

**Default Behavior:**
- **Facet & Repeat:** Shared scales and axes (for easy comparison)
- **Concatenate:** Independent scales by default (each view can have its own range)
- **Layer:** Shared scales (required for overlaying)

Sometimes you want to override these defaults using **resolution methods**.

---

### Scale Resolution

**Problem:** When concatenating charts with very different value ranges, sharing scales can compress one chart.

In [19]:
# Two charts with very different y-ranges
temp_chart = alt.Chart(weather).mark_line().encode(
    alt.X('month(date):T'),
    alt.Y('average(temp_max):Q').title('Temperature (°C)')  # Range: ~0-30
).properties(width=200, height=150)

precip_chart = alt.Chart(weather).mark_line().encode(
    alt.X('month(date):T'),
    alt.Y('average(precipitation):Q').title('Precipitation (mm)')  # Range: ~0-5
).properties(width=200, height=150)

# Default: independent scales (good!)
temp_chart | precip_chart


In [20]:

# Force shared scales (usually bad for different units!)
(temp_chart | precip_chart).resolve_scale(y='shared')

**When to use `resolve_scale()`:**
- `'shared'` - When comparing magnitudes across views (same units)
- `'independent'` - When views show different dimensions (different units)

---

### Axis Resolution

Control whether axes appear in each facet or are shared:

In [21]:
alt.Chart(weather).mark_point().encode(
    alt.X('temp_max:Q'),
    alt.Y('precipitation:Q')
).properties(
    width=150,
    height=150
).facet(
    column='weather:N'
).resolve_axis(
    y='independent'  # Each facet gets its own y-axis
)

**Default for facets:** Shared axes (only leftmost/bottom axes show).  
**Use `'independent'`:** When you want axis labels on every subplot.

---

### Legend Resolution

**Problem:** In complex dashboards, legends can appear in unexpected places.

In [22]:
chart_with_color = alt.Chart(weather).mark_point().encode(
    alt.X('temp_max:Q'),
    alt.Y('precipitation:Q'),
    color='location:N'
).properties(width=150, height=150)

chart_no_color = alt.Chart(weather).mark_bar().encode(
    alt.X('weather:N'),
    alt.Y('count()'),
  ).properties(width=150, height=150)

# Default: legend resolves to entire composition
chart_with_color | chart_no_color



Notice the placement of the location legend. It is to the right of the weather bar chart, but that is counterintuitive to what we would expect. To seperte the legend we can ask it to treat the legend as independent. 

In [23]:
# Force legend to stay with colored chart
(chart_with_color | chart_no_color).resolve_legend(
    color='independent'
)

**When to use `resolve_legend(color='independent')`:**
- Multi-view dashboards where only some views use color
- When you want legends directly adjacent to their relevant chart

---

### Resolution Quick Reference

| Method | Options | Use Case |
|--------|---------|----------|
| `.resolve_scale(x/y='...')` | `'shared'`, `'independent'` | Control value range alignment |
| `.resolve_axis(x/y='...')` | `'shared'`, `'independent'` | Control axis label placement |
| `.resolve_legend(color='...')` | `'shared'`, `'independent'` | Control legend positioning |

---

## View Composition Algebra

The real power emerges when you **combine** multiple operators. Together, they form a **composition algebra** that can express any multi-view layout.

### Building a Dashboard: Step-by-Step

Let's build a comprehensive Seattle weather dashboard that uses all four operators.

We are going to do something you haven't seen before. We will create chart objects that don't include any data until the last step. What this means is that the viz will be empty until the last step. YEP!!!

#### Step 1: Create a SPLOM (using Repeat)

In [24]:
splom = alt.Chart().mark_point(
    filled=True, 
    size=15, 
    opacity=0.5
).encode(
    alt.X(alt.repeat('column'), type='quantitative'),
    alt.Y(alt.repeat('row'), type='quantitative')
).properties(
    width=115,
    height=115
).repeat(
    row=['temp_max', 'precipitation', 'wind'],
    column=['wind', 'precipitation', 'temp_max']
)
splom

---

#### Step 2: Create Monthly Histograms with Reference Lines (using Layer + Repeat)

In [25]:
monthly_hists = alt.layer(
    # Layer 1: Bars
    alt.Chart().mark_bar().encode(
        alt.X('month(date):O', title='Month'),
        alt.Y(alt.repeat('row'), aggregate='average', type='quantitative')
    ),
    # Layer 2: Reference line
    alt.Chart().mark_rule(stroke='firebrick').encode(
        alt.Y(alt.repeat('row'), aggregate='average', type='quantitative')
    )
).properties(
    width=160,
    height=115
).repeat(
    row=['temp_max', 'precipitation', 'wind']
)
monthly_hists

Wait, what, is something the wrong with my code. 
Nope, your code is correct. We have not attached any data to the histograms and so right now both layers are empty. 
Fear not, we will fix. 


---

#### Step 3: Create Temperature Distribution by Weather (using Facet)

In [26]:
colors = alt.Scale(
    domain=['drizzle', 'fog', 'rain', 'snow', 'sun'],
    range=['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52']
)

temp_by_weather = alt.Chart().mark_bar().encode(
    alt.X('temp_max:Q', bin=True, title='Temperature (°C)'),
    alt.Y('count():Q'),
    alt.Color('weather:N', scale=colors),
    alt.Facet('weather:N')
).properties(
    width=115,
    height=100
)

temp_by_weather   # don't worry, it will soon be attached to data

---

#### Step 4: Compose Everything (using Concatenate)

In [27]:
dashboard = alt.vconcat(
    alt.hconcat(splom, monthly_hists),  # Top row: SPLOM + histograms
    temp_by_weather,                     # Bottom row: faceted temp
    data=seattle_data,                 # FINALLY WE ATTACH DATA TO THE CHARTS. 
    title='Seattle Weather Dashboard'
).resolve_legend(
    color='independent'  # Keep color legend with temp chart
)
dashboard


### Composition Model

The structure of our dashboard:

```
vconcat (vertical concatenation)
├─ hconcat (horizontal concatenation)
│  ├─ repeat(row=[...], column=[...])  ← SPLOM
│  │  └─ scatter plot base
│  └─ repeat(row=[...])                 ← Monthly histograms
│     └─ layer
│        ├─ bar chart
│        └─ rule (reference line)
└─ facet(column='weather')              ← Temperature distributions
   └─ histogram base
```

**This is the composition algebra in action!** Four operators working together to create a sophisticated multi-view dashboard.

---

### Design Principles for Multi-View Compositions

When building complex dashboards, keep these principles in mind:

1. **Alignment Matters**
   - Use consistent `width` and `height` properties
   - Align related views for visual comparison

2. **Resolution Strategy**
   - Default resolution is usually good
   - Use `resolve_legend(color='independent')` in mixed dashboards
   - Only use `resolve_scale()` when you have a specific reason

3. **White Space is Good**
   - Don't overcrowd - leave breathing room
   - Use facet `spacing` parameter if needed
   - Consider breaking very dense dashboards into multiple views

4. **Each View Should Have Purpose**
   - Don't add views just because you can
   - Each chart should answer a specific question
   - Remove redundant visualizations

5. **Consistent Encoding Choices**
   - Use same color schemes across related views
   - Keep similar data on similar scales
   - Maintain consistent mark types for same data types

---

## Summary

### What We Learned

**The Four Operators:**
- **Facet:** Small multiples for categorical comparisons
- **Concatenate:** Flexible layouts for independent views
- **Repeat:** Efficient templates for multiple fields
- **Layer:** Overlays for reference lines and annotations

**Advanced Techniques:**
- Facet operator with `columns` parameter for layout control
- Nested concatenation with proper parentheses
- Multi-layer compositions within repeated panels
- Different data sources in concatenated views

**Resolution Systems:**
- Scale resolution controls value range alignment
- Axis resolution controls label placement
- Legend resolution controls legend positioning
- Usually defaults are good; override when needed

**Composition Algebra:**
- Operators can be arbitrarily combined
- Create sophisticated dashboards by nesting operators
- Follow design principles for effective multi-view displays

---

## Next Steps

- **Practice** combining operators with your own datasets
- **Experiment** with different layouts and arrangements
- **Consider** which views best serve your analytical questions
- **Next tutorial:** Interactive selections and coordinated views

**Remember:** Multi-view composition is an art and a science. Start simple, add complexity purposefully, and always ask: "Does this view help answer my question?"

