# Understanding Grouping and Coloring Options

This guide explains the key differences between hvPlot's three main approaches to handling categorical data: [groupby](option-groupby), [by](option-by), and [color/c](option-color). Understanding these differences is important for creating effective visualizations and choosing the right approach for your specific use case.

## Overview of the Three Approaches

hvPlot provides three primary ways to handle categorical data in your plots:

1. **`groupby`**: Creates interactive widgets with [HoloMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.HoloMap) / [DynamicMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.DynamicMap) containers
2. **`by`**: Creates multiple plot elements in [NdOverlay](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdOverlay) or [NdLayout](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdLayout) containers
3. **`color`/`c`**: Creates vectorized coloring within a single plot element

Each approach produces different outputs, offers different interaction capabilities, and has different performance characteristics.

In [None]:
import hvplot.pandas

penguins = hvplot.sampledata.penguins("pandas").dropna()
penguins.head(3)

## `groupby`: Interactive Widgets

The `groupby` parameter creates **interactive widgets** that allow users to filter and explore different subsets of your data dynamically.

### What it creates:
- **HoloViews containers**: [HoloMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.HoloMap) or [DynamicMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.DynamicMap) object
- **Interactive widgets**: Automatically generated based on data type
- **Single view at a time**: Only one category visible per interaction

### When to use:
- When you want to explore different subsets of data interactively
- When you have many categories that would clutter a single plot
- When building dashboards or interactive reports
- When you need to reduce visual complexity

In [None]:
plot_groupby = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    groupby='species',
    title="Groupby: Interactive Widget",
    width=400,
    dynamic=False, # loads all the data on the frontend to enable interactivity on the fly
)
plot_groupby

### Indexing and Access

When you use `groupby`, hvPlot creates a **HoloViews container object** (HoloMap or DynamicMap) that acts like a dictionary. This container stores separate plot elements for each category value, which is why you can "index into" specific categories.

**Why indexing works**: The container maps category values (like `'Adelie'`, `'Chinstrap'`, `'Gentoo'`) to individual plot elements. Each category becomes a "key" that you can use to access its corresponding plot.

**What happens when you index**: You extract a single plot element from the container, giving you just the data for that specific category without the interactive widget.

**Use cases**: 
- Extract specific categories for further analysis
- Combine individual categories in custom layouts
- Access the underlying plot element for advanced customization

In [None]:
# Access specific category: plot['category_value']
adelie_only = plot_groupby['Adelie']  # Shows only Adelie penguins
print(f"Type of groupby plot: {type(plot_groupby)}")
print(f"Type of specific species: {type(adelie_only)}")
adelie_only

## `by`: Multiple Plot Elements

The `by` parameter creates **multiple plot elements** shown simultaneously, either overlaid or in separate subplots.

### What it creates:
- [NdOverlay](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdOverlay): Multiple elements overlaid (default)
- [NdLayout](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdLayout): Separate subplots when `subplots=True`
- **All categories visible**: Simultaneously displayed

### When to use:
- When you want to compare categories side-by-side
- When you have a manageable number of categories (typically < 10)
- When color differentiation is sufficient for your analysis
- When you need all data visible at once

In [None]:
overlay_plot = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    by='species',
    title="By: Overlaid Elements"
)
overlay_plot

In [None]:
plot_by_subplots = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    by='species',
    width=300,
    subplots=True,
)
plot_by_subplots.cols(2)

### Indexing and Access

Similar to `groupby`, using `by` creates a **HoloViews container object** (NdOverlay or NdLayout) that you can index into. However, the behavior is different because all elements are displayed simultaneously.

**Why indexing works**: The NdOverlay/NdLayout container stores individual plot elements for each category, just like a dictionary mapping category values to plot elements.

**What happens when you index**: You extract a single plot element from the overlay/layout. This gives you just that category's data as a standalone plot element, separate from the others.

**Key difference from groupby**: While `groupby` shows one category at a time with widgets, `by` shows all categories together, but you can still extract individual in the same way as `groupby` objects.

In [None]:
# Access specific category: plot['category_value']
adelie_element = overlay_plot['Adelie']  # Returns just the element for Adelie penguins
print(f"Type of 'by' plot: {type(overlay_plot)}")
print(f"Type of specific species element: {type(adelie_element)}")
adelie_element

## `color`/`c`: Vectorized Coloring

The `color` parameter creates **vectorized coloring** within a single plot element, where each data point is colored based on the category value.

### What it creates:
- **Single plot element**: One unified [Scatter](https://holoviews.org/reference_manual/holoviews.html#holoviews.Scatter), [Curve](https://holoviews.org/reference_manual/holoviews.html#holoviews.Curve), etc.
- **Vectorized coloring**: Each point colored by category
- **Cannot be indexed**: Single element, not separable by category

### When to use:
- When you want the best performance for large datasets
- When you need smooth, continuous color mapping
- When you don't need to isolate specific categories

In [None]:
plot_color = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    color='species',
    title="Color: Vectorized Coloring"
)
plot_color

### Indexing and Access

Unlike `groupby` and `by`, using `color` creates a **single plot element** rather than a container object. This means indexing by category is not possible.

**Why indexing doesn't work**: There's no container structure - just one plot element where the color information is stored as data within the element itself. The categories exist as color mappings, not as separate, accessible plot elements.

**What this means**: All your data points are part of one unified plot object. The categorical information is encoded in the visual properties (colors) rather than the data structure.

**Implications**:
- You cannot programmatically separate categories
- Better performance since there's only one plot element to render
- All data is always visible
- Ideal when you want aesthetic grouping without interactive separation

In [None]:
print(f"Type of 'color' plot: {type(plot_color)}")
try:
    adelie_color = plot_color['Adelie']  # This will raise an error.
except Exception:
    print("Error: You cannot index a single element by category!")

## Understanding Container Objects vs Single Elements

The key concept underlying these differences is **how HoloViews organizes your data**:

### Container Objects (indexable)
- **`groupby`** → HoloMap/DynamicMap: `{category: plot_element}`
- **`by`** → NdOverlay/NdLayout: `{category: plot_element}`

These containers act like dictionaries where each category value maps to a separate plot element. This structure enables indexing with `plot['category_name']`.

### Single Elements (not indexable)  
- **`color`** → Single plot element (Scatter, Curve, etc.)

Here, all data points belong to one plot element. Categories are encoded as visual properties (colors) rather than separate data structures.

This fundamental difference affects not just indexing, but also performance, interactivity, and how you can manipulate the resulting plots.

:::{seealso}
[Introduction to HoloViews Elements](https://holoviews.org/getting_started/Introduction.html)
:::

## Visual Output Comparison

Let's compare all three approaches side by side to see the differences:

In [None]:
import panel as pn

width = 300

groupby_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    groupby='species', title="groupby='species'",
    frame_width=width, widget_location='bottom_right',
    shared_axes=False,
)

by_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    by='species', title="by='species'",
    frame_width=width,
)

color_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    color='species', title="color='species'",
    frame_width=width,
)

pn.Column(pn.Row(groupby_plot, by_plot), color_plot)

### Key Visual Differences:

- **`groupby='species'`** shows only one species at a time with a widget
- **`by='species'`** shows all species overlaid with different colors and a legend
- **`color='species'`** looks similar to `by` but is a single plot element

When using `subplots=True` with `by`, you get separate panels:

In [None]:
penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    by='species', subplots=True,
    width=300, height=250,
)

## Advanced: Combining Approaches

You can combine these approaches for more complex visualizations:

In [None]:
penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    groupby='island', color='species',
    width=500, height=400, dynamic=False,
    title="Groupby island, colored by species"
)


## Summary

| Aspect            | `groupby`          | `by`               | `color`/`c`            |
| ----------------- | ------------------ | ------------------ | ---------------------- |
| **Holoviews Object**     | [HoloMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.HoloMap) / [DynamicMap](https://holoviews.org/reference_manual/holoviews.html#holoviews.DynamicMap) | [NdOverlay](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdOverlay) / [NdLayout](https://holoviews.org/reference_manual/holoviews.html#holoviews.NdLayout) | Single Element         |
| **Interactivity** | Widget-based       | Static overlay     | Static single plot     |
| **Indexing**      | `plot['category']` | `plot['category']` | Not available          |
| **Performance**   | Variable           | Medium             | Best                   |
| **Visual**        | One at a time      | All simultaneous   | All simultaneous       |
| **Use case**      | Exploration        | Comparison         | Performance/Aesthetics |

### Choose the approach that best matches your needs:

- **Use `groupby`** for interactive exploration of many categories
- **Use `by`** for direct comparison of manageable categories  
- **Use `color`** for performance with large datasets or aesthetic coloring

:::{admonition} Further Reading
:class: seealso
See the [HoloViews Reference Manual](https://holoviews.org/reference_manual/index.html) for more information on the various objects created by a hvPlot plot.
:::