# Understanding Grouping and Coloring Options

This guide explains the key differences between hvPlot's three main approaches to handling categorical data: `groupby`, `by`, and `color`/`c`. Understanding these differences is important for creating effective visualizations and choosing the right approach for your specific use case.

## Overview of the Three Approaches

hvPlot provides three primary ways to handle categorical data in your plots:

1. **`groupby`**: Creates interactive widgets with [HoloMap](inv:holoviews#holoviews.HoloMap) / [DynamicMap](inv:holoviews#holoviews.DynamicMap) containers
2. **`by`**: Creates multiple plot elements in [NdOverlay](inv:holoviews#holoviews.NdOverlay) or [NdLayout](inv:holoviews#holoviews.NdLayout) containers
3. **`color`/`c`**: Creates vectorized coloring within a single plot element

Each approach produces different outputs, offers different interaction capabilities, and has different performance characteristics.

In [None]:
import hvplot.pandas

penguins = hvplot.sampledata.penguins("pandas").dropna()
penguins.head(3)

## `groupby`: Interactive Widgets

The `groupby` parameter creates **interactive widgets** that allow users to filter and explore different subsets of your data dynamically.

### What it creates:
- **HoloViews containers**: [HoloMap](inv:holoviews#holoviews.HoloMap) or [DynamicMap](inv:holoviews#holoviews.DynamicMap) object
- **Interactive widgets**: Automatically generated based on data type
- **Single view at a time**: Only one category visible per interaction

### When to use:
- When you want to explore different subsets of data interactively
- When you have many categories that would clutter a single plot
- When building dashboards or interactive reports
- When you need to reduce visual complexity

In [None]:
plot_groupby = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    groupby='species',
    title="Groupby: Interactive Widget",
    width=400,
)
plot_groupby

### Indexing and access:

In [None]:
# Access specific category: plot['category_value']
adelie_only = plot_groupby['Adelie']  # Shows only Adelie penguins
print(f"Type of groupby plot: {type(plot_groupby)}")
print(f"Type of specific species: {type(adelie_only)}")
adelie_only

## `by`: Multiple Plot Elements

The `by` parameter creates **multiple plot elements** shown simultaneously, either overlaid or in separate subplots.

### What it creates:
- [NdOverlay](inv:holoviews#holoviews.NdOverlay): Multiple elements overlaid (default)
- [NdLayout](inv:holoviews#holoviews.NdLayout): Separate subplots when `subplots=True`
- **All categories visible**: Simultaneously displayed

### When to use:
- When you want to compare categories side-by-side
- When you have a manageable number of categories (typically < 10)
- When color differentiation is sufficient for your analysis
- When you need all data visible at once

In [None]:
overlay_plot = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    by='species',
    title="By: Overlaid Elements"
)
overlay_plot

In [None]:
plot_by_subplots = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    by='species',
    width=300,
    subplots=True,
)
plot_by_subplots.cols(2)

### Indexing and access:

In [None]:
# Access specific category: plot['category_value']
adelie_element = overlay_plot['Adelie']  # Returns just the element for Adelie penguins
print(f"Type of 'by' plot: {type(overlay_plot)}")
print(f"Type of specific species element: {type(adelie_element)}")
adelie_element

## `color`/`c`: Vectorized Coloring

The `color` parameter creates **vectorized coloring** within a single plot element, where each data point is colored based on the category value.

### What it creates:
- **Single plot element**: One unified [Scatter](inv:holoviews#holoviews.Scatter), [Curve](inv:holoviews#holoviews.Curve), etc.
- **Vectorized coloring**: Each point colored by category
- **Cannot be indexed**: Single element, not separable by category

### When to use:
- When you want the best performance for large datasets
- When you need smooth, continuous color mapping
- When you don't need to isolate specific categories

In [None]:
plot_color = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    color='species',
    title="Color: Vectorized Coloring"
)
plot_color

In [None]:
# Alternative syntax using 'c'
plot_c = penguins.hvplot.scatter(
    x='bill_length_mm',
    y='bill_depth_mm',
    c='species',
    title="Using 'c' parameter (same result)"
)
plot_c

### Indexing and access:

In [None]:
# Cannot index by category - this won't work:
print(f"Type of 'color' plot: {type(plot_color)}")
try:
    adelie_color = plot_color['Adelie']  # This will raise an error!
except Exception:
    print("Error: You cannot index a single element by category!")

## Visual Output Comparison

Let's compare all three approaches side by side to see the differences:

In [None]:
import panel as pn

width = 300

groupby_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    groupby='species', title="groupby='species'",
    frame_width=width, widget_location='bottom_right',
)

by_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    by='species', title="by='species'",
    frame_width=width,
)

color_plot = penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    color='species', title="color='species'",
    frame_width=width,
)

pn.Column(pn.Row(by_plot, color_plot), groupby_plot)

### Key Visual Differences:

- **`groupby='species'`** shows only one species at a time with a widget
- **`by='species'`** shows all species overlaid with different colors and a legend
- **`color='species'`** looks similar to `by` but is a single plot element

When using `subplots=True` with `by`, you get separate panels:

In [None]:
penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    by='species', subplots=True,
    width=300, height=250,
)

## Advanced: Combining Approaches

You can combine these approaches for more complex visualizations:

In [None]:
penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm',
    groupby='island', color='species',
    width=500, height=400,
    title="Groupby island, colored by species"
)


## Summary

| Aspect            | `groupby`          | `by`               | `color`/`c`            |
| ----------------- | ------------------ | ------------------ | ---------------------- |
| **Holoviews Object**     | [HoloMap](inv:holoviews#holoviews.HoloMap) / [DynamicMap](inv:holoviews#holoviews.DynamicMap) | [NdOverlay](inv:holoviews#holoviews.NdOverlay) / [NdLayout](inv:holoviews#holoviews.NdLayout) | Single Element         |
| **Interactivity** | Widget-based       | Static overlay     | Static single plot     |
| **Indexing**      | `plot['category']` | `plot['category']` | Not available          |
| **Performance**   | Variable           | Medium             | Best                   |
| **Visual**        | One at a time      | All simultaneous   | All simultaneous       |
| **Use case**      | Exploration        | Comparison         | Performance/Aesthetics |

### Choose the approach that best matches your needs:

- **Use `groupby`** for interactive exploration of many categories
- **Use `by`** for direct comparison of manageable categories  
- **Use `color`** for performance with large datasets or aesthetic coloring

:::{admonition} Further Reading
:class: seealso
See the [HoloViews Reference Manual](https://holoviews.org/reference_manual/index.html) for more information on the various objects created by a hvPlot plot.
:::