# Understanding hvPlot's Statistical Plot Types

hvPlot provides several statistical plotting functions that go beyond basic charts. Each plot type reveals different aspects of your data and has specific strengths and limitations. This guide explains when and why to use each type.

## Multivariate Data Visualization

When working with datasets containing multiple variables, understanding relationships between all dimensions becomes challenging. hvPlot offers three complementary approaches:

### Scatter Matrix

**What it shows:** All pairwise relationships between numeric variables

**Strengths:**
- Provides quantitative insights into correlations
- Interactive linking allows exploration across all variable pairs
- Familiar scatter plot format is easy to interpret

**Best for:** Identifying correlations, outliers, and clustering patterns between variable pairs

**Limitations:** Can become cluttered with many variables; doesn't show patterns across all dimensions simultaneously

### Parallel Coordinates

**What it shows:** Patterns and relationships across all variables simultaneously

**Strengths:**
- Reveals patterns across all dimensions at once
- Excellent for identifying distinct groups or classes
- Shows which variables contribute most to group differences

**Best for:** Comparing groups across multiple dimensions, identifying which variables distinguish different classes

**Limitations:** Can be difficult to read with many observations; requires some practice to interpret effectively

### Andrews Curves

**What it shows:** Aggregate differences between classes using Fourier series representation

**Strengths:**
- Smooth curves make group differences visually apparent
- Good for showing overall class separation
- Less cluttered than parallel coordinates with many observations

**Best for:** Visualizing overall differences between classes when you care more about separation than specific variable contributions

**Limitations:** Provides less quantitative insight into which specific features drive differences; mathematical transformation makes individual variable contributions less interpretable

## Time Series Analysis

### Lag Plots

**What it shows:** Relationship between current values and values at a previous time point

**Strengths:**
- Reveals autocorrelation patterns in time series
- Identifies volatility and stability in temporal data
- Helps detect seasonal or cyclical patterns

**Best for:** Understanding temporal dependencies, comparing volatility between different time series, detecting autocorrelation

**Key insight:** Tight clustering around the diagonal indicates stable, predictable behavior; scattered points indicate high volatility or weak temporal correlation

## Distribution Analysis

Understanding the distribution of your data is fundamental to statistical analysis. hvPlot provides several plot types that reveal different aspects of data distributions:

### Histograms

**What it shows:** Frequency distribution of values in a single variable

**Strengths:**
- Clear visualization of data distribution shape
- Easy to identify skewness, modality, and outliers
- Familiar and intuitive for most users
- Customizable bin sizes for different levels of detail

**Best for:** Understanding the overall shape and spread of a single variable, identifying distribution patterns

**Limitations:** Can be sensitive to bin size choices; doesn't show relationships between variables

### Box Plots

**What it shows:** Five-number summary (minimum, Q1, median, Q3, maximum) plus outliers

**Strengths:**
- Compact summary of distribution characteristics
- Excellent for comparing distributions across groups
- Clearly identifies outliers and quartile ranges
- Robust to extreme values

**Best for:** Comparing distributions between groups, identifying outliers, understanding data spread and central tendency

**Limitations:** Hides detailed distribution shape; can miss bimodal or complex distributions

### Violin Plots

**What it shows:** Combination of box plot information with kernel density estimation

**Strengths:**
- Shows both summary statistics and distribution shape
- Reveals multimodal distributions that box plots miss
- Good for comparing complex distributions across groups
- More informative than box plots for understanding distribution shape

**Best for:** Comparing detailed distribution shapes across groups, when you need both summary statistics and distribution density

**Limitations:** Can be more complex to interpret; kernel density estimation may smooth over important details

## Interactive Advantages

All hvPlot statistical plots benefit from Bokeh's interactive features:

- **Linked brushing:** Selections in one part of the plot highlight corresponding points elsewhere
- **Linked zooming/panning:** Coordinated exploration across multiple plot panels
- **Hover tooltips:** Detailed information about individual data points

These features make hvPlot's statistical plots significantly more powerful than static alternatives for data exploration.

## Choosing the Right Plot Type

| Goal | Recommended Plot | Why |
|------|------------------|-----|
| Find correlations between variable pairs | Scatter Matrix | Shows quantitative relationships clearly |
| Compare groups across many variables | Parallel Coordinates | Reveals which variables distinguish groups |
| Show overall class separation | Andrews Curves | Emphasizes aggregate differences |
| Analyze temporal dependencies | Lag Plot | Designed specifically for time series patterns |
| Understand single variable distribution | Histogram | Clear frequency distribution visualization |
| Compare distributions across groups | Box Plot or Violin Plot | Box plots for simple comparisons, violin plots for detailed shapes |
| Identify outliers | Box Plot | Explicitly shows outliers beyond quartile ranges |
| Detect multimodal distributions | Violin Plot or Histogram | Violin plots show density curves, histograms show frequency peaks |
| Quick distribution summary | Box Plot | Compact five-number summary |
| Detailed distribution analysis | Violin Plot | Combines summary statistics with full distribution shape |
| Detect outliers in multivariate data | Scatter Matrix + Parallel Coordinates | Combine pairwise and multi-dimensional views |

## Next Steps

- Learn how to create:
    - [multivariate statistical plots](../how_to/multivariate_statistical_plots.ipynb)
    - [time series lag plots](../how_to/time_series_lag_plots.ipynb)
- See the [reference documentation](../ref/api/index.md) for complete parameter lists
- Explore more visualization options at [holoviews.org](https://holoviews.org)