# Count Plot Examples: Comprehensive Guide

This notebook demonstrates the various capabilities of the `count_plot` function, including:
- **Data Types**: Normalized vs Absolute counts
- **Layout Types**: Standard vs Grouped layouts  
- **Figure Compositions**: Standalone vs Combined subplots
- **Customization Options**: Colors, spacing, labels, statistical comparisons

---


## ✅ **Basic Functionality**:
- Normalized vs Absolute count plots
- Different data selection options

## ✅ **Advanced Grouping**:
- Standard layout (group_size=1) 
- Paired grouping (group_size=2)
- Custom grouping (group_size=3+)
- Fine-tuning spacing parameters

## ✅ **Figure Composition**:
- Standalone plots with automatic saving
- Combined subplots for comparisons
- Custom subplot arrangements

## ✅ **Statistical Analysis**:
- Automatic statistical comparisons
- Group-aware statistical testing
- Significance marker interpretation

## ✅ **Customization**:
- Custom styling and colors
- Flexible output formats
- Layout parameter optimization

## 🎯 **Use Cases**:
- **Drug screening**: Compare treatment effects
- **Dose-response**: Group by concentration ranges  
- **Multi-condition**: Compare different experimental setups
- **Quality control**: Absolute vs normalized count validation
- **Publication**: High-quality figure generation

The `count_plot` function provides a powerful, flexible interface for visualizing cell count data with sophisticated statistical analysis and publication-ready output.


In [None]:

## Setup and Data Loading

import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path

from omero_screen_plots import count_plot, PlotType
from omero_screen_plots.colors import COLOR

# Setup output directory
path = Path("../images")
path.mkdir(parents=True, exist_ok=True)

# Load sample data
df = pd.read_csv("data/sample_plate_data.csv")

# Define conditions for examples
conditions = ['control', 'cond01', 'cond02', 'cond03']
print("Available conditions:", conditions)
print("Available cell lines:", df['cell_line'].unique())
print("Data shape:", df.shape)

In [None]:
# 1. Basic Count Plots: Normalized vs Absolute

## 1.1 Normalized Count Plot (Default)
# Shows counts relative to control condition

fig, ax = count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    plot_type=PlotType.NORMALISED,  # Default
    title="Normalized Count Plot - MCF10A",
    fig_size=(8, 6),
    save=True,
    path=path,
)
plt.show()

In [None]:
## 1.2 Absolute Count Plot
# Shows raw counts without normalization

fig, ax = count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    plot_type=PlotType.ABSOLUTE,
    title="Absolute Count Plot - MCF10A",
    fig_size=(8, 6),
    save=True,
    path=path,
)


In [None]:
# 2. Grouped Layout Examples

## 2.1 Standard Layout (group_size = 1)
# Traditional layout with all conditions in a single row

fig, ax = count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Standard Layout - All Conditions",
    fig_size=(10, 6),
    group_size=1,  # Standard layout
    save=True,
    path=path,
)


In [None]:
## 2.2 Grouped Layout (group_size = 2)
# Groups conditions in pairs with gaps between groups
# Statistical comparisons are done within each group

fig, ax = count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Grouped Layout - Pairs (group_size=2)",
    fig_size=(10, 6),
    group_size=2,  # Group in pairs
    within_group_spacing=0.3,
    between_group_gap=0.8,
    save=True,
    path=path,
)
plt.show()

In [None]:
## 2.3 Custom Grouped Layout (group_size = 3)
# Groups conditions in triplets - useful for dose-response studies

# Let's create a 6-condition example for better demonstration
extended_conditions = ['control', 'cond01', 'cond02', 'cond03', 'cond04', 'cond05']

# Filter to only use conditions that exist in the data
available_conditions = [c for c in extended_conditions if c in df['condition'].unique()]
print(f"Using conditions: {available_conditions}")

if len(available_conditions) >= 3:
    fig, ax = count_plot(
        df=df,
        norm_control="control",
        conditions=available_conditions[:6],  # Use up to 6 conditions
        condition_col="condition",
        selector_col="cell_line",
        selector_val="MCF10A",
        title="Grouped Layout - Triplets (group_size=3)",
        fig_size=(7, 5),
        group_size=3,  # Group in triplets
        within_group_spacing=0.2,
        between_group_gap=1.0,
        save=True,
        path=path,
    )
    plt.show()
else:
    print("Not enough conditions available for triplet grouping")

In [None]:
# 3. Combined Subplot Examples

## 3.1 Multiple Cell Lines in Subplots
# Compare the same conditions across different cell lines

# Get available cell lines
clones = df['clone'].unique()  # Use first 2 cell lines
print(f"Comparing clones: {clones}")

fig, axes = plt.subplots(1, len(clones), figsize=(2, 2))
if len(clones) == 1:
    axes = [axes]  # Ensure axes is always a list

for i, clone in enumerate(clones):
    count_plot(
        df=df,
        norm_control="control",
        conditions=conditions,
        condition_col="condition",
        selector_col="clone",
        selector_val=clone,
        title=f"Counts - {clone}",
        axes=axes[i],
        save=False,  # Don't save individual plots
    )
    axes[i].set_title(f"Counts - {clone}", fontsize=6, weight="bold", x=0, y=1.1, ha="left")

plt.tight_layout()
plt.savefig(path / "combined_cell_lines_comparison.pdf", dpi=300, bbox_inches='tight')
plt.show()

In [None]:
## 3.2 Normalized vs Absolute Side-by-Side
# Compare normalized and absolute counts for the same data

fig, axes = plt.subplots(1, 2, figsize=(3, 2))

# Normalized counts
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    plot_type=PlotType.NORMALISED,
    title="Normalized Counts",
    axes=axes[0],
    save=False,
)
axes[0].set_title("Normalized Counts", fontsize=7, weight="bold", x=0, y=1.1,)

# Absolute counts
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    plot_type=PlotType.ABSOLUTE,
    title="Absolute Counts",
    axes=axes[1],
    save=False,
)
axes[1].set_title("Absolute Counts", fontsize=7, weight="bold", x=0, y=1.1)

plt.tight_layout()
plt.savefig(path / "normalized_vs_absolute_comparison.pdf", dpi=300, bbox_inches='tight')
plt.show()

In [None]:
## 3.3 Layout Comparison: Standard vs Grouped
# Compare different grouping strategies for the same data

fig, axes = plt.subplots(2, 1, figsize=(2, 3))

# Standard layout (group_size = 1)
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Standard Layout (group_size=1)",
    axes=axes[0],
    group_size=1,
    save=False,
    x_label=False,
)
axes[0].set_title("Standard Layout (group_size=1)", fontsize=7, weight="bold", x=0, y=1.1)
# Grouped layout (group_size = 2)
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Grouped Layout (group_size=2)",
    axes=axes[1],
    group_size=2,
    within_group_spacing=0.3,
    between_group_gap=0.8,
    save=False,
)
axes[1].set_title("Grouped Layout (group_size=2)", fontsize=7, weight="bold", x=0, y=1.1)
plt.tight_layout()
plt.savefig(path / "layout_comparison_standard_vs_grouped.pdf", dpi=300, bbox_inches='tight')
plt.show()

In [None]:
# 4. Customization Examples

## 4.1 Custom Styling and Labels

fig, ax = count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Custom Styled Count Plot",
    # Custom styling options
    fig_size=(6, 4),
    size_units="cm",
    # Layout options
    x_label=True,  # Show x-axis labels
    # File output options
    save=True,
    path=path,
    file_format="png",  # Save as PNG instead of PDF
    dpi=300,
    tight_layout=True,
)
plt.show()

In [None]:
## 4.2 Fine-tuning Grouped Layout Parameters
# Demonstrate the effect of different spacing parameters

fig, axes = plt.subplots(3, 1, figsize=(2, 5))

# Tight grouping
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Tight Grouping (small spacing)",
    axes=axes[0],
    group_size=2,
    within_group_spacing=0.1,  # Very close within groups
    between_group_gap=0.4,     # Small gap between groups
    save=False,
)
axes[0].set_title("Tight Grouping (small spacing)", fontsize=7, weight="bold", x=0, y=1.1)
# Medium grouping
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Medium Grouping (default spacing)",
    axes=axes[1],
    group_size=2,
    within_group_spacing=0.2,  # Default spacing
    between_group_gap=0.5,     # Default gap
    save=False,
)
axes[1].set_title("Medium Grouping (default spacing)", fontsize=7, weight="bold", x=0, y=1.1)
# Wide grouping
count_plot(
    df=df,
    norm_control="control",
    conditions=conditions,
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Wide Grouping (large spacing)",
    axes=axes[2],
    group_size=2,
    within_group_spacing=0.4,  # Wide spacing within groups
    between_group_gap=1.2,     # Large gap between groups
    save=False,
)
axes[2].set_title("Wide Grouping (large spacing)", fontsize=7, weight="bold", x=0, y=1.1)
plt.tight_layout()
plt.savefig(path / "spacing_parameter_comparison.pdf", dpi=300, bbox_inches='tight')
plt.show()

# 5. Statistical Comparisons Explained

The `count_plot` function automatically performs statistical comparisons and displays significance markers:

## Statistical Methods by Layout:

### Standard Layout (group_size = 1):
- **Comparison**: All conditions compared to the **first condition** (global control)
- **Example**: With conditions `['control', 'cond01', 'cond02', 'cond03']`
  - `cond01` vs `control`
  - `cond02` vs `control` 
  - `cond03` vs `control`

### Grouped Layout (group_size > 1):
- **Comparison**: Within each group, conditions compared to the **first condition of that group**
- **Example**: With `group_size=2` and conditions `['control', 'cond01', 'cond02', 'cond03']`
  - Group 1: `['control', 'cond01']` → `cond01` vs `control`
  - Group 2: `['cond02', 'cond03']` → `cond03` vs `cond02`

## Significance Markers:
- **ns**: Not significant (p > 0.05)
- **\***: Significant (p ≤ 0.05)
- **\*\***: Highly significant (p ≤ 0.01)
- **\*\*\***: Very highly significant (p ≤ 0.001)

## Requirements:
- Statistical analysis requires **≥3 biological replicates** (plate_id values)
- T-tests are performed on the median values for each replicate.
- Repeat points (gray dots) show individual replicate median values.

In [None]:
# 6. Parameter Reference Guide

## Complete parameter list for count_plot function:

"""
count_plot(
    df: pd.DataFrame,                    # Input data
    norm_control: str,                   # Control condition for normalization
    conditions: list[str],               # List of conditions to plot
    
    # Data selection
    condition_col: str = "condition",    # Column name for conditions
    selector_col: Optional[str] = "cell_line",  # Column for filtering
    selector_val: Optional[str] = None,  # Value to filter by
    
    # Plot type and styling
    plot_type: PlotType = PlotType.NORMALISED,  # NORMALISED or ABSOLUTE
    title: Optional[str] = None,         # Plot title
    colors = COLOR,                      # Color scheme
    
    # Figure properties
    fig_size: tuple[float, float] = (7, 7),  # Figure size
    size_units: str = "cm",              # "cm" or "inches"
    axes: Optional[Axes] = None,         # Existing axes to plot on
    
    # Layout and grouping
    group_size: int = 1,                 # Number of conditions per group
    within_group_spacing: float = 0.2,   # Spacing within groups
    between_group_gap: float = 0.5,      # Gap between groups
    x_label: bool = True,                # Show x-axis labels
    
    # Output options
    save: bool = False,                  # Save figure to file
    path: Optional[Path] = None,         # Output directory
    file_format: str = "pdf",            # "pdf", "png", "svg", etc.
    dpi: int = 300,                      # Resolution for raster formats
    tight_layout: bool = False,          # Use tight layout
)
"""

print("Parameter reference printed above - see code cell for complete documentation")