In [None]:
import pandas as pd
from pathlib import Path

# Import the new classes
from omero_screen_plots import feature_plot, feature_plot_simple
path = Path("../images")
path.mkdir(parents=True, exist_ok=True)

df = pd.read_csv("../data/sample_plate_data.csv")

conditions = [
    "palb:0.0 c604:0",
    "palb:0.0 c604:1",
    "palb:0.75 c604:0",
    "palb:0.75 c604:1",
]

## Standard feature plot function

The `feature_plot` function creates sophisticated visualizations for quantitative features across different experimental conditions, with advanced capabilities for data presentation and statistical analysis.

### Key Features:

1. **Multi-layer Data Visualization**:
   - **Boxen plots**: Shows detailed distribution using letter-value plots (more quantiles than standard boxplots)
   - **Swarm plots**: Overlays individual data points colored by replicate/plate ID
   - **Median markers**: Displays plate-wise medians with different marker shapes for each replicate

2. **Statistical Analysis** (when group_size=1):
   - Automatically performs pairwise t-tests comparing each condition to the first (control) condition
   - Displays significance markers: *** (p<0.001), ** (p<0.01), * (p<0.05), ns (p>0.05)
   - Tests are performed on plate-wise medians (not individual cells) for more stringent statistical rigor

3. **Flexible Grouping** (NEW):
   - `group_size=1` (default): Standard layout with evenly spaced conditions
   - `group_size>1`: Groups conditions visually on x-axis for easier comparison
   - Custom spacing within and between groups via `within_group_spacing` and `between_group_gap`

4. **Data Filtering and Selection**:
   - Filter by specific cell lines or other categorical variables
   - Automatically samples data points for visualization to prevent overplotting
   - Scales data if needed (e.g., for normalized features)

5. **Figure Integration**
   - With ax=None a standalone figure is created
   - When an ax object is provided the plot can be inegrated with other subplots
   - x_label True or False can then turn on or off the x-label as required


### Example Usage:

```python
# Standard plot with statistical comparisons
feature_plot(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=["control", "treatment1", "treatment2", "treatment3"],
    selector_val="MCF10A",  # Filter for specific cell line
    ymax=(3000, 12000),     # Set y-axis limits
    group_size=1            # No grouping, enables statistics
)

# Grouped plot for visual comparison
feature_plot(
    df=df,
    feature="intensity_mean_p21_nucleus", 
    conditions=["ctrl_A", "treat_A", "ctrl_B", "treat_B"],
    group_size=2,           # Group conditions in pairs
    within_group_spacing=0.1,
    between_group_gap=0.5
)
```

### When to Use:
- Comparing quantitative features (e.g., protein expression, cell size) across conditions
- When you need both detailed distribution info and statistical significance
- For publication-quality figures with multiple data layers
- When you want to group related conditions visually

# Example 1: No grouping, original plot features and statistacl analysis

In [None]:
feature_plot(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=conditions,
    x_label = True,
    ymax=(3000, 12000),
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="standard featureplot",
    fig_size=(5, 5),
    size_units="cm",
    scale=False,
    group_size=1,
    save=True,
    path=path,
    tight_layout=False,
    file_format="png",
    dpi=100,
)

### Example 2: Grouped Feature Plot
This example demonstrates rouping functionality. With `group_size=2`, the four conditions are visually grouped into two pairs, making it easier to compare related conditions. Note that when using grouping (group_size > 1), statistical significance markers are not displayed.

In [None]:
feature_plot(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=conditions,
    ax=None,
    x_label = True,
    ymax=(1000, 12000),
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="standard featureplot grouped",
    fig_size=(5, 5),
    size_units="cm",
    scale=False,
    group_size=2,
    within_group_spacing=0.1,
    between_group_gap=0.5,
    save=True,
    path=path,
    tight_layout=False,
    file_format="png",
    dpi=100,
)

### Example 3: Simple Feature Plot (Box Plot Version)
The `feature_plot_simple` function provides a cleaner, simpler visualization:
- Standard box plots (instead of boxen plots)
- Plate-wise median points overlaid
- Statistical significance markers
- Less visual complexity, suitable for presentations or when you don't need to show individual data points

In [None]:
feature_plot_simple(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=conditions,
    ymax=(3000, 10000),
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    title="Simple Feature Boxplot",
    scale=False,
    fig_size=(5, 5),
    size_units="cm",
    violin=False,
    save=True,
    path=Path("../images"),
)

In [None]:
feature_plot_simple(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=conditions,
    ymax=(2000,10000),
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    violin=True,
)

### Example 4: Simple Feature Plot (Violin Plot Version)
With `violin=True`, the simple feature plot shows:
- Violin plots displaying the full distribution shape
- Wider sections indicate higher data density
- Plate-wise median points overlaid
- Statistical significance markers
- Useful for visualizing bimodal or skewed distributions

In [None]:
feature_plot_simple(
    df=df,
    feature="intensity_mean_p21_nucleus",
    conditions=conditions,
    ymax=(2000,10000),
    condition_col="condition",
    selector_col="cell_line",
    selector_val="MCF10A",
    violin=True,
    group_size=2,
    within_group_spacing=0.1,
    between_group_gap=0.5,
)

## Summary

Choose the right plot type based on your needs:
- **`feature_plot`**: For comprehensive analysis with multiple data layers
- **`feature_plot` with grouping**: For visual comparison of related conditions
- **`feature_plot_simple` (box)**: For clean, simple comparisons
- **`feature_plot_simple` (violin)**: For visualizing distribution shapes

All plots support the same core parameters for filtering, scaling, and customization.