::::
:::{thebe-button}
:::
::::

# Find centers

Find bubble centers using two different approaches.


In [None]:
from boilercv_docs.nbs import init

paths = init()

from geopandas import GeoDataFrame, points_from_xy
from matplotlib.pyplot import subplots
from myst_nb import glue
from pandas import DataFrame, IndexSlice, NamedAgg
from seaborn import scatterplot
from shapely import LinearRing

from boilercv.images import scale_bool
from boilercv_docs.nbs import HIDE, nowarn, style_df
from boilercv_pipeline.experiments.e230920_subcool import GBC, bounded_ax
from boilercv_pipeline.experiments.e240215_plotting import cool, warm
from boilercv_pipeline.sets import get_contours_df, get_dataset

COMPARE_WITH_TRACKPY = True
"""Whether to get centers using the Trackpy approach."""
TIME = "2023-09-20T17:14:18"
"""Trial."""
STEP = 10
"""Frame step size."""
STOP = 3
"""Last frame to analyze."""
FRAMES: list[int | None] | None = [None, *[STEP * frame for frame in (STOP, 1)]]
"""Frames.

A list that will become a slice. Not a tuple because `ploomber_engine` can't inject
tuples. Here we automatically scale the frame to stop at by the step size.
"""

GUESS_DIAMETER = 21
"""Guess diameter for the Trackpy approach. (px)"""

HIDE

## Data

Load a video of filled contours and the contour loci and plot a composite of all frames to analyze.


In [None]:
if COMPARE_WITH_TRACKPY:
    with nowarn(capture=True):
        from trackpy import batch, quiet

    quiet()

PATH_TIME = TIME.replace(":", "-")
"""Timestamp suitable for paths.

Also used in notebook parametrization.
"""
frames = slice(*FRAMES) if isinstance(FRAMES, list) else slice(None)  # type: ignore  # pyright: 1.1.336
filled_contours = scale_bool(
    get_dataset(PATH_TIME, stage="filled", frame=frames)["video"]
)
contours_df = get_contours_df(PATH_TIME)
composite_video = filled_contours.max("frame").values
figure, ax = subplots()
with bounded_ax(composite_video, ax) as ax:
    ax.imshow(~composite_video, alpha=0.4)

glue("glue-find-centers-composite", figure, display=False)

:::{glue:figure} glue-find-centers-composite
:name: fig-find-centers-composite
:alt: Plot with pixel axes showing a composite gray image of the filled contours in each frame.
:width: 40%

Plot with pixel axes showing a composite gray image of the filled contours in each frame.
:::

## Find centers from filled contours using Trackpy

Use Trackpy to find bubble centers given the filled contours.


In [None]:
if COMPARE_WITH_TRACKPY:
    trackpy_centers = (
        batch(
            frames=filled_contours.values, diameter=GUESS_DIAMETER, characterize=False
        )
        .drop(columns="mass")
        .assign(
            frame=lambda df: df.frame.replace(
                dict(enumerate(filled_contours.frame.values))
            )
        )
        .sort_values(["frame", "y", "x"], ignore_index=True)
    )
else:
    trackpy_centers = DataFrame()

trackpy_centers

## Find centers from contour centroids

The prior approach throws out contour data, instead operating on filled contours. Instead, try using shapely to find centers directly from contour data.


### Prepare to find objects

Prepare a dataframe with columns in a certain order, assign contour data to it, and demote the hiearchical indices to plain columns. Count the number of points in each contour and each frame, keeping only those which have enough points to describe a linear ring. Construct a GeoPandas geometry column and operate on it with Shapely to construct linear rings, returning only their centroids. Also report the number of points in the loci of each contour per frame.

:::{admonition} `groupby` considerations  
:class: dropdown note  
`groupby` operations behave differently depending on the index, so resetting the index before grouping, and unpacking `GBC` to set sensible defaults for `groupby`'s keyword arguments, makes it behave less surprisingly. `GBC` enables `observed` and `sort`, and disables `as_index`, `dropna`, and `group_keys`.  
:::


In [None]:
contours = (
    DataFrame(columns=["xpx", "ypx"])
    .assign(**contours_df.loc[IndexSlice[frames, :], :])
    .rename(axis="columns", mapper=dict(xpx="x", ypx="y"))
    .reset_index()
    .assign(
        count=lambda df: df.groupby(["frame", "contour"], **GBC).x.transform("count")
    )
    .query("count > 3")
    .assign(geometry=lambda df: points_from_xy(df.x, df.y))
    .groupby(["frame", "contour"], **GBC)
    .agg(
        count=NamedAgg(column="count", aggfunc="first"),
        centroid=NamedAgg(
            column="geometry", aggfunc=lambda df: LinearRing(df).centroid
        ),
    )
)
contours

Split the centroid point objects into separate named columns that conform to the Trackpy convention. Report the centroids in each frame.


In [None]:
centers = (
    GeoDataFrame(contours)
    .assign(x=lambda df: df.centroid.x, y=lambda df: df.centroid.y)
    .loc[:, ["y", "x", "frame"]]
    .sort_values(["frame", "y", "x"], ignore_index=True)
)
centers

## Compare approaches

Compare Trackpy centers with contour centroids. Here the guess radius for Trackpy object finding and contour perimeter filtering are matched to produce the same number of objects from each algorithm. Trackpy features more intelligent filtering, but takes much longer. Trackpy's approach for finding local maxima in grayscale images is applied even to binarized images, exhaustively searching for high points in the binary image, adding to execution time.

The percent difference between the approaches is relatively low for this subset, suggesting the contour centroid approach is reasonable.


In [None]:
if COMPARE_WITH_TRACKPY:
    diffs = (centers - trackpy_centers).abs()

    with style_df(
        DataFrame().assign(dx=diffs.x, dy=diffs.y, frame=trackpy_centers.frame)
    ) as styler:
        styler.background_gradient().hide(axis="index")

A warm color palette is used to plot Trackpy centers, and a cool color palette is used to plot contour centroids.


In [None]:
if COMPARE_WITH_TRACKPY:
    scatterplot(
        ax=ax,
        data=trackpy_centers,
        x="x",
        y="y",
        hue="frame",
        alpha=0.6,
        palette=warm,
        legend=False,
    )
scatterplot(
    ax=ax,
    data=centers,
    x="x",
    y="y",
    hue="frame",
    alpha=0.6,
    palette=cool,
    legend=False,
)

fig = ax.get_figure()
fig