::::
:::{thebe-button}
:::
::::

# Find objects

Find bubble size using two different approaches.


In [None]:
from boilercv_docs.nbs import init

paths = init()

from devtools import pprint
from geopandas import GeoDataFrame, points_from_xy
from numpy import pi, sqrt
from pandas import DataFrame, IndexSlice, NamedAgg
from seaborn import scatterplot
from shapely import LinearRing, Polygon

from boilercv.images import scale_bool
from boilercv_docs.nbs import HIDE, nowarn, set_display_options
from boilercv_docs.settings import Notebooks, default
from boilercv_pipeline.experiments.e230920_subcool import GBC, bounded_ax
from boilercv_pipeline.sets import get_contours_df, get_dataset

default.__init__()  # noqa: PLC2801
p = default.notebooks

COMPARE_WITH_TRACKPY = True
"""Whether to get objects using the Trackpy approach."""
STEP = 10
"""Frame step size."""
STOP = 3
"""Last frame to analyze."""
GUESS_DIAMETER = 21
"""Guess diameter for the Trackpy approach. (px)"""
TRACKPY_COLS = ["y", "x", "frame", "size"]
"""Columns to compare with the Trackpy approach."""
COLS = [*TRACKPY_COLS, "area", "diameter_px", "radius_of_gyration_px"]
"""Data to store."""

HIDE

In [None]:
p = Notebooks.model_validate(p)
set_display_options(p.font_scale)
pprint(p)

## Data

Load a video of filled contours and the contour loci and plot a composite of all frames to analyze.


In [None]:
if COMPARE_WITH_TRACKPY:
    with nowarn(capture=True):
        from trackpy import batch, quiet

    quiet()

PATH_TIME = p.time.replace(":", "-")
"""Timestamp suitable for paths.

Also used in notebook parametrization.
"""

filled_contours = scale_bool(
    get_dataset(PATH_TIME, frame=p.frames, stage="filled")["video"]
)
contours_df = get_contours_df(PATH_TIME)
composite_video = filled_contours.max("frame").values
with bounded_ax(composite_video) as ax:
    ax.imshow(~composite_video, alpha=0.4)

HIDE

## Find size from filled contours using Trackpy

Use Trackpy to find bubble size given the filled contours.


In [None]:
if COMPARE_WITH_TRACKPY:
    tp_objects = (
        batch(
            frames=filled_contours.values, diameter=GUESS_DIAMETER, characterize=True
        ).assign(
            frame=lambda df: df.frame.replace(
                dict(enumerate(filled_contours.frame.values))
            )
        )
    ).loc[:, TRACKPY_COLS]
else:
    tp_objects = DataFrame()

tp_objects

## Find size from contours

The prior approach throws out contour data, instead operating on filled contours. Instead, try using shapely to find size directly from contour data.

### Prepare to find objects

Prepare a dataframe with columns in a certain order, assign contour data to it, and demote the hiearchical indices to plain columns. Count the number of points in each contour and each frame, keeping only those which have enough points to describe a linear ring. Construct a GeoPandas geometry column and operate on it with Shapely to construct linear rings, returning centroids and the representative polygonal area. Also report the number of points in the loci of each contour per frame.


In [None]:
contours = (
    DataFrame(columns=["xpx", "ypx"])
    .assign(**contours_df.loc[IndexSlice[p.frames, :], :])
    .rename(axis="columns", mapper=dict(xpx="x", ypx="y"))
    .reset_index()
    .assign(
        count=lambda df: df.groupby(["frame", "contour"], **GBC).x.transform("count")
    )
    .query("count > 3")
    .assign(geometry=lambda df: points_from_xy(df.x, df.y))
    .groupby(["frame", "contour"], **GBC)
    .agg(
        count=NamedAgg(column="count", aggfunc="first"),
        centroid=NamedAgg(
            column="geometry", aggfunc=lambda df: LinearRing(df).centroid
        ),
        area=NamedAgg(column="geometry", aggfunc=lambda df: Polygon(df).area),
    )
    .assign(
        diameter_px=lambda df: sqrt(4 * df["area"] / pi),
        radius_of_gyration_px=lambda df: df["diameter_px"] / 4,
        size=lambda df: df["radius_of_gyration_px"],
    )
)
contours

Split the centroid point objects into separate named columns that conform to the Trackpy convention. Report the centroids in each frame.


In [None]:
objects = (
    GeoDataFrame(contours)
    .assign(x=lambda df: df.centroid.x, y=lambda df: df.centroid.y)
    .loc[:, COLS]
    .sort_values(["frame", "y", "x"], ignore_index=True)
)
objects

## Compare approaches

Compare Trackpy objects with contour objects. Here the guess radius for Trackpy object finding and contour perimeter filtering are matched to produce the same number of objects from each algorithm. Trackpy features more intelligent filtering, but takes much longer. Trackpy's approach for finding local maxima in grayscale images is applied even to binarized images, exhaustively searching for high points in the binary image, adding to execution time.

The percent difference between the approaches is relatively low for this subset, suggesting the contour centroid approach is reasonable.

A warm color palette is used to plot Trackpy objects, and a cool color palette is used to plot contour centroids.


In [None]:
if COMPARE_WITH_TRACKPY:
    scatterplot(
        ax=ax, data=tp_objects, x="x", y="y", alpha=0.6, color="red", legend=False
    )
scatterplot(ax=ax, data=objects, x="x", y="y", alpha=0.6, color="blue", legend=False)

fig = ax.get_figure()
fig