Skip to content

Inaccurate centroid estimation in edge cases with get_centroid_fourier #251

@PierreRaybaut

Description

@PierreRaybaut

Context

The get_centroid_fourier function in sigima.tools.image has been used as the default method to compute image centroids due to its robustness to noise. However, several recent tests have shown that this method can produce severely inaccurate results in some common edge cases — particularly when the object of interest is partially outside the image boundaries or the image is vertically truncated.

For instance, when testing with the image attached below:

Data[dtype=float32, shape=(512, 2048)]
  Barycentre[Fourier]   → (x=1003.23, y=28.37) ❌
  Barycentre[Projected Profile Barycenter] → (x=1022.91, y=255.93) ✅
  Ground truth (visual + prior knowledge) → (x≈1023, y≈255)

In contrast, in a very noisy image with a small, off-center spot, Fourier is the only method returning a correct result:

Data[dtype=uint16, shape=(2000, 2000)]
  Barycentre[Fourier]   → (x=1199.80, y=699.42) ✅
  Others (OpenCV, Moments, etc.) → (x≈1076, y≈883) ❌
  Ground truth → (x=1200, y=700)

Root cause

The Fourier-based method assumes:

  • symmetry,
  • centered content,
  • and periodicity at the borders.

In presence of asymmetry or partial occlusion, its estimate becomes unstable, especially along the truncated axis.

Proposed solution

We introduce a new default method: get_centroid_auto, based on a cross-validation strategy:

  1. Estimate centroid using three methods:

    • Fourier (get_centroid_fourier)
    • Projected profile median (get_projected_profile_centroid(method="median"))
    • scikit-image centroid (skimage.measure.centroid)
  2. Compare how close Fourier and skimage estimates are to the projected profile median (a robust, smoothing-based 1D reference).

  3. Select the estimate closest to the median:

    • If Fourier is closer → keep it,
    • Otherwise → fallback to skimage.
from math import hypot

dist_f = hypot(row_f - row_m, col_f - col_m)
dist_s = hypot(row_s - row_m, col_s - col_m)

return (row_f, col_f) if dist_f < dist_s else (row_s, col_s)

This provides a robust default that:

  • Keeps Fourier when it's really meaningful,
  • Falls back silently when it's off,
  • Handles noise, asymmetry and truncation better.

Impact

  • Better accuracy in real-world use cases,
  • Improved reliability in GUI (DataLab) and automated analyses,
  • Maintains performance and noise-resistance in challenging cases.

Attachment

Test image used in the edge case scenario (512×2048, truncated disk):
📎 centroid_test.npy

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions