Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look at replacing feret and centroid calculations in GrainStats #798

Closed
ns-rse opened this issue Feb 7, 2024 · 2 comments · Fixed by #823
Closed

Look at replacing feret and centroid calculations in GrainStats #798

ns-rse opened this issue Feb 7, 2024 · 2 comments · Fixed by #823
Labels
enhancement New feature or request GrainStats Issues pertaining to the GrainStats class

Comments

@ns-rse
Copy link
Collaborator

ns-rse commented Feb 7, 2024

Whilst working on #748 I needed to extract the Feret calculations so that I could obtain the co-ordinates for the minimum and maximum feret distances. Work in-progress on ns-rse/748-grain-profiles.

Initial implementation in #755 includes a bunch of tests as I had some issues getting what I thought were the correct answers out.

As this will be its own, tested, sub-module we can replace the functionality of GrainStats.get_min_max_feret() which currently only has a single 2x2 square tested and a regression test in place on the final values of cropped minicircle.spm. This can also reduce some duplication of code as convex hull is calculated twice, once for all points and then again for the upper and lower hulls.

I noticed also that there are calculations to determine the centroid of grains/molecules and we could look at replacing these with the Scikit Image's skimage.measure.regionprops() which returns the centroid (amongst many other things).

I started making the change to use topostats.measure.feret.min_max_feret() (currently stashed) and found that in the regression test whilst the values calculated for the max_feret were identical to those returned by GrainStats.get_min_max_feret() the min_feret differed.

Using some of the shapes in the tests developed for topostats.measure.feret to test the Gratinstats.get_max_min_ferets() revealed there may be some issues with the max_feret. These use incredibly simple tiny_{circle,square,quadrilateral} shapes as well as simple elipses and the tests fail for both min_feret and max_feret (see original test on a simple square at tests/test_grainstats.py::test_get_min_max_ferets()).

Code for these tests isn't included in #755 but follows (NB the axis, min_feret_coord_target and max_feret_coord_target parameters aren't used in the tests as GrainStats.get_max_min_ferets() only has one way of sorting convex hull edge points, but it was easier to just copy and paste them from the tests for topostats.measure.feret.min_max_feret()).

import numpy as np
import numpy.typing as npt
import pytest
from skimage import draw


tiny_circle = np.zeros((3, 3), dtype=np.uint8)
rr, cc = draw.circle_perimeter(1, 1, 1)
tiny_circle[rr, cc] = 1

small_circle = np.zeros((5, 5), dtype=np.uint8)
rr, cc = draw.circle_perimeter(2, 2, 2)
small_circle[rr, cc] = 1

tiny_quadrilateral = np.asarray(
    [
        [0, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 1, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
    ],
    dtype=np.uint8,
)

tiny_square = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_triangle = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 0, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_rectangle = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_ellipse = np.asarray(
    [
        [0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 0, 0],
    ],
    dtype=np.uint8,
)

holo_circle = np.zeros((7, 7), dtype=np.uint8)
rr, cc = draw.circle_perimeter(3, 3, 2)
holo_circle[rr, cc] = 1

holo_ellipse_vertical = np.zeros((11, 9), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(5, 4, 4, 3)
holo_ellipse_vertical[rr, cc] = 1

holo_ellipse_horizontal = np.zeros((9, 11), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(4, 5, 3, 4)
holo_ellipse_horizontal[rr, cc] = 1

holo_ellipse_angled = np.zeros((8, 10), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(4, 5, 1, 3, orientation=np.deg2rad(30))
holo_ellipse_angled[rr, cc] = 1

curved_line = np.zeros((10, 10), dtype=np.uint8)
rr, cc = draw.bezier_curve(1, 5, 5, -2, 8, 8, 2)
curved_line[rr, cc] = 1

@pytest.mark.parametrize(
    (
        "shape",
        "axis",
        "min_feret_distance_target",
        "min_feret_coord_target",
        "max_feret_distance_target",
        "max_feret_coord_target",
    ),
    [
        pytest.param(
            tiny_circle, 0, 1.4142135623730951, ([0, 1], [1, 0]), 2, ([1, 2], [1, 0]), id="tiny circle sorted on axis 0"
        ),
        pytest.param(
            tiny_square,
            0,
            1.0,
            ([1, 1], [2, 1]),
            1.4142135623730951,
            ([1, 2], [2, 1]),
            id="tiny square sorted on axis 0",
        ),
        pytest.param(
            tiny_quadrilateral,
            0,
            3.0,
            ([2, 4], [2, 1]),
            4.0,
            ([1, 2], [5, 2]),
            id="tiny quadrilateral sorted on axis 0",
        ),
        pytest.param(
            tiny_triangle,
            0,
            1.0,
            ([1, 1], [2, 1]),
            1.4142135623730951,
            ([1, 2], [2, 1]),
            id="tiny triangle sorted on axis 0",
        ),
        pytest.param(
            tiny_rectangle,
            0,
            1.0,
            ([1, 2], [1, 1]),
            2.23606797749979,
            ([1, 2], [3, 1]),
            id="tiny rectangle sorted on axis 0",
        ),
        pytest.param(tiny_ellipse, 0, 2.0, ([2, 3], [2, 1]), 4.0, ([1, 2], [5, 2]), id="tiny ellipse sorted on axis 0"),
        pytest.param(
            small_circle,
            1,
            4.0,
            ([0, 1], [4, 1]),
            4.47213595499958,
            ([1, 4], [3, 0]),
            id="small circle sorted on axis 0",
        ),
        pytest.param(
            holo_circle, 0, 4.0, ([1, 2], [5, 2]), 4.47213595499958, ([4, 5], [2, 1]), id="holo circle sorted on axis 0"
        ),
        pytest.param(
            holo_ellipse_horizontal,
            0,
            6.0,
            ([1, 3], [7, 3]),
            8.246211251235321,
            ([5, 9], [3, 1]),
            id="holo ellipse horizontal on axis 0",
        ),
        pytest.param(
            holo_ellipse_vertical,
            0,
            6.0,
            ([3, 7], [3, 1]),
            8.246211251235321,
            ([1, 5], [9, 3]),
            id="holo ellipse vertical on axis 0",
        ),
        pytest.param(
            holo_ellipse_angled,
            0,
            3.1622776601683795,
            ([1, 4], [2, 1]),
            7.615773105863909,
            ([5, 8], [2, 1]),
            id="holo ellipse angled on axis 0",
        ),
        pytest.param(
            curved_line,
            1,
            5.656854249492381,
            ([1, 5], [5, 1]),
            8.06225774829855,
            ([4, 1], [8, 8]),
            id="curved line sorted on axis 1",
        ),
    ],
)
def test_min_max_feret_grainstats(
    shape: npt.NDArray,
    axis: int,
    min_feret_distance_target: float,
    min_feret_coord_target: list,
    max_feret_distance_target: float,
    max_feret_coord_target: list,
) -> None:
    """Test calculation of min/max feret."""
    edge_points = np.argwhere(shape == 1)
    min_feret_distance, max_feret_distance = GrainStats.get_max_min_ferets(edge_points)
    # NB Comment out the next line to view failures/differences in max_feret
    assert min_feret_distance == min_feret_distance_target
    assert max_feret_distance == max_feret_distance_target
@ns-rse ns-rse added bug Something isn't working enhancement New feature or request GrainStats Issues pertaining to the GrainStats class labels Feb 7, 2024
@ns-rse
Copy link
Collaborator Author

ns-rse commented Feb 8, 2024

Done a bit more work investigating this by adding additional test images to tests/test_grainstats.py::test_get_min_max_ferets()).

If we use a tiny_circle and tiny_triangle we have some very basic shapes we can work with and easily think about what the results should be.

tiny_circle = np.zeros((3, 3), dtype=np.uint8)
rr, cc = draw.circle_perimeter(1, 1, 1)
tiny_circle[rr, cc] = 1
tiny_circle

array([[0, 1, 0],
       [1, 0, 1],
       [0, 1, 0]], dtype=uint8)

np.argwhere(tiny_circle == 1)

array([[0, 1],
       [1, 0],
       [1, 2],
       [2, 1]])

tiny_triangle = np.asarray([[1, 1], [1, 0]], dtype=np.uint8)
tiny_triangle

np.argwhere(tiny_triangle == 1)

array([[1, 1],
       [1, 0]], dtype=uint8)

array([[0, 0],
       [0, 1],
       [1, 0]])

Just as with the square we can use basic trigonometry to work out what the minimum and maximum ferets of these should be.

Triangle

Minimum Feret : $\sqrt(1^2 + 1^2)) / 2$ 0.7071067811865476
Maximum Feret : $\sqrt(1^2 + 1^2)$ 1.4142135623730951

Circle

Minimum Feret : $\sqrt(1^2 + 1^2)$ 1.4142135623730951
Maximum Feret : 2

Testing

@pytest.mark.parametrize(
    ("edge_points", "min_expected", "max_expected"),
    [
        pytest.param([[0, 0], [0, 1], [1, 0], [1, 1]], 1.0, 1.4142135623730951, id="square"),
        pytest.param([[0, 0], [0, 1], [1, 0]], 0.7071067811865476, 1.4142135623730951, id="triangle"),
        pytest.param([[0, 1], [1, 0], [1, 2], [2, 1]], 1.4142135623730951, 2.0, id="circle"),
    ],
)
def test_get_min_max_ferets(edge_points, min_expected, max_expected) -> None:
    """Tests the GrainStats.get_min_max_ferets method."""
    min_feret, max_feret = GrainStats.get_max_min_ferets(edge_points)
    np.testing.assert_almost_equal(min_feret, min_expected)
    np.testing.assert_almost_equal(max_feret, max_expected)
cwd: /home/neil/work/git/hub/AFM-SPM/TopoStats/
cmd: pytest --color=yes 'tests/test_grainstats.py::test_get_min_max_ferets'

======================================================== test session starts ========================================================
platform linux -- Python 3.11.6, pytest-7.4.2, pluggy-1.3.0
Matplotlib: 3.8.0
Freetype: 2.6.1
rootdir: /home/neil/work/git/hub/AFM-SPM/TopoStats
configfile: pyproject.toml
plugins: mock-3.11.1, mpl-0.16.1, github-actions-annotate-failures-0.2.0, xdist-3.3.1, napari-plugin-engine-0.2.0, regtest-1.5.1, cov-4.1.0, hypothesis-6.87.1, lazy-fixture-0.6.3, anyio-4.0.0, typeguard-4.1.5
collected 3 items                                                                                                                   

tests/test_grainstats.py ...                                                                                                  [100%]

---------- coverage: platform linux, python 3.11.6-final-0 -----------
Name                                Stmts   Miss  Cover
-------------------------------------------------------
topostats/__main__.py                   2      2     0%
topostats/entry_point.py              105    105     0%
topostats/filters.py                  154    133    14%
topostats/grains.py                   105     81    23%
topostats/grainstats.py               331    235    29%
topostats/io.py                       412    348    16%
topostats/logs/logs.py                 25      0   100%
topostats/plotting.py                 156    127    19%
topostats/plottingfuncs.py            139    139     0%
topostats/processing.py               213    213     0%
topostats/run_topostats.py            109    109     0%
topostats/scars.py                    105     94    10%
topostats/statistics.py                26     26     0%
topostats/theme.py                     40      5    88%
topostats/thresholds.py                31     18    42%
topostats/tracing/dnacurvature.py      24     24     0%
topostats/tracing/dnatracing.py       418    365    13%
topostats/tracing/skeletonize.py       23     12    48%
topostats/tracing/tracingfuncs.py     613    550    10%
topostats/utils.py                     88     68    23%
topostats/validation.py                15     15     0%
-------------------------------------------------------
TOTAL                                3134   2669    15%


========================================================= 3 passed in 0.25s =========================================================

Ok this is good, the basic shapes work, however in the case of the tiny_triangle there is no point on the convex hull that is mid-way along the hypotenuse so I'm surprised that the rotating caliper algorithm gets this correct. Since its based on pairs on the upper and lower convex hull the Rotating Caliper based minimum feret should be 1.

I've looked at how min_feret is calculated in GrainStats.get_min_max_feret() and it is calculated using the current pair of points on the rotating caliper and either the next point in the lower_hull or the previous point in the upper_hull.

small_feret = GrainStats.get_triangle_height(
                    np.array(lower_hull[lower_index + 1, :]),
                    np.array(lower_hull[lower_index, :]),
                    np.array(upper_hull[upper_index, :]),
                )
if min_feret is None or small_feret < min_feret:
                    min_feret = small_feret

...and...

small_feret = GrainStats.get_triangle_height(
                    np.array(upper_hull[upper_index - 1, :]),
                    np.array(upper_hull[upper_index, :]),
                    np.array(lower_hull[lower_index, :]),
                )
if min_feret is None or small_feret < min_feret:
                    min_feret = small_feret

A useful post Minimum Feret Diameter » Steve on Image Processing with MATLAB - MATLAB & Simulink explains why this is the correct way to calculate the minimum feret (the visual diagrams were really helpful).

Thus #755 needs revising before it can be merged to use this method.

Apologies for doubting your solution @SylviaWhittle .

@MaxGamill-Sheffield
Copy link
Collaborator

Note: Necessary refactor due to bounding boxes not being returned and needed for traces

@ns-rse ns-rse removed the bug Something isn't working label Feb 20, 2024
ns-rse added a commit that referenced this issue Apr 15, 2024
Closes #798

Replaces the functionality for calculating minimum and maximum feret distances within the `GrainStats`  class with the
new `topostats.measure.feret.min_max_feret()`.

+ Reorder dictionary returned by `min_max_feret()`.
+ Dictionary is included in the `stats` dictionary built by `GrainStats` and saved as CSV.
+ Updates `tests/test_grainstats_minicircle.py::test_grainstats_regression` to include the co-ordinates.
+ Updates `tests/test_grainstats.py::test_process_scan*` to include co-ordinates.
+ Removes `GrainStats.get_max_min_feret()` and associated methods that are called as well as their tests.
+ Co-ordinates for min/max feret are included by default in HDF5 output now that they are part of the returned
  dataframe.

Will have to rethink/change how results are returned by `GraintStats` class if height profiles are to be
extracted. Probably most sensible to have profiling as a separate step conducted after GrainStats.
ns-rse added a commit that referenced this issue Apr 15, 2024
Closes #798

Replaces the functionality for calculating minimum and maximum feret distances within the `GrainStats`  class with the
new `topostats.measure.feret.min_max_feret()`.

+ Reorder dictionary returned by `min_max_feret()`.
+ Dictionary is included in the `stats` dictionary built by `GrainStats` and saved as CSV.
+ Updates `tests/test_grainstats_minicircle.py::test_grainstats_regression` to include the co-ordinates.
+ Updates `tests/test_grainstats.py::test_process_scan*` to include co-ordinates.
+ Removes `GrainStats.get_max_min_feret()` and associated methods that are called as well as their tests.
+ Co-ordinates for min/max feret are included by default in HDF5 output now that they are part of the returned
  dataframe. These are rounded to 13 decimal places to address precision errors encountered on Windows machines running
  the test suite.

Will have to rethink/change how results are returned by `GraintStats` class if height profiles are to be
extracted. Probably most sensible to have profiling as a separate step conducted after GrainStats.
ns-rse added a commit that referenced this issue Apr 15, 2024
Closes #798

Replaces the functionality for calculating minimum and maximum feret distances within the `GrainStats`  class with the
new `topostats.measure.feret.min_max_feret()`.

+ Reorder dictionary returned by `min_max_feret()`.
+ Dictionary is included in the `stats` dictionary built by `GrainStats` and saved as CSV.
+ Updates `tests/test_grainstats_minicircle.py::test_grainstats_regression` to include the co-ordinates.
+ Updates `tests/test_grainstats.py::test_process_scan*` to include co-ordinates.
+ Removes `GrainStats.get_max_min_feret()` and associated methods that are called as well as their tests.
+ Co-ordinates for min/max feret are included by default in HDF5 output now that they are part of the returned
  dataframe. These are rounded to 13 decimal places to address precision errors encountered on Windows machines running
  the test suite.

Will have to rethink/change how results are returned by `GraintStats` class if height profiles are to be
extracted. Probably most sensible to have profiling as a separate step conducted after GrainStats.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request GrainStats Issues pertaining to the GrainStats class
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants