PIMO #1726

jpcbertoldo · 2024-02-09T12:56:03Z

📝 Description

Replace #1557, which replaces the PRs from https://gist.github.com/jpcbertoldo/12553b7eaa97cfbf3e55bfd7d1cafe88 .

Implements refactors from https://github.com/jpcbertoldo/anomalib/blob/metrics/refactors/src/anomalib/utils/metrics/perimg/.refactors .

arxiv: https://arxiv.org/abs/2401.01984
medium post: https://medium.com/p/c653ac30e802
GSoC deliverable: https://gist.github.com/jpcbertoldo/12553b7eaa97cfbf3e55bfd7d1cafe88

Closes #1728 1728

✨ Changes

Select what type of change your PR is:

🐞 Bug fix (non-breaking change which fixes an issue)
🔨 Refactor (non-breaking change which refactors the code base)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📚 Documentation update
🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
📚 I have made the necessary updates to the documentation (if applicable).
🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

jpcbertoldo · 2024-02-09T13:00:09Z

src/anomalib/metrics/per_image/binclf_curve.py

+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from __future__ import annotations


@ashwinvaidya17

Can we avoid this? Last I checked __future__.annotations does not behave well with jsonargparse

i thought it was necessary for the annotations like range: tuple[float, float], is it not?

So, (i think) i re-figured out why i used this.
In the other file, the class AUPIMOResult has annotations using the class itself (eg from_pimoresult(...) -> AUPIMOResult:) and apparently the linter didnt like it, but adding __future__.annotations solves it.
Now, idk if this is a good justification. What do you prefer to do about this?

Our minimum python version is 3.10. I think we can safely omit __future__.annotations

jpcbertoldo · 2024-02-09T13:00:59Z

src/anomalib/metrics/per_image/_binclf_curve_numba.py

@@ -0,0 +1,119 @@
+"""Binary classification matrix curve (NUMBA implementation of low level functions).


@ashwinvaidya17

What's the convention behind naming files starting with _?

it's similar to the _ attributes inside a module

a function _func in module.py is supposed to be used within the module scope, right?

similarly, the _validate is private to per_image/, so it can be used across other modules inside the subpackage per_image but not out of it

jpcbertoldo · 2024-02-09T13:02:16Z

src/anomalib/metrics/per_image/_validate.py

+        raise ValueError(msg)
+
+
+def file_path(file_path: str | Path, must_exist: bool, extension: str | None, pathlib_ok: bool) -> None:


@ashwinvaidya17

Maybe we can move this method in data.utils.

i am trying to avoid things out of per_image for the sake of synchronizing the standalone repo more easily
could we eventually put this in a list of "refactors for later"?

Yeah I understand that it is already a large PR but it might be a good idea to sync with src/anomalib/data/utils/path.py. We already have utils that perform similar checks

jpcbertoldo · 2024-02-09T13:04:10Z

src/anomalib/metrics/per_image/_validate.py

+        raise ValueError(msg)
+
+
+def file_paths(file_paths: list[str | Path], must_exist: bool, extension: str | None, pathlib_ok: bool) -> None:


@ashwinvaidya17

same here and maybe we can rename it to validate_path_are_files. A bit verbose but improves readibility imo.

so the idea of the _validate module is to have short names because the "validate" part is common to all the methods; the functions are always called with _validate.file_paths() to make the "validate" show up

sounds logical?

Yeah I get what you mean but I feel a lot of methods defined here might be useful across the repo as well. Like the same_shape and is_tensor. But It might be too much work and too many changed files. We can address it in another PR but it would be nice to then add a TODO at the top of the file and create an issue for it so that we don't forget to revisit it later.

jpcbertoldo · 2024-02-09T13:05:10Z

src/anomalib/metrics/per_image/binclf_curve_numpy.py

+
+
+@dataclass
+class BinclfAlgorithm:


@ashwinvaidya17

What's the advantage of using this over enums?

i just wanted to avoid enums for the sake of simplicity, yet i wanted to put these contants together for consistency
should i switch to enum or is that ok?

With risk of sounding dogmatic, I am in favour of switching to enums. The mental model in my head for dataclasses is to store data, and for enums is for switching. From what I see, this is used for switching. Also, it is easy to validate enums by just wrapping the some passed value with enum type.

Take line 196 for example.
algorithm: str = BinclfAlgorithm.NUMBA
I understand the objective behind this but I would rather go with

class BinclfAlgorithm(Enum): ...

algorithm: BinclfAlgorithm | str = BinclfAlgorithm.NUMBA, ) -> ndarray: algorithm = BinclfAlgorithm(algorithm)

this has the same desired effect as validate

jpcbertoldo · 2024-02-09T13:08:51Z

src/anomalib/metrics/per_image/pimo_numpy.py

+            "The lower bound of the shared FPR integration range is not exactly achieved. "
+            f"Expected {fpr_lower_bound} but got {fpr_lower_bound_defacto}, which is not within {rtol=}."
+        )
+        warnings.warn(msg, RuntimeWarning, stacklevel=1)


@ashwinvaidya17

What's the advantage of using both logger.warning and warnings.warn. What happens if we want to pipe the warnings to a file and keep the console empty? A file handler can be passed to the logging module in such a scenario.

tbh i didnt know what was the right policy here 😳
i put both cuz i was sure you'd find and tell me what to do :P hehe

so should i use the logger?

on a side note: warnings.warn allows one to make this raise an exception with warnings.filterwarnings("error");
maybe that could be interesting?

for context: this warning is due to having too few points to integrate the AUC curve, which may result in unprecise AUC values

Let's stick to logger.warning

jpcbertoldo · 2024-02-09T13:09:51Z

src/anomalib/metrics/per_image/pimo_numpy.py

+    normalization_factor = aupimo_normalizing_factor(fpr_bounds)
+    aucs = (aucs / normalization_factor).clip(0, 1)
+
+    return threshs, shared_fpr, per_image_tprs, image_classes, aucs, num_points_integral


@ashwinvaidya17

Do you think we should capture these in a dataclass?

there is a dataclass for the torch interface but not for the numpy interface
i couldnt figure out a nice, maintainable way to make the two versions
perhaps we can discuss this in a call? i'll explain better and maybe you'll know a good solution

Let's leave it as is for now. We are planning on refactoring both the inferencers.

jpcbertoldo · 2024-02-09T13:10:50Z

another unresolved issue from the previous PR

[about "ATTENTION..." in docstrings]

@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

…tors Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

samet-akcay · 2024-02-09T14:17:15Z

another unresolved issue from the previous PR

[about "ATTENTION..." in docstrings]

@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

samet-akcay · 2024-02-09T14:18:37Z

src/anomalib/metrics/per_image/__init__.py

+# Original Code
+# Copyright (c) 2024 @jpcbertoldo
+# https://github.com/jpcbertoldo/aupimo
+# SPDX-License-Identifier: MIT
+#
+# Modified
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0


this is a good idea, but we might need to double check with GSoC guidelines in regards to code ownership.

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

jpcbertoldo · 2024-02-09T16:23:03Z

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

there are some [metadata] stuff showing up, idk why

apparently sphinx doesnt like dataclasses?
https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html#id5
this field is in the class AUPIMOResult but it's showing as if it was a function or w/e in the root (?)
while AUPIMOResult doesnt show at all

samet-akcay · 2024-02-09T16:48:17Z

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

there are some [metadata] stuff showing up, idk why

apparently sphinx doesnt like dataclasses? https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html#id5 this field is in the class AUPIMOResult but it's showing as if it was a function or w/e in the root (?) while AUPIMOResult doesnt show at all

Tree structure is also messed up a bit. It might be an idea to split each metric into a separate section.

jpcbertoldo · 2024-02-10T15:13:31Z

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html

it's not quite working as expected

i expected per_image to show as submenu in metrics, how could I do that?
it seems not to like dataclasses; there are attributes of PIMOResult and AUPIMOResult showing as if it was a function (?) and the classes themselves dont' show

ashwinvaidya17

A lot has changed since this PR was submitted, but I've finally gotten around to reviewing it. This is a huge PR with a lot of efforts behind it. However, I have some concerns. I've gone over it once but I think I'll require a few more passes for a more thorough review. But meanwhile we can start the discussions for the current opens.

ashwinvaidya17 · 2024-02-27T13:48:34Z

src/anomalib/metrics/per_image/binclf_curve.py

+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from __future__ import annotations


Our minimum python version is 3.10. I think we can safely omit __future__.annotations

ashwinvaidya17 · 2024-05-15T12:26:40Z

src/anomalib/metrics/per_image/_binclf_curve_numba.py

+# Original Code
+# Copyright (c) 2024 @jpcbertoldo
+# https://github.com/jpcbertoldo/aupimo
+# SPDX-License-Identifier: MIT


Are we allowed to release MIT licensed work under Apache license?

ashwinvaidya17 · 2024-05-15T12:28:16Z

requirements/base.txt

@@ -16,3 +16,4 @@ torch>=2,<2.2.0 # rkde export fails even with ONNX 17 (latest) with torch 2.2.0.
 torchmetrics==0.10.3
 rich-argparse
 open-clip-torch>=2.23.0
+numba>=0.58.1


We now only use pyproject.toml but this comment might be relevant there as well. Maybe we need to make this requirement part of optional depencencies. There is an open issue requesting smaller core size.

ashwinvaidya17 · 2024-05-15T12:29:25Z

tests/unit/metrics/per_image/test_utils.py

+
+def test_compare_models_pairwise_wilcoxon(scores_per_model: dict, alternative: str, higher_is_better: bool) -> None:
+    """Test `compare_models_pairwise_wilcoxon`."""
+    from anomalib.metrics.per_image import AUPIMOResult, compare_models_pairwise_wilcoxon


What's the rationale behind the import statement here?

ashwinvaidya17 · 2024-05-15T12:32:06Z

src/anomalib/metrics/per_image/_validate.py

+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from __future__ import annotations


Let's remove __future__ across the code. It is not relevant anymore

ashwinvaidya17 · 2024-05-15T13:17:40Z

src/anomalib/metrics/per_image/utils.py

+    Numpy version docstring
+    =======================
+
+    {docstring}


Is this intentional?

ashwinvaidya17 · 2024-05-15T13:20:30Z

src/anomalib/metrics/per_image/binclf_curve_numpy.py

+class BinclfAlgorithm:
+    """Algorithm to use."""
+
+    PYTHON: ClassVar[str] = "python"


Should we call it python or numpy?

ashwinvaidya17 · 2024-05-15T13:46:59Z

src/anomalib/metrics/per_image/_binclf_curve_numba.py

+
+
+@numba.jit(nopython=True)
+def binclf_one_curve_numba(scores: ndarray, gts: ndarray, threshs: ndarray) -> ndarray:


Maybe I missed something but the only difference between this and the non-numba method is the decorator. This way we might be able to reduce the file count. Maybe you can use a generic decorator that uses numba's jit when it is available

from anomalib.utils.exceptions import try_import import numpy as np def numba_accelerate(func): if try_import('numba'): from numba import jit return jit(nopython=True)(func) else: return func @numba_accelerate def add(a:np.ndarray, b: np.ndarray): return a + b

This is a quick something I tried. Feel free to polish it

ashwinvaidya17 · 2024-05-15T13:48:33Z

src/anomalib/metrics/per_image/pimo_numpy.py

+            "The lower bound of the shared FPR integration range is not exactly achieved. "
+            f"Expected {fpr_lower_bound} but got {fpr_lower_bound_defacto}, which is not within {rtol=}."
+        )
+        warnings.warn(msg, RuntimeWarning, stacklevel=1)


Let's stick to logger.warning

ashwinvaidya17 · 2024-05-15T13:49:19Z

src/anomalib/metrics/per_image/pimo_numpy.py

+    normalization_factor = aupimo_normalizing_factor(fpr_bounds)
+    aucs = (aucs / normalization_factor).clip(0, 1)
+
+    return threshs, shared_fpr, per_image_tprs, image_classes, aucs, num_points_integral


Let's leave it as is for now. We are planning on refactoring both the inferencers.

jpcbertoldo requested review from samet-akcay, ashwinvaidya17 and djdameln as code owners February 9, 2024 12:56

github-actions bot added Dependencies Pull requests that update a dependency file Tests labels Feb 9, 2024

jpcbertoldo mentioned this pull request Feb 9, 2024

🚀 Add support for MVTec LOCO dataset and sPRO metric #1686

Merged

9 tasks

jpcbertoldo commented Feb 9, 2024

View reviewed changes

jpcbertoldo added 16 commits February 9, 2024 14:43

update

65e052e

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

test binclf curves numpy and numba and fixes

e5233d9

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

correct som docstrings

04fc4b1

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

torch interface and tests

5148cc8

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

torch interface and tests

d186bec

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

constants regrouped in dataclass as class vars

1d720ae

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

result class was unneccesary for per_image_binclf_curve

f280523

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

factorize function _get_threshs_minmax_linspace

ecafb8a

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

small docs fixes

a8208d8

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add pimo numpy version and test

527713b

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

move validation

c5d053b

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add shared_fpr_metric option

eb00858

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add pimo torch functional version and test

01fe062

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add torchmetrics interface and test

83e6678

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

renames and put things in init

f7a80eb

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

validate inputs in result objects

8b730b4

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

jpcbertoldo added 15 commits February 9, 2024 14:43

add missing docstrings

d10b161

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

minore vocabulary fix for consistency

ef999e9

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add per image scores statistics and test it

377d384

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

refactor constants notation

6abc3d0

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add stats tests and test it

c546c4c

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

change the meaning of AUPIMO.num_thresh

5e4c91d

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

interface to format pairwise test results

da2c95e

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

improve doc

5948103

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add optional paths to result objects and some minor fixes and refac…

798484e

…tors Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

add numba to requirements

90b7764

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

remove frozen from dataclasses and some done todos

8909cd0

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

review headers

4d254e8

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

doc modifs

a22073e

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

refactor score_less_than_thresh in _binclf_one_curve_python

add57db

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

correct license comments

98198ac

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

jpcbertoldo force-pushed the pimo3 branch from af0972d to 98198ac Compare February 9, 2024 13:43

samet-akcay reviewed Feb 9, 2024

View reviewed changes

fix doc

ba75772

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>

jpcbertoldo force-pushed the pimo3 branch from c734e24 to ba75772 Compare February 9, 2024 15:16

samet-akcay added this to the v1.1.0 milestone Feb 29, 2024

samet-akcay added Feature and removed Dependencies Pull requests that update a dependency file Tests labels Mar 25, 2024

samet-akcay modified the milestones: v1.1.0, v1.2.0 May 14, 2024

ashwinvaidya17 requested changes May 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PIMO #1726

PIMO #1726

jpcbertoldo commented Feb 9, 2024 •

edited by samet-akcay

jpcbertoldo Feb 9, 2024

jpcbertoldo Feb 11, 2024

ashwinvaidya17 Feb 27, 2024

jpcbertoldo Feb 9, 2024 •

edited

jpcbertoldo Feb 9, 2024

ashwinvaidya17 May 15, 2024

jpcbertoldo Feb 9, 2024

ashwinvaidya17 May 15, 2024

jpcbertoldo Feb 9, 2024

ashwinvaidya17 May 15, 2024

jpcbertoldo Feb 9, 2024

ashwinvaidya17 May 15, 2024

jpcbertoldo Feb 9, 2024

ashwinvaidya17 May 15, 2024

jpcbertoldo commented Feb 9, 2024

samet-akcay commented Feb 9, 2024 •

edited

samet-akcay Feb 9, 2024 •

edited

jpcbertoldo commented Feb 9, 2024

samet-akcay commented Feb 9, 2024

jpcbertoldo commented Feb 10, 2024

ashwinvaidya17 left a comment

ashwinvaidya17 Feb 27, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

ashwinvaidya17 May 15, 2024

		@@ -0,0 +1,119 @@
		"""Binary classification matrix curve (NUMBA implementation of low level functions).

		raise ValueError(msg)


		def file_path(file_path: str \| Path, must_exist: bool, extension: str \| None, pathlib_ok: bool) -> None:

		raise ValueError(msg)


		def file_paths(file_paths: list[str \| Path], must_exist: bool, extension: str \| None, pathlib_ok: bool) -> None:



		@numba.jit(nopython=True)
		def binclf_one_curve_numba(scores: ndarray, gts: ndarray, threshs: ndarray) -> ndarray:

PIMO #1726

Are you sure you want to change the base?

PIMO #1726

Conversation

jpcbertoldo commented Feb 9, 2024 • edited by samet-akcay

📝 Description

✨ Changes

✅ Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpcbertoldo Feb 9, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpcbertoldo commented Feb 9, 2024

samet-akcay commented Feb 9, 2024 • edited

samet-akcay Feb 9, 2024 • edited

Choose a reason for hiding this comment

jpcbertoldo commented Feb 9, 2024

samet-akcay commented Feb 9, 2024

jpcbertoldo commented Feb 10, 2024

ashwinvaidya17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpcbertoldo commented Feb 9, 2024 •

edited by samet-akcay

jpcbertoldo Feb 9, 2024 •

edited

samet-akcay commented Feb 9, 2024 •

edited

samet-akcay Feb 9, 2024 •

edited