Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegionProps.__setattr__ has non-negligible overhead #6601

Open
anntzer opened this issue Oct 31, 2022 · 4 comments
Open

RegionProps.__setattr__ has non-negligible overhead #6601

anntzer opened this issue Oct 31, 2022 · 4 comments

Comments

@anntzer
Copy link
Contributor

anntzer commented Oct 31, 2022

Description:

RegionProps currently relies on __setattr__ to provide backcompat access of old property names (e.g. mapping BoundingBox to bbox); but because this also affects all attribute initializations in __init__, this ends up having a non-negligible overhead which could be avoided.

Way to reproduce:

First check how much time it takes to label a random (1024, 1024) binary image.

$ python -mtimeit -s 'import numpy as np; from skimage.measure import label, regionprops; np.random.seed(0); im = np.random.randint(0, 2, (1024, 1024))' -- 'label(im)'

-> 3673 labels (label(im).max()); 30ms/iteration

Then check how much time it takes just to instantiate the corresponding regionprops.

$ python -mtimeit -s 'import numpy as np; from skimage.measure import label, regionprops; np.random.seed(0); im = np.random.randint(0, 2, (1024, 1024)); l = label(im)' -- 'regionprops(l)'

-> 31ms/iteration i.e. even more than the labeling itself

One easy way to skip __setitem__ is to directly update the object's __dict__ directly, e.g. replace, in RegionProperties.__init__,

        self.label = label
        
        self._slice = slice
        self.slice = slice
        self._label_image = label_image
        self._intensity_image = intensity_image
        
        self._cache_active = cache_active
        self._cache = {}
        self._ndim = label_image.ndim
        self._multichannel = multichannel
        self._spatial_axes = tuple(range(self._ndim))
        
        self._extra_properties = {}

by

        vars(self).update(
            label=label,
            _slice=slice,
            slice=slice,
            _label_image=label_image,
            _intensity_image=intensity_image,
            _cache_active=cache_active,
            _cache={},
            _ndim=label_image.ndim,
            _multichannel=multichannel,
            _spatial_axes=tuple(range(label_image.ndim)),
            _extra_properties={},
        )

which speeds up the above timeit call to 18ms/iteration -- almost twice as fast.

Admittedly that's a bit ugly; probably the "proper" way to do this is to completely get rid of the custom __setattr__ (and of backcompat alias handling in __getattr__) and to instead define a bunch of pass-through properties, e.g.

@property
def BoundingBox(self): return self.bbox
@BoundingBox.setter
def BoundingBox(self, value): self.bbox = value

(which can probably be done in a loop in the class definition body in you prefer, e.g. something along the lines of

for _alias, _name in PROPS.items():
    locals()[_alias] = property(
        lambda self, *, _name=_name: getattr(self, _name),
        lambda self, value, *, _name=name: setattr(self, _name, value))

)
OTOH, I see that #6000 explicitly chose not to provide properties for the old aliases for the purpose of better hiding them.

I don't really care whether to fix this with the vars(self).update(...) approach or by introducing these properties.

Traceback or output:

No response

Version information:

3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
Linux-5.19.14-300.fc37.x86_64-x86_64-with-glibc2.36
scikit-image version: 0.19.3
numpy version: 1.23.3
@stefanv
Copy link
Member

stefanv commented Nov 3, 2022

@anntzer How about we use your last suggestion (looks most elegant to me?), and then just have a workaround for the old names, with acomment to deprecate / remove that?

Are you able to help us implement this?

@anntzer
Copy link
Contributor Author

anntzer commented Nov 4, 2022

I can propose the following patch, but it's not a complete PR as you also need to fix the tests which currently explicitly test that the backcompat aliases are not real properties (something that this patch changes):

diff --git a/skimage/measure/_regionprops.py b/skimage/measure/_regionprops.py
index c7a5855c4..b19f3e090 100644
--- a/skimage/measure/_regionprops.py
+++ b/skimage/measure/_regionprops.py
@@ -282,6 +282,21 @@ def _inertia_eigvals_to_axes_lengths_3D(inertia_tensor_eigvals):
     return axis_lengths
 
 
+def _make_backcompat_prop(alias):
+    name = PROPS[alias]
+    if alias.lower() == alias:
+        def fget(self):
+            return getattr(self, name)
+    else:
+        def fget(self):
+            raise AttributeError(f"'{type(self)}' object has no attribute '{alias}'")
+
+    def fset(self, value):
+        setattr(self, name, value)
+
+    return property(fget, fset)
+
+
 class RegionProperties:
     """Please refer to `skimage.measure.regionprops` for more information
     on the available region properties.
@@ -363,27 +378,12 @@ class RegionProperties:
                     f'Custom regionprop function\'s number of arguments must '
                     f'be 1 or 2, but {attr} takes {n_args} arguments.'
                 )
-        elif attr in PROPS and attr.lower() == attr:
-            if (
-                self._intensity_image is None
-                and PROPS[attr] in _require_intensity_image
-            ):
-                raise AttributeError(
-                    f"Attribute '{attr}' unavailable when `intensity_image` "
-                    f"has not been specified."
-            )
-            # retrieve deprecated property (excluding old CamelCase ones)
-            return getattr(self, PROPS[attr])
         else:
             raise AttributeError(
                 f"'{type(self)}' object has no attribute '{attr}'"
             )
 
-    def __setattr__(self, name, value):
-        if name in PROPS:
-            super().__setattr__(PROPS[name], value)
-        else:
-            super().__setattr__(name, value)
+    locals().update({alias: _make_backcompat_prop(alias) for alias in PROPS})
 
     @property
     @_cached
@@ -1355,9 +1355,10 @@ def _parse_docs():
 def _install_properties_docs():
     prop_doc = _parse_docs()
 
-    for p in [member for member in dir(RegionProperties)
-              if not member.startswith('_')]:
-        getattr(RegionProperties, p).__doc__ = prop_doc[p]
+    for member in dir(RegionProperties):
+        if member.startswith('_') or member in PROPS:
+            continue
+        getattr(RegionProperties, member).__doc__ = prop_doc[member]
 
 
 if __debug__:

I don't have the time to make a full PR either right now, but you are more than welcome to just take over the patch.

@github-actions
Copy link

github-actions bot commented May 3, 2023

Hey, there hasn't been any activity on this issue for more than 180 days. For now, we have marked it as "dormant" until there is some new activity. You are welcome to reach out to people by mentioning them here or on our forum if you need more feedback! If you think that this issue is no longer relevant, you may close it by yourself; otherwise, we may do it at some point (either way, it will be done manually). In any case, thank you for your contributions so far!

@github-actions github-actions bot added the 😴 Dormant no recent activity label May 3, 2023
@lagru
Copy link
Member

lagru commented May 13, 2023

Thanks for the suggestion! I think there is an even simpler way. If I move the definition of __setattr__ and override the default one at the end of __init__ I get the same performance improvement compared to patch. However, we don't need to update the tests. And it seems a lot simpler to me...

def __init__(self, *args, **kwargs):
    # initialize...

    def __setattr__(self, name, value):
        """Translate older property names to current ones."""
        object.__setattr__(PROPS.get(name, name), value)

    # Override the default __setattr__ after initialization to speed it up
    self.__setattr__ = __setattr__

@github-actions github-actions bot removed the 😴 Dormant no recent activity label May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants