New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum member lookup is 20x slower than normal class attribute lookup #67674
Comments
Running the attached test script: $ time python test.py enum real 0m6.546s real 0m0.384s I encountered this with a script that yielded a sequence of objects (potentially a few hundred thousand of them) and categorized them with instances of an Enum subclass. The consumer of that iteration processes each object with a switch-case-like comparison of the category, checking it sequentially against each instance of the Enum. This seems like a fairly common use case. From cProfile it looks like EnumMeta.__getattr__ and _is_dunder are the main bottlenecks: [...] |
Craig Holmquist wrote:
So for every object you compare against every Enum member? Is there a reason you don't just use the lookup capability? class Category(Enum):
tiny = 1
medium = 2
large = 3
cat = Category(obj.category) # assumes obj.category is 1, 2, or 3 |
I may not have been clear before. What I mean is, code like this: for obj in get_objects():
if obj.category == Cat.cat1:
#do something
elif obj.category == Cat.cat2:
#do something else
elif obj.category == Cat.cat3 or obj.category == Cat.cat4:
#... obj.category is already an instance of Cat, in other words. The consumer is using it to determine what to do with each obj. |
It seems like performance is drastically improved by doing this: class Category(Enum):
tiny = 1
medium = 2
large = 3
tiny = Category.tiny
medium = Category.medium In other words, resolving Category.tiny and Category.medium is what's slow. The comparison itself is very fast. |
Yup, you have it figured out. It's the lookup that is the slowdown. When performance is an issue one of the standard tricks is to create a local name, like you did with "tiny = Category.tiny". For the curious (taken from the docstring for Enum.__getattr__):
It is possible to store all the enum members /except/ for 'name' and 'value' in the class' __dict__, but I'm not sure it's worth the extra complication. |
Out of curiousity I tried: took two new lines, one modified line, and one comment. :) |
Oh, and the slowdown dropped from 20 to 3 (for non-DynamicClassAttributes -- which is probably more than 99% of them). |
I don't understand the patch, but 3x slower instead of 20x slower is a huge optimization :-) Do you plan to change Python 3.5 *and* Python 3.4? |
This isn't a change to the API or any visible user behavior (besides performance), so I don't see a reason to not add it to 3.4. |
Larry, I have a very small patch (~4 lines) that does change user behavior or the API, but does have a significant performance boost. I'm still learning what is/is not okay to add to maintenance releases, so wanted to run this by you. |
My inclination is 3.5 only. Barry, do you want to argue for this going into 3.4? |
Argh, sorry -- that was supposed to be *does not* change user behavior nor the API, it's *just* a performance increase. Does that change your inclination? |
Oh, I read the code. But it's a performance hack, and the rules say we only accept security fixes and bug fixes at this stage of the release, and they're the rules for good reasons. |
Poor performance could fall under the category of bug fixes, so for an in-maintenance mode release, a fix that does not in any way change user visible behavior could be acceptable. It would probably be fine for 3.4 but I'm just +0 on it. Larry's call. |
This is not a regression (there were no enums before 3.4), slow down is not critical (only constant factor, not increased computational complexity), there is a workaround, and the code that just use constants that were converted to IntEnum is not affected. I'm -0 on it. |
In getting everything fixed up and tested I realized there was one slight user-facing change: with this patch it is now possible to say: SomeEnum.SomeMember = SomeMember In other words, it is possible to set a value on the class as long as it is the same value that already existed. 3.5 sounds good. |
New changeset 2545bfe0d273 by Ethan Furman in branch 'default': |
Slight reordering of code removed the one user visible change. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: