New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cythonize some hotspots #55
Conversation
…on and singleton. No optimizations yet. All tests pass.
…naive version so far. Hopefully we can get better still.
This was a pretty big win, getting us 9-13% faster on the externalization benchmarks compared with the last commit.
…place) so it can be a tuple. This is again slightly faster.
…ther 1-2% on extern.
…eading underscores, rename our private module back to private again.
This nets us a 4% improvement or so in externalization, maybe a tiny slowdown in internalization (but I expect that to be fixed when I type that module). We could do better, but the class attribute _ext_excluded_out_ and the like are slowing us down, and we have some code that doesn't keep their usual types (mostly in housing). A metaclass may be able to fix that, because they all seem to be defined at the class level.
Define them as an instance, make an accessor function, type the result of the accessor in the pxd. This avoids needing to get the type in the python file, which is what causes the leading underscore problems. This avoids the need for re-enumerating all the constants while still being direct access (yay!). Speeds things up by another percent or two in datastructures.py.
…ment on the benchmarks, it appears.
…ning bottlenecks Two changes of note: - notify_modified alias was removed. I couldn't find any uses of it on github - eventFactory was removed from notifyModified. I couldn't find any uses of it on github. These are both easy to add back if needed.
Looking specifically at CHANGES: I do see some usage of nti.dataserver I haven't traced back far enough to tell if those contain args that would be unexpected but it's something to look in to. |
I later developed a pattern to deal with that, so it's easy enough to restore (I came across that one very early and intended to go back and deal with it, search the repos, but I didn't get there. Thanks.) |
…nary with a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I've pushed this to PyPI as 1.0a2 so we can start finding out what I unintentionally broke 😄 |
I know there were some other associated PRs that went along with this. Are the benchmarks in the original post still accurate? |
Running the current benchmarks now and comparing against 1.0a1, the results are either no different than originally posted, or notably faster (in the last case you need to look at absolute time; for whatever reason the pure-python version was suddenly faster):
Compared to the benchmarks posted originally here, quite a bit of progress was made (again, that last one is within margin of error; I need more substantial benchmarks of a larger object tree):
|
Of course, PyPy is still dramatically faster. We're talking order of magnitude or more:
|
Fixes #52
Some of the changes also make pure python mode faster as well.
I tried to stay compatible, and did a number of searches of our github repos to see if I was breaking anything. See CHANGES.rst for notes on the few things I know I broke (but that didn't appear to be used). I'm encouraged by the fact that there are no substantial test changes and we still have full coverage.
I need to do some refactoring to bring module sizes under control, but this PR is big enough as it is. Refactoring of these modules is next. Then I'll look at other modules to see if they might need some cython love (suggestions for known hotspots welcome).
Now the numbers. These are simple benchmarks, but for more complex types, I expect the speedups to scale well (types with lots of Singleton decorators could see a nice benefit). Comparing this to the pure-python version on CPython 2.7: