New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZODB.utils.p64 and u64 could be faster and provide better error reports #216
Comments
I did it for NEO, but without debugging information: Nexedi/neoppod@ef38744 For u64, having For p64, setting |
Thanks for the pointers. I'll do some benchmarking of the various options. I'll be happy if we at least don't lose any speed |
Here are the results of the benchmarking script where I measured
The first line is our current implementation. That's the number to beat. The second line is the absolute fastest we can get, directly calling the methods of Struct to pack and unpack. It's also unrealistic because it doesn't index into the results of unpacking like we need to do. It serves as a floor. The third line (still using the cached methods of Struct) makes unpacking realistic by using a wrapper function to return the correct index, while still just using the pack method directly. That's the fastest we could realistically be. The fourth line adds a wrapper function for packing that sets It looks like function call overhead is approximately 100ns. Once you have that, either setting |
("Inlining" the pack/unpack methods into the function through the old keyword argument trick, instead of accessing them as globals, did not improve the speed. Indeed, it slowed it down slightly.) |
Oh. I was about to comment about this and I'm surprised you get slower results. I should test again on my side. |
I was surprised too, but the effect is consistent across multiple separate test runs. Python 2.7:
Note the higher stdandard deviation of the kwarg approach, which, on the low end, would make it competitive with the non-kwarg approach. In Python 3.7, I understand accessing module globals is faster now, so there's little effect (although the standard deviation is smaller):
|
In NEO, I instead used a nonlocal variable. But I redid perf tests with Python 2 and I see no difference between that and a global. So fine, let's consider the most readable code. |
If I update the benchmarks to themself have a local reference to the functions they're calling, instead of calling them through a global, as in this gist, I can see a tiny improvement in the kwarg version on 2.7 compared to not. But (1) that's not how they're called in real life, and (2) the improvement is within the stddev. (FWIW PyPy 6 2.7 clocks in at around 1.5 ns for each of the benchmarks, roughly 100 times faster than CPython.) |
They currently use
struct.[un]pack(">Q", x)
which parses the format string each time. Using a pre-allocated Struct object can be quite a bit fasterAdditionally, when there is an error in packing, the resulting
struct.error
just says something likeerror: integer out of range for 'Q' format code
, without saying what the faulting value is. It would be nice if that information was available. We've run into scenarios where RelStorage over a MySQL connection can start throwing that exception for no apparent reason fromstorage.new_oid()
and knowing the value could help debugging. I think just__traceback_info__
would be enough, and probably cheaper than a try/except that raises a different exception.The text was updated successfully, but these errors were encountered: