-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Fraction pickling #88320
Comments
The current version of the Fraction.__reduce__() method uses str(), which produces bigger dumps, esp. for large components. C.f.:
>>> import random, pickle
>>> from fractions import Fraction as F
>>> random.seed(1); a = F(*random.random().as_integer_ratio())
>>> for proto in range(pickle.HIGHEST_PROTOCOL + 1):
... print(len(pickle.dumps(a, proto)))
...
71
70
71
71
77
77
>>> b = a**13
>>> for proto in range(pickle.HIGHEST_PROTOCOL + 1):
... print(len(pickle.dumps(b, proto)))
...
444
443
444
444
453
453
vs the attached patch:
>>> for proto in range(pickle.HIGHEST_PROTOCOL + 1):
... print(len(pickle.dumps(a, proto)))
...
71
68
49
49
59
59
>>> for proto in range(pickle.HIGHEST_PROTOCOL + 1):
... print(len(pickle.dumps(b, proto)))
...
444
441
204
204
214
214 Testing for non-default protocols was also added. Let me know if all this does make sense as a PR. |
Yes, this looks reasonable. Go ahead with a PR. |
Oh yes - please do. It's not just pickle size - going through str() makes (un)pickling quadratic time in both directions if components are large. Pickle the component ints instead, and the more recent pickle protocol(s) can do both directions in linear time instead. |
Ok, I did.
Yeah, I noticed speedup too, but size was much more important for may application. BTW, the same issue affects some other stdlib modules, ex. in the Decimal() it will be more efficient to use the tuple (sign, digit_tuple, exponent) instead of dumping strings. Maybe more, simple fgrep suggests me also the ipaddress module, but I think here it's ok;-) |
Not sure why this wasn't closed after pr merging. If this was intentional - let me know and reopen. I'm less sure if something like this will work for a Decimal(). Perhaps, if the constructor will accept an integer as the value[1], not just a tuple of digits. |
You're right that this won't work for decimal because it takes a string constructor. A fancier reduce might do the trick but it would involve modifying the C code (no fun) as well as the Python code. Also, the conversion from decimal to string and back isn't quadratic, so we don't have the same worries. Lastly, really large fractions happen naturally as they interoperate, but oversized decimals are uncommon. |
On Thu, May 20, 2021 at 12:03:38AM +0000, Raymond Hettinger wrote:
Yes, it will be harder. But I think - is possible. E.g. with this trivial patch:
$ git diff
diff --git a/Lib/_pydecimal.py b/Lib/_pydecimal.py
index ff23322ed5..473fb86770 100644
--- a/Lib/_pydecimal.py
+++ b/Lib/_pydecimal.py
@@ -627,6 +627,9 @@ def __new__(cls, value="0", context=None):
self._exp = value[2]
self._is_special = True
else:
+ value = list(value)
+ if isinstance(value[1], int):
+ value[1] = tuple(map(int, str(value[1])))
# process and validate the digits in value[1]
digits = []
for digit in value[1]:
@@ -3731,7 +3734,7 @@ def shift(self, other, context=None):
# Support for pickling, copy, and deepcopy
def __reduce__(self):
- return (self.__class__, (str(self),))
+ return (self.__class__, ((self._sign, int(self._int), self._exp),)) def __copy__(self):
if type(self) is Decimal: Simple test suggests that 2x size difference is possible:
>>> import pickle
>>> from test.support.import_helper import import_fresh_module
>>> P = import_fresh_module('decimal', blocked=['_decimal'])
>>> P.getcontext().prec = 1000
>>> d = P.Decimal('101').exp()
>>> len(pickle.dumps(d))
1045
vs
>>> len(pickle.dumps(d))
468 with the above diff. (Some size reduction will be even if we
Yes, for a speed bonus - we need to do something more clever)
For financial calculations this, probably, is true. But perfectly Let me know if it's worth openning an issue with above improvement. |
I don't think so. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: