-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce uuid.UUID() memory footprint #75160
Comments
memory usage for uuid.UUID seems larger than it has to be. it seems that using __slots__ will save around ~100 bytes per instance, which is very significant, e.g. when dealing with large sets of uuids (which are often used as "primary keys" into external data stores). uuid.UUID has a __setattr__ that prevents any extra attributes to be def __setattr__(self, name, value):
raise TypeError('UUID objects are immutable') ...so it seems to me not having __dict__ should not cause any problems? before (RES column):
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command with slots:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command i will open a pr on github shortly. |
as a follow-up note, i also experimented with keeping the actual value as a bytes object instead of an integer, but that does not lead to less memory being used: a 128-bit integer uses less memory than a 16 byte bytes object (presumably because PyBytesObject has a cached hash() field and a trailing null byte). |
This saves memory, but using str(uuid.uuid4()) requires even less memory. Can you explain someone would like to have 1000000 uuid objects, instead of 1000000 strings? What is the advantage of keeping UUID objects around? |
i consider uuids as low level data types, not as fancy containers, similar to how i view datetime objects. given the native support in e.g. postgresql and many other systems, it's common to deal with uuids. of course you can convert to/from strings or numbers, but that is cumbersome in many cases. for comparison, one would typically not convert unicode text from/into utf-8 encoded byte strings either, even though the latter will save memory in many cases. from experience: converting can lead to nasty bugs, e.g. because you forgot about a conversion, and then a uuid string does not compare equal to a uuid.UUID instance, leaving you puzzled. |
This change breaks pickle. You should preserve forward and backward pickle compatibility.
|
See new PR which addresses pickle forward and backward compatibility. |
I close the issue because of the pickle issue that hasn't been addressed by the wouter bolsterlee (the author) didn't reply for 1 month 1/2. @wouter bolsterlee: if you still want to work on that issue, you should try to address the pickle issue first, then reopen this issue or maybe create a new issue pointing to this one. |
Oops. I missed the fact that Tal created PR 9078. Sorry, I reopen the issue ;-) |
Thanks for the suggestion and the original patch, Wouter! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: