bpo-36694: Do not memoize temporary objects in the C implementation of pickle. #13036

serhiy-storchaka · 2019-05-01T12:28:14Z

This produces more optimal pickle data and reduces memory consumption on
pickling and unpickling.

https://bugs.python.org/issue36694

…f pickle. This produces more optimal pickle data and reduces memory consumption on pickling and unpickling.

pitrou · 2019-05-10T16:38:39Z

Modules/_pickle.c

@@ -1601,15 +1601,15 @@ memo_get(PicklerObject *self, PyObject *key)
 /* Store an object in the memo, assign it a new unique ID based on the number
   of objects currently stored in the memo and generate a PUT opcode. */
 static int
-memo_put(PicklerObject *self, PyObject *obj)
+memo_put(PicklerObject *self, PyObject *obj, int opt)


I don't understand how you decide whether opt should be 0 or 1. What is the heuristic?

+1 on this. This is critical behavior for cloudpickle :)

This is an interesting question. The rule is opt=0 for objects which save its content after saving itself. They are non-empty lists, sets, dicts and general objects with non-trivial elements 2-4 of the tuple returned by __reduce__(). They should be memoized to allow detecting reference loops.

pitrou · 2019-05-10T16:38:55Z

Also @pierreglaser .

pierreglaser · 2019-05-10T17:32:12Z

Modules/_pickle.c

 {
    char pdata[30];
    Py_ssize_t len;
    Py_ssize_t idx;

    const char memoize_op = MEMOIZE;

-    if (self->fast)
+    if (self->fast || (opt && Py_REFCNT(obj) == 1))


Requiring Py_REFCNT(obj) to be 1 pretty strong right? Does this only affect objects created using the C API, i.e never bounded to python-level variables?

It mostly affects temporary objects. For example, if __reduce__ returns constructor, ((x, y),) then the tuple (x, y) will not be memoized. This is a case of namedtuples.

Another example: if you have a list of unique numbers, strings, tuples, etc, then items of the list will not be memoized as the only reference to the item is from the list.

pitrou · 2019-05-31T08:33:16Z

I think this optimization should be restricted to well-known built-in types (tuples, etc.). Omitting arbitrary user objects risks opening regressions.

serhiy-storchaka · 2019-05-31T09:06:22Z

For example?

pitrou · 2019-05-31T09:08:41Z

I don't have any example, but I'm not confident that they don't exist. @pierreglaser mentioned cloudpickle, which heavily customizes pickling.

serhiy-storchaka · 2019-05-31T09:39:56Z

cloudpickle customizes pickle using the Python implementation. This optimization is only for the C implementation.

pitrou · 2019-05-31T09:58:04Z

No, cloudpickle will soon be subclassing from the C implementation.

serhiy-storchaka · 2021-01-02T21:28:15Z

I do not see possibility of regressions.

serhiy-storchaka · 2021-01-04T09:09:59Z

Well, maybe there are some differences with constructors with side effect. I'll try to write tests for this.

igozali · 2023-02-19T15:22:37Z

I believe I'm also hitting this issue, curious about the remaining steps to push this patch forward? Is it just missing tests?

bpo-36694: Do not memoize temporary objects in the C implementation o…

7936816

…f pickle. This produces more optimal pickle data and reduces memory consumption on pickling and unpickling.

serhiy-storchaka added the performance Performance or resource usage label May 1, 2019

the-knights-who-say-ni added the CLA signed label May 1, 2019

bedevere-bot added the awaiting core review label May 1, 2019

pitrou reviewed May 10, 2019

View reviewed changes

pierreglaser reviewed May 10, 2019

View reviewed changes

serhiy-storchaka added 3 commits May 10, 2019 23:13

Fix pickling recursive objects.

edcd39c

Merge branch 'master' into pickle-optimize-refcnt

20aab10

Merge branch 'master' into pickle-optimize-refcnt

66979e2

Fix pickletools doctests.

7b1d188

LachlanStuart mentioned this pull request Jan 16, 2020

formula_images memory manager issue metaspace2020/Lithops-METASPACE#55

Closed

serhiy-storchaka added 8 commits January 1, 2021 17:48

Merge branch 'master' into pickle-optimize-refcnt

cf82909

Update Python version numbers.

6444a90

Add comments to recursive objects tests.

9d1b10d

Refactor recursion tests.

b39126a

Merge branch 'master' into pickle-optimize-refcnt

646fe43

Merge branch 'master' into pickle-optimize-refcnt

31eab15

Stabilize dis() tests.

48c12c7

Remove excessively strict tests.

15e2399

Paul-E mannequin mentioned this pull request Apr 10, 2022

Excessive memory use or memory fragmentation when unpickling many small objects #80875

Open

ezio-melotti removed the CLA signed label Jul 13, 2022

serhiy-storchaka marked this pull request as draft December 1, 2023 12:44

bedevere-app bot removed the awaiting core review label Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-36694: Do not memoize temporary objects in the C implementation of pickle. #13036

bpo-36694: Do not memoize temporary objects in the C implementation of pickle. #13036

serhiy-storchaka commented May 1, 2019 •

edited by bedevere-bot

pitrou May 10, 2019

pierreglaser May 10, 2019

serhiy-storchaka May 10, 2019

pitrou commented May 10, 2019

pierreglaser May 10, 2019 •

edited

serhiy-storchaka May 31, 2019

pitrou commented May 31, 2019 •

edited

serhiy-storchaka commented May 31, 2019

pitrou commented May 31, 2019

serhiy-storchaka commented May 31, 2019

pitrou commented May 31, 2019

serhiy-storchaka commented Jan 2, 2021

serhiy-storchaka commented Jan 4, 2021

igozali commented Feb 19, 2023

bpo-36694: Do not memoize temporary objects in the C implementation of pickle. #13036

Are you sure you want to change the base?

bpo-36694: Do not memoize temporary objects in the C implementation of pickle. #13036

Conversation

serhiy-storchaka commented May 1, 2019 • edited by bedevere-bot

pitrou May 10, 2019

Choose a reason for hiding this comment

pierreglaser May 10, 2019

Choose a reason for hiding this comment

serhiy-storchaka May 10, 2019

Choose a reason for hiding this comment

pitrou commented May 10, 2019

pierreglaser May 10, 2019 • edited

Choose a reason for hiding this comment

serhiy-storchaka May 31, 2019

Choose a reason for hiding this comment

pitrou commented May 31, 2019 • edited

serhiy-storchaka commented May 31, 2019

pitrou commented May 31, 2019

serhiy-storchaka commented May 31, 2019

pitrou commented May 31, 2019

serhiy-storchaka commented Jan 2, 2021

serhiy-storchaka commented Jan 4, 2021

igozali commented Feb 19, 2023

serhiy-storchaka commented May 1, 2019 •

edited by bedevere-bot

pierreglaser May 10, 2019 •

edited

pitrou commented May 31, 2019 •

edited