[FIX] core: reduce memory use of BaseModel._flush() #162442

cawo-odoo · 2024-04-18T14:14:33Z

Motivation: MemoryError exceptions when a large number of records on the same model have dirty fields. Such often happens during upgrades.

In the current implementation, the cached data is re-arranged in multiple steps using local data structures. The most problematic is id_vals[record.id][field.name], because it creates a dictionary with a potentially long field name (think studio fields) as key for each dirty record. For thousands of records, this quickly accumulates to 10s or even 100s of MiB in RAM.

The idea of this patch is:

collect all dirty ids for all dirty fields on the model. This does not cost additional memory, since the list of ids per field will be pop()'ed from the cache.
Walk over fields and ids collecting all fields and values of each id in the same loop, consuming objects from the walked structure while directly building the updates dictionary, without creating the intermediate data structures.

This way, the _flush method only consumes a marginal amount of memory compared to the memory already consumed by the cache.

Using profiling while flushing 50k records in both versions of the code, it has shown that the reduction of space complexity does not increase the runtime. As one can see, the vast amount of time is consumed by the SQL operation in both versions of the code:

robodoo · 2024-04-18T14:15:07Z

Pull request status dashboard.

odoo/models.py

rco-odoo · 2024-04-26T15:56:44Z

@cawo-odoo I just pushed a commit with a proposed simplification.
It still fixes the main memory issue, but without all the detailed little optimizations.
I believe all those low-level optimizations are overall useless, and only make the algorithm harder to understand.
Can you please give it a try?

Motivation: MemoryError exceptions when a large number of records on the same model have dirty fields. Such often happens during upgrades. In the current implementation, the cached data is re-arranged in multiple steps using local data structures. The most problematic is `id_vals[record.id][field.name]`, because it creates a dictionary with a potentially long field name (think studio fields) as key for each dirty record. For thousands of records, this quickly accumulates to 10s or even 100s of MiB in RAM. The idea of this patch is: 1. collect all dirty ids for all dirty fields on the model. This does not cost additional memory, since the OrderedSet of ids per field will be pop()'ed from the cache. 2. Walk over fields and ids collecting all fields and values of each id in the same loop, directly building the "updates", without creating the intermediate data structure. This way, the _flush method only consumes a small amount of memory compared to the memory already consumed by the cache. Using profiling while flushing 50k records in both versions of the code, it has shown that we can save this memory without increasing the runtime. Co-authored-by: Raphael Collet <rco@odoo.com>

C3POdoo added the RD research & development, internal work label Apr 18, 2024

cawo-odoo force-pushed the 17.0-fix_model_flush_mem_use-cawo branch 2 times, most recently from 28648c2 to 759bcbb Compare April 25, 2024 14:44

Pirols reviewed Apr 25, 2024

View reviewed changes

odoo/models.py Outdated Show resolved Hide resolved

cawo-odoo force-pushed the 17.0-fix_model_flush_mem_use-cawo branch from 588e726 to c415b25 Compare April 29, 2024 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] core: reduce memory use of BaseModel._flush() #162442

[FIX] core: reduce memory use of BaseModel._flush() #162442

cawo-odoo commented Apr 18, 2024 •

edited

robodoo commented Apr 18, 2024

rco-odoo commented Apr 26, 2024

[FIX] core: reduce memory use of BaseModel._flush() #162442

Are you sure you want to change the base?

[FIX] core: reduce memory use of BaseModel._flush() #162442

Conversation

cawo-odoo commented Apr 18, 2024 • edited

robodoo commented Apr 18, 2024

rco-odoo commented Apr 26, 2024

cawo-odoo commented Apr 18, 2024 •

edited