[WIP] Global caching by masterpiga · Pull Request #20712 · darktable-org/darktable

masterpiga · 2026-03-31T11:44:12Z

In this post AP documents some dramatic pipeline improvements using more aggressive global caching. Modulo the name calling and unpleasant attitude, it is quite an interesting writeup.

For curiosity, I asked Claude to incorporate these changes in darktable's pixelpipe. I asked to decompose the changes in self-contained WPs, and then I asked it to implement the first 3. See pixelpipe_caching.md for the full analysis.

I played with the resulting binary a bit and fixed a couple of crashes, I think it's pretty stable now. If I run with -d pipe I see that there are quite a lot of cache hits, so the change appears to be effective. However, I don't find a huge difference in interactive usage, but I didn't really try to stress the pipeline.

@TurboGit @jenshannoschwalm are there any benchmarks that you would like to try to measure if there are noticeable interactive speedups? @kofa73 you may also be interested in taking a look.

I don't have a strong opinion about this PR. I consider it to be still WIP and I am not even sure that it's something that we want to incorporate. I decided to start the discussion here because I thought that having it over a PR with a testable binary would be more productive.

andriiryzhkov · 2026-03-31T13:27:10Z

@masterpiga : Very interesting approach.

A couple of questions:

Benchmarking: What's the recommended way to measure the performance difference? Would -d pipe log timestamps be enough to compare pipeline stage timings between this branch and master? Or is there a more structured benchmark scenario you'd suggest?
Test scenarios: Which editing workflows would best demonstrate the caching benefits? I'm guessing images with many active modules (filmic, tone equalizer, denoise profile, masks) and interactive operations like zoom/pan/slider changes would show the biggest difference.

Would be happy to test if that's useful.

jenshannoschwalm · 2026-03-31T14:06:02Z

Just my two cents ...

A was using a malfuctioning pipe caching as in dt 4.0. Since then we have done a lot. So i honestly don't care what ap rants about dt.
He reorganized the pipe quite a lot btw , where is scaling happening for example. So a different strategy for caching might be used.
In dt we use other very efficient ways to speedup interactive use.
Did you test cache efficiecy with current dt at all? The hit rate is very high as i see it.
We don't cache opencl mem buffers until now. We could thus possibly avoid a copy to cl device although that is not very costly.
A unified cache would simplify things although code burden is very low for that.

masterpiga · 2026-03-31T14:42:17Z

Thanks for chiming in, @andriiryzhkov and @jenshannoschwalm.

As I mentioned above, I didn't find a lot of obvious benefits in interactive usage. OTOH I have a very fast M4 Pro, so maybe my set up is not one that would benefit a lot from this change. Hence my question: is there an established way of comparing interactive execution speed after pipeline refactorings? I know I can run a benchmark with darktable-cli, but it's not batch processing speed that I am after (even though, of course, I wouldn't mind to get some improvements there as well).

@andriiryzhkov (1) my question exactly. (2) yes, ideally you should see most benefits when you have a long chain of modules and you edit something in the top half.

@jenshannoschwalm indeed, I am very ignorant about this space, I understand only superficially what is happening and I am not sure if the ideas have a lot of value given the current state of dt. Consider this PR as an excuse to have a conversation about this topic.

jenshannoschwalm · 2026-03-31T15:39:19Z

If you want to go into pipe performance, the point getting into would be mask distorting in the pipe :-) that might be beneficial for all processing ...

TurboGit · 2026-03-31T20:19:59Z

Yes I fully agree with @jenshannoschwalm, in current dt we have done a lot for speed since then. All this is @jenshannoschwalm work on the pipe, and @ralfbrown for many iop by squeezing CPU cycles as much as possible. There may be room for improvement, as always, but we need figures for this.

jenshannoschwalm · 2026-03-31T21:15:48Z

Some more cents :-)

I don't know how and what exactly ap is benchmarking, the worse dt results for module processing could be well explained by the "over developing" so we can move the darkroom canvas around without any recalculation.
Using a pinned cl image for caching might be a good one.

masterpiga · 2026-04-01T07:53:21Z

Ok, thanks for your feedback. Closing this for now as I understand that there are lower hanging fruits and more promising directions to explore.

ralfbrown · 2026-04-01T12:12:23Z

There is one (rarely-used) module which would be considerably helped by improved caching - liquify. It computes its displacement map at least twice for every pipe run as well as every time the shapes for drawn masks are updated on screen, and that calculation can take hundreds of milliseconds for large/numerous warps. That's part of the reason why I spent a fair bit of time minimizing refreshes a couple of years ago.

So we really should be caching liquify's displacement map as well as its output.

masterpiga added 2 commits March 31, 2026 13:53

WP1: Module-Level Cumulative Checksums

64d56dd

WP2+WP3: Memory bounded cache + global shared cache

4f30b07

masterpiga force-pushed the caching branch from 852d4db to 4f30b07 Compare March 31, 2026 11:54

masterpiga closed this Apr 1, 2026

masterpiga mentioned this pull request Apr 2, 2026

Improvements to mask distort efficiency via incremental caching and reduced malloc/free cycles. #20729

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Global caching#20712

[WIP] Global caching#20712
masterpiga wants to merge 2 commits intodarktable-org:masterfrom
masterpiga:caching

masterpiga commented Mar 31, 2026 •

edited

Loading

Uh oh!

andriiryzhkov commented Mar 31, 2026 •

edited

Loading

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

masterpiga commented Mar 31, 2026 •

edited

Loading

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

TurboGit commented Mar 31, 2026

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

masterpiga commented Apr 1, 2026

Uh oh!

ralfbrown commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

masterpiga commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andriiryzhkov commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

masterpiga commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

TurboGit commented Mar 31, 2026

Uh oh!

jenshannoschwalm commented Mar 31, 2026

Uh oh!

masterpiga commented Apr 1, 2026

Uh oh!

ralfbrown commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

masterpiga commented Mar 31, 2026 •

edited

Loading

andriiryzhkov commented Mar 31, 2026 •

edited

Loading

masterpiga commented Mar 31, 2026 •

edited

Loading