Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve experience/performance when compositing >1000 images #2286

Closed
rpav opened this issue Jul 8, 2020 · 15 comments
Closed

Improve experience/performance when compositing >1000 images #2286

rpav opened this issue Jul 8, 2020 · 15 comments

Comments

@rpav
Copy link

@rpav rpav commented Jul 8, 2020

Are you using the latest version? Is the version currently in use as reported by npm ls sharp the same as the latest version as reported by npm view sharp dist-tags.latest?

Yes... I upgraded (0.25.4), sadly same issue

What are the steps to reproduce?

I'm processing a bunch of images (1408 sprites around 16x16 or 32x16 to be exact). At somewhere between 1156 and 1393 (exact number unlikely to be relevant), this will crash with a segmentation fault. The gdb stacktrace looks something like this:

#0  0x00007fffedd442b6 in g_hash_table_lookup () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libglib-2.0.so.0
#1  0x00007fffedad029a in g_signal_handler_is_connected ()
   from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#2  0x00007fffee27b5b6 in vips.object_set_member () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#3  0x00007fffee27bab8 in vips_object_set_property ()
   from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#4  0x00007fffedabd7c3 in g_object_set_valist ()
   from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#5  0x00007fffedabe0bf in g_object_set () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#6  0x00007fffee27a1f5 in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#7  0x00007fffee27ac0a in vips_argument_map () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#8  0x00007fffee27acf7 in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#9  0x00007fffedaba113 in g_object_unref () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#10 0x00007fffedabd531 in g_object_set_valist ()
   from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#11 0x00007fffedabe0bf in g_object_set () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#12 0x00007fffedab556d in g_closure_invoke () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#13 0x00007fffedac7aab in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#14 0x00007fffedad314f in g_signal_emit_valist ()
   from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#15 0x00007fffedad3be2 in g_signal_emit () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#16 0x00007fffee27ad69 in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#17 0x00007fffedaba113 in g_object_unref () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#18 0x00007fffee279e96 in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#19 0x00007fffedab556d in g_closure_invoke () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#20 0x00007fffedac7aab in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#21 0x00007fffedad314f in g_signal_emit_valist ()

This continues for "a bit", ending as follows:

#48450 0x00007fffedad314f in g_signal_emit_valist () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#48451 0x00007fffedad3be2 in g_signal_emit () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#48452 0x00007fffee27ad69 in ?? () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#48453 0x00007fffedaba113 in g_object_unref () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#48454 0x00007ffff448c42c in PipelineWorker::Execute() () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/sharp.node
#48455 0x00007ffff4472eb4 in Napi::AsyncWorker::OnExecute(Napi::Env) () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/sharp.node
#48456 0x00007ffff4475eb3 in Napi::AsyncWorker::OnAsyncWorkExecute(napi_env__*, void*) () from /home/rpav/vcs/tileutil.js/node_modules/sharp/build/Release/sharp.node
#48457 0x00005555565b4335 in ?? ()
#48458 0x00007ffff738f422 in start_thread () from /usr/lib/libpthread.so.0
#48459 0x00007ffff72bcbf3 in clone () from /usr/lib/libc.so.6

I don't think this is actually infinite recursion ... I don't know what's going on, specifically, but it will actually complete given enough stack space. I discovered this because it actually works for this image set when run in emacs's eshell! Subsequently using ulimit -s 16384 also completes.

This is however not the greatest workaround as it doesn't appear to scale.

What is the expected behaviour?

Stack usage not dependent on input size. Also not crashing.

Are you able to provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem?

Unsure .. my code looks something like:

sharp({
  create: { /* something like 1024x1024 */ },
})
.composite(arrayOfImages)
.raw().toBuffer()

This seems sufficient to crash it, though I haven't tried this outside of the other code that actually loads the images etc. Note that removing .toBuffer() is sufficient for it to not crash. This is all in an async lambda in an async main.

Are you able to provide a sample image that helps explain the problem?

Specific images don't seem to be the triggering factor.

What is the output of running npx envinfo --binaries --system?

npx: installed 1 in 1.747s

  System:
    OS: Linux 5.6 Arch Linux
    CPU: (12) x64 Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
    Memory: 25.63 GB / 31.31 GB
    Container: Yes
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 14.5.0 - /bin/node
    Yarn: 1.22.4 - /bin/yarn
    npm: 6.14.5 - ~/.nodejs/bin/npm
@rpav rpav added the triage label Jul 8, 2020
@rpav
Copy link
Author

@rpav rpav commented Jul 8, 2020

Also, notably, perhaps related, composite() is by far the slowest part of the utility I'm working on; nearly 5 seconds to composite said 1400 images. Other parts of the process involve walking every sprite pixel-by-pixel to compute borders, and writing sprite bleed, also pixel-by-pixel. In Javascript. E.g. the makeBleed() portion of this by comparison is 0.09s.

So probably, something weird is going on.

@tgrajewski
Copy link

@tgrajewski tgrajewski commented Jul 8, 2020

This was reported before: #1708

@lovell
Copy link
Owner

@lovell lovell commented Jul 9, 2020

Yes, this looks like the same stack overflow as #1470.

@rpav does #1580 better suit your use case?

The best we can do to avoid this is add an fixed limit on the number of images being composited at once. It's hard to know exactly what that limit should be, but I'd prefer a low value as people attempting to composite 100s of images are probably looking for #1580.

In terms of performance, we'd need to try using a tool such as callgrind to see if there's any over-computation that might be solved with a libvips tile cache.

@lovell lovell added enhancement and removed triage labels Jul 9, 2020
@lovell lovell changed the title Segfault processing lots of images, stack overflow? Improve experience/performance when compositing >1000 images Jul 9, 2020
@rpav
Copy link
Author

@rpav rpav commented Jul 9, 2020

It doesn't appear to; that appears to join a number of images in a uniform array. In my case, I'm using maxrects-packer to compute bins and sharp to composite some processed sprites into the image.

At a glance, vips composite is not what I really want here, rather embed or insert. These don't appear to be available via sharp but I'm not sure.

@rpav
Copy link
Author

@rpav rpav commented Jul 9, 2020

Presumably the following ought also produce similar results:

let atlas = sharp({ ... });

await atlas.composite([ images[0] ]);
await atlas.composite([ images[1] ]);

(Obviously, in the "real" case, I do promises.push(atlas.composite([images[i]])) then await them all, but for sake of simplified testing and problem elimination...)

Unfortunately this seems to clear the entire image every time .composite() is called, leaving only one image in the atlas. Surely this is not the intended effect? This is the same even if I change blend modes.

@rpav
Copy link
Author

@rpav rpav commented Jul 9, 2020

For now I have written a simple pixel-by-pixel copy. This cuts runtime to the point where, running linearly, it's not even worth optimizing.

With .composite():

Start at 0.0022215s (0.0022215s)
Loaded/trimmed at 3.9076s (3.9054s)
Packing computed at 4.0802s (0.17257s)
Composited at 9.5329s (5.4527s)
Sprites processed: 1421

With pixel copying:

Loaded/trimmed at 3.4233s (3.4210s)
Packing computed at 3.7387s (0.31510s)
Binned images at 3.8882s (0.14949s)
Built altas at 3.9453s (0.057114s)
Bleed at 4.0371s (0.091836s)
Process finished at 4.4002s (0.36307s)
Sprites processed: 1421

The "Binned images" step is mostly slow due to rotating the image and re-converting it to a buffer. This conversion seems to take the vast majority of the time; caching the unrotated raw image buffer cut this from 2.9s to the current 0.1s. Blitting is the "Built atlas" step. (I'm probably doing something dumb and can cut loading down too.)

In any case, if someone is having performance / crash issues with .composite(), just copying data between buffers with dest.writeUInt32LE(src.readUInt32LE(...), ...) is very fast.

@willemmulder
Copy link

@willemmulder willemmulder commented Aug 27, 2020

Hi @rpav I'm trying to do composite as well with a relatively large amount of images (i.e. stitch them together to form one big image). I have the same problems:

  • doing .composite(...).composite(...) will cause the first composites to be ignored and only the last to be taken into account
  • make a large array and then pushing it to .composite(...) at once, is problematic. Either it fails (after eating 10 gigabytes of RAM when compositing 196 small images) or it's super slow.
  • and (unrelated) when I do .composite().resize().png() there will be an error Error: Image to composite must have same dimensions or smaller which I don't understand...

@lovell Do you have an idea how this could be fixed?
@rpav I don't understand your workaround. Could you give a small code sample?

Thanks!

@rpav
Copy link
Author

@rpav rpav commented Aug 27, 2020

@willemmulder Oh right, I forgot to mention here, I posted my set of tile utilities. This is a pretty quick hack job and doubtless some pretty crappy JS, but you can see here for example, I'm literally doing a naive/simple pixel-by-pixel copy from one buffer to the other. For my current set of ~1500-is images, this takes ~60ms out of ~5s.

The real solution to this I think would be as noted above, including embed or insert rather than just composite. But at this point it's not a priority for me, because other things are considerably slower (loading takes 80% of my runtime!).

@willemmulder
Copy link

@willemmulder willemmulder commented Aug 28, 2020

@rpav Thanks! From what I see, the part of your util that you mentioned then is mainly meant to copy buffers? Which would enable me to do several .composite() operations in sequence, using a series of nested sharp(previous.raw().toBuffer(), {raw:info}) etc but in a more efficient manner. Right? Or am I missing the point here?

@rpav
Copy link
Author

@rpav rpav commented Aug 28, 2020

No, this doesn't change the viability of composite(), it just does the operation I was doing with composite by hand, which was "copy pixels from one buffer to another buffer". In my case, specifically, this is a simple data copy/overwrite rather than composite, which can combine/blend/etc.

@willemmulder
Copy link

@willemmulder willemmulder commented Aug 28, 2020

No, this doesn't change the viability of composite(), it just does the operation I was doing with composite by hand, which was "copy pixels from one buffer to another buffer". In my case, specifically, this is a simple data copy/overwrite rather than composite, which can combine/blend/etc.

Ah I see 👍 Would it be possible to somehow use that for stitching together 20x20 small tiles into one big image? If so, I can probably figure out the rest myself...

@lovell
Copy link
Owner

@lovell lovell commented Feb 16, 2022

Commit c620025 should improve the performance and accuracy of multi-image compositing. This will be in v0.30.2 (a patch release as the previous accuracy could well be considered a bug).

Please note that #1580 remains as a future possible enhancement.

@tgrajewski
Copy link

@tgrajewski tgrajewski commented Feb 17, 2022

Is the stack overflow from issue #1708 also fixed now?

@lovell
Copy link
Owner

@lovell lovell commented Feb 17, 2022

@tgrajewski It's likely, yes, and you're welcome to test locally using npm install --build-from-source lovell/sharp

@lovell
Copy link
Owner

@lovell lovell commented Mar 2, 2022

v0.30.2 now available with this performance improvement.

Please remember that #1580 still exists as a future possible enhancement for joining/stitching non-overlapping images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants