Why is libvips quick

John Cupitt edited this page Apr 30, 2017 · 1 revision

We have a How it works page which goes into some detail, but briefly:

  • Fully demand-driven: libvips doesn't process entire images --- instead images are pulled through your computer as a series of small regions. This reduces memory use.

  • Fast operations: The libvips primitives are implemented carefully and some use techniques like run-time code generation. The convolution operator, for example, will examine the matrix and the image and at run-time write a short SSE3 program to implement exactly that convolution on exactly that image.

  • Threaded image input-output (IO) system: Most image processing libraries have threaded operations. Each operation has code, generally using a framework like OpenMP, to run the operation over all the available processors. libvips instead puts the threading into the image IO system and gives each thread a separate (very light-weight) copy of the entire image pipeline to work on. This style of horizontal threading makes better use of processor caches and reduces locking.

  • Overlapped input and output: libvips is able to run the load, process and save parts of a program in parallel, even though most image format libraries are single-threaded. It uses a set of threads for input and processing (which queue up on the load library), plus an extra background write-behind thread which runs whenever a set of output scanlines are completed.

  • libvips is (almost) tile-less: Most image processing systems split images into tiles for processing --- non-overlapping areas of pixels which can be cached and reused. Ensuring tiles do not overlap forces threads to continually synchronise, plus there needs to be special treatment of tile edges. libvips instead uses regions --- rectangular areas of images which can overlap. This removes a lot of housekeeping. It has a set of rules to try to keep region overlap (and therefore recomputation) to a minimum.

  • libvips is (almost) lock-less: Threads need to talk to each other to coordinate their work. On systems with a large number of processors this can become very expensive. libvips has only one mutex on file read and one on file write --- the whole of the rest of the system does not need any locking or synchronisation.

  • Variety of pixel formats: libvips supports 10 pixel formats, from 8-bit unsigned to 128-bit complex, and almost all operations can work on any format. This means that it can usually process data directly with no need to repack to another format for computation.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.