New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to improved vips_resize, requires libvips v8.3.1+ #310

Closed
jcupitt opened this Issue Nov 19, 2015 · 48 comments

Comments

3 participants
@jcupitt
Copy link

jcupitt commented Nov 19, 2015

Hi all, I've been looking at vips_resize(), the vips function that sharp uses for downsizing, and I think I've maybe found a couple of things which can improve quality, especially for small downsizes. Would anyone be able to help test the changes?

Code changes in this commit:

jcupitt/libvips@8506ff1

There's a test image here, plus sample shrinks from vipsthumbnail and convert. Put the images in separate tabs and try flipping between them.

http://www.rollthepotato.net/~john/skin/

The test image is an interesting one since it combines fine, high-contrast detail in the hair, plus a strong skin texture (I hope the subject doesn't mind me saying that). The downsize needs to minimise aliasing in the hair while retaining texture in the skin.

The old vips_resize() avoided anti-aliasing nicely, but had the effect of almost airbrushing the skin. The three changes in the patch:

  • reduce the strength of the anti-alias filter slightly
  • does not do the final sharpening pass if we skipped the anti-alias filter
  • fixes a bug where the sharpening would be skipped if there was no box filter

Any thoughts? Would someone be able to compare to photoshop? I remember we had an interesting discussion the last time we went over this code.

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented Nov 19, 2015

Hi John, sharp doesn't currently use vips_resize 😱 although I'd very much like it to (e.g. #299).

One thing that prevents sharp going all-in is the need to support resizing without maintaining aspect ratio. (which _shrink and _affine both support). This could be as simple as something like sharp's approach of averaging the x and y factors - see https://github.com/lovell/sharp/blob/master/src/pipeline.cc#L536

@lovell lovell added the question label Nov 19, 2015

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Nov 19, 2015

You could check your resize code against these changes, I guess. I think you are using a pretty similar algorithm, is that right?

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Nov 19, 2015

I'd forgotten about that request, sorry. I added non-square resizes to vips_resize():

jcupitt/libvips@2620f3b

It'll be poorer quality for non-square, but perhaps that doesn't matter.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Nov 19, 2015

I had a look through pipeline.cc and it looks like these changes aren't going to be relevant to sharp, you've already turned down the anti-alias filter. I'll close.

@jcupitt jcupitt closed this Nov 19, 2015

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented Nov 19, 2015

I feel this is still relevant to sharp. Your changes to vips_resize make it behave much more like the current (guesstimated) level of anti-aliasing.

If it's OK with you I'd like to reopen and make this the task to migrate sharp to using vips_resize.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Nov 19, 2015

Sure, reopen away. It's at least a useful test image.

Here's the previous resize quality issue: #121

It had these two other useful test images: wet hair, featuring a lot of high-contrast edges and some skin texture, though less than the one above, plus ghibli, which has tricky gently-angled lines.

http://www.rollthepotato.net/~john/wet-hair/
http://www.rollthepotato.net/~john/ghibli/

I've added new images for these new settings. wet-hair looks about the same, but the ghibli is a bit better.

@jcupitt jcupitt reopened this Nov 19, 2015

@lovell lovell changed the title help with image downsize quality improvement Switch to vips_resize (requires libvips 8.2.0) Nov 19, 2015

@lovell lovell added enhancement and removed question labels Nov 19, 2015

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented Nov 19, 2015

Thanks John.

On a related note, I've been experimenting with the removal of the anti-alias filter for the linear(-ish) interpolators - see https://github.com/lovell/sharp/blob/look/src/pipeline.cc#L549 - sample outputs at https://github.com/lovell/sharp/tree/look/test/interpolators

This is due to the proposed switch, in the next release of sharp, to a default interpolator setting of bicubic (from bilinear). Avoiding the convolution step with bilinear resizing is ~20% faster and still produces fairly good results (esp. if we assume that using bilinear is itself "good enough").

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Nov 19, 2015

Yes, vipsthumbnail has been bicubic for a while. I think you still need the anti-alias filter and the sharpen, even with bicubic.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Nov 30, 2015

A small note, if it helps someone.

  1. Word thumbnail in resize functions is often used to say, that method is optimized for minimal output size. Than can include ICC profile shrinking and is good only for small images.
  2. It's better to split downsizing to 2 clear steps
    • resize (interpolation filter can be selected according to quality and speed requirements)
    • sharpening (usually "unsharp mask" is used, but more advanced metods are also possible)
@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Dec 1, 2015

Hi @puzrin, I agree on 1.

On 2., some degree of sharpening can be part of the interpolation algorithm. vips_resize() does box filter + anti-alias filter + bicubic + sharpening, and the final (rather mild) sharpening step is really part of the interpolation: it produces the negative lobes of the lanczos that's being approximated.

I think I'd agree that any extra sharpening should be possible but optional. vips has quite a nice cored sharpen which could be useful:

http://www.vips.ecs.soton.ac.uk/supported/current/doc/html/libvips/libvips-convolution.html#vips-sharpen

I think sharp has used this before.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Dec 1, 2015

@jcupitt i've rereaded Nicolas's articles & IM forum topics, also digged sources a bit. Seems i was not right. Pre-plur and post-USM technics are both correct.

In this case Intel's article can be interesting for you for fast blur computation.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Dec 1, 2015

That's an interesting article @puzrin.

vips already does something like what Intel do. vips generates a separable convolution matrix from the sigma of the gaussian, then (at run time) writes a short AVX program which implements exactly that convolution on exactly that image, baking in things like pixel stride and scale. Intel's thing would probably win for large radius blurs, but the vips one should be quicker for small radius.

I'm exaggerating a bit :-( vips uses orc for the run-time code generation, and memory management issues inside orc have been the cause of a lot of crashes for us. I'm not sure if sharp even has this turned on any more :-( There's a new orc with at least some of this fixed, things might be getting better.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Dec 1, 2015

@jcupitt i meaned, vips use classic convolver, with complecity depending on blur radius (sigma + precision). Inte's article use IIR filter with 4 simple passes total (forward/back for horizontal/vertical).

PS. For fun: here you can play with javascript demo of IIR blur, and here you can play with unsharp mask.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Dec 1, 2015

I understand, but the vips one will still be quicker for small radius. The anti-alias blur radius is very small, only three pixels across, so vips can do it in only a couple of instructions.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Dec 1, 2015

@jcupitt if we speak about quality, could you explain why vips does not implements Lanczos (3 lobes) filter? Speed reasons? Any suggested replacements? AFAIK lanczos3 is the most popular default.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Dec 1, 2015

Yes, vips tries to implement a fast approximation of lanczos2. This was the IM default, have they changed it? I should check.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Dec 1, 2015

In ubuntu 14.04lts with ImageMagick 6.7.7-10:

convert -size 100x100 xc:  -set option:filter:verbose 1 -resize 50% null: | grep '^#'
# Resize Filter (for graphing)
#
# filter = SincFast
# window = SincFast
# support = 3
# window-support = 3
# scale-blur = 1
# practical-support = 3

That's lanczos3

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented Dec 1, 2015

"I'm not sure if sharp even has this turned on any more :-( There's a new orc with at least some of this fixed, things might be getting better. "

The pre-compiled version of libvips v8.1.1 used by the latest v0.12.0 of sharp for 64-bit Linux systems includes liborc 0.4.24. No complaints yet :)

@puzrin

This comment has been minimized.

Copy link

puzrin commented Dec 7, 2015

No complaints yet :)

0.12.0 segfaults :(, new liborc did not helped.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Mar 10, 2016

Yes, that's the current one. This branch is doing:

  1. vertical 1D box shrink
  2. horizontal 1D box shrink
  3. anti-alias, sigma = ((1.0 / residual) - 0.5) / 2.5, only blur if residual < 0.9 && sigma > 0.5
  4. vertical 1D lanczos3
  5. horizontal 1D lanczos3

1 and 3 vectorise well, 2, 4, and 5 do not :( But performance is nicely up from the previous best thanks to no longer needing a line cache.

vips 8.2 with a 10k x 10k RGB jpg:

$ time vipsthumbnail wtc.jpg -s 2000
peak rss: 52m
real    0m0.698s
user    0m1.536s
sys    0m0.044s

And with this branch:

$ time vipsthumbnail wtc.jpg -s 2000
peak rss: 40m
real    0m0.450s
user    0m0.792s
sys    0m0.020s
@puzrin

This comment has been minimized.

Copy link

puzrin commented Mar 10, 2016

@jcupitt thanks for clear equations, that will save me some time. Could you clarify some terms?

  • "Box shrink" - what is it?
  • "residual" - downscale ratio? 1000x1000px -> 500x500px, residual = 2?
@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Mar 10, 2016

By box shrink, I just mean a simple integer block average. The residual is the left-over fractional part.

For example, to shrink a 1000 x 1000 pixel image down to 128 x 128, you would first do a 7 x 7 box shrink to get a 142 x 142 image, then lanczos3 with a residual of (128 / 142), about 0.9, to get exactly to 128 x 128.

If the residual is small, you will get nasty aliasing due to undersampling, so you need to have an anti-alias filter between the block shrink and the lanczos3.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Mar 11, 2016

Hm... are there any references available about quality of such 2-stage algorythms (~mipmaping + presize downscale). Nicolas did not mentioned anything similar in his articles on imagemagick site. I've seen such technics somewhere, usually with 2^n preliminary steps, but don't know anything about quality measurements.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Mar 11, 2016

It's just an optimisation. I think Nicolas takes it as a given.

The "correct" way to do a 8:1 three-lobe Lanczos downsample would be to compute the full 2D kernel for each output pixel. This would have a 7 x 7 centre section which was roughly gaussian, then maybe three pixels of negative each side, another three positive and a final three negative. You'd be computing about a 25 x 25 point mask for each output pixel, so that's ~700 multiply-add for each one, with very poor locality and no opportunities for vectorisation. And that's not including mask generation costs.

The box filter is separable and can be vectorised, and box + blur will approximate the centre part of the large mask. The third pass, which does two simple fixed 6-point lanczos, adds the other lobes reusing the results of the nearby averaging operations. Total cost ~61 multiply-add, at least 10 times faster for almost the same result.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Mar 11, 2016

You'd be computing about a 25 x 25 point mask for each output pixel, so that's ~700 multiply-add for each one, with very poor locality and no opportunities for vectorisation. And that's not including mask generation costs.

??? lanczos3 is separable. It's vectorizable too (see skia src). Difference with Box is ~ 6 times (3x due window size + 2x due memory access for precomputed filters). Filters setup make sense, but not much.

Probably, not vectorizable are Nicolas's modifications like LanczosEWA and others.

It's just an optimisation. I think Nicolas takes it as a given.

IMHO, that's not the same from math point of view (i don't say it's bad or will not work at all). That's why i asked if someone else experimented with such things and did quality compare.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Mar 11, 2016

Vectorizable with gcc, I should say. I should maybe have a go with orc.

Lanczos is separable in that you can do two 1D lanczos, but you won't get the same result as one 2D lanczos. The two-pass version will be equal to a one-pass version with a rather "squarish" mask. You can see the effect for example in stepping artefacts:

http://www.imagemagick.org/Usage/resize/#distort_resize

It's close, but not exact. Sadly only a few kernels are truly separable. Moving more of the calculations into a block filter reduces the differences.

(I'm not a mathematician at all, so corrections please!)

@puzrin

This comment has been minimized.

Copy link

puzrin commented Mar 11, 2016

Lanczos is separable in that you can do two 1D lanczos, but you won't get the same result as one 2D lanczos.

Let's be precise, kernel can be separable or not separable. No intermediate states :) .

The two-pass version will be equal to a one-pass version with a rather "squarish" mask. You can see the effect for example in stepping artefacts:

That example is for upscale, while we speak about downscale. Lanczos is not ideal for upscale at all.

http://www.imagemagick.org/Usage/distorts/#resize, see note, not sure it's a "classic" lanczos

The real difference in the above two images is that the Distort Operator uses a two dimensional Elliptical Area Resampling filter method (also known as cylindrical filtering or resampling) for its image processing.


Moving more of the calculations into a block filter reduces the differences.

We can do any math we wish, but question is - how to qualify result and will it continue to be predictable? If we implement known methods, then we can refer to theory + past measurements, done by other people.

@puzrin

This comment has been minimized.

Copy link

puzrin commented Mar 17, 2016

Last days we polished image uploading logic in our project & experimented with sharp. My opinion is that results of sharp 0.13.1 are far from perfect should be improved significantly :)

Resize by sharp:

sharp-0-13-1

Resize by pica (lanczos3, without unsharp mask):

pica

Sizes are a bit different, because i grabbed result from the screen. By i think difference if obvious.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented Mar 18, 2016

https://github.com/jcupitt/libvips/issues/404 has some samples of vips with a new lanczos3 downsizer, it should be in 8.3 and should look better.

@lovell lovell changed the title Switch to vips_resize (requires libvips 8.2.0) Switch to improved vips_resize, requires libvips v8.3.0 Mar 18, 2016

@lovell lovell changed the title Switch to improved vips_resize, requires libvips v8.3.0 Switch to improved vips_resize, requires libvips v8.3.1+ May 8, 2016

@lovell lovell added this to the v0.15.0 milestone May 15, 2016

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 15, 2016

The switch to use the new vips_reducev and vips_reduceh operations (defaulting to a Lanczos 3 kernel) is now available on the outfit branch and will be in v0.15.0.

npm install lovell/sharp#outfit
@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 18, 2016

This is now in master awaiting release as part of v0.15.0. Feedback welcome, as always.

Next step: run benchmarks and update results.

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 21, 2016

It looks like resize performance in sharp v0.15.0 (libvips v8.3.1) will be between 10% and 20% faster than previous versions! CPU cache size seems to have the greatest impact on any improvements.

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 21, 2016

v0.15.0 now available via npm; enjoy.

@lovell lovell closed this May 21, 2016

@puzrin

This comment has been minimized.

Copy link

puzrin commented May 21, 2016

5740c36d7becc062289db8e9_sm

Hm... don't see quality improvment after upgrade to 0.15. Should i change any resize options?

@puzrin

This comment has been minimized.

Copy link

puzrin commented May 21, 2016

Should i create a new ticket about resize quality? May be i do something wrong - details after resize to small size are auful. I've tried to increase jpeg quality 75% -> 95% - nothing changed.

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 21, 2016

@puzrin Yes, please create a new issue with sample images that we can use to fine-tune. Please can you also try with gamma() as this disables the use of JPEG shrink-on-load.

@puzrin

This comment has been minimized.

Copy link

puzrin commented May 26, 2016

It seems, separate ticket not needed. The main problem was jpeg quality param in 0.14.x, that disappeared in 0.15. Other things can be subjective and not enougth to ask for your attention.

https://github.com/nodeca/sharp-resize-test - that's a test repo to quickly check images from different resizers.

@lovell

This comment has been minimized.

Copy link
Owner

lovell commented May 26, 2016

"jpeg quality param in 0.14.x, that disappeared in 0.15"

The quality() option is still there.
http://sharp.dimens.io/en/stable/api/#qualityquality
https://github.com/lovell/sharp/blob/master/test/unit/io.js#L286-L299

Your resize-test repo looks rather handy, thank you for sharing.

@puzrin

This comment has been minimized.

Copy link

puzrin commented May 26, 2016

The quality() option is still there.

I mean in v0.14 passing quality 95% instead of 75% did not improved things in our application. I don't know why. In 0.15 quality difference is visible. Since 0.15 works as expected, we did nod investigated what's up with 0.14 - not actual anymore.

@jcupitt

This comment has been minimized.

Copy link

jcupitt commented May 26, 2016

Since 7.42, vips has disabled chroma subsampling on jpeg save if Q >= 90, could that be part of it?

@puzrin

This comment has been minimized.

Copy link

puzrin commented May 26, 2016

I'd say with Q = 75 quality in 0.15 is better too. At least don't see those big artifacts as before.

To summarize things - previews in our app (nodeca) looked really bad before, but now those are ok. If i notice something strange with ordinary use - i will let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment