Investigate merging attention's saliency features for "smart" crop #295

rightaway · 2015-11-10T17:06:00Z

Often when images are shrunk to generate thumbnails through resizing or cropping, the resulting image doesn't look very good just because of the content and the dimensions of the original image. But if there was a way to generate 'smart' thumbnails that's based on the content of the image, it would allow for much better thumbnails. For example, http://29a.ch/sandbox/2014/smartcrop/examples/testsuite.html.

There's a JS library that implements this https://github.com/jwagner/smartcrop.js, would it be possible to offer similar functionality in sharp?

lovell · 2015-11-10T17:13:52Z

Hello, might https://github.com/lovell/attention provide what you're looking for?

rightaway · 2015-11-10T17:36:04Z

Interesting!

So is the idea that the return values from attention.region, which are top/left/bottom/right, would be passed to sharp.extract? I imagine not because it doesn't pay attention to the width and height provided by sharp.resize. Is it better then to focus the image on the focal point provided by attention.point?

How would you use the x/y coordinates returned by attention.point in sharp? Ideally if sharp.crop optionally took an x/y coordinate instead of gravity, it could automatically center the image there while still respecting the other values passed to sharp such as the resize width and height.

So for example something like sharp.resize(300, 200).crop(125, 36) would offer up an image that's 300x200 centered at the point 125, 36 in the original image, which would be fantastic!

lovell · 2015-11-10T20:08:22Z

You've got the idea. It's quite experimental and the performance could probably be improved. I might consider merging some of the features of attention into sharp after #152 but for now you'll have to "do the math" yourself :)

homerjam · 2015-12-14T11:00:09Z

Would love to see sharp get some attention 😆

vlapo · 2015-12-15T13:41:41Z

+1

homerjam · 2016-02-05T12:25:11Z

Hi @lovell just wondering if there was any update on this - I'm about to start using attention in my image processing workflow unless there's an integration on the near horizon? Could I help at all (guessing its beyond a simple PR though)?

lovell · 2016-02-05T14:25:01Z

@homerjam This is still planned but with nothing implemented yet. It'd be great to learn which features of attention you find useful to help prioritise what gets added to sharp.

homerjam · 2016-02-05T14:37:13Z

Cool, well not to worry I will press on with attention as it stands.

First up I'm going to be using attention to find the focal point of an image (coming soon on karinatwiss.com). But would also find the palette finder really useful - there are lots of cases where I'd like to match dominant colours (or calculate complimentary/opposites) of images to background colours.

puzrin · 2016-02-05T15:42:10Z

Focal point to generate thumbnails is enougth for me. I guess, that's a most demanded method.

jwagner · 2016-02-11T08:18:21Z

Actually I have some integration of smartcrop with sharp. I'll probably release it together with the next release. :)

lovell · 2016-02-11T20:38:05Z

@jwagner 👍

lovell · 2016-03-05T12:34:18Z

Commit 2034efc on the needle branch adds an experimental implementation of the entropy-based method suggested by @jcupitt in https://github.com/lovell/attention/issues/8

Here's an example of how you might use this to generate auto-cropped 200px square thumbnails using Streams:

var transformer = sharp().resize(200, 200).crop(sharp.strategy.entropy);
readableStream.pipe(transformer).pipe(writableStream);

Feedback very much welcome.

puzrin · 2016-03-05T12:41:31Z

One question. Do i understand right, that scale can vary? It selects region with requested width/height ratio, crop it and scale down to exact size. Correct?

lovell · 2016-03-05T13:49:36Z

@puzrin The image is resized so at least one dimension is correct, then the edges of the remaining dimension are repeatedly cropped until it too is correct. (I've added this feature in such a way that, in the future, we could also use it to auto-extract a target width and height.)

puzrin · 2016-03-05T13:58:57Z

Got it. Probably i don't undersatand how api was intended to be used. Let me describe my task:

I have images of any size on input. Let's take 3000x1500 for example.
I wish to get 170x150 thumbnail on output, with "the most valuable content in it".

It would be nice to have simple call for that. I expect such use case to be the most demanded.

lovell · 2016-03-05T15:56:03Z

@puzrin My thinking here is that we can add further "strategies" more suited to the use case you describe. These might be things like "skin tones", "edges", "contrast" etc.

As with the approach used in attention, these techniques require training with salient region datasets to calculate suitable thresholds.

The initial entropy-based strategy is more about removing the least valuable edges rather than keeping the most valuable/salient regions - I'll try to make the docs clearer - thanks for the feedback!

puzrin · 2016-03-05T16:12:09Z

Thanks for explanation. After thinking a bit, probably fuzzy edges cut will be enougth for my needs.

jcupitt · 2016-03-05T16:40:08Z

Yes, I like the trim boring edges strategy, it seems like a simple, reliable way to cut an image down that should need little training.

Most photos will not have a very small detail that you want to cut out. It must be much more common to just want to handle off-centre compositions automatically.

lovell · 2016-03-05T19:50:07Z

It looks like something in libgobject in causing hist_entropy to segfault with an "Access violation" on Windows. Here's the backtrace:

msvcrt!strchr+0x2
libgobject_2_0_0!g_param_spec_pool_lookup+0x8d
libgobject_2_0_0!g_object_class_find_property+0x1a
libvips_42!vips_object_get_argument+0x35
libvips_42!vips_object_set_valist+0xb4
libvips_42!vips_call_required_optional+0x1fe
libvips_42!vips_call_split+0x92
libvips_42!vips_log+0x32
libvips_42!vips_hist_ismonotonic+0x30b
libvips_42!vips_object_build+0x19
libvips_42!vips_cache_operation_buildp+0x48
libvips_cpp!vips::VImage::call_option_string+0x185
libvips_cpp!vips::VImage::hist_entropy+0xf7
sharp!sharp::EntropyCrop+0x11a

I'll investigate.

lovell · 2016-03-05T21:26:48Z

@jcupitt Should hist_entropy.c#L83 be

vips_log( t[0], &t[1], NULL )

instead of

vips_log( t[0], &t[1], 1.0 / sum, 0, NULL )

?

jcupitt · 2016-03-05T22:55:31Z

Ooops, yes, looks like a copy-paste error.

there was a copy-paste error in the call to vips_log(), thanks Lovell see lovell/sharp#295

jcupitt · 2016-03-06T18:04:57Z

I fixed it in 8.2 and master, and added a test for it. Thanks for spotting the dumbness @lovell!

lovell · 2016-03-07T09:17:25Z

@jcupitt Fantastic, thank you.

lovell · 2016-03-22T09:51:11Z

The release of libvips v8.2.3 with the hist_entropy fix means this is now working on Windows too - https://ci.appveyor.com/project/lovell/sharp/build/372/job/3guwidhfb7hm0t3d

calebshay · 2016-03-30T13:50:31Z

I'm looking for something similar enough that I didn't think I should make a new issue: Trimming whitespace around an image. I've implemented it before using the vips ruby bindings, but sharp doesn't expose the vips methods I would need. Ruby implementation below. Note that this implementation assumes that the the image is already RGB(A), and would need more smarts to handle other color spaces.

    def trim(img)
      alpha = nil
      # Remove the alpha channel, if there is one, as it breaks mask creation
      if img.bands == 4
        alpha = img.extract_band(3)
        img = img.extract_band(0, 3)
      end
      mask = img.less(240)
      columns, rows = mask.project
      left = columns.profile_h.min
      right = columns.x_size - columns.fliphor.profile_h.min
      top = rows.profile_v.min
      bottom = rows.y_size - rows.flipver.profile_v.min
      # Put the alpha channel back in, if it had one
      img = img.bandjoin(alpha.clip2fmt(img.band_fmt)) if alpha
      img = img.extract_area(left, top, right - left, bottom - top)
      img
    end

lovell · 2016-03-30T18:18:31Z

@calebshay This discussion is more about strategies for dealing with cropping-when-resizing. What you describe sounds like automated image extraction so feel free to create a new feature request for this.

(I see the possibility of combining the two approaches in one pipeline, e.g. extract non-whitespace then resize+crop using entropy.)

lovell · 2016-04-02T12:42:40Z

The entropy-based cropping strategy is in v0.14.0, now available via npm, thanks for all the comments and help here. I'm going to leave this task open to track further additions/improvements from attention and similar modules.

puzrin · 2016-04-02T15:01:58Z

It seems to work strange. I've tried to create cropped thumbnails for images with clear left focus & right focus. Those are detected well by smartcrop.js, but not with sharp 0.14 (with new crop param).

puzrin · 2016-04-27T16:20:58Z

@lovell is previous explanation clear enougth or i should provide more info?

I used this demo to compare result https://29a.ch/sandbox/2014/smartcrop/examples/testbed.html.

lovell · 2016-04-27T17:40:33Z

@puzrin Thanks, I think we've got plenty of info at the moment. When this is revisited we can look at training/thresholding algorithms with data sets such as this.

puzrin · 2016-04-27T18:23:06Z

Glad to know. My test case is crop 4:3 ratio image to 170*150 pixels (downscale + cut left & right sides a bit). Your link has at least one image (with focus on the left) good for algorythm check. It should cut such images from the one side only.

puzrin · 2016-06-04T18:36:28Z

@lovell do you have any estimates/priorities for revisiting smartcrop feature?

lovell · 2016-06-04T19:57:50Z

Здравствуй @puzrin, I still plan to revisit this feature to add "trained" crop strategies. This and #236 seem to be the most popular/requested/useful new features, so I'll look at both of these over the next few months.

jwagner · 2016-06-25T22:53:25Z

A little update on integrating smartcrop with sharp. I have released smartcrop 1.0 along with smartcrop-sharp now. It's not super efficient right now as the image needs to be decoded twice (once for smartcrop, once for operating on it with sharp). But in practice it works quite well. :)

homerjam · 2016-06-25T23:02:46Z

Oooooh lovely @jwagner, thanks!

lovell · 2016-10-03T19:40:23Z

Update/teaser:

The following graph shows image count (y-axis) against % error (x-axis) for the existing entropy-based crop strategy (dark blue) vs the attention-based strategy (green). Closer to the origin is closer to the "ground truth" and therefore better, so the attention-based approach is the relative winner in terms of accuracy.

(The MSRA Salient Object Database image set B was used as the source of "ground truth".)

The attention-based strategy is currently ~50% faster than entropy, typically adding <50ms to processing time, but work continues to fine-tune both accuracy and performance.

jcupitt · 2016-10-04T07:01:47Z

That's a fantastic graph Lovell! Very nice work. I should look at your attention crop code.

lovell · 2016-10-11T20:23:52Z

The attention branch adds experimental support for a crop "strategy" based on a slightly modified+simplified version of the original logic in the attention module.

sharp(input).resize(200, 200).crop(sharp.strategy.attention)...

lovell · 2016-10-12T10:31:18Z

Commit 18b9991 adds this to the master branch ready for inclusion in v0.16.1.

rightaway · 2016-10-12T11:06:51Z

@lovell What would be the difference between using the original attention package with sharp, vs using sharp.strategy.attention?

In layman's terms what's the difference between the entropy and attention based strategies?

homerjam · 2016-10-12T11:11:39Z

Not sure this qualifies as layman's terms but here's a bit of an explanation...

lovell · 2016-10-12T12:45:17Z

sharp will be getting an updated+improved version of the focal-point logic from attention, made available via crop() when resizing to fixed dimensions, which I believe is its most common/popular use case.

entropy ranks regions based on their vips_hist_entropy value, or "which bit of the image has the most energy?"

attention converts image regions to the LAB and LCH colourspaces and generates 3 masks:

luminance frequency: edge detection on the L channel via the Sobel operator, or "which bit of the image has the biggest change in brightness?"
colour saturation: include only pixels from the C channel of LCH where the value is >~50%, or "which bit of the image has the most saturated colour?"
skin tones: include only pixels where AB chroma is within a range trained with http://humanae.tumblr.com/ , or "which bit of the image contains humans?"

...then adds them together and finds the maximum value to rank regions.

lovell · 2016-10-13T13:03:16Z

v0.16.1 now available via npm. Thanks everyone for your help with this!

lovell added the question label Nov 10, 2015

lovell changed the title ~~Support for 'smart' cropping~~ Investigate merging attention's saliency features for "smart" crop Nov 23, 2015

lovell added enhancement and removed question labels Nov 23, 2015

lovell added this to the v0.14.0 milestone Feb 15, 2016

lovell mentioned this issue Feb 25, 2016

Magically find the best region matching desired crop lovell/attention#12

Closed

jcupitt added a commit to libvips/libvips that referenced this issue Mar 6, 2016

fix hist_entropy

acf5f51

there was a copy-paste error in the call to vips_log(), thanks Lovell see lovell/sharp#295

papandreou mentioned this issue Mar 14, 2016

Add smartcrop.js to help with optimal cropping papandreou/express-processimage#14

Open

lovell removed this from the v0.14.0 milestone Apr 2, 2016

lovell mentioned this issue Jul 5, 2016

Feature request: Trim support #491

Closed

deuill mentioned this issue Jul 16, 2016

Support automatic crop via entropy detection. deuill/mash#4

Open

lovell mentioned this issue Sep 30, 2016

Consider supporting smart crop libvips/libvips#317

Closed

lovell added this to the v0.16.1 milestone Oct 11, 2016

lovell closed this as completed Oct 13, 2016

KyleAMathews mentioned this issue May 28, 2017

Expose crop focus parameter and make consistent with base64 gatsbyjs/gatsby#1055

Merged

polarathene mentioned this issue Aug 19, 2020

[gatsby-transformer-sharp] cropFocus: attention seems broken in example site gatsbyjs/gatsby#10178

Closed

Investigate merging attention's saliency features for "smart" crop #295

Investigate merging attention's saliency features for "smart" crop #295

Comments

rightaway commented Nov 10, 2015

lovell commented Nov 10, 2015

rightaway commented Nov 10, 2015

lovell commented Nov 10, 2015

homerjam commented Dec 14, 2015

vlapo commented Dec 15, 2015

homerjam commented Feb 5, 2016

lovell commented Feb 5, 2016

homerjam commented Feb 5, 2016

puzrin commented Feb 5, 2016

jwagner commented Feb 11, 2016

lovell commented Feb 11, 2016

lovell commented Mar 5, 2016

puzrin commented Mar 5, 2016

lovell commented Mar 5, 2016

puzrin commented Mar 5, 2016

lovell commented Mar 5, 2016

puzrin commented Mar 5, 2016

jcupitt commented Mar 5, 2016 • edited

lovell commented Mar 5, 2016

lovell commented Mar 5, 2016

jcupitt commented Mar 5, 2016

jcupitt commented Mar 6, 2016

lovell commented Mar 7, 2016

lovell commented Mar 22, 2016

calebshay commented Mar 30, 2016

lovell commented Mar 30, 2016

lovell commented Apr 2, 2016

puzrin commented Apr 2, 2016

puzrin commented Apr 27, 2016

lovell commented Apr 27, 2016

puzrin commented Apr 27, 2016

puzrin commented Jun 4, 2016

lovell commented Jun 4, 2016

jwagner commented Jun 25, 2016

homerjam commented Jun 25, 2016

lovell commented Oct 3, 2016

jcupitt commented Oct 4, 2016

lovell commented Oct 11, 2016

lovell commented Oct 12, 2016

rightaway commented Oct 12, 2016

homerjam commented Oct 12, 2016

lovell commented Oct 12, 2016

lovell commented Oct 13, 2016

jcupitt commented Mar 5, 2016 •

edited