Add support for webp thumbnailing #7586

WGH- · 2020-05-27T23:14:45Z

I tested this by editing a file on my server directly inside the virtualenv, seems to work fine.

WGH- · 2020-05-27T23:38:23Z

On the other hand, it might make sense to generate webp previews for webp images. This one also works fine. What do you think?

diff --git a/synapse/config/repository.py b/synapse/config/repository.py
index 9d2ce2022..88c37d0fe 100644
--- a/synapse/config/repository.py
+++ b/synapse/config/repository.py
@@ -68,8 +68,10 @@ def parse_thumbnail_requirements(thumbnail_sizes):
         height = size["height"]
         method = size["method"]
         jpeg_thumbnail = ThumbnailRequirement(width, height, method, "image/jpeg")
+        webp_thumbnail = ThumbnailRequirement(width, height, method, "image/webp")
         png_thumbnail = ThumbnailRequirement(width, height, method, "image/png")
         requirements.setdefault("image/jpeg", []).append(jpeg_thumbnail)
+        requirements.setdefault("image/webp", []).append(webp_thumbnail)
         requirements.setdefault("image/gif", []).append(png_thumbnail)
         requirements.setdefault("image/png", []).append(png_thumbnail)
     return {
diff --git a/synapse/rest/media/v1/thumbnailer.py b/synapse/rest/media/v1/thumbnailer.py
index c234ea742..b7c3cbde1 100644
--- a/synapse/rest/media/v1/thumbnailer.py
+++ b/synapse/rest/media/v1/thumbnailer.py
@@ -34,7 +34,7 @@ EXIF_TRANSPOSE_MAPPINGS = {

 class Thumbnailer(object):

-    FORMATS = {"image/jpeg": "JPEG", "image/png": "PNG"}
+    FORMATS = {"image/jpeg": "JPEG", "image/png": "PNG", "image/webp": "WEBP"}

     def __init__(self, input_path):
         self.image = Image.open(input_path)

erikjohnston · 2020-05-29T14:41:04Z

On the other hand, it might make sense to generate webp previews for webp images. This one also works fine. What do you think?

Given some devices can't render webp I think it makes sense for thumbnails to be in a more widely supported format.

changelog.d/7586.feature

erikjohnston · 2020-05-29T14:44:20Z

This looks good. I don't suppose you've looked into if its easy to add tests for this somehow?

WGH- · 2020-05-29T14:46:57Z

This looks good. I don't suppose you've looked into if its easy to add tests for this somehow?

I looked around a little bit, and didn't find tests that checked all the supported formats. IIRC there were only some tests with small hardcoded PNG files. I can try to refactor these tests, and add WebP there.

erikjohnston · 2020-05-29T14:54:47Z

This looks good. I don't suppose you've looked into if its easy to add tests for this somehow?

I looked around a little bit, and didn't find tests that checked all the supported formats. IIRC there were only some tests with small hardcoded PNG files. I can try to refactor these tests, and add WebP there.

Ah. It might be easier to make a new set of tests that poke some of the functions directly? I haven't looked at that bit of the code in a while so don't have any good suggestions off the top of my head. Alternatively there might be tests in sytest?

Anyway, it'd be fab if you could have a look, and feel free to pop into #synapse-dev:matrix.org if you need some pointers. Otherwise, if it turns out to be a royal pain then perhaps we forget about adding a test for now and instead just file an issue to write some general tests for thumbnailing 🙂

WGH- · 2020-06-03T20:33:41Z

I modified the tests to check that webp is also thumbnailed. Previously, there were only png thumbnail tests that checked that the result is bytewise identical to the expected one. Since I doubt that jpeg encoders are guaranteed to be deterministic, I decided to avoid the comparison in the webp test, so it only ensures that the thumbnail exists and some valid image is returned.

erikjohnston · 2020-06-04T09:41:01Z

Awesome thank you! I wonder if we should thumbnail them as PNG's just to make the thumbnailing deterministic?

erikjohnston · 2020-06-04T09:42:01Z

tests/rest/media/v1/test_media_storage.py

+                None,
+                None,


Suggested change

None,

None,

None, # Thumbnailing may (?) be non-deterministic, so we just check for success not exact output.

None,

Trying to explain leads to obvious question "why PNG thumbnailing is deterministic, then?". I'd simply omit self-contradictory explanation of the somewhat broken test altogether, since we're planning on fixing it soon (yes, I'm planning to do that).

Fair enough, now that we have documented what None means above I think its fine to skip this 👍

erikjohnston

It would be good just to add some comments to the parameterised stuff, like below, just to make it easier to see what's going on. Otherwise this is good to go 🎉

erikjohnston · 2020-06-04T09:51:06Z

tests/rest/media/v1/test_media_storage.py

+_TestImage = namedtuple(
+    "_TestImage",
+    ["data", "content_type", "extension", "expected_cropped", "expected_scaled"],
+)


Suggested change

_TestImage = namedtuple(

"_TestImage",

["data", "content_type", "extension", "expected_cropped", "expected_scaled"],

)

import attr

from Typing import Optional

@attr.s

class _TestImage:

"""An image for testing thumbnailing with the expected results

Attributes:

data: The raw image to thumbnail

content_type: The type of the image as a content type, e.g. "image/png"

extension: The extension associated with the format, e.g. ".png"

expected_cropped: The expected bytes from cropped thumbnailing, or None if

test should just check for success.

expected_scaled: The expected bytes from scaled thumbnailing, or None if

test should just check for a valid image returned.

"""

data = attr.ib(type=bytes)

content_type = attr.ib(type=bytes)

extension = attr.ib(type=bytes)

expected_cropped = attr.ib(type=Optional[bytes])

expected_scaled = attr.ib(type=Optional[bytes])

WGH- · 2020-06-04T14:00:17Z

I wonder if we should thumbnail them as PNG's just to make the thumbnailing deterministic?

This is a good question. WebP itself can be lossless and lossy, and I'd dare to say that it's two completely different algorithms.

Although it's mostly thought experiments, I didn't do real tests, my reasoning is as follows. I assume that lossy WebP is used in cases where JPEG is used (photo-like images), and lossless WebP is used where PNG is used (text, screenshots, line art).

Lossless WebP -> PNG: ok
Lossless WebP -> JPEG: thumbnail looks poor, but is compressed okayish
Lossy WebP -> PNG: poor compression
Lossy WebP -> JPEG: ok

Since it just a thumbnail, I think small size should be preferred over quality.

As of tests, to be frank, I dislike the idea of comparing images this way, even for PNGs. Although I doubt there're many way to compress 1 white pixel with zlib, this approach is very fragile and will break if some larger image is put through it. Default compression level might change, even the underlying library might decide to add some extra PNG metadata chunks, etc.

For lossless conversion, I think the proper test would be to check the resulting file format and raw image data. For lossy conversion, well, in the theory we could check that the resulting image is close enough to the expected result, but I think that would be unnecessary complication since we would duplicate the testing efforts of underlying library (Pillow).

WGH- · 2020-06-04T14:04:01Z

Deterministic is probably the wrong word. Given the same settings and the same library versions, the result will be the same all the time. However, I doubt anyone guarantees byte-identical results across versions.

erikjohnston · 2020-06-04T14:08:13Z

Fair, I'm sold on all of that. I guess what we actually want to test is that the returned images are the right size, but 🤷‍♂️

WGH- · 2020-06-04T14:27:11Z

Fair, I'm sold on all of that. I guess what we actually want to test is that the returned images are the right size, but 🤷‍♂️

Right. I think it's not very hard to do: we can create solid color-filled images with Pillow at runtime, and thumbnail them in various ways

Would it be fine to both change the tests in such relatively substantial way AND introduce WebP support in one pull request?

erikjohnston · 2020-06-04T14:53:23Z

Fair, I'm sold on all of that. I guess what we actually want to test is that the returned images are the right size, but man_shrugging

Right. I think it's not very hard to do: we can create solid color-filled images with Pillow at runtime, and thumbnail them in various ways

Ooh, interesting. That sounds like it should work.

Would it be fine to both change the tests in such relatively substantial way AND introduce WebP support in one pull request?

I don't think we should block webp support on making our tests better, so long as we have a test that will explode if we completely break webp support then I'm happy (which we do).

Closes matrix-org#4382 Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>

WGH- · 2020-06-04T21:16:45Z

Test 600 Outbound federation will ignore a missing event with bad JSON for room version 6... FAIL

Looks like some flaky test.

erikjohnston · 2020-06-05T10:32:00Z

Test 600 Outbound federation will ignore a missing event with bad JSON for room version 6... FAIL

Looks like some flaky test.

Yeah, looks like it annoyingly. Will re-run the test then merge! 🎉 Thanks for bearing with the back and forth on this 🙂

(Just as a small note for any future PRs: we'd prefer if you didn't rebase while we're reviewing to make it easier to track the changes you make, we'll squash merge it at the end anyway. It wasn't a problem in this PR though 🙂 )

WGH- · 2020-06-05T10:57:18Z

(Just as a small note for any future PRs: we'd prefer if you didn't rebase while we're reviewing to make it easier to track the changes you make, we'll squash merge it at the end anyway. It wasn't a problem in this PR though 🙂 )

@erikjohnston what about the PRs that logically consist of multiple commits? If I get the commit "separation" right from the start, I can push fixup commits, but what if separation becomes clear later on?

erikjohnston · 2020-06-05T10:59:33Z

Certainly I tend to do my PRs with many small commits rather than one big one, though I do squash merge at the end usually. If it makes sense we can do a normal merge, though typically if the commits are separate enough that they warrant being their own commits in the git history after merge they probably should have been separate PRs. There's no hard and fast rule here though

WGH- · 2020-06-05T11:06:55Z

though typically if the commits are separate enough that they warrant being their own commits in the git history after merge they probably should have been separate PRs

If some new feature requires a bit of preparatory refactoring of existing code, I prefer to have the refactoring in a separate commit in the history. Having them separate will make future archaeology (git bisect, etc.) easier, and smaller patches are usually easier to review than one giant patch. Sure, the refactoring can be a separate PR, but without the new feature it will usually look unnecessary, so it makes sense to review and merge them together.

But that's just like my opinion :)

richvdh · 2020-06-05T13:24:41Z

If some new feature requires a bit of preparatory refactoring of existing code, I prefer to have the refactoring in a separate commit in the history...

These things might be true, but they are outweighed by the fact that rebasing after a review has taken place takes the reviewer back to the drawing board, making it much harder to land the PR. Better to have the PR landed as one big commit than not landed at all because the reviewer didn't have time to figure out what changed in each round of force-pushes.

@dklimpel

…rg-hotfixes Synapse 1.15.0rc1 (2020-06-09) ============================== Features -------- - Advertise support for Client-Server API r0.6.0 and remove related unstable feature flags. ([\#6585](#6585)) - Add an option to disable autojoining rooms for guest accounts. ([\#6637](#6637)) - For SAML authentication, add the ability to pass email addresses to be added to new users' accounts via SAML attributes. Contributed by Christopher Cooper. ([\#7385](#7385)) - Add admin APIs to allow server admins to manage users' devices. Contributed by @dklimpel. ([\#7481](#7481)) - Add support for generating thumbnails for WebP images. Previously, users would see an empty box instead of preview image. ([\#7586](#7586)) - Support the standardized `m.login.sso` user-interactive authentication flow. ([\#7630](#7630)) Bugfixes -------- - Allow new users to be registered via the admin API even if the monthly active user limit has been reached. Contributed by @dkimpel. ([\#7263](#7263)) - Fix email notifications not being enabled for new users when created via the Admin API. ([\#7267](#7267)) - Fix str placeholders in an instance of `PrepareDatabaseException`. Introduced in Synapse v1.8.0. ([\#7575](#7575)) - Fix a bug in automatic user creation during first time login with `m.login.jwt`. Regression in v1.6.0. Contributed by @olof. ([\#7585](#7585)) - Fix a bug causing the cross-signing keys to be ignored when resyncing a device list. ([\#7594](#7594)) - Fix metrics failing when there is a large number of active background processes. ([\#7597](#7597)) - Fix bug where returning rooms for a group would fail if it included a room that the server was not in. ([\#7599](#7599)) - Fix duplicate key violation when persisting read markers. ([\#7607](#7607)) - Prevent an entire iteration of the device list resync loop from failing if one server responds with a malformed result. ([\#7609](#7609)) - Fix exceptions when fetching events from a remote host fails. ([\#7622](#7622)) - Make `synctl restart` start synapse if it wasn't running. ([\#7624](#7624)) - Pass device information through to the login endpoint when using the login fallback. ([\#7629](#7629)) - Advertise the `m.login.token` login flow when OpenID Connect is enabled. ([\#7631](#7631)) - Fix bug in account data replication stream. ([\#7656](#7656)) Improved Documentation ---------------------- - Update the OpenBSD installation instructions. ([\#7587](#7587)) - Advertise Python 3.8 support in `setup.py`. ([\#7602](#7602)) - Add a link to `#synapse:matrix.org` in the troubleshooting section of the README. ([\#7603](#7603)) - Clarifications to the admin api documentation. ([\#7647](#7647)) Internal Changes ---------------- - Convert the identity handler to async/await. ([\#7561](#7561)) - Improve query performance for fetching state from a PostgreSQL database. ([\#7567](#7567)) - Speed up processing of federation stream RDATA rows. ([\#7584](#7584)) - Add comment to systemd example to show postgresql dependency. ([\#7591](#7591)) - Refactor `Ratelimiter` to limit the amount of expensive config value accesses. ([\#7595](#7595)) - Convert groups handlers to async/await. ([\#7600](#7600)) - Clean up exception handling in `SAML2ResponseResource`. ([\#7614](#7614)) - Check that all asynchronous tasks succeed and general cleanup of `MonthlyActiveUsersTestCase` and `TestMauLimit`. ([\#7619](#7619)) - Convert `get_user_id_by_threepid` to async/await. ([\#7620](#7620)) - Switch to upstream `dh-virtualenv` rather than our fork for Debian package builds. ([\#7621](#7621)) - Update CI scripts to check the number in the newsfile fragment. ([\#7623](#7623)) - Check if the localpart of a Matrix ID is reserved for guest users earlier in the registration flow, as well as when responding to requests to `/register/available`. ([\#7625](#7625)) - Minor cleanups to OpenID Connect integration. ([\#7628](#7628)) - Attempt to fix flaky test: `PhoneHomeStatsTestCase.test_performance_100`. ([\#7634](#7634)) - Fix typos of `m.olm.curve25519-aes-sha2` and `m.megolm.v1.aes-sha2` in comments, test files. ([\#7637](#7637)) - Convert user directory, state deltas, and stats handlers to async/await. ([\#7640](#7640)) - Remove some unused constants. ([\#7644](#7644)) - Fix type information on `assert_*_is_admin` methods. ([\#7645](#7645)) - Convert registration handler to async/await. ([\#7649](#7649))

Closes matrix-org#4382 Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>

bkil · 2021-08-05T15:21:53Z

I agree that creating webp thumbnails should make more sense and/or thumbnail should result in a smaller image than the source. See also: #2667

WGH- force-pushed the webp-thumbnail branch from f4b581a to c04bdfd Compare May 27, 2020 23:17

WGH- marked this pull request as ready for review May 27, 2020 23:38

clokep requested a review from a team May 28, 2020 11:13

erikjohnston reviewed May 29, 2020

View reviewed changes

changelog.d/7586.feature Outdated Show resolved Hide resolved

WGH- force-pushed the webp-thumbnail branch 2 times, most recently from 24e1761 to ff010f2 Compare June 3, 2020 20:29

WGH- force-pushed the webp-thumbnail branch from ff010f2 to f475dc3 Compare June 3, 2020 20:57

erikjohnston reviewed Jun 4, 2020

View reviewed changes

Add support for webp thumbnailing

e900708

Closes matrix-org#4382 Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>

WGH- force-pushed the webp-thumbnail branch from f475dc3 to e900708 Compare June 4, 2020 20:39

richvdh assigned erikjohnston Jun 5, 2020

erikjohnston approved these changes Jun 5, 2020

View reviewed changes

erikjohnston merged commit e55ee7c into matrix-org:develop Jun 5, 2020

juhovan mentioned this pull request Jun 13, 2020

Missing dependency for webp thumbnailing in Docker image #7694

Closed

phil-flex pushed a commit to phil-flex/synapse that referenced this pull request Jun 16, 2020

Add support for webp thumbnailing (matrix-org#7586)

99f9cc4

Closes matrix-org#4382 Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>

clokep mentioned this pull request Jun 16, 2020

We should create JPG thumbnails of PNGs, not PNG thumbnails #2045

Open

clokep mentioned this pull request Jan 27, 2022

WEBPs are thumbnailed as JPEGs, losing transparency/animation #11840

Open

DMRobertson mentioned this pull request Aug 25, 2022

thumbnail byte-larger than original #2667

Open

clokep mentioned this pull request Jan 25, 2023

Thumbnail WebP images as WebP #14890

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for webp thumbnailing #7586

Add support for webp thumbnailing #7586

WGH- commented May 27, 2020

WGH- commented May 27, 2020

erikjohnston commented May 29, 2020

erikjohnston commented May 29, 2020

WGH- commented May 29, 2020

erikjohnston commented May 29, 2020

WGH- commented Jun 3, 2020 •

edited

Loading

erikjohnston commented Jun 4, 2020

erikjohnston Jun 4, 2020

WGH- Jun 4, 2020

erikjohnston Jun 5, 2020

erikjohnston left a comment

erikjohnston Jun 4, 2020

WGH- commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 5, 2020

WGH- commented Jun 5, 2020

erikjohnston commented Jun 5, 2020

WGH- commented Jun 5, 2020

richvdh commented Jun 5, 2020

bkil commented Aug 5, 2021

-                None,
-                None,
+                None,  # Thumbnailing may (?) be non-deterministic, so we just check for success not exact output.
+                None,

-_TestImage = namedtuple(
-    "_TestImage",
-    ["data", "content_type", "extension", "expected_cropped", "expected_scaled"],
-)
+import attr
+from Typing import Optional
+@attr.s
+class _TestImage:
+    """An image for testing thumbnailing with the expected results
+    Attributes:
+        data: The raw image to thumbnail
+        content_type: The type of the image as a content type, e.g. "image/png"
+        extension: The extension associated with the format, e.g. ".png"
+        expected_cropped: The expected bytes from cropped thumbnailing, or None if
+            test should just check for success.
+        expected_scaled: The expected bytes from scaled thumbnailing, or None if
+            test should just check for a valid image returned.
+    """
+    data = attr.ib(type=bytes)
+    content_type = attr.ib(type=bytes)
+    extension = attr.ib(type=bytes)
+    expected_cropped = attr.ib(type=Optional[bytes])
+    expected_scaled = attr.ib(type=Optional[bytes])

Add support for webp thumbnailing #7586

Add support for webp thumbnailing #7586

Conversation

WGH- commented May 27, 2020

WGH- commented May 27, 2020

erikjohnston commented May 29, 2020

erikjohnston commented May 29, 2020

WGH- commented May 29, 2020

erikjohnston commented May 29, 2020

WGH- commented Jun 3, 2020 • edited Loading

erikjohnston commented Jun 4, 2020

erikjohnston Jun 4, 2020

Choose a reason for hiding this comment

WGH- Jun 4, 2020

Choose a reason for hiding this comment

erikjohnston Jun 5, 2020

Choose a reason for hiding this comment

erikjohnston left a comment

Choose a reason for hiding this comment

erikjohnston Jun 4, 2020

Choose a reason for hiding this comment

WGH- commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 4, 2020

WGH- commented Jun 4, 2020

erikjohnston commented Jun 5, 2020

WGH- commented Jun 5, 2020

erikjohnston commented Jun 5, 2020

WGH- commented Jun 5, 2020

richvdh commented Jun 5, 2020

bkil commented Aug 5, 2021

WGH- commented Jun 3, 2020 •

edited

Loading