Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert images to webP by default #20896

Open
jmaris opened this issue Nov 16, 2022 · 43 comments
Open

Convert images to webP by default #20896

jmaris opened this issue Nov 16, 2022 · 43 comments
Labels

Comments

@jmaris
Copy link

jmaris commented Nov 16, 2022

Pitch

When images are uploaded to the server, they should be converted to webp files for storage by default.

Motivation

The current approach of limiting size and heavily using JPEG compression allows us to limit the amount of data stored, as well as the amount of data cached by other instances. However it has lead to complaints concerning quality such as #20255 .

WebP is a relatively new image format, but is already supported by all major browsers, it has a considerably better quality to size ratio meaning its use could not only improve quality of images but actually further reduce size of stored images.

This has two positive effects: it decreases the overall cost of operating a server, especially over time, and it leaves more space in the cache meaning more posts from other instances can be cached.

@jmaris jmaris added the suggestion Feature suggestion label Nov 16, 2022
@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 16, 2022

As per #20255 :

I'd personally prefer AVIF. Now that Safari has added AVIF support, it's going to propagate pretty quickly (1/3rd of browsers have updated in just the 2 months since it was released), and it'd be months before this change to Mastodon would deploy anyway. Edge will be the only "major" browser without it, and I doubt for very long. It's an extra ~25% compression over WebP for a given quality.

That said: anything is better than this current image-handling system, so I'll vote for whatever's most popular!

Hmm, perhaps the best way to do it would be a serverside storage parameter to let the admin choose the storage format? Different servers would make different choices and we could get real-world comparative data, and servers could change their minds later for all new incoming images thereafter...

@afontenot
Copy link
Contributor

Note that "legacy" JPEG support will probably be necessary for the foreseeable future, either as a storage format or with converting on the fly by nginx. With Safari, WebP support relies on OS capabilities, needing macOS 11 or later. macOS 11 dropped support for MacBooks released before 2015, which is not that old by current PC standards. Likewise, AVIF will probably only ever be deliverable to select devices even though the bandwidth savings are quite good. Edge still doesn't support it and Safari support is extremely limited at this point.

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 17, 2022

Note that "legacy" JPEG support will probably be necessary for the foreseeable future, either as a storage format or with converting on the fly by nginx. With Safari, WebP support relies on OS capabilities, needing macOS 11 or later. macOS 11 dropped support for MacBooks released before 2015, which is not that old by current PC standards. Likewise, AVIF will probably only ever be deliverable to select devices even though the bandwidth savings are quite good. Edge still doesn't support it and Safari support is extremely limited at this point.

Yes. As per the other thread (I'm not sure why a new one was started), imgproxy solves this. It autoconverts formats as needed for legacy clients. It's hard to say exactly what sort of load (disk space, CPU) imgproxy would add, but given the assumption that "new files are served frequently and old files rarely", probably quite insignificant compared to the savings.

I think there's a general architecture lining up here.

  • caching/CDN sits in front of imgproxy (optional)
  • imgproxy sits in front of Mastodon (if a non-legacy storage format is chosen)
  • An idle thread running at low priority converts uploaded images to more compact formats and does database updates. (if a non-legacy storage format is chosen)
  • Old uploaded images (which have already been quality-degraded) could be lossless or near-lossless compressed further in said idle thread, at even lower priority (optional, if a non-legacy storage format is chosen)
  • The server can configure its preferred formats and conversion thresholds (question: will lenient servers increase the load on stringent servers?)

So some defaults might be:

StorageFormat: None (native legacy formats only (JPEG/GIF/PNG)), WebP, AVIF

SizeLimitForPreconvertedImages: 500k (if AVIF) / 600K (if WebP). Larger than the size limit for nonconverted image files, because (A) they're sparing the server work, and (B) to give users an option to upload "high quality images" where needed (artwork, professional photography, science imagery, etc) if they're willing to do more work on their end.

SizeLimitForConvertedImages: 250k (if AVIF) / 300K (if WebP). The vast majority of images being JPEGs lazily pasted in or uploaded would fall under this category.

SizeLimitForNoCompression: 150k (if AVIF) / 180K (if WebP)

MaxClientsideImageUploadSize: 1.2MB. No point to being more than ~5x larger than SizeLimitForConvertedImages, as the extra detail would just get lost.

MaxServersideImageUploadSize: 1.5MB (a bit of extra leeway)

MaxClientsideGIFUploadSize: 5MB. Remember that "images" can include animated gifs and the like, which can get quite big, but can compress down by an order of magnitude (AVIF).

MaxServersideGIFUploadSize: 6MB (a bit of extra leeway)

ServerImageResLimit - I'm told that Mastodon can't handle images larger than 4092x4092. I'm not sure why.

Sample clientside logic train:

  • Is the image larger than MaxClientsideImageUploadSize?
    • Is it a GIF?
      • Yes: If larger than MaxClientsideGIFUploadSize, reject (or do WebP / AVIF encoding in the browser...? Later feature perhaps...)
      • No: Save (JPEG, or should we try to do better clientside?) with whatever quality factor is needed to get it under MaxClientsideImageUploadSize; if the qualify factor would be under ~30, shrink the image ( https://github.com/nodeca/image-blob-reduce) as needed to get the QF back over 30.
  • Is the image previously in StorageFormat, aka intended for high quality, but (A) had to be modified for upload, (B) is larger than SizeLimitForPreconvertedImages or (C) larger than ServerImageResLimit?
    • Warn the user that their image had to be degraded for upload.
  • Does the user have any StripMetadata options chosen in their settings?
    • Yes: Strip the specified metadata from the file.

Sample serverside logic train:

  • Is the file larger than MaxServersideImageUploadSize / MaxServersideGIFUploadSize?
    • Reject
  • Is the file already in StorageFormat?
    • Yes: Is the file smaller than SizeLimitForPreconvertedImages and smaller than ServerImageResLimit on its longest axis?
      • Yes: Do nothing
      • No: Queue it for reencoding (and rescaling if needed), aiming for a quality factor that would achieve a size of SizeLimitForPreconvertedImages
    • No: Is the file smaller than SizeLimitForNoCompression?
      • Yes: Queue the file for lossless compression if there's any gain to be had (and to rescaling if needed)
      • No: Would a low level of lossy compression (~95% QF) bring it to under SizeLimitForConvertedImages?
        • Yes: Queue the file for conversion at low level of lossy compression (and rescaling if needed).
        • No: Queue the file for whatever amount of lossy compression will bring it to SizeLimitForConvertedFiles (including rescaling if needed)

Something like that.

Thoughts?

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 17, 2022

Also, a note on lossless compression:

  • AVIF is generally worse than WebP at lossless compression. WebP has a special algorithm designed specifically for lossless compression while AVIF uses its normal algorithm in a zero-loss mode, which isn't as effective. So for lossless, even if AVIF is the default storage format, any lossless conversions should default to WebP.

  • JPEG XL is a great option for lossless recompression of existing JPEGs, but browser support is "nearly zero", so it only would make sense for archival files that are unlikely to be requested often. I don't think even imgproxy supports it at present.

  • There are WebAssembly encoders/decoders for AVIF and other formats, so it's technically possible to add support to any browser that doesn't have native image support but does support WebAssembly. Though even that isn't "all browsers".

@jmaris
Copy link
Author

jmaris commented Nov 17, 2022

The server can configure its preferred formats and conversion thresholds (question: will lenient servers increase the load on stringent servers?)
Thoughts?

My only concern with doing this would be that many servers are now using deduplicated S3 storage for posts. This means they get significant space savings when the same file is uploaded to multiple instances. If an option is added to configure formats/thresholds then we risk removing that possibility (or having servers manually coordinate, which i suppose would also be fine)

In a general sense though, excellent work with the comment, it is extremely detailed

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 17, 2022

Out of curiosity, how much deduplication are we getting in practice without making use of image fingerprinting? E.g., how often are uploaded images not simply the same, but byte-for-byte identical?

Should image fingerprinting be part of the same update project, or should it be put off a separate later project? My take on image fingerprinting:

  • Existing hash algorithm (there's several, no need to reinvent the wheel)

  • Very strict similarity requirements (we're not using it to hunt for copyright infringement; better to err on the side of storing the file twice than mixing up different files)

  • If two images hash the same but the latter one is higher quality, the lower-quality one should be replaced.

The third one seems the trickiest to me. I wouldn't want to open up an image-replacement attack vector where a later user can carefully tweak specific pixels so that they hash the same but some key aspect is changed, you know? It's something that needs some thought. Maybe should start doing some tests...

@xGulfie
Copy link

xGulfie commented Nov 17, 2022

My only concern with doing this would be that many servers are now using deduplicated S3 storage for posts. This means they get significant space savings when the same file is uploaded to multiple instances.

Yes, please please please talk to the jortage.com people before you do anything that could mess deduplication up

Also users tend to hate webP

@enn-nafnlaus
Copy link

Nobody is doing anything now; we're discussing what would be optimal for both quality and minimizing server loads, while still maintaining backwards compatibility. :) I for one don't want to waste weeks coding something that's never going to get adopted because of X, Y or Z; it's important to talk things out in advance.

Thus, that said, add in your own thoughts and suggestions! :)

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 17, 2022

The more I think about it, I don't think it's physically possible to hash a higher-quality image together with a lower-quality image without risking an attack vector.

Example: someone uploads a picture:
image

Then someone uploads a tweaked higher-quality version:

image

Which they've modified.

image

How could the algo possibly know that that wasn't truly in the original image and just lost in the lower-quality version? It couldn't.

So I don't think it's possible - without risking attack vectors - to merge together images that have matching hashes when there's any meaningful quality difference between the two.

I think - if we were to do this at all - any images would have to be batched into quality groups (assessed on the basis of both resolution and quality factor, the latter of which would need to be algorithmically assessed, not simply taken from the file), and hash matches only compared within said quality group.

I think a comparison would also have to be done block-by-block on full/near-full resolution images, and each block hashed (and then that large hash checksummmed to shrink it), rather than taking a hash of the image as a whole. Because if all but one of the blocks are identical, but one block has been significantly altered, I'd argue that that's not acceptable for the two images to get the same hash.

So it's a challenging problem.

Let me do a test to see the rate of how common a typical meme is byte-for-byte identical to see how much impact deduping might have....

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 17, 2022

Okay, I just searched on Twitter for the infamous "Weird Nerds" meme that gets posted constantly:

image

There's lots of variants of this meme out there, but I required specifically ones with no visual difference. Since Twitter strips metadata, this will maximize the odds of a byte-for-byte match. 16 images were found in my quick search. I sorted them by MD5 hash:

051a4654784c230fb92cab296c8c580b Fhnt3djakAEniLX.jpeg 12144
05663589951bfdf3641fdbd0691d9430 FhniddeUUAAWtwW.jpeg 31912
1e73420043b74452f7eb6a033e151ab2 FhLIrWhXoAA_LkU.jpeg 42990
256dd27a8240f32e34e0e04a6139ff51 FhpYqm1WAAAUPBr.jpeg 70261
65bc422c69aa081cdd10ce34013a4dd5 FhRXMizWYAEl9_j.jpeg 46608
87ffd0f13cd63de2cbd88bc5aaef47fd FhjdfbJaAAE2bCY.jpeg 69273
9f0748d2c8586e79388ced4dc8c49856 FhipmHSWQAEgMYp.png 203936
ad8bcc5f8dcdbfa71634bd22337c216b FhJesLqWAAImyyv.jpeg 31781
b06aa418774113d38ad19d7c0b61d078 Fhn8ImpXoAAAIWl.png 580218
bfc7e56f98b42aed9cfffff7d00073b0 FhDep7FXEAAny0T.jpeg 69076
c99c83ee0a780c9de9bdec7049282bc4 Fhngk_SXoAAYlwC.jpeg 69774
c99c83ee0a780c9de9bdec7049282bc4 FhQGWl_WAAAi26Q.jpeg 69774
c99c83ee0a780c9de9bdec7049282bc4 FhUjYLoWIAEXp8m.jpeg 69774
c99c83ee0a780c9de9bdec7049282bc4 FhXzVisWAAE1G0J.jpeg 69774
cd88cb4a4c0e9f653288ec37e1cb115d FhO7oEpXgAAOh1V.jpeg 72595
f2169bbef77bae57cc2dd28c9c4cf93c FhNCOa6WIAEEUJl.jpeg 46946

So, I mean... I guess that's something? 4 out of 16 were duplicates. 25%. But a pretty poor showing, IMHO, for images that are visually identical.

@KRRW
Copy link

KRRW commented Nov 18, 2022

Might be a bad idea. Some fediverse apps expect images to be jpg, png, or gif files by default; sending webp to these servers might cause errors in displaying the image.

@enn-nafnlaus
Copy link

Might be a bad idea. Some fediverse apps expect images to be jpg, png, or gif files by default; sending webp to these servers might cause errors in displaying the image.

Why would imgproxy not work?

@jmaris
Copy link
Author

jmaris commented Nov 18, 2022

Might be a bad idea. Some fediverse apps expect images to be jpg, png, or gif files by default; sending webp to these servers might cause errors in displaying the image.

Of course, implementing this would involve coordinated work over a period of time with app developers and the fediverse at large so that when it launches all apps are ready. It isn't something that could be done overnight.

Out of curiosity, how much deduplication are we getting in practice without making use of image fingerprinting? E.g., how often are uploaded images not simply the same, but byte-for-byte identical?

I'm not sure that we should be doing anything other than basic deduplication (same image exactly), anything else is too risky IMHO

@lordkitsuna
Copy link

Nobody is doing anything now; we're discussing what would be optimal for both quality and minimizing server loads, while still maintaining backwards compatibility. :) I for one don't want to waste weeks coding something that's never going to get adopted because of X, Y or Z; it's important to talk things out in advance.

Thus, that said, add in your own thoughts and suggestions! :)

Personally since these are self hosted instances I feel like it should just be up to the person hosting the instance. If they want to have 50MB JXL or AVIF images let them. It just means that if it gets retooted to a server that is a different format or size limit there just needs to be a little message that says View at origin or something to that effect that takes the user to the original server hosting it. or other servers could choose to convert/compress to their requirements.

forcing a single format and quality on all servers will never make everyone happy. at the very least make linking to external image hosting less annoying as far as I can tell even trying to link an external image ends up getting compressed instead of just embedded. Would greatly prefer if an externally hosted image was simply embedded without trying to compress a new preview.

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 18, 2022

I'm not sure that we should be doing anything other than basic deduplication (same image exactly), anything else is too risky IMHO

I think I'm in agreement here. There's some rather thorny issues that make it more difficult to do right than format conversions. The lowest-hanging fruit is of course just doing JPEG correctly ;) But we really should go further than that and use a better storage format.

It just means that if it gets retooted to a server that is a different format or size limit there just needs to be a little message that says View at origin or something to that effect that takes the user to the original server hosting it. or other servers could choose to convert/compress to their requirements.

Those both seem like valid options. Indeed, I don't see why both can't be done. Replacing a locally hosted image link with a remote-hosted one is not complicated. Likewise, forcing a reencode would just be using a preexisting software pipeline.

ED: The originating server, not the destination server, should decide the preferred behavior (link vs. reencode) in the case that a federated server that's using your content does not support your file size or format. Because otherwise, imagine that you're a tiny server, and some giant server federated to you starts direct linking your images and driving a ton of traffic; it could overwhelm you (offset at least in part by dedup'ing). On the other hand if some tiny federated server is using your content, you may prefer to benefit your users by keeping the quality higher.

@afontenot
Copy link
Contributor

afontenot commented Nov 19, 2022

I'm puzzled by the drift in the discussion here. The original request was "convert image to WebP by default". Maybe that's implementable, maybe not, but I think the deduplication discussion should probably have another issue if someone wants that implemented.

Federation doesn't necessarily make the image format harder to solve. I think the basis of this request was the assumption (right or wrong) that WebP has sufficiently broad support that it can be made the default. When federating, the ActivityPub activity will have a link to the WebP image instead of a JPEG. Since (nearly) every client supports WebP, this won't be a problem. After all, ActivityPub instances can already send media formats that aren't understood by every receiver. It's up to the receiving instance what to do about that - either reencode it to a format the legacy client understands, or drop it.

If this limitation is seen as a problem, the obvious way to work around it would probably be to extend the ActivityStreams specification to allow including an alternative link with an alternative mediatype for each link. Maybe this would be unwieldy, or maybe it's too late to make a change like this to AS2, IDK.

@KRRW
Copy link

KRRW commented Nov 19, 2022 via email

@afontenot
Copy link
Contributor

afontenot commented Nov 19, 2022

@KRRW I don't know that we're talking about converting all images to WebP, I think we're talking about converting lossy images. Lossless could be attempted and then discarded if the resulting image size is larger.

in praxis converting images to webp results in larger files.

Not for lossy images. At the same quality, WebP images are almost always smaller than a JPEG, and at the same file size, the WebP image is almost always higher quality. See e.g. https://wyohknott.github.io/image-formats-comparison/#swallowtail&jpg=s&webp=s for some comparisons.

That is why forcing Mastodon to store and serve images as webp is sort of impractical, and puts a lot of unnecessary backend image conversion/compression to a format that is unpredictable and relatively new

Mastodon already supports uploading WebP images (and HEIF and AVIF), it just currently does the conversion the other direction (serving the files as JPEGs). I don't think age is the limiting factor here (WebP is 12 years old and is basically a subset of VP8 which is 14 years old), client support is. And virtually all actively supported clients support WebP.

On the other downside, webp is limited to 16k resolution.

Which is far, far higher than the maximum resolution I've ever seen any Mastodon instance support. I believe the vanilla setting is 1920x1080. Fediverse instances other than Mastodon that are focused on high quality photography (e.g. Pixelfed) will probably want to use other formats (though this is more because lossy WebP requires subsampling), but that doesn't affect Mastodon.

I don't see a huge downside here, although this should probably be an opt-in thing for admins. I think vanilla Mastodon will probably not want to drop support for Safari on MacBooks before 2015.

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 19, 2022

I'm puzzled by the drift in the discussion here. The original request was "convert image to WebP by default"

Deduplication was raised because of the risk that this may reduce the deduplication hit rate. That said, my take is that the impact should be small. The deduplication hit rate is probably already low to begin with, and in most cases conversion shouldn't impact it much if at all. The savings should be by far more than worth it.

That said, only real-world testing can find out for sure.

Despite Google telling people that webp saves storage (in theory), in praxis converting images to webp results in larger files. This is especially true when converting from PNG, APNG, and Animated GIFs, and so do generated images from videos like thumbnails or script-generated screenshots, canvas-generated raster images, and so on.

I don't think this is fair. Yes, there are cases where lossless WebP can be larger than PNG, APNG and animated GIFs. But:

  1. This is not the exception, not the rule, with a proper converter.

  2. If it were to happen, you'd just throw the conversion out and keep the original format

  3. This generally only applies to lossless conversions, while most of our desired conversions are quite lossy.

We can very much profile the compression in advance based on the image type, image resolution, desired amount of compression, etc, and verify the benefits in the output.

That is why forcing Mastodon to store and serve images as webp is sort of impractical, and puts a lot of unnecessary backend image conversion/compression

Here's the thing. Upload happens once. But download happens again and again and again, while storage costs continue every year. So yes, doing more work at upload time is worth it. But from talking with server operators, compute is rarely the cost-driving factor - it's storage. And images make up the lion's share of storage, like 80% or so. So if we want to scale, we need to handle images better. And if we want to improve quality, we also need to handle images better.

(And indeed, we even have options to offload storage work to the client if needed, such as WebAssembly encoders! That said, this is probably unnecessary work)

to a format that is unpredictable and relatively new.

WebP is not "relatively new"; it's 12 years old. As a reminder of how long ago that was: that was the year of the Haiti earthquake, the Eyjafjallajökull eruption, and the trapped Chilean miners. If you had bought a puppy when WebP was invented, odds are better than not that it would have already died by now. WebP is only 4 years younger than Twitter. And it's 6 years older than Mastodon.

Support for WebP is near universal now. Not 100%, but quite close.

I would call AVIF "relatively new" (though it's already up to ~80% support, and probably would hit ~90% support by the time any changes went out). I would not at all call WebP "relatively new".

On the other downside, webp is limited to 16k resolution.

LOL. Have you checked out Mastodon's current resolution? It (over)aggressively scales everything down to 2MP (just fixing how JPEGs are handled alone would double quality or halve space). By contrast: 16384x16384 is 268MP. According to Gargron the server can't even handle images larger than 4092x4092.

It's not an issue. More specifically, not an option,

forcing mastodon to store images as webp by default is bad.

If you had read the discussion, the consensus seems to be on having the default storage format configurable at the individual server level - legacy formats, WebP, or AVIF. Not "forcing" anything.

Even google search images do not serve webp version of images.

A year ago, only 2 1/2% of websites served WebP images. Today, it's 6 1/2%:

image

But, two VERY important caveats.

  1. That's websites that SERVE WebP images. The percentages that use WebP for storage is surely higher than that.

  2. Large commercial companies have money to burn on storage space, while small independent websites don't get much traffic - thus in both cases, deprioritizing work on upgrading their code bases/architectures. We're neither. We are quite cost constrained. We cannot afford to be wasteful.

If the question is "why did it take this long for companies to start switching to WebP"... while it's not a new format, some browsers support is "fairly new". Edge added it in late '18, Firefox in early '19, and Safari in late '20. It's basically only running about 2 years ahead of AVIF, even though it's nearly a decade older.

If an instance admin wants to store and serve webp by default, a more viable option is to use an image proxy/CDN to convert from other formats.

If you had read the discussion, you would have seen that what's under discussion IS using imgproxy to serve to clients, to automatically deal with what clients support. The server only needs to deal with storage decisions.

I would advise reading more of the above discussion. That said, I appreciate your inputs! :)

@BlueSwordM
Copy link

Lossy WebP should not be implemented at all to be honest. It'd be better to use a superior JPEG encoder like mozjpeg until we get JPEG-XL working.

Lossless WebP on the other hand is a different game entirely for encoding PNGs, so I'd be up for that until Mastodont gets JXL support.

@pauloxnet
Copy link
Contributor

I totally agree with this proposal.

@enn-nafnlaus
Copy link

When you say JXL support, do you mean as an archival format, or as a archival + serving format? Because we can already do archival with JXL. But the grand total of browsers that support JPEG XL without enabling a debug flag (which was recently even removed from Chrome) is "1", the obscure Pale Moon browser. It's not clear if it's going to come to browsers at all, let alone any time soon. The only option at hand would be a WebAssembly JPEG XL decoder.

That said, we could certainly enable an option to encode it as an archival-only format. It'd take real-world experimentation to figure out what the impact will be (more conversions / caching of recently-requested files), but I suspect that there will be a lot of cache hits for recent files, with comparably few requests for older ones that need conversion.

Again, I think the best solution is to create a variety of options that server administrators can enable based on their own preferences, since it seems that most people agree that something should be done, but differ on what should be done. Said options also encompasing "what you're allowed to deliver to me over the Fediverse" for conservative operators receiving data from more lenient operators.

I'd also argue - actually back to my initial point in the original thread - that the very first step should be as you mentioned, Blue SwordM - 1) improving our existing JPEG logic (size limits, not res limits, apart from the 4092x4092 processing limits), 2) less aggressive shrinking, more aggressive QF reductions, since up to a point reducing QF has less impact on quality for a given file size; and 3) using a better encoder (e.g. mozjpeg). Alongside that, 4) less leniency for giant PNGs; the current handling of them lets them store bigger files than JPEGs. In all cases, with options for administrators to choose their target file sizes, to pick their balance between space / hosting costs and image quality.

From there we can incrementally add options for different serving and archival formats, behind imgproxy and a caching server, and server operators can report back what works well for them in the real world.

Does this sound good?

@afontenot
Copy link
Contributor

Lossy WebP should not be implemented at all to be honest. It'd be better to use a superior JPEG encoder like mozjpeg until we get JPEG-XL working.

Given that Mastodon is going to continue to convert too large (>1920*1080 pixels) images to mid-bitrate lossy, I'm perplexed why you'd want to insist on using JPEG (even with mozjpeg) when lossy WebP is expected to be superior in quality at the same bitrate (for most web-sized images). Also there is no "we" who can get JPEG-XL working. It's not supported in any browser out of the box and Chrome don't intend to pursue it at all. We should take the improvements we can get. Note that as I understand it the reason for proposing WebP over another broadly supported codec like AVIF is that the ActivityPub protocol doesn't currently have a mechanism for specifying alternative formats for images, so the provided file needs to have near-100% support. WebP has that. AVIF still doesn't. JPEG-XL by the looks of things never will; if I could change that I would.

Example: mozjpeg at quality 90 (same setting used by Mastodon) WebP at same file size

If you look closely you'll notice that the WebP has better details and doesn't exhibit nasty ringing around the sparks. (Note that Mastodon uses ImageMagick for its JPEGs, not mozjpeg, so -quality 90 is actually conservatively large here.)

I have heard, anecdotally, that lossless WebP is inexplicably larger than lossless PNG for some files. I haven't worked with lossless WebP (which is almost a completely different codec than lossy WebP btw) much, so I can't confirm that directly. But it means that you'd want to test individual images for size against a PNG optimizer and pick the best result on a case by case basis.

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 24, 2022

It occasionally is larger on lossless, but it's not common, and as you note, you can always compare the output size with the input size.

The other thing about WebP - and I think this is a key point, so I'll stress it...

-= Mastodon Already Supports WebP Files =-

Search through your server cache, if you run one; you'll find WebP files being shared. Not a ton compared to jpegs and pngs, but they're absolutely in circulation. And - I should add - really tiny compared to them.

My take is that there's arguments for and against all possibilities. 1) Better JPEG/PNG handling only; 2) WebP; 3) AVIF; 4) JPEG XL (and even combinations thereof for specific circumstances) So IMHO we should strive to support as many possibilities as possible, let server admins decide, and then we can compare notes based on real-world experiences.

Just my two krónur :)


ED: I'd also like to reiterate, as part of my initial point when I opened this issue:

If you look closely you'll notice that the WebP has better details and doesn't exhibit nasty ringing around the sparks. (Note that Mastodon uses ImageMagick for its JPEGs, not mozjpeg, so -quality 90 is actually conservatively large here.)

... that downscaling a giant image to only 2MP, but then saving with -quality 90, is a ridiculous way to compress images. Not to put too fine a point on it, but if you were to up the quality even further to 100, and scale down even further to compensate, you've basically converted JPEG into BMP. :Þ

To a point (quality ~30-40 or so, give or take), reducing file size is better done by reducing the quality factor than downscaling. That's literally the point of JPEG and other forms of image compression, that they work better for retaining image data for a given file size than simple downscaling. Otherwise we'd just save all images on the internet as downscaled BMPs.

@jyrkialakuijala
Copy link

github.com/libjxl/libjxl has a new classic JPEG encoder (using heuristics originally developed for JPEG XL) that can compress ~25 % more than mozjpeg/libjpeg-turbo, and a new decoder that corresponds to about ~8 % better quality for decompression. It also solves the HDR by allowing more than 8-bits of accuracy in traditional JPEG formalism. Adopting them for lossy compression can be the simplest way to make progress.

One could get some more compression on that by applying the lossless JPEG recompression, i.e., JPEG XL lossless transport for JPEGs (allowing arithmetic coding, better context modeling and some prediction).

WebP is a good replacement for PNGs, if there is unwillingness to go all the way to JPEG XL.

@lordkitsuna
Copy link

In all cases, with options for administrators to choose their target file sizes, to pick their balance between space / hosting costs and image quality.

From there we can incrementally add options for different serving and archival formats, behind imgproxy and a caching server, and server operators can report back what works well for them in the real world.

Does this sound good?

this sounds perfect, a decent set of defaults but allow server administrators to pick what they want to do based on their preference. At the end of the day this is a self hosted and compartmentalized system with activity pub allowing some overlap. If somebody wants to make it so that only people using Firefox nightly with a experimental flag enabled can view their images by using jpeg XL, or make it so no server is willing to retoot by uploading 100MB PNGs then why not let them? that's their choice to make imo.

@enn-nafnlaus
Copy link

github.com/libjxl/libjxl has a new classic JPEG encoder (using heuristics originally developed for JPEG XL) that can compress ~25 % more than mozjpeg/libjpeg-turbo, and a new decoder that corresponds to about ~8 % better quality for decompression. It also solves the HDR by allowing more than 8-bits of accuracy in traditional JPEG formalism. Adopting them for lossy compression can be the simplest way to make progress.

I think you misunderstand. Yes, libjxl can compress classic JPEGs, but the output is not itself a classic JPEG; the output is a JPEG XL. Which isn't at all backwards compatible.

[mastodon@chmmr Media]$ cjxl pexels-orig.jpg out.jpg
JPEG XL encoder v0.6.1 [SSE4,SSSE3,Scalar]
Read 4912x7360 image, 69.7 MP/s
Encoding [Container | JPEG, lossless transcode, squirrel | JPEG reconstruction data], 16 threads.
Compressed to 3157418 bytes (0.699 bpp).
4912 x 7360, 31.36 MP/s [31.36, 31.36], 1 reps, 16 threads.
Including container: 3157938 bytes (0.699 bpp).

[mastodon@chmmr Media]$ ls -l pexels-orig.jpg out.jpg
-rw-r--r--. 1 mastodon mastodon 3157938 nóv 25 00:47 out.jpg
-rw-r--r--. 1 meme meme 3732422 nóv 24 14:19 pexels-orig.jpg

[mastodon@chmmr Media]$ identify out.jpg
out.jpg JPG 0x0 16-bit sRGB 3.01164MiB 0.000u 0:00.005
identify: Not a JPEG file: starts with 0x00 0x00 `out.jpg' @ error/jpeg.c/JPEGErrorHandler/339.

@KRRW
Copy link

KRRW commented Nov 25, 2022 via email

@jyrkialakuijala
Copy link

| Yes, libjxl can compress classic JPEGs, but the output is not itself a classic JPEG;
| the output is a JPEG XL. Which isn't at all backwards compatible.

I believe you can create actual JPEGs, just the UI for doing it is a bit klungy -- you need to use the benchmark_xl program, and they emerge with an ICC v4 color profile which is not supported by every program.

get libjxl
$ ./ci.sh opt
$ ./build/tools/benchmark_xl --input input.png --output_dir . --codec=jpeg:libjxl:d1.0 --save_compressed

For high quality it produces about 28 % denser JPEGs than other encoders (libjpeg-turbo and mozjpeg).

@enn-nafnlaus
Copy link

enn-nafnlaus commented Nov 25, 2022

| Yes, libjxl can compress classic JPEGs, but the output is not itself a classic JPEG; | the output is a JPEG XL. Which isn't at all backwards compatible.

I believe you can create actual JPEGs, just the UI for doing it is a bit klungy -- you need to use the benchmark_xl program, and they emerge with an ICC v4 color profile which is not supported by every program.

get libjxl $ ./ci.sh opt $ ./build/tools/benchmark_xl --input input.png --output_dir . --codec=jpeg:libjxl:d1.0 --save_compressed

For high quality it produces about 28 % denser JPEGs than other encoders (libjpeg-turbo and mozjpeg).

Hmm... I can't be bothered to install a non-RPM version of libjxl to get benchmark_xl at the moment, so I'll just have to take your word on it. If you want to do some tests on browser compatibility on the output files and get back to us, that'd be useful information :)

Unfortunately, "CanIUse" doesn't have browser support for ICC profile info at all, so it's not easy to assess what percentage of browsers support it - let alone any other oddities that might exist in the transcoded file.

@mrAceT
Copy link

mrAceT commented Dec 4, 2022

I second the motion! I’dd love (at least the option for) converting all uploaded JPEG to WEBP !

@KRRW
Copy link

KRRW commented Dec 4, 2022 via email

@mrAceT
Copy link

mrAceT commented Dec 4, 2022

I was discussing this with some other mastodon admins.. one of them made a verry good remark:

Maybe Mastodon shouldn't download that much stuff in the first place but just save the source url.

my instance has been running for a week and already it got to 1 GigaByte.. mostly ‘double data’ I expect.. so I indeed do think that the option to link to the source URL would be a good one..

@saschanaz
Copy link

In that case there's a risk that the images can change anytime. Maybe some hash can help?

@enn-nafnlaus
Copy link

  1. If everyone everything is hotlinked, then I think you run a risk of small servers being swamped by content that gets federated to large servers. Maybe, maybe not, but it's a concern. On the other hand, yes, it would reduce the amount of data getting sucked in.

  2. Hashing is a complicated topic. See here: Convert images to webP by default #20896 (comment)

Hashing exact copies of files is already done (dedup'ing), at least on a per-server basis, and at least from a serving basis (not necessarily a storage perspective). We can definitely do better. The question is where to draw the line.

@KRRW
Copy link

KRRW commented Dec 5, 2022 via email

@mrAceT
Copy link

mrAceT commented Dec 5, 2022

There are a lot of (sub) topics crossing through here..

The primary goal (if I understand it all correctly) is the reduction of the enormous amounts of space used by Mastodon.. and I do think there is a waste of space.. Space takes energy, and money in one way or an other.. so if that can be reduced it's better for the financials of the admins and for "the world in general" (especially when Mastodon gets bigger and bigger)

(going "Bottom up" in the posts for the arguments)

1) hashing
to catch duplicates. Understandable. but to "nice" and different ones get matched. Too "strict" and obvious duplicates get missed..
One way or an other, there is no perfect solution. I like this #20896 (comment) simple and elegant solution. (MD5 the data). That will catch duplicates. Not perfect, but it will help!
2) hotlinking
When a toot (post) is popular it might swamp a small instance. That's a good argument. How long does such a popular post "live" on mastodon? A day? A week at the most? What if the current system got a "time limit" of say a week.. (or something smarter: calculated with the amount of accounts an instance has? Big instances must cache more than small ones?).
And then remove the cached image and replace it with a hotlink? Best of both worlds?
3) changed images
The argument that a hotlinked image will change. Ehm, so? If the original poster wanted to change the image.. then that is/was his/her Prerogative.. that is then the most up to date version of that post.. period?
4) (better) compression
I'm guessing 95% (and up) of the multimedia data attached to a toot (post) will be an image. reducing the amount of data "at the source" is I think one of the best ways of tackling this subject. In my opinion WEBP is perfect for this. It's very well supported by browsers, the quality loss is negligible and the reduction in size compared to "the average JPEG" is something like 25% in my experience. If an admin has the option to check a box at "convert all uploaded JPEG images to WEBP" I'dd check that box immediately!
Maybe with a compression setting?
Optionally a checkbox for PNG also? (optionally with option for lossless/compression setting?)
Do a check if the resulting WEBP is smaller then the JPEG (better safe then sorry ;)
JPEG-XL is way to new and not widely supported. Maybe add that "checkbox option" over time?
5) admin: config option for maximum image size
I just checked.. a boosted post with an image had the size of 1080 x 1541 pixels. On my monitor (1920 x 1080) it gets rendered at 548 x 782 when clicked upon. Then I do can enlarge.. (are there stats on how often this is clicked?)
I'dd love the option for a maximum size; say fit within 1000 x 700 (and/or give some extra options?

All this (and maybe more ideas) have the goal of reducing the "data footprint" of Mastodon. And I do think this deserves more attention! The sooner options/ideas above get implemented, the better it will be!

@enn-nafnlaus
Copy link

enn-nafnlaus commented Dec 5, 2022

Re, #1: I'd note that my post was md5'ing the files, but presumedly you'd get a better hit rate by md5'ing the data (raw, or better, decompressed), even if you require an exact match. Because with raw files, if say some service strips or modifies EXIF or other header data, or presents the same data in a different encoding, then they'll get different hashes for the same contents.

Re, other formats, one thing we have to ask ourselves is: when do we leave Paperclip behind? Currently Mastodon uses Paperclip, but it's deprecated and is no longer being maintained.

https://github.com/thoughtbot/paperclip

It's also super-slow and limited. These days, it's been replaced by Rails' ActiveStorage, which AFAIK DOES support WebP (as well as vips, which makes JPEGs much faster than ImageMagick, and with fewer security issues). So do we keep hacking new features into (dead) Paperclip, or do we bite the bullet and switch to ActiveStorage?

@KRRW
Copy link

KRRW commented Dec 5, 2022 via email

@enn-nafnlaus
Copy link

Animated gifs are currently converted to mpeg. No need to switch to using WebP for animation.

@KRRW
Copy link

KRRW commented Dec 5, 2022 via email

@pauloxnet
Copy link
Contributor

"As of yesterday the code has been merged that removes JPEG-XL support from the Chromium/Chrome web browser code-base."
https://www.phoronix.com/news/Chrome-Drops-JPEG-XL

I think WebP is the right candidate to be used as default image format in Mastodon due to all the benefits already wrote in this issue.

@enn-nafnlaus
Copy link

"As of yesterday the code has been merged that removes JPEG-XL support from the Chromium/Chrome web browser code-base." https://www.phoronix.com/news/Chrome-Drops-JPEG-XL

I think WebP is the right candidate to be used as default image format in Mastodon due to all the benefits already wrote in this issue.

Yes - more here. #20896 (comment)

Could still be used as an archival format for old images that are served infrequently. It's quick to convert out of JPEG XL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests