Streaming image decoding #13

jakearchibald · 2019-09-02T13:13:39Z

const decoded = await imageDecoder.decode(input);
const canvas = ...;
canvas.getContext('2d').putImageData(decoded, 0, 0);

The above (from the explainer) suggests that decoding doesn't stream, which feels like a missed opportunity.

In its current state, it feels like it'd be better to change createImageBitmap so it could accept a stream.

Maybe that should happen, whereas a whole new API could provide expose streamed image decoding.

Images can stream in a few ways:

Yielding pixel data from top to bottom.
Yielding pixel data with increasing detail (progressive/interlaced).
Yielding frames.

This would allow partially-loaded images to be used in things like <canvas>.

The text was updated successfully, but these errors were encountered:

pthatcherg · 2019-09-03T15:28:43Z

To the possible WebIDL I added this:

partial interface ImageData {
readonly attribute ReadableStream readable; // of bytes
}

Which would allow the data to stream (the pixel data), at least from the decoder.

I'm interested to hear what you mean by "streaming frames" That makes it sound a lot like a video decoder, which is hopefully well supported by the previous portions of the explainer. Is there something different you area looking for? Perhaps what you mean is different quality versions of the same image rather than different frames over time?

jakearchibald · 2019-09-03T15:34:46Z

I'm interested to hear what you mean by "streaming frames". That makes it sound a lot like a video decoder

By yielding frames I mean animated gif/webp. Maybe that could be handled by the video decoder, but the platform doesn't treat these things the same elsewhere.

Perhaps what you mean is different quality versions of the same image rather than different frames over time?

Nah, "yielding pixel data with increasing detail (progressive/interlaced)" was supposed to cover that case.

jakearchibald · 2019-09-03T15:37:10Z

To the possible WebIDL I added this:

partial interface ImageData {
readonly attribute ReadableStream readable; // of bytes
}

I don't think that makes sense as ImageData is a synchronous data structure. Maybe you mean ImageBitmap? But even then it seems weird as APIs expect ImageBitmap to represent a 'decoded' image source.

pthatcherg · 2019-09-03T15:48:55Z

A couple of different things at once:

Other than containerization (which is an interesting topic for video vs. image codecs in its own right), how is an animated webp file different than a vp8 video?
For ImageData.readable, my intention was to allow a WHATWG stream version of .data. I was hoping it could then be piped through transform streams (to edit the image) before being piped elsewhere (such as an encoder).
I'm a little new to ImageData vs. ImageBitmap, but reading through the Chromium source code, it appears that ImageData is the lower-level concept of a source of an image which is used in many places, so that seemed like a more fitting structure to represent what would come out of a decoder. But I could be wrong.
How could a progress stream of raw/unencoded bytes of increasing quality work?

jakearchibald · 2019-09-03T16:33:55Z

Other than containerization (which is an interesting topic for video vs. image codecs in its own right), how is an animated webp file different than a vp8 video?

I'm not sure. But we don't allow <video> to play gifs, or <img> to play muted mp4s. It'd be great if that changed IMO 😄.

For ImageData.readable, my intention was to allow a WHATWG stream version of .data. I was hoping it could then be piped through transform streams (to edit the image) before being piped elsewhere (such as an encoder).

I think that would be better handled by a helper that converts an array/sequence to a stream. But, I don't see the benefit.

The benefit of streaming is you can do things in chunks, or ideally in parallel, so a streaming encoder/decoder is only useful if it can provide meaningful data before the whole operation is complete.

How could a progress stream of raw/unencoded bytes of increasing quality work?

Good question, and the answer would take a lot of careful design, but here are some half-baked ideas:

If the image format only touches pixels once during decode (baseline jpeg, webp), I'd expect the stream to yield data structures like this:

Final image width.
Final image height.
ImageData of some decoded data.
Target x position of image data.
Target y position of image data.
Target width of image data.
Target height of image data.

I guess 'multi-scan' formats would have the same format, but the target x & y would always be 0, and the target width & height would be the same as the final width & height.

pthatcherg · 2019-09-04T00:37:42Z

On Tue, Sep 3, 2019 at 9:33 AM Jake Archibald ***@***.***> wrote: Other than containerization (which is an interesting topic for video vs. image codecs in its own right), how is an animated webp file different than a vp8 video? I'm not sure. But we don't allow <video> to play gifs, or <img> to play muted mp4s. It'd be great if that changed IMO 😄. For ImageData.readable, my intention was to allow a WHATWG stream version of .data. I was hoping it could then be piped through transform streams (to edit the image) before being piped elsewhere (such as an encoder). I think that would be better handled by a helper that converts an array/sequence to a stream. But, I don't see the benefit.

The benefit of streaming is you can do things in chunks, or ideally in

parallel, so a streaming encoder/decoder is only useful if it can provide meaningful data before the whole operation is complete.

How useful that will be for images is still definitely uncertain. So far, it's just an idea to possibly explore.

How could a progress stream of raw/unencoded bytes of increasing quality work? Good question, and the answer would take a lot of careful design, but here are some half-baked ideas: If the image format only touches pixels once during decode (baseline jpeg, webp), I'd expect the stream to yield data structures like this: - Final image width. - Final image height. - ImageData of some decoded data. - Target x position of image data. - Target y position of image data. - Target width of image data. - Target height of image data. I guess 'multi-scan' formats would have the same format, but the target x & y would always be 0, and the target width & height would be the same as the final width & height.

You originally said stream of bytes, but you just described a stream of data structures. I agree a stream of data structure may make sense. It was the stream of bytes that had me confused. But you make it sound like you mean stream of data structures, which would clear up my confusion.

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB3J76X3HH3HJWGL5NDCTD3QH2GXHANCNFSM4IS5LJ7Q> .

guest271314 · 2019-09-04T01:35:44Z

@jakearchibald

In its current state, it feels like it'd be better to change createImageBitmap so it could accept a stream.

Note, createImageBitmap and ImageCapture.grabFrame() implementation has issues which until fixed impact reliability of that API.

The first list item below is an insidious bug. The only way that was able to achieve expected result, before reading the second list item which clarified what the issue is, was to assign ReadableStreamDefaultController to a variable in start() which could call close() on outside of the ReadableStream instance.

Issue 967459: WritableStream close occasionally does not execute when ReadableStreamDefaultController close is executed inside pull when pipeTo is used https://bugs.chromium.org/p/chromium/issues/detail?id=967459
Clarify grabFrame timing Improve memory locality for higher performances and reduced memory working set #212
Clarify grabFrame timing mediacapture-image#212
Issue 979890: createImageBitmap() throws DOMException when passed Blob of PNG converted from data URL, Firefox does not throw error and draws ImageBitmap onto canvas https://bugs.chromium.org/p/chromium/issues/detail?id=979890

guest271314 · 2019-09-04T01:46:47Z

@jakearchibald See

jakearchibald · 2019-09-04T07:43:35Z

@pthatcherg

a streaming encoder/decoder is only useful if it can provide meaningful data before the whole operation is complete.

How useful that will be for images is still definitely uncertain. So far, it's just an idea to possibly explore.

Every browser has taken advantage of streaming decoding for over a decade now. I think its usefulness has been well proven.

You originally said stream of bytes

I think I said pixel data. But yes, there needs to be metadata that tells the developer where that data exists in the overall image.

I guess what I'm trying to say, at a higher level, is: All browsers ship streaming image decoders that allow them to improve performance by handling image data of partially decoded images. Let's give developers access to this.

jakearchibald · 2019-09-04T07:58:35Z

@guest271314 this issue is about streaming image decoding, not getting images from a video. Please start another issue, or post on an existing relevant issue, if you have something different to discuss.

guest271314 · 2019-09-04T08:08:49Z

Only posted here because you mentioned using or modifying createImageData() which has issues. If you interpret the comments as pertaining only to video and the information is not useful to you so be it. Good luck!

jakearchibald · 2019-09-04T08:26:17Z

@guest271314 The existence of an implementation bug in one browser does not justify the creation of a whole new standard/API.

pthatcherg · 2019-09-04T16:29:13Z

@pthatcherg

a streaming encoder/decoder is only useful if it can provide meaningful data before the whole operation is complete.

How useful that will be for images is still definitely uncertain. So far, it's just an idea to possibly explore.

Every browser has taken advantage of streaming decoding for over a decade now. I think its usefulness has been well proven.

I meant more specifically the .readable attribute on ImageData giving a stream of bytes rather than .data for an array of bytes.

The "stream of data structures" thing that you meant (and I misunderstood at first) is something different.

You originally said stream of bytes

I think I said pixel data. But yes, there needs to be metadata that tells the developer where that data exists in the overall image.

Ah.... my mistake, then.

I guess what I'm trying to say, at a higher level, is: All browsers ship streaming image decoders that allow them to improve performance by handling image data of partially decoded images. Let's give developers access to this.

I think we're on the same page now.

Let's see if I can take a shot at what the WebIDL would look like:

interface ImageDecoder {
// You get more than one as more information becomes available.
ReadableStream decode(EncodedImageData or ReadableStream);
}

interface DecodedImage {
// Things that aren't available yet are null
readonly attribute int? width;
readonly attribute int? height;
readonly attribute int? targetWidth;
readonly attribute int? targetHeight;
readonly attribute int? targetX;
readonly attribute int? targetY;
readonly attribute ImageData? imageData;
}

That could probably be cleaned up a bit, but first let's make sure that's generally what you're looking for (and if it's implementable :).

jakearchibald · 2019-09-04T16:37:05Z

It should probably be a transform stream rather than a function that takes a readable, but seems good!

However, I haven't looked at how streaming codecs actually work, and what they output. They might do something much better, or cater for something we're missing.

pthatcherg · 2019-09-04T16:38:57Z

"Justify" and "justice" have separate etymologies.

But also, wow.

FYI, I deleted a comment that I think was distracting from the meaningful conversation.

pthatcherg · 2019-09-04T16:44:23Z

It should probably be a transform stream rather than a function that takes a readable, but seems good!

However, I haven't looked at how streaming codecs actually work, and what they output. They might do something much better, or cater for something we're missing.

OK, how about this:

interface ImageDecoder {
  attribute WritableStream writable;  // of encoded bytes
  attribute ReadableStream readable;  // of DecodedImage (progressively more info)
}

interface DecodedImage {
  // Things that aren't available yet are null
  readonly attribute int? width;
  readonly attribute int? height;
  readonly attribute int? targetWidth;
  readonly attribute int? targetHeight;
  readonly attribute int? targetX;
  readonly attribute int? targetY;
  readonly attribute ImageData? imageData;
}

jakearchibald · 2019-09-04T16:46:43Z

👍

This doesn't cater for animated formats, but yeah, it does feel like the video stuff (or something similar) would be in a better position to handle that.

pthatcherg · 2019-09-04T16:50:42Z

I think what I'll do is remove what is there currently for images (I meant it to be a PR and just pushed to the wrong branch anyway) and then make a PR like this.

pthatcherg · 2019-09-04T20:38:13Z

As I'm updating the explainer to represent this, I had the following question:

What is it that the browser can do here that you can't do already in wasm/JS?

jakearchibald · 2019-09-04T21:17:29Z

In terms of the things discussed in this thread, it can all be done if you throw enough wasm at it.

However, the browser already has well-optimised steaming image decoders, so why not let developers use those rather than importing their own?

guest271314 · 2019-09-09T16:50:05Z

Is this issue aiming for something similar to http://www.http2demo.io and https://http2.akamai.com/demo?

Does not HTTP require the entire resource to be downloaded, e.g., using fetch() even where ReadableStream is used to read the response? Meaning even if "Steaming image decoding" is used at HTTP it is an illusion after the fact of the resource being downloaded (save for EventSource and WebSocket, web-platform-tests/wpt#18335)?

If the total size of ImageData is known before hand individual pixel can be set at the ImageData instance similar to what AFAICT is described at whatwg/html#4785, see https://github.com/dsanders11/imagebitmap-getimagedata-demo.

guest271314 · 2019-09-09T17:22:36Z

"Justify" and "justice" have separate etymologies.

But also, wow.

FYI wikionary is not a primary source for the etymology or meaning of English words, terms or phrases. In fact no such primary source exists, as English is an equivocal language. There is no prohibition to arbitrary creation and re-defining of words or terms, or not providing any definition at all. One example is the term "justice", where the term is not defined in U.S. or State law. Similar to the term "justify" or "justifiable". E.g., the phrase "justifiable homocide" can have vastly different interpretations depending on who is being evaluated for the action and whose life was taken. The State can conclude "justifiable" the family can reject such an assertion, for example in the case of Stephon Clark in Sacramento, California. Another example is the term "cheaper by the dozen" https://english.stackexchange.com/q/486088 which has several different interpretations, or "etymology", if you prefer. The closest you can get to a primary source for the meaning of a word, term or phrase in English is a technical document, e.g., a specification or standard, or a law enacted by a legislative body, where the codified rules of statutory construction apply.

jakearchibald · 2019-09-09T17:51:35Z

@guest271314

Is this issue aiming for something similar to http://www.http2demo.io and https://http2.akamai.com/demo?

No.

Does not HTTP require the entire resource to be downloaded, e.g., using fetch() even where ReadableStream is used to read the response?

No.

Meaning even if "Steaming image decoding" is used at HTTP it is an illusion after the fact of the resource being downloaded (save for EventSource and WebSocket, web-platform-tests/wpt#18335)?

No.

If the total size of ImageData is known before hand individual pixel can be set at the ImageData instance similar to what AFAICT is described at whatwg/html#4785, see https://github.com/dsanders11/imagebitmap-getimagedata-demo.

Yes but you'd still want to know when pixels changed, and what changed. For progressive/interlaced formats, your proposal doesn't let you provide your own interpolation.

guest271314 · 2019-09-09T18:07:15Z

Have had the concept of, if the pixel dimension of the image to be displayed are the same, to create a single "array" for the purposes of this use case, and replace each pixel that is different in the original array when a new "chunk" arrives, avoiding the creation of multiple ImageBitmap or ImageData instances.

Now, mapping which pixel changed would at least require creating a map and keeping track of which pixels change in both directions, which may or may not consume just as much resources as creation of multiple ImageBitmap and ImageData instances due to traversing the given "array" and "map" data structures, though should only need to be created once.

For an animation use case would consider utilizing Web Animation API, where images or pixels could be "streamed" to be displayed at background-image using an "array", async generator, ReadableStream, etc. Created such a concept some time ago though lost the code. Created another such proof-of-concept for the use case of creating a timeline for input being a MediaStream from canvas.captureStream() which does not have any ending, it is a live stream, the basic code

keyframes.push(
                            {backgroundImage: `url(${canvas.toDataURL("image/webp")})`, width:width + "px", height:height + "px"});
                        stream.requestFrame();
// ...

            let t = Math.floor(duration * 1000);
            const animation = picture.animate(keyframes, {duration: t});
            animation.play();

Am far more best suited to trying to solve challenging use cases where the use case is clear and at least some code exists to test and determine what the bugs are and what is ultimately not possible using current technologies.

What are you not able to achieve right now?

jakearchibald · 2019-09-09T18:43:51Z

@guest271314

I appreciate you trying to help, but given that you don't understand the streaming nature of HTTP or image decoders, I don't think this conversation is constructive, and will only continue to derail the thread for the rest of us.

Rather than ask me questions about what HTTP and images can/can't do, perhaps do some research. You can get answers to the things you've asked with some pretty basic tests, or by putting your query to a search engine.

guest271314 · 2019-09-09T20:03:57Z

Have done research. That is why posted the previous link describing HTTP/2. It is not clear what you are trying to achieve that you cannot do right now. If you have an incoming stream of data [0, 10, 255] where the first two elements are x and y and the last the "color code" you can simply swap the existing color for a different color. That can be drawn onto a canvas or streamed as a background image at a frame rate at any HTML element. By "streaming" image are you describing streaming a single image in "chunks" or streaming arbitrary pixels to form a single image for the purpose of an animation? Either procedure should be presently possible.

The accusation of "derail" is not correct at all. Am asking specific questions attempting to gather what you are actually trying to do that you are not capable of doing right now. The questions posed are no different than any other questions on any board. If you already had the answers to your own question then there would be no need for you to file this issue in the first place, due to that fact we a re in the exact same space relevant to asking questions.

guest271314 · 2019-09-09T20:19:40Z

@jakearchibald Am at the front-end. "the rest of us" since you are speaking for everybody other than yourself as well should be able to figure out what you are actually trying to do and to solve the problem that is not clear to this novice coder that just writes code. Do not really care at all if you or anyone or entity like the questions posed or not. Am not here to make friends. Am asking technical questions for own edification. If answering such basic questions is beyond the scale of your status then just state that: you are over-qualified to answer such question. Which again, leads back to why you even need to file this issue in the first place. You can solve your own inquiry with using your own expertiese. Am certain in the final analysis the outcome will be useful and clear. Best of luck with your project.

nadavsinai · 2019-10-28T09:30:19Z

HI, we at Philips-Algotec developing a medical imaging application would benefit very much from this proposed WebCodecs extension for using the browser's decoders in our own code.
In addition, allowing the user to register a custom decoder via JS/WASM would be truely amazing- this will really manifest the low-end extensibility that the Extensible Web Manifesto speaks of.
something like a module adhering to the ImageDecoder interface which can be registered for a given mimeType

const jpegXLDecoder = new JpegXLDecoder(); /// do your WASM/JS magic here and adhere to interface.
const imageDecoder = new ImageDecoder({mimetype: 'image/jpegxl',decoder:jpegXLDecoder});
navigator.registerDecoder(imageDecoder); // imaginary API... 
/// from here on any <img> tag which loads a source with the right mime type will use our decoder
// and also imperatively 
const decoded = await imageDecoder.decode(input); // some streaming input
const canvas = ...;
canvas.getContext('2d').putImageData(decoded, 0, 0);

Actually supporting non-streaming inputs as part of the ImageDecoder interface makes sense in this case.
What's your thoughts?

pthatcherg · 2019-11-06T18:11:19Z

There are two separate things here:

Making a WASM image decoder and then using it to decode input from JS.
Making a WASM image decoder and then expecting an img tag to use it.

The first I believe you can do already today. No new APIs are needed.

The second is somewhat interesting, but in that case, what is the advantage of using the img tag instead of canvas?

dalecurtis · 2020-04-17T23:48:06Z

I've written an explainer and implemented a prototype of how this might work:
https://github.com/dalecurtis/image-decoder-api/blob/master/explainer.md

Please take a look and let me know if that approach sounds good. If so, we may eventually want to merge it with the WebCodecs explainer/spec.

eeeps · 2020-08-27T22:45:12Z

Just want to say that I'm starting to try to build a demo of JPEG-XL's native progressiveness, and how it might be useful for low-quality image placeholders or even single-file responsive images. I would absolutely love an ImageDecoder that dealt in Streams!

dalecurtis · 2020-08-27T22:55:19Z

ImageDecoder can already do that. You just give it the ReadableStream as data value.

chcunningham · 2021-02-19T22:29:18Z

Old issue. Dale's explainer supports streaming, is now implemented in Chrome behind the WebCodecs flag/origin trial, and actively being spec'ed (tracked in #50).

w3c deleted a comment from guest271314 Sep 4, 2019

pthatcherg added the maybe Ideas that might be in scope, and worth discussing label Sep 18, 2019

sandersdan mentioned this issue Feb 20, 2020

Byte stream formats #38

Closed

sandersdan mentioned this issue Apr 22, 2020

Support for images #50

Closed

chcunningham closed this as completed Feb 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming image decoding #13

Streaming image decoding #13

jakearchibald commented Sep 2, 2019 •

edited

Loading

pthatcherg commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

pthatcherg commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

pthatcherg commented Sep 4, 2019 via email

guest271314 commented Sep 4, 2019

guest271314 commented Sep 4, 2019

jakearchibald commented Sep 4, 2019 •

edited

Loading

jakearchibald commented Sep 4, 2019

guest271314 commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

guest271314 commented Sep 9, 2019

guest271314 commented Sep 9, 2019

jakearchibald commented Sep 9, 2019

guest271314 commented Sep 9, 2019

jakearchibald commented Sep 9, 2019

guest271314 commented Sep 9, 2019

guest271314 commented Sep 9, 2019

nadavsinai commented Oct 28, 2019 •

edited

Loading

pthatcherg commented Nov 6, 2019

dalecurtis commented Apr 17, 2020 •

edited

Loading

eeeps commented Aug 27, 2020

dalecurtis commented Aug 27, 2020

chcunningham commented Feb 19, 2021 •

edited

Loading

Streaming image decoding #13

Streaming image decoding #13

Comments

jakearchibald commented Sep 2, 2019 • edited Loading

pthatcherg commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

pthatcherg commented Sep 3, 2019

jakearchibald commented Sep 3, 2019

pthatcherg commented Sep 4, 2019 via email

guest271314 commented Sep 4, 2019

guest271314 commented Sep 4, 2019

jakearchibald commented Sep 4, 2019 • edited Loading

jakearchibald commented Sep 4, 2019

guest271314 commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

pthatcherg commented Sep 4, 2019

jakearchibald commented Sep 4, 2019

guest271314 commented Sep 9, 2019

guest271314 commented Sep 9, 2019

jakearchibald commented Sep 9, 2019

guest271314 commented Sep 9, 2019

jakearchibald commented Sep 9, 2019

guest271314 commented Sep 9, 2019

guest271314 commented Sep 9, 2019

nadavsinai commented Oct 28, 2019 • edited Loading

pthatcherg commented Nov 6, 2019

dalecurtis commented Apr 17, 2020 • edited Loading

eeeps commented Aug 27, 2020

dalecurtis commented Aug 27, 2020

chcunningham commented Feb 19, 2021 • edited Loading

jakearchibald commented Sep 2, 2019 •

edited

Loading

jakearchibald commented Sep 4, 2019 •

edited

Loading

nadavsinai commented Oct 28, 2019 •

edited

Loading

dalecurtis commented Apr 17, 2020 •

edited

Loading

chcunningham commented Feb 19, 2021 •

edited

Loading