Improve Decoder/Encoder symmetry #2276

JimBobSquarePants · 2022-10-28T23:45:52Z

Prerequisites

I have written a descriptive pull-request title
I have verified that there are no overlapping pull-requests open
I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
I have provided test coverage for my change (where applicable)

Description

An attempt to make the decoder/encoder patterns more symmetric and also simplify implementation.

Adds abstract ImageDecoder & ImageEncoder types which allow simplified synchronous implementations while ensuring streams are seekable.
Renames IImageDecoderSpecialized<T> to SpecializedImageDecoder<T> and adds a SpecializedImageDecoder<T> implementation that inherits ImageDecoder.
Adds QuantizingEncoder abstraction containing common properties and ensure that they are used across all implementing encoders.
Simplifies decoder/encoder implementations by removing casting and introducing default method proxying by specialized decoders/encoders
Expose default decoding resize method (not sure about this one as we might end up implementing scanline block resize on decode).
Make DecoderOptions properties init only.

There are now two separate pipelines - Our implemented codecs use a synchronous pipeline where the asynchronous implementation is proxied. External implementations are now able to fully implement both sync and async pipelines or inherit our sync pipeline.

JimBobSquarePants · 2022-11-15T12:53:13Z

src/ImageSharp/Formats/IImageDecoder.cs

 /// </summary>
-public interface IImageDecoder : IImageInfoDetector
+public interface IImageDecoder


No reason for a separate interface here.

src/ImageSharp/Formats/ImageDecoder.cs

src/ImageSharp/Formats/ImageEncoder.cs

src/ImageSharp/Formats/Bmp/BmpEncoder.cs

src/ImageSharp/Formats/ImageDecoder.cs

src/ImageSharp/Formats/SpecializedImageDecoder{T}.cs

src/ImageSharp/Formats/QuantizingImageEncoder.cs

src/ImageSharp/Image.FromStream.cs

antonfirsov

This looks definitely better than the initial version!

I've always seen encoders and decoders as types which are supposed to be implemented and only used through the our Image.Load / Image.Save infrastructure, and never directly. I think we should keep this design, since it greatly simplifies things both for us and the users.

If the main reason for changing this is the "specialized" decoding niche case, I think it's definitely not worth the extra layers being added. Instead, we can add an infra for the specialized decoding cases separated from the decoder types.

For example we can create API-s as following:

public class Image
{
    public static Image Load(JpegDecoderOptions options, Stream stream);
}

The only disadvantage of the above design is that 3rd party decoders won't be able to utilize the specialized infra, but it's such a niche (or better even non-existent) case, that I'm fine with it.

Or we can consider subclassing JpegDecoderOptions : DecoderOptions. This would greatly simplify the object model. The disadvantage is that a user can pass JpegDecoderOptions together with a PNG stream, but IMHO it doesn't sound that terrible. We can throw an exception or just ignore it.

A third solution would be to make decoders stateful, merging the decoder role with the "specialized options" role. We could then compose IImageDecoder from the general DecoderOptions. This would also remove the stateless decoder / stateful encoder asymmetry.

antonfirsov · 2022-11-20T00:18:50Z

src/ImageSharp/Formats/ImageDecoder.cs

+            // Our buffered reads may have left the stream in an incorrect non-zero position.
+            // Reset the position of the seekable stream if we did not read to the end to allow additional reads.
+            if (stream.CanSeek && stream.Position != s.Position && s.Position != s.Length)
+            {
+                stream.Position = position + s.Position;
+            }


I'm not sure I'm following every detail here. We will only create ChunkedMemoryStream if stream.CanSeek == false, otherwise the two streams are identical and stream.Position != s.Position will never be true.

Is this a new feature? Do we have a test to demonstrate/validate it?

It's a fix for #2259 where 3 images were embedded in a stream one after the other. We have tests to cover it. I can add a reference to the issue in the code.

I think I'm either missing something, or this code might be buggy.

If stream.CanSeek == false, we are creating a ChunkedMemoryStream and passing it to Action, but in that case the stream.CanSeek && ... condition cannot be ever true, so we won't rewind anything, therefore Action is basically a dummy.

I'll have another read through; ChunkedMemoryStream can seek though.

The issue only really happens because BufferedReadStream will often overshoot the expected position since it does a bulk read. We only use that stream in our internal decoders so I should move the fix to the extension method that uses it making it much more readable and only appear one time.

Wherever it lands, I would unit test the logic, if possible. No sure Issue_2259.png is enough to exercise all the cases.

OK. There was no need for the logic in Image.FromStream so I've removed it. ImageDecoder is the only place now we perform the check which makes sense since we only use BufferedReadStream internally on decoders inheriting that type. The tests I have in place for this provide coverage.

antonfirsov · 2022-11-20T00:31:35Z

src/ImageSharp/Formats/ImageDecoder.cs

+    /// <inheritdoc/>
+    public Image<TPixel> Decode<TPixel>(DecoderOptions options, Stream stream)
+        where TPixel : unmanaged, IPixel<TPixel>
+        => WithSeekableStream(


This is double-doing WithSeekableStream, which was already done in the Image.Load middleware.

It's for when the decoder is called directly. Duplication is avoided because the decoder is always using a seekable stream when called via Image.Load

For me this is a sign, that something is off with the design. Would be nice to figure out how to do it only once.

It's really only a nice to have for our own decoders. I'm happy to remove it and leave the direct decoder usage pipeline throwing instead.

I think I want to keep this. The stream is always seekable if loaded via the Image.Load middleware and our checks in the decoder will avoid a duplicate operation and call the action directly.

When someone calls ImageDecoder.Decode() or the async equivalent directly our checks will avoid unnecessary copying and also ensure that the input stream is cloned to a seekable stream when required.

antonfirsov · 2022-11-20T00:36:59Z

src/ImageSharp/Formats/IImageDecoder.cs

+    /// <param name="cancellationToken">The token to monitor for cancellation requests.</param>
+    /// <returns>A <see cref="Task{Image}"/> representing the asynchronous operation.</returns>
+    /// <exception cref="ImageFormatException">Thrown if the encoded image contains errors.</exception>
+    public Task<Image<TPixel>> DecodeAsync<TPixel>(DecoderOptions options, Stream stream, CancellationToken cancellationToken)


Considering the enormous work, I don't think it's realistic that either us or a 3rd party will ever implement a truly async real-life decoder. If so, I don't see the justification to maintain the infrastructure enabling the decoders to do so.

Even if we think it's realistic, I would at least postpone the introduction of the feature to the time we first implement it, or a user convinces us that it's really needed for them.

There's one library I know that has both sync and async decoders/encoders. TiffLibrary which has an ImageSharp adapter.

I would be happy to drop async from both decode and encode and just use an extension to wrap the synchronous call though. IImageDecoder is a lot of work to implement.

Wow, that's nice! We should keep the async path then.

As a side note I've now decorated the public cancellation token parameters with a default value.

src/ImageSharp/Formats/ImageDecoder.cs

JimBobSquarePants · 2022-11-20T04:05:06Z

@antonfirsov I'm really struggling with a stateful decoder implementation that avoids being called directly via our general middleware. The absolute last pattern I ever want to allow again is for someone to do this.

using var image = Image.Load("my.png", new PngDecoder());

It's a footgun IMO. I've answered quite a few questions on Stack Overflow - (Wrong assumption from explicit decoder usage triggered by bad file extension in this case) and elsewhere, where people have attempted to pass explicit, incorrect decoders in an attempt to decode streams then wondered why it doesn't work. Sometimes it's not their fault - I've seen tons of images over the years working in digital agencies where the image has the wrong file extension.

That's why I have tried to make it as clear as I can that specialized decoder options are an advanced, specialist operation that only individuals with the specific requirement to use those options should use.

General users should never care about, nor require the specialization.

I see 3 levels of usage.

Simple.

using var image  = Image.Load(inStream);

Intermediate.

DecoderOptions options = new(){...};
using var image  = Image.Load(options, inStream);

Advanced.

bool isJpeg = // Some method that is guaranteed to the correct result, maybe via pre-stored mime type.
if(isJpeg)
{
    JpegDecoderOptions options = new(){...}
    using var image  = new JpegDecoder().Decode(options, inStream);
}

The stateless nature of the decoder is there because we use the options in our IImageDecoder methods since we require the ability for the decoders to be initialized without constructor parameters.

Maybe we enforce a decoder factory to ImageFormatManager.SetDecoder(...) that accepted the options and used them to construct instances?

We could then drop the decoder options from the IImageDecoder API?

However, that doesn't mean we would drop passing the general decoder options via the Image.Load middleware. Those general options give us a lot of power.

I dunno, this is hard...

@Webreaper You're my target audience for the Advanced mode here. I'd love to hear your thoughts.

Webreaper · 2022-11-20T08:44:10Z

For me intermediate and advanced are the same, but I see why you called them out separately. I like the beginner / intermediate options, and they make sense to me - the options container object is exactly what makes the code readable but gives extra flexibility. Is probably make the stream the first parameter though, and options second, as that seems more intuitive.

I presume for scaled decoding I'd pass the target size in the options object? But yeah, I'd be very happy with this.

JimBobSquarePants · 2022-11-20T09:53:19Z

@Webreaper Thanks for your input. The main difference between Intermediate and Advanced is that you cannot pass JpegDecoderOptions to any of the Image.Load(...) methods. And yeah, for general scaled decoding you pass a Size via the options. The JpegDecoder will decode the image to the closest size then perform secondary resize for an exact match. To avoid this second resize you'd use the specialized options against the decoder directly.

@antonfirsov I've been reviewing the 2.x branch and the build in IImageDecoder instances are essentially stateless there since they are registered in the ImageFormatManager against the private dictionary.

JimBobSquarePants · 2022-12-08T10:47:59Z

[xUnit.net 00:02:22.17]     LoadAsync_IsCancellable(useMemoryStream: False, file: "Gif/kumin.gif", percentageOfStreamReadToCancel: 0.7) [FAIL]
  Failed LoadAsync_IsCancellable(useMemoryStream: False, file: "Gif/kumin.gif", percentageOfStreamReadToCancel: 0.7) [1 m 1 s]
  Error Message:
   System.TimeoutException : The operation has timed out.
  Stack Trace:
     at SixLabors.ImageSharp.Tests.ImageTests.Decode_Cancellation.LoadAsync_IsCancellable(Boolean useMemoryStream, String file, Double percentageOfStreamReadToCancel) in /home/runner/work/ImageSharp/ImageSharp/tests/ImageSharp.Tests/Image/ImageTests.Decode_Cancellation.cs:line 95
--- End of stack trace from previous location ---

I cannot figure out where this timeout is coming from. The duration (1m 1s) far exceeds the limit of the wait threshold.

There's a second test failing also that used to work just fine.

3m 8s
[xUnit.net 00:01:06.55]     IgnoreMetadata_ControlsWhetherMetadataIsParsed(ignoreMetadata: True) [FAIL]
  Failed IgnoreMetadata_ControlsWhetherMetadataIsParsed(ignoreMetadata: True) [24 ms]
  Error Message:
   Expected 0 undisposed buffers but found 1
Expected: True
Actual:   False
  Stack Trace:
     at SixLabors.ImageSharp.Tests.MemoryAllocatorValidator.TestMemoryDiagnostics.Validate(Int32 expectedAllocationCount) in /home/runner/work/ImageSharp/ImageSharp/tests/ImageSharp.Tests/MemoryAllocatorValidator.cs:line 61
   at SixLabors.ImageSharp.Tests.ValidateDisposedMemoryAllocationsAttribute.After(MethodInfo methodUnderTest) in /home/runner/work/ImageSharp/ImageSharp/tests/ImageSharp.Tests/ValidateDisposedMemoryAllocationsAttribute.cs:line 28

This reverts commit 718bdc4.

antonfirsov · 2022-12-08T15:45:44Z

@JimBobSquarePants the .WaitAsync(TimeSpan.FromSeconds(30)) is a trick to avoid hanging the test run when PausedStream becomes stuck. We should give sufficient time though, otherwise the test will be flaky, depending on the server load.

JimBobSquarePants · 2022-12-08T22:39:10Z

@antonfirsov Ah Ok yeah. It's WaitAsync that throws the exception.

There's something seriously funky going on then. Should we be overriding CopyAsync in PausedStream? We seem to be mixing and matching PausedMemoryStream and PausedStream behaviors through the tests (JpegDecoderTests are mappsing a MemoryStream to PausedStream). Perhaps PausedStream should be PauseFileStream?

…mageSharp into js/decoder-attempt-2

JimBobSquarePants · 2022-12-09T04:22:31Z

https://github.com/SixLabors/ImageSharp/actions/runs/3654144968/jobs/6174307674#step:13:267

I think there's something wrong with the VM. This test passed fine at 8fdd6b0 and we haven't touched anything that could be related since.

JimBobSquarePants · 2022-12-11T22:50:33Z

@antonfirsov Figured out why the metadata test is failing as it's now replicable on my machine. TestFile has a private Image property that is created then cloned whenever we call TestFile.CreateRgba32Image. This image is never disposed.

I'm actually amazed the tests haven't thrown for this before.

I'm now seeing the png file issue on my local machine also so shall investigate.

EDIT. Interesting.... Fixing TestFile somehow made the png issue disappear. Will run a few times to be sure it's not some thread safety thing.

It was a fluke...

…mageSharp into js/decoder-attempt-2

JimBobSquarePants · 2022-12-11T23:29:45Z

@antonfirsov Looks like your tests definitely fail on Windows too.

https://github.com/SixLabors/ImageSharp/actions/runs/3671278744/jobs/6206376200#step:14:338

JimBobSquarePants · 2022-12-12T13:17:49Z

src/ImageSharp/Formats/ImageDecoder.cs

+        long position = stream.CanSeek ? stream.Position : 0;
+        Configuration configuration = options.Configuration;
+        using ChunkedMemoryStream memoryStream = new(configuration.MemoryAllocator);
+        await stream.CopyToAsync(memoryStream, configuration.StreamProcessingBufferSize, cancellationToken).ConfigureAwait(false);


@antonfirsov What I don't understand about the test that is timing out is that the entire stream is copied here in the failing condition. This means we shouldn't even reach the actual decode operation before the position-based cancellation kicks in and an exception is thrown.

I also cannot get the test to fail in any scenario unless the full suite is running.

The position based cancellation will kick in during the copy, since the copy is utilizing reads/writes since 3a7c4f4. When PausedMemoryStream is used, WithSeekableMemoryStreamAsync will threat it as a MemoryStream and avoid wrapping it.

We'll need to press the rerun a few more times to prove it, but I think 85dc0f3 fixed the test issues. I introduced the timeout to handle a scenario where I noticed (or imagined?) that the paused stream was hanging without actually trying to understand why did it hang. Anyways, good news is that I don't see this hang happening anymore.

Instead, we were seeing the tests timing out because a decoder run can actually take longer than 30 seconds on a heavily overloaded 1-core VM.

Our code shouldn't, (at least I couldn't find a reason it should following refactoring) be hanging since the wrapping since the sync wrapping handles exceptions and always returns a Task.

Fingers crossed this has it! I'll trigger the run again now.

That's 2 successful runs in a row. I'll trigger a 3rd now.

I also did it 3 times, should be good now hopefully.

antonfirsov · 2022-12-12T22:24:34Z

Re 2145794: the cache in TestFile was introduced very long time ago to speed-up overall test execution. Instead of removing it, I would either avoid TestFile.GetImage() in decoder tests, or introduce a TestFile.GetImage(..., bool skipCache=false) overload.

Alternatively, we can measure test run times with/without the cache, and drop it if we don't see much difference in test execution times, but I'm not big fan of making this many conceptual changes in a single PR.

Edit: actually, the usage is not that broad as I thought, so probably it's OK that it's gone.

JimBobSquarePants · 2022-12-12T22:44:11Z

Alternatively, we can measure test run times with/without the cache, and drop it if we don't see much difference in test execution times, but I'm not big fan of making this many conceptual changes in a single PR.

I haven't noticed any change in times at all actually.

antonfirsov

One thing I would consider is to introduce shared singletons for now stateless decoders, so

new JpegDecoder().Decode(new JpegDecoderOptions() { ...}, stream);

becomes

JpegDecoder.Shared.Decode(new JpegDecoderOptions() { ...}, stream);

antonfirsov · 2022-12-12T22:45:57Z

src/ImageSharp/Formats/SpecializedImageDecoder{T}.cs

+    /// <param name="cancellationToken">The token to monitor for cancellation requests.</param>
+    /// <returns>The <see cref="Image{TPixel}" />.</returns>
+    /// <exception cref="ImageFormatException">Thrown if the encoded image contains errors.</exception>
+    protected abstract Image<TPixel> Decode<TPixel>(T options, Stream stream, CancellationToken cancellationToken)


Here we still lack the ability to provide no CancellationToken.

That happens on the public caller. This is the protected inner call.

Oops, commented on the wrong method.

If you start typing new JpegDecoder().DecodeAsync( ... you won't see an overload that takes JpegDecoderOptions without also asking for a CancellationToken.

One reason why I aim to add tests that validate all public methods directly it helps avoiding such mistakes.

Ah right... That was intentional. My thought was that more often than not a cancellation token will be passed. I'll update.

Fixed. I've updated all tests calling those APIs directly to remove the default passed token.

…mageSharp into js/decoder-attempt-2

JimBobSquarePants · 2022-12-13T23:57:09Z

@antonfirsov I've introduced shared singletons for all our decoders and updated all references to them. I went with Instance though over Shared to match existing convention.

Please let me know if there's anything else you consider blocking. I'm very keen to get this merged.

antonfirsov

Great work, thanks for the patience!

JimBobSquarePants · 2022-12-14T22:27:21Z

Thanks for really pushing me here @antonfirsov Without your input I would have made many mistakes.

JimBobSquarePants added 4 commits October 28, 2022 23:16

Replace IImageDecoder

19bdecc

Replace IImageEncoder

31def5c

Make decoder options init only (except Configuration cos tests)

5726089

Delete ImageTests.ImageLoadTestBase.Fakes.cs

7d5c05b

JimBobSquarePants requested review from brianpopow, antonfirsov, dlemstra and tocsoft October 28, 2022 23:45

JimBobSquarePants added this to the 3.0.0 milestone Oct 28, 2022

JimBobSquarePants added the API label Oct 28, 2022

JimBobSquarePants added 7 commits October 29, 2022 11:43

Use new not shared options.

93a1700

Merge branch 'main' into js/decoder-attempt-2

391442b

Merge branch 'main' into js/decoder-attempt-2

5669626

Re-introduce IImageDecoder and split decoding pipelines.

1d99413

Re-introduce IImageEncoder and split encoding pipelines.

c550af8

Rename base encoder

b1db34d

Stub code to prevent unnecessary stream copying on decode.

dd88269

JimBobSquarePants added the enhancement label Nov 15, 2022

JimBobSquarePants added 2 commits November 15, 2022 22:51

Update AotCompilerTools.cs

b98e1df

Update SpecializedImageDecoder{T}.cs

1dc7bbf

JimBobSquarePants commented Nov 15, 2022

View reviewed changes

antonfirsov reviewed Nov 20, 2022

View reviewed changes

JimBobSquarePants added 2 commits November 20, 2022 16:02

Use ScaleToTargetSize

afa204b

Enable optimization.

2721ad5

Rename and move configuration module.

0dc9949

JimBobSquarePants mentioned this pull request Nov 20, 2022

Enable Nullable #2282

Merged

4 tasks

Merge branch 'main' into js/decoder-attempt-2

b32755d

Experiment with a much shorter timeout.

718bdc4

Revert "Experiment with a much shorter timeout."

5f488ee

This reverts commit 718bdc4.

antonfirsov added 4 commits December 9, 2022 03:10

disable gif cancellation tests

64b4045

Merge branch 'js/decoder-attempt-2' of https://github.com/SixLabors/I…

475825d

…mageSharp into js/decoder-attempt-2

we should validate cancellation for each decoder separately

9a55652

Disable cancellation tests on Unix entirely

bb22d69

JimBobSquarePants added 2 commits December 12, 2022 09:15

Fix test file

2145794

Merge branch 'js/decoder-attempt-2' of https://github.com/SixLabors/I…

bd6e58c

…mageSharp into js/decoder-attempt-2

Better error handling in action wrapper

1dd2d6e

JimBobSquarePants commented Dec 12, 2022

View reviewed changes

[Experiment] do not timeout LoadAsync_IsCancellable

85dc0f3

antonfirsov reviewed Dec 12, 2022

View reviewed changes

JimBobSquarePants added 3 commits December 13, 2022 09:42

Make decoder cancellation token default.

205e1b3

Merge branch 'js/decoder-attempt-2' of https://github.com/SixLabors/I…

64c7a8e

…mageSharp into js/decoder-attempt-2

Use shared instances for all built-in decoders.

4007a48

JimBobSquarePants requested a review from antonfirsov December 14, 2022 00:07

antonfirsov approved these changes Dec 14, 2022

View reviewed changes

JimBobSquarePants merged commit ca20c92 into main Dec 14, 2022

JimBobSquarePants deleted the js/decoder-attempt-2 branch December 14, 2022 22:27

JimBobSquarePants mentioned this pull request Dec 14, 2022

Enable Nullable ReferenceTypes #2236

Closed

Improve Decoder/Encoder symmetry #2276

Improve Decoder/Encoder symmetry #2276

Conversation

JimBobSquarePants commented Oct 28, 2022 • edited

Prerequisites

Description

Choose a reason for hiding this comment

antonfirsov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JimBobSquarePants Nov 22, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov Nov 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JimBobSquarePants commented Nov 20, 2022 • edited

Webreaper commented Nov 20, 2022

JimBobSquarePants commented Nov 20, 2022

JimBobSquarePants commented Dec 8, 2022 • edited

antonfirsov commented Dec 8, 2022

JimBobSquarePants commented Dec 8, 2022

JimBobSquarePants commented Dec 9, 2022

JimBobSquarePants commented Dec 11, 2022 • edited

JimBobSquarePants commented Dec 11, 2022

JimBobSquarePants Dec 12, 2022 • edited

Choose a reason for hiding this comment

antonfirsov Dec 12, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov commented Dec 12, 2022 • edited

JimBobSquarePants commented Dec 12, 2022

antonfirsov left a comment

Choose a reason for hiding this comment

antonfirsov Dec 12, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov Dec 12, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JimBobSquarePants commented Dec 13, 2022

antonfirsov left a comment

Choose a reason for hiding this comment

JimBobSquarePants commented Dec 14, 2022

JimBobSquarePants commented Oct 28, 2022 •

edited

JimBobSquarePants Nov 22, 2022 •

edited

antonfirsov Nov 20, 2022 •

edited

JimBobSquarePants commented Nov 20, 2022 •

edited

JimBobSquarePants commented Dec 8, 2022 •

edited

JimBobSquarePants commented Dec 11, 2022 •

edited

JimBobSquarePants Dec 12, 2022 •

edited

antonfirsov Dec 12, 2022 •

edited

antonfirsov commented Dec 12, 2022 •

edited

antonfirsov Dec 12, 2022 •

edited

antonfirsov Dec 12, 2022 •

edited