WIP Replace Jpeg Decoder #274

JimBobSquarePants · 2017-07-10T14:13:12Z

Prerequisites

I have written a descriptive pull-request title
I have verified that there are no overlapping pull-requests open
I have verified that I am following matches the existing coding patterns and practise as demonstrated in the repository. These follow strict Stylecop rules 👮.
I have provided test coverage for my change (where applicable)

Description

This is a WIP pull request to allow us to work collaboratively on a replacement for the current jpeg decoder.

While baseline encoding is fairly straightforward, decoding is much more difficult due to the required support for both baseline and progressive jpegs + correct handling of images encoded using buggy encoders.

Our current decoder can handle some cases but struggles with others. It's fairly fast but the code is extremely difficult to navigate due to the nature of the original Golang implementation. It also produces different output to other implementations with an error in the +-3 range per pixel component. There are hints of a memory leak there also. #151 #224

The new jpeg decoder is based upon a jpeg implementation found in Mozilla's PDF.js with several key enhancements.

Fast table lookups for colorspace transforms
Better handling of broken images
Faster Huffman and IDCT implementations

We can now decode the images in #159 correctly throw in #214 and produce much closer output to the reference implementations in #155 and #245

However, the new decoder is 2X slower that our original and will require optimization. To do that we need to do the following:

Modularize the code more to allow improved testing
Perform unit and load testing on the decoder components
Optimize the decoder based on test findings.

Decoding a jpeg has two areas of high complexity which require extreme optimization: Huffman decoding and Inverse Discrete Cosine Transform; we can make large gains in performance in focusing our efforts there.

The Huffman decoder requires an accurate fast lookup table implementation See here and here for a reference implementation in Java and IDCT can be sped up using SIMD. (I would like to see if we can do it using Vector<short> rather than the Block8x8F implementation we have in the current decoder to keep memory usage down. Perhaps @mellinoe can give us some hints there?)

Please have a good dig through the code. If you have any questions please ask. I'd like us to get a much better understanding of the format than we currently have.

Update by @antonfirsov

I'm on a bit different and more critical opinion towards this change, and I also have a plan to speed up the decoder while keeping it correct and memory friendly. I'm quite sure there is no easy way for this, and I really appreciate any help you can provide with the tasks I managed to identify.

… (Optimizing PNG-s with external tools from now.)

# Conflicts: # tests/Images/External

# Conflicts: # tests/ImageSharp.Tests/TestFile.cs

antonfirsov · 2017-08-18T19:29:15Z

For everyone's information:
The new decoder should be based on this port, because the stream parsing and the CPU-intensive processing logic is clearly separated here. This makes work on progressive jpeg-s easier. So good job @JimBobSquarePants, this decoder is a really-really good thing! (Despite it's messy and slow IDCT and colorspace transformation and logic, which has to be replaced :P)

However, I suggest to open a new clean WIP PR, removing the noise we gathered here. I could describe my implementation plan there, and track the work. I also wiped the out the description in #192 to keep all information up-to-date.

There are some possibilities for incremental work. We can release our beta with the initial slow variant of the PdfJs decoder (if we find it good enough for a beta), and improve performance later. We need to keep all the classes of the original golang port in the repository though, to keep the useful logic and tests refactored as we add changes.

@JimBobSquarePants thoughts?

JimBobSquarePants · 2017-08-18T23:12:47Z

Hey @antonfirsov ! The colorspace stuff I thought was pretty good! 😝

I agree, Let's close this PR and open a new one. Make sure we capture the EXIF fixes I added also though.

I think the PdfJS based decoder is good enough for beta despite the comparative decrease in speed. It can already successfully decode every single rogue image across all our jpeg issues and produces a very similar output to libjpeg. Make it work; make it work fast.

During beta we can tackle performance issues and ensure we don't regress using your new qa lab code.

So let's do two PR's, one to switch out the decoder, then a second WIP one for performance improvements.

antonfirsov · 2017-08-18T23:21:51Z

@JimBobSquarePants I just merged jpeg-port into my jpeg-lab branch, organizing tests+classes+namespaces in a cleaner way.

For a while I'm planning to do refactors that keep both decoders working, so we can just merge back jpeg-lab into jpeg-port and use this (#274) PR for the switch-out process.

JimBobSquarePants · 2017-08-18T23:29:37Z

Great! 👍

Update namespaces & package names

Also fix Oilpainting test

JimBobSquarePants and others added 30 commits June 16, 2017 22:56

Begin port

1555f09

Fix header finder

1abe631

Add js source link

d4d74b4

Use buffer

f63f85a

Remove offset

2c629c7

Fix progressive bool assignment

b025ed6

(╯°□°）╯︵ ┻━┻

3728b82

Can now build huffman tables

0ea7a6f

Begin ProcessStartOfScan

a718bf8

Merge branch 'master' into jpeg-port

cd72206

Can now decode a scan

1629819

Begin second phase of decoding

8bbc63f

Impove disposal

c1025a6

Experiment with new file marker finder

549e61f

Merge branch 'master' into jpeg-port

ba8a5b3

Decoder now doesn't break tests

2f501eb

Fix progressive decoding

4a4e94d

baseline decode works progressive nearly

472d6ba

Fix progressive scan decoding

28a8aca

Can now decode many images

ca9bd35

Merge branch 'master' into jpeg-port

69c15e3

Can now decode that bad progressive image

59c0793

Now decodes all images

827ca83

Fix #159

e2d26eb

use an offset span instead of buffer

76e91db

additional usages of Span

db2b712

fixed Sandbox46 execution

5439240

Rough working better Huffman

ea0abc9

Better Huffman decoding

0f60242

Almost got Huffman LUT working

0323d00

antonfirsov added 10 commits August 18, 2017 16:45

good by GenericFactory!

84852a8

Using Corecompat.System.Drawing as reference encoder/decoder for PNG.…

1562e32

… (Optimizing PNG-s with external tools from now.)

PngDecoder is covered now, and proven to be buggy :P

02eb5f2

covered DetectEdges

1df0010

Merge remote-tracking branch 'origin/antonfirsov/qa-lab' into jpeg-lab

f6904d9

# Conflicts: # tests/Images/External

TestImageProvider.FileProvider cache is now aware of decoder parameters

a103cb8

Merge remote-tracking branch 'origin/antonfirsov/qa-lab' into jpeg-lab

a798d5a

provider.GetImage(new JpegDecoder())

385ed88

let's merge jpeg-port to have the changelog!

c4953b0

Merge remote-tracking branch 'origin/jpeg-port' into jpeg-lab

1959c4a

# Conflicts: # tests/ImageSharp.Tests/TestFile.cs

antonfirsov mentioned this pull request Aug 18, 2017

Improve Jpeg Decoder #192

Closed

antonfirsov added 8 commits August 18, 2017 23:43

grouping files for decoders

7383124

moving a few more files

4676d8a

GolangPort namespaces following folder structure

b6d4f35

move Block8x8F into ImageSharp.Formats.Jpeg.Common

1c75403

adjust PdfJsPort namespaces

80380d9

prefixing GolangPort stuff with Old*** #Round1

51b430b

renaming is hard

493deda

introduced OldJpegDecoder : IImageDecoder for the GolangPort decoder

b3b4827

antonfirsov mentioned this pull request Aug 19, 2017

[WIP] Building an optimized and accurate Jpeg Decoder [deprecated] #298

Merged

10 tasks

JimBobSquarePants added 2 commits August 19, 2017 13:10

Merge pull request #297 from SixLabors/tocsoft/sixlabors_rename

9b34d09

Update namespaces & package names

Merge branch 'tocsoft/mutate-api' into jpeg-port

0baf7d0

Also fix Oilpainting test

antonfirsov mentioned this pull request Aug 24, 2017

Implementing or porting an accurate, SIMD optimized IDCT algorithm #306

Closed

tocsoft merged commit 0baf7d0 into master Sep 14, 2017

tocsoft deleted the jpeg-port branch September 14, 2017 18:25

antonfirsov mentioned this pull request Jun 21, 2021

JpegDecoder: post-process baseline spectral data per MCU-row #1597

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Replace Jpeg Decoder #274

WIP Replace Jpeg Decoder #274

JimBobSquarePants commented Jul 10, 2017 •

edited by antonfirsov

Loading

antonfirsov commented Aug 18, 2017 •

edited

Loading

JimBobSquarePants commented Aug 18, 2017

antonfirsov commented Aug 18, 2017

JimBobSquarePants commented Aug 18, 2017

WIP Replace Jpeg Decoder #274

WIP Replace Jpeg Decoder #274

Conversation

JimBobSquarePants commented Jul 10, 2017 • edited by antonfirsov Loading

Prerequisites

Description

Update by @antonfirsov

antonfirsov commented Aug 18, 2017 • edited Loading

JimBobSquarePants commented Aug 18, 2017

antonfirsov commented Aug 18, 2017

JimBobSquarePants commented Aug 18, 2017

JimBobSquarePants commented Jul 10, 2017 •

edited by antonfirsov

Loading

antonfirsov commented Aug 18, 2017 •

edited

Loading