Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image/jpeg: unexpected EOF when decoding image file #45902

Open
inliquid opened this issue May 1, 2021 · 19 comments
Open

image/jpeg: unexpected EOF when decoding image file #45902

inliquid opened this issue May 1, 2021 · 19 comments

Comments

@inliquid
Copy link

@inliquid inliquid commented May 1, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16.3 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\***\AppData\Local\go-build
set GOENV=C:\Users\***\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\***\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\***\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.16.3
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\***\AppData\Local\Temp\go-build2404133136=/tmp/go-build -gno-record-gcc-switches

What did you do?

I work with a web platform, where multiple users can post their content, including images. We're moving from legacy solution based on PHP and re-implementing it in Go. One of the functionality required for a platform is resizing/cropping images. We used github.com/disintegration/imaging for that and encountered this problem with that library originally. What happens is that for some particular jpeg images, there is an error when you try to decode it using standard library.

Please note, that Firefox/Chrome/etc browsers open and display that image correctly, as well as imagick (part of legacy solution) does resize/cropping of it without any problems. With image/jpeg I'm getting unexpected EOF when trying to decode an image.

Here is a sample code to reproduce an error:

package main

import (
	"image"
	_ "image/jpeg"
	"os"
)

func main() {
	f, err := os.Open("1.jpg")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	_, _, err = image.Decode(f)
	if err != nil {
		panic(err)
	}
}

Output:

panic: unexpected EOF
...

Archived image:
1.zip

What did you expect to see?

No panic.

What did you see instead?

Panic.

@ZekeLu
Copy link

@ZekeLu ZekeLu commented May 1, 2021

Output from GraphicsMagick indicates this file is an invalid JPEG file (Premature end of JPEG file). Maybe you want to check this issue #10447.

Outputs from "gm identify"
$ gm identify -verbose 1.jpg
Image: 1.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Geometry: 705x725
  Class: DirectClass
  Type: true color
  Depth: 8 bits-per-pixel component
  Channel Depths:
    Red:      8 bits
    Green:    8 bits
    Blue:     8 bits
  Channel Statistics:
    Red:
      Minimum:                     0.00 (0.0000)
      Maximum:                 65535.00 (1.0000)
      Mean:                    58898.59 (0.8987)
      Standard Deviation:      13043.94 (0.1990)
    Green:
      Minimum:                     0.00 (0.0000)
      Maximum:                 65535.00 (1.0000)
      Mean:                    52892.09 (0.8071)
      Standard Deviation:      21055.03 (0.3213)
    Blue:
      Minimum:                     0.00 (0.0000)
      Maximum:                 65535.00 (1.0000)
      Mean:                    55775.47 (0.8511)
      Standard Deviation:      17834.50 (0.2721)
  Filesize: 201.3Ki
  Interlace: No
  Orientation: Unknown
  Background Color: white
  Border Color: #DFDFDF
  Matte Color: #BDBDBD
  Page geometry: 705x725+0+0
  Compose: Over
  Dispose: Undefined
  Iterations: 0
  Compression: JPEG
  JPEG-Quality: 100
  JPEG-Colorspace: 2
  JPEG-Colorspace-Name: RGB
  JPEG-Sampling-factors: 2x2,1x1,1x1
  Signature: 4daca6d3d493e6799e0f0a383b8f5acb746e7e4287984fe7418877241978b57f
  Profile-EXIF: 50 bytes
    Orientation: 0
    Exif Offset: 38
  Tainted: False
  User Time: 0.010u
  Elapsed Time: 0m:0.009465s
  Pixels Per Second: 51.5Mi
gm identify: Premature end of JPEG file (1.jpg).
@inliquid
Copy link
Author

@inliquid inliquid commented May 1, 2021

@ZekeLu I don't think this jpeg is "invalid", haven't heard of that tool and its output doesn't make any sense to me. I know that file was created by regular user with some very basic tools, such as Adobe Photoshop/scanning software. There are more examples.

What I know for sure is that image is correctly opened by all possible software on my machine:

Browsers:

  • Mozilla Firefox
  • Google Chrome
  • Microsoft Edge

Editors:

  • Affinity Photo
  • Krita
  • Paint.NET
  • FastStone Image Viewer
  • SAI (it's the only from the list to complain, but it opens the file correctly and lets editing)

As I said Image Magic tools resize and crop this image without any error, producing valid results.

@AlexRouSg
Copy link
Contributor

@AlexRouSg AlexRouSg commented May 1, 2021

Zoom in on the right edge. It does look like a slightly corrupted image.
image

As I said Image Magic tools resize and crop this image without any error, producing valid results.

This info is irreverent since it would load the image into it's own internal format which would always be valid to do operations on. So as long as it can load something, it can process it.

cc @nigeltao @robpike as per owners

@inliquid
Copy link
Author

@inliquid inliquid commented May 1, 2021

This info is irreverent since it would load the image into it's own internal format which would always be valid to do operations on. So as long as it can load something, it can process it.

Disagree. This fact clearly indicates that nothing prevents from decoding and processing an image, and Image Magic is de-facto industry standard for all sorts of image processing. It's an addition to a quite decent list of software which doesn't treat that image as "invalid".

My criteria for validity (as well as of my users) if that major web browser can read and display an image. In real world you will see tons of such images.

@AlexRouSg
Copy link
Contributor

@AlexRouSg AlexRouSg commented May 1, 2021

Disagree. This fact clearly indicates that nothing prevents from decoding and processing an image, and Image Magic is de-facto industry standard for all sorts of image processing. It's an addition to a quite decent list of software which doesn't treat that image as "invalid".

I meant being able to do operations (scale/rotate/crop/etc...) on the image is irreverent. The important thing is being able to load it.
Once a image is loaded into a image processing software, the validity of the origional image has no impact on it's ability to processs said image.

@AlexRouSg
Copy link
Contributor

@AlexRouSg AlexRouSg commented May 1, 2021

While waiting for this issue to be resolved, you could use another jpeg library e.g. github.com/pixiv/go-libjpeg/jpeg
I tested it on your example image and it does not error, tho it requires a copy of libjpeg

@slrz
Copy link

@slrz slrz commented May 1, 2021

Image Magic is de-facto industry standard for all sorts of image processing. It's an addition to a quite decent list of software which doesn't treat that image as "invalid".

Use the identify tool (shipped with ImageMagick) on the file and it will tell you that the JPEG file is corrupted/truncated. See an earlier comment where this was already done (using the GraphicsMagick fork of ImageMagick).

@ZekeLu also pointed out that your problem is already discussed in issue #10447.

@ZekeLu
Copy link

@ZekeLu ZekeLu commented May 2, 2021

Sorry that I didn't make it clear that when I said the image is invalid, I meant from the perspective of the encoding standard. I just tried ImageMagick, and it reports the same result: Premature end of JPEG file `1.jpg' @ warning/jpeg.c/JPEGWarningHandler/402. It's not doubt that the image data does not strictly follow the encoding standard. The problem is whether image/jpeg should be more tolerant for such kind of images. And that's the topic of #10447.

Outputs from "magick identify"
./magick identify -verbose 1.jpg
Image:
  Filename: 1.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: DirectClass
  Geometry: 705x725+0+0
  Units: Undefined
  Colorspace: sRGB
  Type: TrueColor
  Base type: Undefined
  Endianness: Undefined
  Depth: 8-bit
  Channel depth:
    Red: 8-bit
    Green: 8-bit
    Blue: 8-bit
  Channel statistics:
    Pixels: 511125
    Red:
      min: 0  (0)
      max: 255 (1)
      mean: 229.157 (0.898656)
      median: 255 (1)
      standard deviation: 50.7915 (0.199182)
      kurtosis: 5.12959
      skewness: -2.32255
      entropy: 0.470499
    Green:
      min: 0  (0)
      max: 255 (1)
      mean: 205.811 (0.8071)
      median: 255 (1)
      standard deviation: 81.8345 (0.32092)
      kurtosis: 0.0176647
      skewness: -1.28635
      entropy: 0.476879
    Blue:
      min: 0  (0)
      max: 255 (1)
      mean: 217.027 (0.851087)
      median: 255 (1)
      standard deviation: 69.3206 (0.271846)
      kurtosis: 2.05107
      skewness: -1.78947
      entropy: 0.486341
  Image statistics:
    Overall:
      min: 0  (0)
      max: 255 (1)
      mean: 217.332 (0.852281)
      median: 255 (1)
      standard deviation: 67.3156 (0.263983)
      kurtosis: 1.83995
      skewness: -1.76201
      entropy: 0.477906
  Rendering intent: Perceptual
  Gamma: 0.454545
  Chromaticity:
    red primary: (0.64,0.33)
    green primary: (0.3,0.6)
    blue primary: (0.15,0.06)
    white point: (0.3127,0.329)
  Matte color: grey74
  Background color: white
  Border color: srgb(223,223,223)
  Transparent color: none
  Interlace: None
  Intensity: Undefined
  Compose: Over
  Page geometry: 705x725+0+0
  Dispose: Undefined
  Iterations: 0
  Compression: JPEG
  Quality: 100
  Orientation: Undefined
  Profiles:
    Profile-exif: 50 bytes
  Properties:
    date:create: 2021-05-01T23:43:10+00:00
    date:modify: 2021-05-01T23:43:10+00:00
    exif:ExifOffset: 38
    jpeg:colorspace: 2
    jpeg:sampling-factor: 2x2,1x1,1x1
    signature: 89e20c767489d8dd8718fd22fd9319f2b6e7953cbf0be2368ef491acca24a3f0
  Artifacts:
    verbose: true
  Tainted: False
  Filesize: 206142B
  Number pixels: 511125
  Pixels per second: 23.8097MP
  User time: 0.020u
  Elapsed time: 0:01.021
  Version: ImageMagick 7.0.11-6 Q16 x86_64 2021-03-28 https://imagemagick.org
identify: Premature end of JPEG file `1.jpg' @ warning/jpeg.c/JPEGWarningHandler/402.
@nigeltao
Copy link
Contributor

@nigeltao nigeltao commented May 2, 2021

Yeah, duplicate of #10447

@nigeltao nigeltao closed this May 2, 2021
@inliquid
Copy link
Author

@inliquid inliquid commented May 2, 2021

@nigeltao This kind of error was never mentioned in #10447, and error I see absolutely meaningless. It's not premature end of JPEG file or something, it's generic unexpected EOF from io package which makes you think that file cannot be read due to I/O problems. Only this makes it different from the mentioned issue and makes it necessary to address, at least to introduce this kind of errors and correctly report them. So please reopen.

Sorry that I didn't make it clear that when I said the image is invalid, I meant from the perspective of the encoding standard. I just tried ImageMagick, and it reports the same result: Premature end of JPEG file `1.jpg' @ warning/jpeg.c/JPEGWarningHandler/402. It's not doubt that the image data does not strictly follow the encoding standard. The problem is whether image/jpeg should be more tolerant for such kind of images. And that's the topic of #10447.

@ZekeLu As I said I'm using Image Magick bindings to crop/resize the image and it decodes, manipulates given file with no issues. That was the point. I don't care if any specific tool can detect that file is not perfect. You will never have a chance to work only with perfect images in the wild, and it just makes image/jpeg pointless because you simply can't use it in production in such a basic scenario of multi-user web platform. And please have a look at the longer list of applications that I was able to test, that don't treat this file as corrupted: #45902 (comment)

I read that issue and according to comments from @bradfitz, Go should follow what browsers do. Is this approach changed?

While waiting for this issue to be resolved, you could use another jpeg library e.g. github.com/pixiv/go-libjpeg/jpeg
I tested it on your example image and it does not error, tho it requires a copy of libjpeg

@AlexRouSg Thanks, we are using github.com/disintegration/imaging (and it actually is quite popular) which in turn uses image standard library. It lets us have similar set of operations (Crop/Resize/Thumbnail/etc) with modern features such as auto-orientation on all types of images: jpeg, png, gif. So I don't think there is any workaround.

@ZekeLu
Copy link

@ZekeLu ZekeLu commented May 2, 2021

@inliquid, in fact, I'm with you 🤝 . I wish image/jpeg could be more tolerant with not perfect images too! But the fact is that @nigeltao is very overloaded 🤗 . I guess he has a very big backlog, and this feature doesn't get the chance to enter the top.

In my opinion, this issue can not be addressed without #10447 being addressed, and it doesn't help to just leave this one open. #10447 is tagged as NeedsDecision, to push things forward, I think we need someone experienced in image processing to create a proposal first. I wish I could help, but it's sad that I'm not familiar with image processing.

@AlexRouSg
Copy link
Contributor

@AlexRouSg AlexRouSg commented May 2, 2021

Thanks, we are using github.com/disintegration/imaging

All their functions are taking an argument of image.Image this means you can use any lib that gives you a image.Image and you don't have to use their Open function. The jpeg lib I linked you returns a `image.Image.

@inliquid
Copy link
Author

@inliquid inliquid commented May 2, 2021

@AlexRouSg as I said, we want auto orientation feature, and it works with use of imaging.Open call like:

img, err := imaging.Open("test.jpg", imaging.AutoOrientation(true))

or with direct call of imaging.Decode which takes io.Reader instead of file name, and doesn't change a lot. Another thing you skipped for some reason, is that we use similar implementation for all three web raster graphics formats: jpeg, png, and gif.

@AlexRouSg
Copy link
Contributor

@AlexRouSg AlexRouSg commented May 2, 2021

@inliquid Sorry I have a bad habbit of reply too fast, was trying to give you options on how to make it work but I see now that it's likely not worth the effort now that you are using image magic.

@nigeltao
Copy link
Contributor

@nigeltao nigeltao commented May 2, 2021

@nigeltao This kind of error was never mentioned in #10447, and error I see absolutely meaningless. It's not premature end of JPEG file or something, it's generic unexpected EOF from io package which makes you think that file cannot be read due to I/O problems. Only this makes it different from the mentioned issue and makes it necessary to address, at least to introduce this kind of errors and correctly report them. So please reopen.

There are two concerns here.

  1. When image.Decode is given an invalid JPEG, the (image.Image, error) that it returns is (nil, anError) instead of partial results (nonNil, anError).
  2. anError is io.ErrUnexpectedEOF instead of something JPEG-specific.

As for (1), that is a duplicate of #10447, as previously mentioned. Even if we 'fixed' (2), we can't solve your underlying problem (you want as much of the JPEG as you can still decode, even if it's overall technically an invalid JPEG) without solving (1).

As for (2), io.ErrUnexpectedEOF still seems reasonable to me. We asked jpeg.Decode to do something, passing an io.Reader. It couldn't do it, because it hit a premature end (an EOF) of the data stream. Sounds like an unexpected EOF to me.

io.ErrUnexpectedEOF isn't restricted to 'just OS level things' like files and sockets. For example, in the standard library:

  • The compress/zlib package will return io.ErrUnexpectedEOF if it's invalid ZLIB.
  • The encoding/hex package will return io.ErrUnexpectedEOF if it's invalid hexadecimal-encoded data.
@inliquid
Copy link
Author

@inliquid inliquid commented May 4, 2021

As for (1), that is a duplicate of #10447, as previously mentioned. Even if we 'fixed' (2), we can't solve your underlying problem (you want as much of the JPEG as you can still decode, even if it's overall technically an invalid JPEG) without solving (1).

As I said, and as you can clearly see, because you are an active participant of that issue, there is no mention of unexpected EOF as a result of processing possibly "corrupted" file. We spent quite some time investigating it, and our case is real-world scenario, and we searched through open and closed issues related to image before posting it. So I disagree it is "duplicate", if you think it's similar situation underneath (which is not that clear from the error text, it sounds more like a bug of a library) then you could copy my case to #10447 thread, so it would become an "addition" not just "duplicate", to help others who may encounter the same situation. Instead you decided to just close my case.

As for (2), io.ErrUnexpectedEOF still seems reasonable to me. We asked jpeg.Decode to do something, passing an io.Reader.

What's the point? You are talking about some internals of image, which I don't care. I don't even use image directly! Processing of images is tightly coupled with receiving a data (i.e. imaging lib does both for us, opens a file and manipulates it). How do I know that io.ErrUnexpectedEOF has anything to do with "wrong" JPEGs? How do I handle this error? What exactly I, as a caller, should do and how do I distinguish this from I/O errors? Is this a problem with web socket or TCP stream? Or NFS? Should I re-instantiate connection or close it? Should I report a caller that file needs to be resent? There is nothing pointing to the fact that file was analyzed and library could not decode it, the only fact you know is that something is wrong with I/O.

It couldn't do it, because it hit a premature end (an EOF) of the data stream. Sounds like an unexpected EOF to me.

You just said what it sounds: "premature end of [something you just processed] JPEG".

The compress/zlib package will return io.ErrUnexpectedEOF if it's invalid ZLIB.
...

Ok, nice. I don't know if their use of io.ErrUnexpectedEOF correct or not, because I don't have experience report for that, but what I know for sure is that it's wrong in this particular case.

And, again what's the point to referring to them? So if, lets imagine, their use of io error was wrong would they refer to image/jpeg and say the problem will not be fixed because it's used in the same way in image/jpeg? Vicious circle? If it's wrong or bad design, than it should be fixed, as simple as that.

@jfesler
Copy link

@jfesler jfesler commented May 4, 2021

In volunteer projects, It may be more productive to contribute working code.

@nigeltao
Copy link
Contributor

@nigeltao nigeltao commented May 5, 2021

So I disagree it is "duplicate"

As I said, there are two concerns. Concern (1) is that the Go standard library doesn't give you 'best effort' partial results when asked to decode an invalid image. To repeat what other commentators have said above, these images are invalid according to the JPEG specification. Concern (1) is a duplicate of #10447. To repeat, even if we returned a different error, you wouldn't see any difference in:

What did you expect to see?
No panic.

What did you see instead?
Panic.

in your original code snippet.


If it's... bad design, than it should be fixed, as simple as that.

Concern (2) is that it returns io.ErrUnexpectedEOF instead of jpeg.ErrSomethingElse. You're right in that it's not immediately obvious that io.ErrUnexpectedEOF in this context means that it's an invalid JPEG. In hindsight, we might have made a different design decision, but as I've said, there is still a rationale for why it behaves the way it does.

For better or worse, this is how the standard library has behaved (for image/jpeg and for other packages, as I've mentioned) for many years now. Other programs may already treat io.ErrUnexpectedEOF in certain ways. Hyrum's Law might apply. It might not be "as simple as that" to change.

I'll re-open this issue, specifically for people searching for io.ErrUnexpectedEOF when decoding images, but to be clear, I don't expect the actual behavior to change any time soon.

@nigeltao nigeltao reopened this May 5, 2021
@inliquid
Copy link
Author

@inliquid inliquid commented May 5, 2021

To repeat, even if we returned a different error, you wouldn't see any difference in
...
in your original code snippet.

This is just a stupid form. I wrote quite a lot in this thread since that, if you didn't notice.

@cagedmantis cagedmantis added this to the Backlog milestone May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants