New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image/jpeg: add options to partially decode or tolerantly decode invalid images? #10447

Open
pranavraja opened this Issue Apr 14, 2015 · 28 comments

Comments

Projects
None yet
@pranavraja
Copy link

pranavraja commented Apr 14, 2015

go version devel +ce43e1f Mon Apr 13 23:27:35 2015 +0000 linux/amd64

Attempted to use jpeg.Decode on the below image:
https://streamcoimg-a.akamaihd.net/000/340/810/9ae536dd97d2d92fc17a6590509a51c0.jpg

Expected the image to decode successfully, as it displays in a browser.

Actual result:
invalid JPEG format: short Huffman data

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Apr 15, 2015

It seems to decode fine for me, on tip. I know your Go version is only a few days old, but can you "git sync" and re-try?

@pranavraja

This comment has been minimized.

Copy link

pranavraja commented Apr 15, 2015

Strange, I updated to go version devel +e5b7674 Wed Apr 15 02:28:53 2015 +0000 linux/amd64, and am still getting the same error. Here's my test program:

package main

import (
        "image/jpeg"
        "net/http"
        "fmt"
)

func main() {
        res, err := http.Get("https://streamcoimg-a.akamaihd.net/000/340/810/9ae536dd97d2d92fc17a6590509a51c0.jpg")
        if err != nil {
                panic(err)
        }
        defer res.Body.Close()
        img, err := jpeg.Decode(res.Body)
        if err != nil {
                panic(err)
        }
        fmt.Println(img.Bounds())
}

And here's the output:

panic: invalid JPEG format: short Huffman data

goroutine 1 [running]:
main.main()
        /usr/share/fix-images/check.go:17 +0x182

@minux

This comment has been minimized.

Copy link
Member

minux commented Apr 15, 2015

@pranavraja

This comment has been minimized.

Copy link

pranavraja commented Apr 15, 2015

Added fmt.Println(runtime.Version())

devel +e5b7674 Wed Apr 15 02:28:53 2015 +0000
panic: invalid JPEG format: short Huffman data

goroutine 1 [running]:
main.main()
        /usr/share/fix-images/check.go:19 +0x28a

Anyway, as long as this is fixed on tip I'm happy to close this.

@pranavraja pranavraja closed this Apr 15, 2015

@tenorok

This comment has been minimized.

Copy link

tenorok commented Jun 13, 2016

Hello!
I try to use golang v1.6.2 and with image of @pranavraja everything okay, but with image http://bubble.ru/system/magazines/mg_n11_01_original.jpg I have the same error.
Please, can you explain what is problem?

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Jun 14, 2016

I'm not sure what the problem is, but it's not a regression: I see the same error on the stable release (Go 1.6). We're in code freeze for the upcoming 1.7 release; I'll take a look at it once the tree opens again for 1.8.

@nigeltao nigeltao reopened this Jun 14, 2016

@nigeltao nigeltao added this to the Go1.8 milestone Jun 14, 2016

@tenorok

This comment has been minimized.

Copy link

tenorok commented Jun 24, 2016

I'm uploaded image with problem to github for safety, because hoster can remove her.
https://cloud.githubusercontent.com/assets/1322855/16350692/f67e1266-3a68-11e6-96a1-205b396b1ace.jpg

@cctse

This comment has been minimized.

Copy link

cctse commented Aug 18, 2016

I met the same problem. It raise "OSError: image file is truncated (53 bytes not processed)" when I use python to download a jpeg file, the binary data can be save to file with ImageFile.LOAD_TRUNCATED_IMAGES=True, but the truncated pixel will set to black. with go, all pixel process well and not truncate but it will raise "invalid JPEG format: short Huffman data" when decode it. the jpeg show full in safari but truncated in chrome. maybe go need truncate the jpeg as python does

@quentinmit quentinmit added the NeedsFix label Oct 7, 2016

@elektroid

This comment has been minimized.

Copy link

elektroid commented Oct 9, 2016

I have the same issue with pictures from my phone, I attach one in case it helps
https://cloud.githubusercontent.com/assets/6634115/19217480/2c442a36-8ddc-11e6-8392-4b45725b49ef.jpg

$ go version
go version go1.7.1 freebsd/amd64

@mattn

This comment has been minimized.

Copy link
Member

mattn commented Oct 9, 2016

package main

import (
    "fmt"
    "image/jpeg"
    "net/http"
)

func main() {
    urls := []string{
        "https://streamcoimg-a.akamaihd.net/000/340/810/9ae536dd97d2d92fc17a6590509a51c0.jpg",
        "https://cloud.githubusercontent.com/assets/6634115/19217480/2c442a36-8ddc-11e6-8392-4b45725b49ef.jpg",
    }
    for _, u := range urls {
        res, err := http.Get(u)
        if err != nil {
            panic(err)
        }
        defer res.Body.Close()
        img, err := jpeg.Decode(res.Body)
        if err != nil {
            panic(err)
        }
        fmt.Println(img.Bounds())
    }
}

I got fail in the second @elektroid mentioned.

(0,0)-(1920,1080)
panic: invalid JPEG format: short Huffman data

goroutine 1 [running]:
panic(0x5f5c00, 0xc0421a4060)
        c:/dev/go/src/runtime/panic.go:527 +0x1ae
main.main()
        c:/dev/go-sandbox/jpeg.go:22 +0x21d
exit status 2

first.

9ae536dd97d2d92fc17a6590509a51c0.jpg: JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=0], baseline, precision 8, 1920x1080, frames 3

second.

2c442a36-8ddc-11e6-8392-4b45725b49ef.jpg: JPEG image data, Exif standard: [TIFF image data, big-endian, direntries=9, datetime=2016:09:29 20:09:59, GPS-Data, model=Aquaris M5.5, resolutionunit=2, yresolution=155, xresolution=163], baseline, precision 8, 3120x4160, frames 3
@elsonwu

This comment has been minimized.

Copy link

elsonwu commented Oct 14, 2016

The same issue +1, some special jpg will cause this problem:

invalid JPEG format: short Huffman data

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Oct 15, 2016

@elektroid I'll try to find some time next week to look at it, but FWIW, I get e-mail for every comment on this issue, and somewhere along the mail pipeline, or in my browser's JPEG decoder, that attachment doesn't look like a valid JPEG. I've attached a screenshot from my mail, where I've added a pink ring to emphasize where it breaks down.

invalid

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Oct 15, 2016

@elsonwu can you give more details than "some special jpg will cause this problem"? Can you attach an example? Do other programs (e.g. web browsers, Photoshop) handle those special JPEGs OK or do they also reject them?

@elsonwu

This comment has been minimized.

Copy link

elsonwu commented Oct 17, 2016

@nigeltao
2013-04-06 00 24 50

This is the photo (it's from one of my user, if they don't like it, I will remove from here).

I tried to open it through web browser, it works fine without problem, but on Photoshop, it show error, but I can still continue to open and view it.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Nov 7, 2016

@nigeltao, what's the status here?

@elektroid

This comment has been minimized.

Copy link

elektroid commented Nov 7, 2016

I switched to "gopkg.in/gographics/imagick.v1/imagick" hoping it would cope with my improper files but it fails to load them too.

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Nov 8, 2016

Sorry, I didn't find the time to make a detailed investigation, and there have been no recent changes to Go's image/jpeg package, but it sounds like non-Go software is also reporting errors with some or all of these cases.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Nov 10, 2016

Yeah, but I can open it in Chrome. I thought we tried to match whatever browsers do.

@rsc rsc modified the milestones: Go1.9, Go1.8 Nov 11, 2016

@special

This comment has been minimized.

Copy link

special commented Nov 19, 2016

All of the failing testcases here and others that I've found are truncated. They don't have complete SOS segments, don't contain an EOI, and raise warnings with other decoders. In @elsonwu's case, it's fixed by appending \x00\xff\xd9. The others are missing more data.

Still, there is a bug in that there's no way to decode truncated images, which seem relatively common and are readable with most other decoders.

My first thought is to return a partially decoded image (if there is one) along with the error; see special@c7a05f3. I don't mind adding docs/tests and submitting that if the approach is ok.

Otherwise, it seems slightly inappropriate to silence legitimate decoding errors, and there's no API for decoding options, so I'm not sure what else to do.

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Nov 20, 2016

We could possibly change jpeg.Decode (and the other image codecs) to return (non-nil Image, non-nil error) with partial results if it encounters an error, although that's unusual in general for functions returning (T, err), and certainly not going to happen for Go 1.8.

As for matching whatever browsers do, what browsers do influences how far down the Postel's law slope we slip, but I'm wary of the slippery slope, and according the JPEG spec, these are invalid images.

@jlongman

This comment has been minimized.

Copy link

jlongman commented Feb 1, 2017

Just did a quick look at this for my reasons and summarized it like this:

  • sid and nancy : incorrectly terminated - no missing bytes
  • parking garage : truncated and incorrectly terminated - missing bytes
  • the guy : strange icc_profile and incorrectly terminated - no missing bytes
  • russian comic book : incorrectly terminated - no missing bytes

I'm saying the files are not truncated by examining the bottom right block and seeing pixels - as opposed to parking garage which is clearly truncated a couple of lines of blocks early. I didn't check metadata vs lines of pixels however, so whole lines of blocks could be missing.

The ICC_PROFILE may be something that's accepted in JPEG formats, or the format descriptions I was looking at weren't showing the working standard (as opposed to the published standard). It clearly decodes in other decoders in any case.

Anyways, I don't have a solution to your quandary about returning an error and an image, I suppose you could put the decoded image in the error (ack), but my thought is that the issue might better be described as instead of "image/jpeg: Unable to decode valid JPEG image" as "image/jpeg: Unable to decode invalid/truncated JPEG image". Cheers!

@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.9 May 23, 2017

@korya

This comment has been minimized.

Copy link

korya commented Jun 21, 2017

I have another example of invalid JPEG image. The problem with the file is that the second half of the file is filled with garbage bytes:

$ xxd bad-image.jpg
00000000: ffd8 ffe0 0010 4a46 4946 0001 0100 0001  ......JFIF......
00000010: 0001 0000 ffdb 0043 0005 0304 0404 0305  .......C........
 ... valid JPEG contents ...
0001fff0: 7ca0 574a 75cf 835d 4b12 cffd 9a0e 8fa1  |.WJu..]K.......
00020000: 7e33 1885 9110 0a4f b753 fff7 9de2 be06  ~3.....O.S......
 ... same 16-byte pattern ...
0003f4e0: 7e33 1885 9110 0a4f b753 fff7 9de2 be06  ~3.....O.S......
0003f4f0: 7e33 1885 9110 0a4f b753 fff7 9de2 be06  ~3.....O.S......

image.Decode fails with invalid JPEG format: missing 0xff00 sequence, but the browsers display the image:

screen shot 2017-06-21 at 09 12 28

@korya

This comment has been minimized.

Copy link

korya commented Jun 21, 2017

In regards to supporting invalid JPEG images. I would vote for adding this support sooner by keeping the semantics of jpeg.Decode unmodified and adding a new function to decode potentially invalid JPEGS, e.g. jpeg.TryToDecode. Alternatively, if it is not desirable to add such function to jpeg package, a new experimental package could be added. The package would implement the new semantics of JPEG decoding.

This way people can start using it today in these rare cases when image.Decode fails but it is known that the byte stream is a JPEG file.

By the way, if such package already exists please let me know.

@nigeltao

This comment has been minimized.

Copy link
Contributor

nigeltao commented Jun 23, 2017

I'd rather have a different package instead of adding TryToDecode to the standard library for the rest of Go 1.x's lifetime. As a bonus, such a package wouldn't be bound by the standard library feature freeze that we're currently in.

I don't know if any such package already exists, and I won't have time to make one any time soon. Sorry.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Jun 28, 2017

Related: #20804 for an invalid GIF that browsers decode, but Go doesn't.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 28, 2017

@golang golang deleted a comment from KishoreVignesh Jul 15, 2017

@golang golang deleted a comment from KishoreVignesh Jul 15, 2017

@bradfitz bradfitz modified the milestones: Go1.10, Unplanned Nov 16, 2017

@rasky

This comment has been minimized.

Copy link
Member

rasky commented Nov 29, 2017

@bradfitz can we change the title of this issue? This is now about doing something for invalid images, not valid ones.

@bradfitz bradfitz changed the title image/jpeg: Unable to decode valid JPEG image image/jpeg: add options to partially decode or tolerantly decode invalid images? Nov 29, 2017

@gopherbot gopherbot removed the NeedsFix label Nov 29, 2017

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Nov 29, 2017

@rasky, done.

@sheerun

This comment has been minimized.

Copy link

sheerun commented Oct 9, 2018

Here's another example: __20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment