Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fl_Image_Reader::read_byte() does not return EOF with file reads #271

Closed
wcout opened this issue Sep 7, 2021 · 29 comments
Closed

Fl_Image_Reader::read_byte() does not return EOF with file reads #271

wcout opened this issue Sep 7, 2021 · 29 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request fixed The issue or PR was fixed.

Comments

@wcout
Copy link
Contributor

wcout commented Sep 7, 2021

...because the value of getc(pFile) is cast to unsigned char on return, so EOF (-1) becomes 255 (0xff) and the line in Fl_GIF_Image::load_ gif()

https://github.com/fltk/fltk/blob/master/src/Fl_GIF_Image.cxx#L195

will not detect the EOF condition.

A hand-crafted fake GIF shows the effect: endless loop.
bug.zip

@erco77
Copy link
Contributor

erco77 commented Sep 7, 2021

Good catch.

Well, while read_byte() is a public method, the .h file is private (.h? Shouldn't it be .H?), so I guess there's at least two ways to solve:

  1. We could try to 'fix' the function by having it return an int so that both -1 and valid data 0x00-0xff can be returned.

-or-

  1. We could leave the function as is, and just document that to test for errors one has to use ferror() on the stream to check for errors, and warn not to test the return for -1.

Not sure which is best.

@wcout
Copy link
Contributor Author

wcout commented Sep 8, 2021

Hm, ferror() didn't work for EOF, feof() did:

@@ -74,7 +74,9 @@ Fl_Image_Reader::~Fl_Image_Reader() {
 // Read a single byte from memory or a file
 uchar Fl_Image_Reader::read_byte() {
   if (pIsFile) {
-    return getc(pFile);
+    int c = getc(pFile);
+    printf("getc: %d ferror: %d feof: %d\n", c, ferror(pFile), feof(pFile));
+    return c;
   } else if (pIsData) {
     return *pData++;
   } else {

gives:

getc: -1 ferror: 0 feof: 1

@erco77
Copy link
Contributor

erco77 commented Sep 8, 2021

Good info; if we go the (2) route we should make mention of that.

The docs for getc(3) do say it returns -1 on EOF /or/ error, and I suppose one is left to test either feof() or ferror() to determine what actually happened.

@Albrecht-S
Copy link
Member

Historically the code in question has been directly in the image loading functions, w/o the Fl_Image_Reader class. IIRC that old code didn't check for any errors. The prerequisite for loading a particular image was that it is valid data.

Nevertheless this 'GH Issue' is a valid issue, but it's not only the read_byte() function that has issues. There are other read_*() functions that use read_byte() or do similar unchecked operations. A quick look at the code seems to unveil that the methods that read from memory don't check for "end of memory" as well. BTW:

int open(const char *imagename, const unsigned char *data);

lacks a size argument, hence we can't check for "end of memory" with this code.

At this point in time I'm not sure whether going the (1) or (2) route would be the better choice. Probably the most simple "fix" would be to use the feof() check in some "strategic" places in the code (since reading after feof() should return EOF again, hence we could delay the check until we need it). But my gut feeling is that it would be better to change all read_*() methods to return signed values and test for return values less than 0. (Nitpicking: IIRC EOF is not clearly defined as -1 by the standard but can be "anything", it is implementation defined - but I assume it's always -1.)

@Albrecht-S
Copy link
Member

The docs for getc(3) do say it returns -1 on EOF /or/ error, and I suppose one is left to test either feof() or ferror() to determine what actually happened.

Yep, the POSIX man page says this explicitly:

The ferror() or feof() functions must be used to distinguish between an error condition and an end-of-file condition.

Note: the POSIX man page of getc() refers to fgetc(), hence the link above to fgetc().

@Albrecht-S
Copy link
Member

the .h file is private (.h? Shouldn't it be .H?)

No, not according to the CMP.

@wcout
Copy link
Contributor Author

wcout commented Sep 8, 2021

Historically the code in question has been directly in the image loading functions, w/o the Fl_Image_Reader class. IIRC that old code didn't check for any errors.

FLTK 1.3 has the same issue with the NEXTBYTE macro:

if (i<0) {

@Albrecht-S
Copy link
Member

There could be another way when reading from a file:

  1. Let read_byte() and others (read_word(), read_dword() etc.) check for errors and store the result (EOF or ERROR) in the Fl_Image_Reader object. Then add methods like Fl_Image_Reader::eof() and Fl_Image_Reader::error() to check for existing errors (aka get_last_error()).

We should think about not reading any further bytes if a previous read operation yielded EOF or ERROR to stay consistent if the caller doesn't check for EOF or ERROR immediately.

If we went that route I'd recommend to return 0 rather than 255 (uchar(-1)) in case of EOF or ERROR.

@Albrecht-S
Copy link
Member

Reading from memory could be enhanced as follows: we add another constructor with an additional size argument to Fl_BMP_Image and Fl_GIF_Image (and maybe other image formats?). Note that Fl_PNG_Image already has a constructor with a size argument (I didn't check other image types yet). Adding this new constructor would be a step forward for more consistency. As soon as we have a size argument we can check for "end of memory" (buffer overrun).

The drawback is that we can't easily remove the old constructor(s) which should then be deprecated and necessarily have to assume "infinite memory/size". These existing constructors do already document explicitly that buffer overruns are not checked.

@erco77
Copy link
Contributor

erco77 commented Sep 9, 2021

Wouldn't option (3) just be duplicating the separate error/eof flags in the FILE struct? Apparently that's how ferror()/feof() do what they do. If we do want our own ::eof() and ::error() detection, we can just wrap feof() and ferror() on the FILE* I'd think, no need for separate flags.

But yes, any EOF detected from any of the read functions should immediately return the indication and wind back out of the image reader "gracefully" on an incomplete read.

Not sure if that means returning a black image, or deallocating the image and returning a NULL image with zero for the x()/y() sizes.

In my film production days, we'd often code the image reader library to show an "X" frame (black image with a white X to all 4 corners) as a visual indication of an image that didn't load completely.

@wcout
Copy link
Contributor Author

wcout commented Sep 9, 2021

I have tested `bug.gif' with this change:

diff --git a/src/Fl_GIF_Image.cxx b/src/Fl_GIF_Image.cxx
index d747ac480..188665758 100644
--- a/src/Fl_GIF_Image.cxx
+++ b/src/Fl_GIF_Image.cxx
@@ -192,7 +192,7 @@ void Fl_GIF_Image::load_gif_(Fl_Image_Reader &rdr)
   for (;;) {
 
     int i = rdr.read_byte();
-    if (i<0) {
+    if (rdr.error()) {
       Fl::error("Fl_GIF_Image: %s - unexpected EOF", rdr.name());
       w(0); h(0); d(0); ld(ERR_FORMAT);
       return;
@@ -242,11 +242,23 @@ void Fl_GIF_Image::load_gif_(Fl_Image_Reader &rdr)
       break; // okay, this is the image we want
     } else {
       Fl::warning("%s: unknown gif code 0x%02x", rdr.name(), i);
-      blocklen = 0;
+      // though this an unknown block id, treat it like a GIF block
+      // in order to skip it correctly
+      ch = rdr.read_byte();
+      if (rdr.error()) blocklen = 0;
+      else {
+        ch = rdr.read_byte();
+        blocklen = rdr.error() ? 0 : ch;
+      }
     }
 
     // skip the data:
-    while (blocklen>0) {while (blocklen--) {ch = rdr.read_byte();} blocklen = rdr.read_byte();}
+    while (!rdr.error() && blocklen>0) {
+      while (!rdr.error() && blocklen--) {
+       ch = rdr.read_byte();
+       }
+       blocklen = rdr.read_byte();
+    }
   }
 
   if (BitsPerPixel >= CodeSize)
diff --git a/src/Fl_Image_Reader.h b/src/Fl_Image_Reader.h
index 93f5edc0d..a8f0f5396 100644
--- a/src/Fl_Image_Reader.h
+++ b/src/Fl_Image_Reader.h
@@ -71,6 +71,7 @@ public:
   // return the name or filename for this reader
   const char *name() { return pName; }
 
+  int error() const { return pFile ? (feof(pFile) || ferror(pFile)) ? 1 : 0 : 0; }
 private:
 
   // open() sets this if we read from a file

I have also changed the handling of unknown block id's, otherwise the file would choke on every character read in with the Fl::warning("unknown gif code").

@Albrecht-S
Copy link
Member

Wouldn't option (3) just be duplicating the separate error/eof flags in the FILE struct?

Yes if we're reading from a file, but no if we're reading from memory. The advantage is that we can make it consistent for both cases.

@wcout Great job!

@Albrecht-S Albrecht-S self-assigned this Sep 20, 2021
@Albrecht-S Albrecht-S added bug Something isn't working enhancement New feature or request labels Sep 20, 2021
Albrecht-S pushed a commit that referenced this issue Sep 25, 2021
This is part 1 and a prerequisite for the fix of issue #271.
It enables the user of this internal class (Fl_{BMP|GIF}_Image)
to test for read errors and EOF (end of file) while reading.

The method used to read data from memory got an optional third
argument 'const long datasize = -1)' to limit the size of the
memory block of data provided to the image reader. Default is -1
which means "unlimited" (backwards compatibility).

Using only two arguments (w/o size limit) is deprecated and should
only be done if the data size is not available.
@Albrecht-S Albrecht-S added the active Somebody is working on it label Sep 25, 2021
Albrecht-S pushed a commit that referenced this issue Sep 25, 2021
Fix: Fl_Image_Reader::seek() would not clear the error flag when
reading from memory.
@Albrecht-S
Copy link
Member

Albrecht-S commented Sep 25, 2021

The two commits mentioned above add error detection to Fl_Image_Reader as discussed above. I also added Fl_Image_Reader::tell() which can be used in error cases to help debugging (output the current memory/file offset in error messages).

Still to do: use the new feature(s) in Fl_GIF_Image and Fl_BMP_Image. This is work in progress: I'm almost done with it but this needs some cleanup. I'm also still testing some special images that are either broken or can't be decoded by our code.

FWIW: this image file (resize.gif) appears to be displayed with random colors or crashes the current (FLTK) GIF decoder. Here is an upscaled image as displayed by gimp. If someone feels inclined to find the bug...

Review of the new Fl_Image_Reader code would be appreciated.

@Albrecht-S
Copy link
Member

FTR: I found the culprit in the image mentioned above (resize.gif) and our FLTK decoder. The problem is that ColorMapSize == 2 (i.e. 0, 1) but the transparent color index is set to 2, i.e. beyond the existing colormap. Our current code could not deal with this. My new code extends the colormap and now it works. :-)

Here is an image of the resultant display in FLTK with:

  • window background: light yellow
  • image box background: pink
  • hence transparent pixels are pink

resize_fltk_fixed

Next step: clean up code and commit...

Albrecht-S pushed a commit to Albrecht-S/fltk that referenced this issue Sep 26, 2021
- add error and EOF checks
- fix transparent pixel index outside ColorMap (Issue fltk#271)
- fix Fl_GIF_Image decoder bug (Issue fltk#274)

Work in progress, there's still room for improvements.
Albrecht-S pushed a commit that referenced this issue Sep 27, 2021
- add error and EOF checks
- fix transparent pixel index outside ColorMap (#271)
- fix Fl_GIF_Image decoder bug (#274)
- add Fl_Image_Reader::skip(unsigned int)
- use new skip() method in GIF reader
@Albrecht-S
Copy link
Member

This issue has been fixed in commit 1d847fe.

The same error and EOF checking has been applied to Fl_BMP_Image in commit e0d630e.

I consider this issue fixed. Test reports, feedback and suggestions for further improvement would be appreciated. TIA.

@Albrecht-S Albrecht-S added fixed The issue or PR was fixed. and removed active Somebody is working on it labels Sep 27, 2021
@Albrecht-S
Copy link
Member

Albrecht-S commented Sep 27, 2021

Note: the suggested modification in #271 (comment) in lines 242++ has not been applied. These lines of code concern data that is not necessarily followed by "data blocks" that can be skipped this way. The original file bug.gif issues the following warnings:

bug.gif: unknown GIF code 0x73 at offset 13
bug.gif: unknown GIF code 0x6c at offset 14
bug.gif: unknown GIF code 0x61 at offset 15
bug.gif: unknown GIF code 0x6a at offset 16
bug.gif: unknown GIF code 0x64 at offset 17
bug.gif: unknown GIF code 0x6c at offset 18
bug.gif: unknown GIF code 0x6b at offset 19
bug.gif: unknown GIF code 0x73 at offset 20
bug.gif: unknown GIF code 0x61 at offset 21
bug.gif: unknown GIF code 0x6a at offset 22
bug.gif: unknown GIF code 0x64 at offset 23
bug.gif: unknown GIF code 0x6b at offset 24
bug.gif: unknown GIF code 0x6c at offset 25
bug.gif: unknown GIF code 0x6a at offset 26
bug.gif: unknown GIF code 0x6b at offset 27
bug.gif: unknown GIF code 0x6b at offset 28
[229] Fl_GIF_Image: bug.gif - unexpected EOF or read error at offset 29

Hint: interpreting GIF code 0x73 as a block size would cause a read error and obfuscate better error diagnostics. IMHO there's nothing we can do in such a case.

EDIT: ... except maybe terminating the decoding immediately rather than reading further...

@wcout
Copy link
Contributor Author

wcout commented Sep 28, 2021

Hint: interpreting GIF code 0x73 as a block size would cause a read error and obfuscate better error diagnostics. IMHO there's nothing we can do in such a case.

Well, that's debatable, but not really important. The main drawback is a (possible) thousands of error lines in the console. Terminating though may be not such a good idea, because it could break some special GIF's - who knows...

@wcout
Copy link
Contributor Author

wcout commented Sep 28, 2021

I was just skimming through the new code. Will test later, as I'm busy right now.

What I notice: Is there maybe a leak when CHECK_ERROR returns prematurely after line

uchar *Image = new uchar[Width*Height];

At normal exit the memory is released:

delete[] Image;

...but not in CHECK_ERROR.

@wcout
Copy link
Contributor Author

wcout commented Sep 28, 2021

Ah, and FWIW cppcheck meant this line could be const:

const char *name() { return pName; }

Albrecht-S pushed a commit that referenced this issue Sep 28, 2021
This could happen if a read error or end of file was encountered.
Albrecht-S pushed a commit that referenced this issue Sep 28, 2021
@Albrecht-S
Copy link
Member

@wcout wrote:

Is there maybe a leak when CHECK_ERROR returns prematurely ...

Good catch, thanks! Fixed in 32e02a6.

cppcheck meant this line could be const ...

Thanks for this as well, fixed in 66c0bf4.

@Albrecht-S
Copy link
Member

Referring to this comment: I agree it's debatable, but the error message "unknown GIF code 0x..." is in a context (the main read loop) where only a few different codes are to be expected. I read the "grammar" in the GIF spec (Appendix B) and I believe that finding other codes in this place is a serious error, i.e. a broken GIF file. I'll double check this though.

In other places there's the notion of data (sub)blocks with a blocklen and a terminating block. If we encounter an "unknown GIF extension" we can (and do) skip and thus ignore the data blocks and continue reading. But that's definitely another issue.

BTW, we don't (yet) parse "empty" GIF files correctly. This one is from the pygif test suite mentioned here:

zero-height.gif: unknown GIF code 0x3b at offset 19
[229] Fl_GIF_Image: zero-height.gif - unexpected EOF or read error at offset 20

The point here is that 0x3b indicates the "trailer" (which we don't recognize). After that - I think - we should terminate the reader. This happens only if we don't find a valid image which means the Fl_GIF_Image object is empty (invalid). The GIF docs say in Appendix B:

The grammar indicates that it is possible for a GIF Data Stream to contain the Header, the Logical Screen Descriptor, a Global Color Table and the GIF Trailer. This special case is used to load a GIF decoder with a Global Color Table, in preparation for subsequent Data Streams without color tables at all.

Since we don't read "subsequent Data Streams" this is a moot point and we should exit gracefully - but I don't know yet what to do with the image object. Mark it as invalid (ERR_FORMAT) or create an image with w() = h() = d() = ld() = 0?

@wcout
Copy link
Contributor Author

wcout commented Sep 28, 2021

the error message "unknown GIF code 0x..." is in a context (the main read loop) where only a few different codes are to be expected. I read the "grammar" in the GIF spec (Appendix B) and I believe that finding other codes in this place is a serious error, i.e. a broken GIF file.

After reading up now...you are right, this is not the place for block skipping. Thanks for your explanations!

Since we don't read "subsequent Data Streams" this is a moot point and we should exit gracefully - but I don't know yet what to do with the image object. Mark it as invalid (ERR_FORMAT) or create an image with w() = h() = d() = ld() = 0?

The GIF viewers I tested with, treat it as an invalid GIF. This seems to me also the best solution in order to discern this case from image descriptor with zero width/height.

@wcout
Copy link
Contributor Author

wcout commented Sep 30, 2021

I have tested all files on my hard drive with the new version and it works perfectly 👍

Hope you don't mind some nitpicking:

As you certainly know this line is technically not necessary:

if (Image)

Out of curiosity:
What do these pXXXX variable names in Fl_Image_Reader mean? p like private? I do not see this scheme used in the other FLTK files.

@Albrecht-S
Copy link
Member

Great, thanks for confirmation.

As you certainly know this line is technically not necessary: ...

I usually use it anyway (but not always) to emphasize that it's only done when an image was allocated, but you're right, I know that.

What do these pXXXX variable names in Fl_Image_Reader mean? p like private? I do not see this scheme used in the other FLTK files.

I don't know either. It's not the official FLTK coding style, but sometimes contributed code doesn't conform to it. Private member variables should have a trailing underscore instead. Personally I think the 'p' is a bad choice, it could mean private, public, protected, pointer, ..., maybe more. I chose to keep the existing style and added the new variables with a 'p' prefix as well to avoid changing too much unrelated code but I'm not happy with this. BTW: the CamelCase variable names in Fl_GIF_Image are also not FLTK style.

@Albrecht-S
Copy link
Member

I'm leaving this issue open because I intend to work on the remaining open points:

  • parse the GIF trailer 0x3b correctly (terminate image reading with error/warning message)
  • improve unknown GIF code behavior: terminate with error message

@wcout
Copy link
Contributor Author

wcout commented Sep 30, 2021

it could mean private, public, protected, pointer

Hope not pointer, for then pError would be wrong ;)
[I came to this only because I tried to make it public (in order to save on the number of functions call to gif_error()), and asked myself if that would change the name, if private was the meaning of p]

@wcout
Copy link
Contributor Author

wcout commented Sep 30, 2021

improve unknown GIF code behavior: terminate with error message

BTW: I came across a file (it maybe was posted in one the FLTK groups previously), that has this error, but still loads ok then:
black_white.zip

@Albrecht-S
Copy link
Member

Albrecht-S commented Sep 30, 2021

BTW: I came across a file (it maybe was posted in one the FLTK groups previously), that has this error, but still loads ok then:
black_white.zip

IMHO it doesn't "load ok" in the FLTK viewer (fl_pixmap_browser): it displays an image, but that's not "ok". As you can see in the error messages, the first 24 lines seem to show the contents of what should be a global color table (8 gray values). The last line says "black_white.gif does not have a color table, using default" (which could be considered a "feature"). However, such a broken image could eventually lead to anything, including breaking an otherwise correct FLTK program (if it crashes later for whatever reason). So the only safe solution is IMHO to terminate decoding and ...

All the other "image viewers" I tested (including Firefox and Chrome) don't display an image, some of them issue an error message, others don't.

FTR: the broken image has only one bit wrong: the "Global Color Table Flag" is 0 (must be 1). I fixed it with a hex editor (s/0x22/0xa2/ - see below) et voilà: it works. It's a nicely animated GIF image.

$ xxd -g1 black_white.gif | head -1; xxd -g1 black_white_fixed.gif | head -1
00000000: 47 49 46 38 39 61 58 02 58 02 22 06 00 99 99 99  GIF89aX.X.".....
00000000: 47 49 46 38 39 61 58 02 58 02 a2 06 00 99 99 99  GIF89aX.X.......

Image attached:
black_white_fixed.gif

@Albrecht-S
Copy link
Member

The two remaining issues have been fixed in commit cc82b74:

  • parse the GIF trailer 0x3b correctly (terminate image reading with error/warning message)
  • improve unknown GIF code behavior: terminate with error message

This concludes the work on this issue and related code.

I'm closing this issue now. Thanks for all contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request fixed The issue or PR was fixed.
Projects
None yet
Development

No branches or pull requests

3 participants