Make rawspeed much more resilient to crashing #141

pedrocr · 2016-02-21T15:38:06Z

Just like discussed in #137 I've made the getData() API safer. I've also been running rawspeed under afl-fuzz and fixing quite a few crashing bugs. I've ran regression testing against the rawsamples.ch sample set and everything is still working as it should.

Rawspeed works by reading the whole file into memory and then indexing into it. This together with using file inputs as offsets can lead to out of bounds reads in memory. Implement a better internal API to make these checks both easier to write and harder to skip. This covers three main cases: 1) Whenever we are accessing data directly from the file we used to do something like: uchar8 *data = mFile->getData(offset); Replace that with: uchar8 *data = mFile->getData(offset, len); and then do proper bounds checking to make sure we can safely read len bytes from the returned pointer. 2) When we want to read a byte stream we will often do something like: ByteStream in(mFile->getData(off), mFile->getSize()-off); which basically means we allow reading from the offset all the way to the end of the file. Make that simpler by just doing: ByteStream in(mFile, off); and have that do all the needed bounds checking. 3) When we want to read a byte stream up to a certain number of bytes we will do something like: ByteStream in(mFile->getData(off), len); which may be unsafe if len > mFile->getSize()-off. Instead do: ByteStream in(mFile, off, len); and that will make sure the bounds checking is done properly.

FileMap includes by default 16 bytes of extra space so that things like BitPumpMSB can read safely a bit beyond the size of the file. Take that into account properly when doing bounds checking. Fixes some broken D800 files that had the image data go all the way to the end of the file.

TiffEntry parsing can lead to out-of-bounds access. Guard against that with getData() checks instead of CHEKSIZE and bail out of processing IFDs or Entries when the checks fail but don't bail out of the whole file because of that. Raw formats being what they are they are still parseable even if some entries are broken.

CIFF has a similar method for reading where the number of entries is fetched from the file. Do the same checking of offsets and handling of errors that was already done for TIFF.

Don't force there to be an extra byte at the end of a TiffEntry/CiffEntry for it to be valid. As far as I can tell this didn't break any files but it seems wrong anyway so go ahead and fix it.

Integer overflow is tricky. Make sure offset+size never overflows by doing the calculation in 64bits.

When offset==size we're exactly 1byte over the end of the file so reject that location.

When a Tiff entry is empty (count == 0) instead of doing a NULL-dereference on things like getInt() just return some empty data instead.

If a TIFF entry had an empty string we'd try and allocate a zero-sized array and then set to \0 the position before that array resulting in SIGSEGV. Instead just return the empty string.

TIFF entry count of items is a 32bit integer but the corresponding byte size can overflow those 32bits if the count is very large and the byte size is 4 or 8. Reject the tiff entry if the byte size exceeds UINT32_MAX as we don't handle files larger than that anyway.

A common pattern in the code was isValid(offset+count) which is actually wrong as offset+count is the first byte after the region we want. Create a new API that's isValid(offset,count) that does the right thing by default and convert all the users to it.

It's easy enough to generate an infinite looping TIFF by having a SubIFD that points towards itself (or other more complex indirections). Avoid crashing on this by setting a maximum TIFF depth and aborting once reached.

Just as TIFF allows for inifinite looping SubIFDs so does CIFF (CRW base format). Avoid those loops the same way as we already did for TIFF.

At the root level TIFF has the notion of the next IFD and if that's made into a loop we get into a loop of heap-allocating calls, eventually running out of memory. Fix that just by limiting the max number of SubIFDs the root one can have to 100 which should be more than plenty for any sane file.

This isn't actually used anywhere so just remove it.

Now that the API itself does enough bounds checking we don't need to have these macros around anymore.

The TIFF metadata block may not have been found so check for that before trying to dereference the pointer to look for make and model.

When doing bad pixel processing it makes no sense to keep on going when a lot of pixels are bad. Bail out after 500. This avoids a pathological case where all pixels are marked as bad and so we turn raw processing into O(n^2) or worse.

A malformed DNG file can cause a huge memory allocation and memset by claiming to have a huge CFA array. Bail out earlier as the decode is never going to work anyway.

Make rawspeed much more resilient to crashing

klauspost · 2016-03-19T18:56:50Z

Thanks a bunch!

pedrocr · 2016-03-19T18:58:25Z

:) there should be quite a few more on the way, afl-fuzz is really good at finding issues.

pedrocr and others added 23 commits February 21, 2016 15:32

Remove now unused variable

0f17edb

Fix bounds check in BitPump*

3ff5e55

Don't set data on empty TiffEntry

50363cb

Add CIFF bound checks like the TIFF ones

abe87d9

CIFF has a similar method for reading where the number of entries is fetched from the file. Do the same checking of offsets and handling of errors that was already done for TIFF.

CIFF/TIFF: fix over eager bounds check

7009bce

Don't force there to be an extra byte at the end of a TiffEntry/CiffEntry for it to be valid. As far as I can tell this didn't break any files but it seems wrong anyway so go ahead and fix it.

Strenghten the Filemap bounds checks

77263fc

Integer overflow is tricky. Make sure offset+size never overflows by doing the calculation in 64bits.

FileMap: offset==size is not valid

78c207a

When offset==size we're exactly 1byte over the end of the file so reject that location.

Handle empty Tiff entries gracefully

e009f4d

When a Tiff entry is empty (count == 0) instead of doing a NULL-dereference on things like getInt() just return some empty data instead.

TiffEntry: Don't crash on empty strings

11884ad

If a TIFF entry had an empty string we'd try and allocate a zero-sized array and then set to \0 the position before that array resulting in SIGSEGV. Instead just return the empty string.

Fix 32-bit build

857aeaf

TiffIFD: abort on infinite looping TIFF files

d717258

It's easy enough to generate an infinite looping TIFF by having a SubIFD that points towards itself (or other more complex indirections). Avoid crashing on this by setting a maximum TIFF depth and aborting once reached.

CIFF: also avoid infinite-looping CIFF files

e3e6032

Just as TIFF allows for inifinite looping SubIFDs so does CIFF (CRW base format). Avoid those loops the same way as we already did for TIFF.

TIFF: remove dead code (TiffParserHeaderless)

3130362

This isn't actually used anywhere so just remove it.

Remove last few CHECK*() macros

bf66900

Now that the API itself does enough bounds checking we don't need to have these macros around anymore.

MRW: avoid crash in metadata handling

046103f

The TIFF metadata block may not have been found so check for that before trying to dereference the pointer to look for make and model.

RawImage: bail out on too many bad pixels

1dc75d5

When doing bad pixel processing it makes no sense to keep on going when a lot of pixels are bad. Bail out after 500. This avoids a pathological case where all pixels are marked as bad and so we turn raw processing into O(n^2) or worse.

CFA: Bail out on too big CFA arrays

1812a8c

A malformed DNG file can cause a huge memory allocation and memset by claiming to have a huge CFA array. Bail out earlier as the decode is never going to work anyway.

pedrocr mentioned this pull request Feb 21, 2016

Implement a safer TiffEntry/CiffEntry way of accessing data #142

Open

klauspost added a commit that referenced this pull request Mar 19, 2016

Merge pull request #141 from pedrocr/safety-first

f3cdd8e

Make rawspeed much more resilient to crashing

klauspost merged commit f3cdd8e into klauspost:develop Mar 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make rawspeed much more resilient to crashing #141

Make rawspeed much more resilient to crashing #141

pedrocr commented Feb 21, 2016

klauspost commented Mar 19, 2016

pedrocr commented Mar 19, 2016

Make rawspeed much more resilient to crashing #141

Make rawspeed much more resilient to crashing #141

Conversation

pedrocr commented Feb 21, 2016

klauspost commented Mar 19, 2016

pedrocr commented Mar 19, 2016