Implement a safer TiffEntry/CiffEntry way of accessing data #142

pedrocr · 2016-02-21T15:51:08Z

After the work in PR #141 I've come up with a sketch for a further API change to achieve two things at once:

Make the TiffEntry/CiffEntry APIs safer against out of bounds access
Finally remove the TiffIFD/TiffIFDBE split, as discussed in PR Rename 'BE' to "Swap", #27

So here's the basic sketch:

Remove getIntArray()/getShortArray() functions and instead make getInt()/getShort()/getData() take an optional integer argument that's the array offset (and in the case of getData() a count as well)
Make TiffEntry/CiffEntry no longer have a pointer to data but instead just keep around the reference to the underlying FileMap
Move the get4BE() and friends macros into the FileMap API
Have TiffEntry/CiffEntry know the host and file endianness (pushing into FileMap is also possible but sometimes there are BE bits inside a LE file and vice-versa)
Now when doing getInt()/getShort()/getData() just call into FileMap with getNBE()/getNLE()/getData() insuring proper bounds checks. It is also endian clean so there's no need to keep around the swapped versions of TiffIFD/TiffEntry. It also avoids the memory copying in TiffEntryBE.

Since the TiffEntry API is used for metadata only the performance implications should be minimal if any. How does the plan sound?

pedrocr · 2016-03-25T16:58:37Z

@klauspost any opinion on this? getShortArray() and getIntArray() make bounds checking non-obvious so I'd rather just move everything into getInt() and getShort() as described. Any objection to that?

axxel · 2016-11-30T03:25:18Z

@pedrocr : looking at the memcpy vs aligned int cast issue (see #164) I came from one thing to the next and ended up thinking about how to improve the code duplication situation around the 5 more-or-less equivalent bitpump implementations and the two ByteStream(Swap) variants, which is related to your ideas about removing TiffEntryBE. Are you still pursuing this idea? And if so, got this stuck just because @klauspost did not give any feedback to your suggestions?

LebedevRI · 2016-11-30T11:53:47Z

As far i'm concerned this got implemented and merged a while ago.

axxel · 2016-11-30T12:15:54Z

I was referring to the second point and the 'sketch', especially the FileMap API changes regarding byte swapping.

pedrocr · 2016-11-30T12:45:31Z

@axxel I haven't worked on this in a while but it's still a simplification opportunity in the code. Of the sketch I describe above only the first point is implemented. I've been toying with a rust implementation for a raw loading library and have implemented it like that:

https://github.com/pedrocr/rawloader/blob/master/src/decoders/tiff.rs

There is no TiffEntry/TiffEntryBE split, there's only a single TiffEntry that gets instantiated with an endianness and then an Endian implementation that abstracts away the swapping of bytes:

https://github.com/pedrocr/rawloader/blob/master/src/decoders/basics.rs

Works quite well. As for the BitPumps I've so far only implemented two and did them with code duplication so far but the rawspeed way of just having a base class and then implementing filling sounds good to me. BitPumps are very sensitive performance wise so optimizing for less code isn't ideal. I did that initially for the MSB/LSB pumps and the performance penalty was significant.

axxel · 2016-11-30T13:23:56Z

@pedrocr Are you intending to pursue those sketched out plans further?

As to the potential performance implications: I am well aware of them and would in no way sacrifice performance (my motivation looking into this in the first place, was my worry about a performance drop caused by the memcpy() fix for the alignment issues on some platforms). And I see further performance improvement opportunities in the handling of the byteswapping. Therefore my asking about your plans in this regard.

pedrocr · 2016-11-30T13:26:49Z

@axxel not really, if you're interested go ahead

pedrocr mentioned this issue Apr 2, 2016

Sync code changes from darktable #149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a safer TiffEntry/CiffEntry way of accessing data #142

Implement a safer TiffEntry/CiffEntry way of accessing data #142

pedrocr commented Feb 21, 2016

pedrocr commented Mar 25, 2016

axxel commented Nov 30, 2016

LebedevRI commented Nov 30, 2016

axxel commented Nov 30, 2016

pedrocr commented Nov 30, 2016

axxel commented Nov 30, 2016

pedrocr commented Nov 30, 2016

Implement a safer TiffEntry/CiffEntry way of accessing data #142

Implement a safer TiffEntry/CiffEntry way of accessing data #142

Comments

pedrocr commented Feb 21, 2016

pedrocr commented Mar 25, 2016

axxel commented Nov 30, 2016

LebedevRI commented Nov 30, 2016

axxel commented Nov 30, 2016

pedrocr commented Nov 30, 2016

axxel commented Nov 30, 2016

pedrocr commented Nov 30, 2016