Reduce overhead reading byte/sbyte arrays from IndexedReader #405

drewnoakes · 2024-02-06T12:24:06Z

Traces gathered over the test suite show:

~2.7% CPU spent in IndexedReader.GetByte(int)
~0.4% CPU spent in IndexedReader.GetSByte(int)

Looking at the main callers shows loops in TiffReader that call these methods in loops. This approach accrues overhead per-byte due to bounds checking and virtual dispatch.

Instead, use the Span<byte> overload of GetBytes that performs the bounds checking once, then copies data in a single call.

It may be possible to give a similar treatment to the handling of other TIFF format codes, though they're not currently showing up on traces and would be more complex to implement due to byte-ordering issues (byte and sbyte being immune from those).

Traces gathered over the test suite show: - ~2.7% CPU spent in `IndexedReader.GetByte(int)` - ~0.4% CPU spent in `IndexedReader.GetSByte(int)` Looking at the main callers shows loops in `TiffReader` that call these methods in loops. This approach accrues overhead per-byte due to bounds checking and virtual dispatch. Instead, use the `Span<byte>` overload of `GetBytes` that performs the bounds checking once, then copies data in a single call. It may be possible to give a similar treatment to the handling of other TIFF format codes, though they're not currently showing up on traces and would be more complex to implement due to byte-ordering issues (`byte` and `sbyte` being immune from those).

iamcarbon · 2024-02-06T20:39:14Z

Looks good!

drewnoakes · 2024-02-06T22:10:34Z

Once these changes are in I'll gather some new traces and see what surfaces.

We're so IO-bound that I think we'll need to rethink how we read from disk/network to get much more improvement on the perf side.

drewnoakes · 2024-02-06T22:14:39Z

We're so IO-bound that I think we'll need to rethink how we read from disk/network to get much more improvement on the perf side.

As part of that a good goal would be to enable async IO so that we're not blocking threads on IO operations. The way we currently read data, byte-by-byte, doesn't lend itself well to that much async, as the overhead adds up. I lean towards async IO for pulling larger chunks of data from the file, storing those chunks as Memory<byte> types, and then moving parsing to subsequent steps (non-IO).

drewnoakes added format-tiff performance labels Feb 6, 2024

drewnoakes requested review from iamcarbon and kwhopper February 6, 2024 12:25

drewnoakes merged commit 1837bff into main Feb 6, 2024
2 checks passed

drewnoakes deleted the reduce-reader-get-byte-overhead branch February 6, 2024 22:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce overhead reading byte/sbyte arrays from IndexedReader #405

Reduce overhead reading byte/sbyte arrays from IndexedReader #405

drewnoakes commented Feb 6, 2024

iamcarbon commented Feb 6, 2024

drewnoakes commented Feb 6, 2024

drewnoakes commented Feb 6, 2024

Reduce overhead reading byte/sbyte arrays from IndexedReader #405

Reduce overhead reading byte/sbyte arrays from IndexedReader #405

Conversation

drewnoakes commented Feb 6, 2024

iamcarbon commented Feb 6, 2024

drewnoakes commented Feb 6, 2024

drewnoakes commented Feb 6, 2024