ARROW-5605: [C++] Verify Flatbuffer messages in more places to prevent crashes due to bad inputs#4573
ARROW-5605: [C++] Verify Flatbuffer messages in more places to prevent crashes due to bad inputs#4573crepererum wants to merge 7 commits intoapache:masterfrom
Conversation
Issue: ARROW-5605
wesm
left a comment
There was a problem hiding this comment.
Only really DRY issues. Have you run arrow-ipc-read-write-benchmark to assess performance changes if any?
cpp/src/arrow/ipc/message.cc
Outdated
| if (!flatbuf::VerifyMessageBuffer(verifier)) { | ||
| return Status::IOError("Invalid flatbuffers message."); | ||
| } | ||
| message_ = flatbuf::GetMessage(data); |
There was a problem hiding this comment.
Can you factor this into a helper function to avoid code duplication?
There was a problem hiding this comment.
helper function doesn't work due to the early IOError return
There was a problem hiding this comment.
And honestly I was not able to find a nice macro that does this. The issue is basically that you would need to cover 2 things: the early exit and the assignment to message_ (or another variable)
| if (sparse_tensor == nullptr) { | ||
| return Status::IOError( | ||
| "Header-type of flatbuffer-encoded Message is not SparseTensor."); | ||
| } |
There was a problem hiding this comment.
This assertion about the message type could be handled more generically (since we can pass in the expected union value), then this helps with code duplication
There was a problem hiding this comment.
And I have the same issue here: how to pull that code out that it includes the assignment AND the return?
|
Do we have a doc on how to run fuzzing builds somewhere? |
|
@crepererum will you have time this week to look into my comments? Would be great to get this into 0.14.0 |
|
Is Friday sufficient? If not, I can try to get it through tomorrow, but that's nothing I can promise. |
|
Sure, that's fine |
|
For the benchmarks: Before: After: I have to admit though that the benchmarks are extremely flaky, even with |
|
I'm taking care of the refactoring. Done shortly |
While the first commit (
fix ReadRecordBatch validation) is sufficient to fix ARROW-5605, I took the time to fix very similar issues in the code IPC code base, so our users are probably protected and also to help the fuzzer to not run straight into a similar problem again.