New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] arrow::ipc::StreamDecoder
may return a broken RecordBatch with Consume(const uint8_t* data, int64_t size)
API
#39163
Comments
kou
added a commit
to kou/arrow
that referenced
this issue
Dec 10, 2023
…(data) We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls.
kou
added a commit
to kou/arrow
that referenced
this issue
Dec 10, 2023
…(data) We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls.
kou
added a commit
to kou/arrow
that referenced
this issue
Dec 11, 2023
…(data) We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls.
kou
added a commit
to kou/arrow
that referenced
this issue
Jan 5, 2024
…(data) We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls.
kou
added a commit
that referenced
this issue
Jan 6, 2024
…#39164) ### Rationale for this change We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls. ### What changes are included in this PR? * Add missing copies. * Clean up existing buffer copy codes. * Change tests to use ephemeral `data` to detect this case. * Add `copy_record_batch` option to `CollectListener` to copy decoded record batches. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes #39163 * Closes: #39163 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
clayburn
pushed a commit
to clayburn/arrow
that referenced
this issue
Jan 23, 2024
…(data) (apache#39164) ### Rationale for this change We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls. ### What changes are included in this PR? * Add missing copies. * Clean up existing buffer copy codes. * Change tests to use ephemeral `data` to detect this case. * Add `copy_record_batch` option to `CollectListener` to copy decoded record batches. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes apache#39163 * Closes: apache#39163 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
…(data) (apache#39164) ### Rationale for this change We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls. ### What changes are included in this PR? * Add missing copies. * Clean up existing buffer copy codes. * Change tests to use ephemeral `data` to detect this case. * Add `copy_record_batch` option to `CollectListener` to copy decoded record batches. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes apache#39163 * Closes: apache#39163 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
zanmato1984
pushed a commit
to zanmato1984/arrow
that referenced
this issue
Feb 28, 2024
…(data) (apache#39164) ### Rationale for this change We need to copy data for metadata message. Because it may be used in subsequent `Consume(data)` calls. We can't assume that the given `data` is still valid in subsequent `Consume(data)`. We also need to copy buffered `data` because it's used in subsequent `Consume(data)` calls. ### What changes are included in this PR? * Add missing copies. * Clean up existing buffer copy codes. * Change tests to use ephemeral `data` to detect this case. * Add `copy_record_batch` option to `CollectListener` to copy decoded record batches. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes apache#39163 * Closes: apache#39163 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug, including details regarding any error messages, version, and platform.
arrow::ipc::StreamDecoder::Consume(const uint8_t* data, int64_t size)
can't assume the given data is still valid after the nextConsume(const uint8_t* data, int64_t size)
call. So we need to copy the given data for metadata message. Because the metadata message is used in the next body message. The next body message may be happen in the nextConsume(const uint8_t* data, int64_t size)
call.Component(s)
C++
The text was updated successfully, but these errors were encountered: