-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arithmatic overflow leads to segfault in concat_batches
#3123
Comments
More investigation suggests that this happens when |
Thank you for the report and reproducer, I plan to look into this next week unless somebody else gets there first. I suspect MutableArrayData is not doing a checked operation when it should be - this should return an error, or at the very least panic. As for the use of i32, this is a Javaism that unfortunately made its way into the arrow specification. You could try using LargeUtf8 which uses i64, but you may also want to use smaller batches. |
|
Describe the bug
I can reliably reproduce a bug where
concat_batches
performs an addition that overflows (see https://github.com/apache/arrow-rs/blob/master/arrow-data/src/transform/utils.rs#L39). In release mode, the result for my application and data is a segfault on linux. When I run the same application in debug mode, I get a panic since checked arithmetic panics in debug mode.To Reproduce
I've tried to minimize this as much as I can. Checkout the repo https://github.com/msalib/broken-arrow and run the following command:
You should see a panic with backtrace (I'm running on x86-64 linux). That repo includes a Cargo.toml which enables overflow-checks in release builds because I got tired of waiting for slower debug builds. If you run the same command with just one less copy of redacted.parquet, it will succeed without panicing.
Expected behavior
The code should not segfault. Either the arithmetic should be checked for overflow so we get a panic or (much more preferable) the overflow should be avoided.
Additional context
Here's the full backtrace from my application (code and data not included here because it is proprietary, but we're using color-eyre and tracing so we get more detailed backtrace):
Thanks so much! We rely heavily on arrow+parquet and they've made our lives a lot easier!
The text was updated successfully, but these errors were encountered: