ARROW-6901: [Rust] [Parquet] Increment total_num_rows when closing a row group#5672
ARROW-6901: [Rust] [Parquet] Increment total_num_rows when closing a row group#5672bw-matthew wants to merge 3 commits intoapache:masterfrom
Conversation
Will need to work on a fix and then reverse those changes.
This can be run with:
```
git submodule update --init
(
cd rust
PARQUET_TEST_DATA="$(git rev-parse --show-toplevel)/cpp/submodules/parquet-testing/data" ARROW_TEST_DATA="$(git rev-parse --show-toplevel)/testing/data" cargo +nightly-2019-09-25 test
)
```
|
It's failed linting, going to make those changes. |
|
I changed it to i64 because the RowGroupMetaData has i64 for the num_rows field (and the ColumnChunkMetaData has i64 for the num_values). All of these values cannot reasonably be negative so I understand what you are saying. If you think it would be better to change them then I can do that, however I'm cautious about that approach because when I tried it it started interacting with things like the thrift conversion. |
sadikovi
left a comment
There was a problem hiding this comment.
Looks good. Thanks for fixing it.
Did you test that manually?
I had my private branch with the fixes, but I forgot to merge it somehow into upstream back then.
|
This PR updates the round trip tests to check that the correct value is present when reading the file metadata from the written file. Thanks for merging this. |
|
Just tested my branch again, it writes the correct row count now. |
This means that the total_num_rows written to the file will accurately reflect the row groups that were successfully written.