Skip to content

Commit

Permalink
Fix parquet data page offset calculation in parquet writer
Browse files Browse the repository at this point in the history
Cherry-pick of trinodb/trino#10722

The ParquetWriter::flush method calls int OutputStreamSliceOutput::size()
to get the data page offset which is a long, thus flushing fails trying
to write files larger than ~2 GB with an integer overflow exception.

Co-authored-by: Saulius Valatka <saulius.vl@gmail.com>
  • Loading branch information
2 people authored and pettyjamesm committed Feb 18, 2022
1 parent 953f44d commit 4d8aad1
Showing 1 changed file with 1 addition and 1 deletion.
Expand Up @@ -211,7 +211,7 @@ private void flush()
List<BufferData> bufferDataList = builder.build();

// update stats
long stripeStartOffset = outputStream.size();
long stripeStartOffset = outputStream.longSize();
List<ColumnMetaData> metadatas = bufferDataList.stream()
.map(BufferData::getMetaData)
.collect(toImmutableList());
Expand Down

0 comments on commit 4d8aad1

Please sign in to comment.