Use ChunkedSliceOutput to chunk and buffer writes in parquet #18564

raunaqmorarka · 2023-08-07T07:57:42Z

Description

Avoids separate calls to output stream for writing page headers and page data.
Avoids holding on to extra memory for each buffered page due to over-allocation by compressors.
Adds memory usage accounting for buffered pages.

Additional context and related issues

Should help with the memory usage accounting problems described in #18557

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Hudi, Iceberg, Delta
* Improve memory usage accounting of parquet writer. ({issue}`18564`)

Allows simplifying some of the code working with pages sizes to use ints instead of longs

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java

findepi · 2023-08-07T12:58:24Z

Avoids separate calls to output stream for writing page headers and page data.

not sure i understand the goal here.
are we saving on http requests? i think the output stream is buffering internally?

can you expand a bit the commit rationale so that it's more obvious what's the objective?

raunaqmorarka · 2023-08-07T13:17:19Z

Avoids separate calls to output stream for writing page headers and page data.

not sure i understand the goal here. are we saving on http requests? i think the output stream is buffering internally?

can you expand a bit the commit rationale so that it's more obvious what's the objective?

Yes, there is varying levels of buffering in different output stream implementations. It's not clear to me that every implementation is doing that or that we should rely on that always being the case. In any case, that's not the main benefit of this change.

The main objective here is

Avoids holding on to extra memory for each buffered page due to over-allocation by compressors.
Adds memory usage accounting for buffered pages.

Also, ORC writer uses ChunkedSliceOutput in OrcOutputBuffer#writeChunkToOutputStream in a similar way.

raunaqmorarka · 2023-08-07T14:41:29Z

Insert benchmarks.pdf

There is a big increase in the peak memory usage due to the improved accounting and small decrease in perf due to the extra byte array copy required to get rid of over-allocation in compression buffer.

lukasz-stec

lgtm % comments

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java

lib/trino-parquet/src/main/java/io/trino/parquet/writer/ParquetWriter.java

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java

lib/trino-parquet/src/main/java/io/trino/parquet/writer/ParquetDataOutput.java

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java

sopel39 · 2023-08-09T10:15:56Z

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java

@@ -297,6 +297,7 @@ public long getBufferedBytes()
    public long getRetainedBytes()


Is it called after a flush? (is there something like a flush here)?

There is no "flush" here, there is a "close" followed by a getBuffer to extract the buffered pages.
This is called by connector page sink for every writer after writing each page.

lib/trino-parquet/src/main/java/io/trino/parquet/writer/ParquetDataOutput.java

Avoids separate calls to output stream for writing page headers and page data Avoids holding on to extra memory for each buffered page due to over-allocation by compressors

Avoid unncessary usage of BytesInput in parquet writer

f57c2a6

Allows simplifying some of the code working with pages sizes to use ints instead of longs

cla-bot bot added the cla-signed label Aug 7, 2023

raunaqmorarka requested review from sopel39, electrum, findepi and gaurav8297 August 7, 2023 07:57

raunaqmorarka mentioned this pull request Aug 7, 2023

Memory tracking issue: worker OOM in PrimitiveColumnWriter #18557

Closed

github-actions bot added the tests:hive label Aug 7, 2023

raunaqmorarka force-pushed the pqw-output branch from 846c8da to 0720356 Compare August 7, 2023 08:30

findepi requested a review from martint August 7, 2023 12:52

findepi reviewed Aug 7, 2023

View reviewed changes

lib/trino-parquet/src/main/java/io/trino/parquet/writer/PrimitiveColumnWriter.java Show resolved Hide resolved

raunaqmorarka requested review from findepi and lukasz-stec August 7, 2023 16:58

lukasz-stec approved these changes Aug 8, 2023

View reviewed changes

raunaqmorarka force-pushed the pqw-output branch from 0720356 to 47d6c15 Compare August 9, 2023 09:07

raunaqmorarka requested a review from lukasz-stec August 9, 2023 09:10

sopel39 approved these changes Aug 9, 2023

View reviewed changes

sopel39 reviewed Aug 9, 2023

View reviewed changes

lib/trino-parquet/src/main/java/io/trino/parquet/writer/ParquetDataOutput.java Show resolved Hide resolved

raunaqmorarka force-pushed the pqw-output branch 3 times, most recently from f1df74a to dc324d9 Compare August 9, 2023 20:03

raunaqmorarka added 2 commits August 10, 2023 10:15

Use ChunkedSliceOutput to chunk and buffer writes in parquet

64745a1

Avoids separate calls to output stream for writing page headers and page data Avoids holding on to extra memory for each buffered page due to over-allocation by compressors

Add memory usage accounting for buffered pages in writer

f01d196

raunaqmorarka force-pushed the pqw-output branch from dc324d9 to f01d196 Compare August 10, 2023 04:45

raunaqmorarka added the performance label Aug 10, 2023

raunaqmorarka merged commit ddf82b3 into trinodb:master Aug 10, 2023
67 checks passed

raunaqmorarka deleted the pqw-output branch August 10, 2023 12:50

raunaqmorarka mentioned this pull request Aug 10, 2023

Release notes for 423 #18288

Closed

colebow added this to the 423 milestone Aug 10, 2023

colebow mentioned this pull request Aug 10, 2023

Add Trino 423 release notes #18496

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ChunkedSliceOutput to chunk and buffer writes in parquet #18564

Use ChunkedSliceOutput to chunk and buffer writes in parquet #18564

raunaqmorarka commented Aug 7, 2023 •

edited

findepi commented Aug 7, 2023

raunaqmorarka commented Aug 7, 2023

raunaqmorarka commented Aug 7, 2023 •

edited

lukasz-stec left a comment

sopel39 Aug 9, 2023

raunaqmorarka Aug 9, 2023

		@@ -297,6 +297,7 @@ public long getBufferedBytes()
		public long getRetainedBytes()

Use ChunkedSliceOutput to chunk and buffer writes in parquet #18564

Use ChunkedSliceOutput to chunk and buffer writes in parquet #18564

Conversation

raunaqmorarka commented Aug 7, 2023 • edited

Description

Additional context and related issues

Release notes

findepi commented Aug 7, 2023

raunaqmorarka commented Aug 7, 2023

raunaqmorarka commented Aug 7, 2023 • edited

lukasz-stec left a comment

Choose a reason for hiding this comment

sopel39 Aug 9, 2023

Choose a reason for hiding this comment

raunaqmorarka Aug 9, 2023

Choose a reason for hiding this comment

raunaqmorarka commented Aug 7, 2023 •

edited

raunaqmorarka commented Aug 7, 2023 •

edited