Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parquet Writer Rework: Support complex types #2832

Merged
merged 30 commits into from
Dec 23, 2021

Conversation

Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Dec 22, 2021

This PR refactors the Parquet writer to a similar (nested) structure as the Parquet reader. This adds support for complex types (arbitrarily nested lists and structs) and fixes some issues found in the Parquet reader in the process. The following now works:

CREATE TABLE list_of_lists AS SELECT * FROM (VALUES
	([[1, 2, 3], [4, 5], [], [6, 7]]),
	([[8, NULL, 10], NULL, []]),
	([]),
	(NULL),
	([[11, 12, 13, 14], [], NULL, [], [], [15], [NULL, NULL, NULL]])
) tbl(i);

COPY list_of_lists TO 'complex_list.parquet' (FORMAT 'parquet');


SELECT * FROM parquet_scan('complex_list.parquet');
-- [[1, 2, 3], [4, 5], [], [6, 7]]
-- [[8, NULL, 10], NULL, []]
-- []
-- NULL
-- [[11, 12, 13, 14], [], NULL, [], [], [15], [NULL, NULL, NULL]]

Fixes #2557 and #2815, supersedes #2821

Refactor

The Parquet writer now creates recursive writer objects (ColumnWriter) similar to the recursive reader objects. The writers can have child-writers, and there are two special case writers for complex types (StructColumnWriter and ListColumnWriter).

Writing a row group to the file happens in two iterations. We do a first pass over the data (Prepare) which is used to (a) set up the definition and repetition levels recursively, and (b) figure out how many pages to write (for regular columns), so we don't exceed the 2^31 uncompressed page size limit.

The second pass over the data (BeginWrite, Write, FinalizeWrite) performs the actual write into a temporary buffer, after which the data is compressed and written into the file. The write into the temporary buffer is always necessary even if compression is disabled to figure out the exact uncompressed size, which has to be written in the page header before any data is written.

Row Group Size

This PR also adds the ROW_GROUP_SIZE option to the parquet writer, e.g..:

COPY tbl TO 'file.parquet' (FORMAT 'PARQUET', ROW_GROUP_SIZE 25000);

The default row group size is 100 000.

…he arrow reader but there appear to be some minor bugs left in our struct reader.
…-vector aligned writing of boolean values in Parquet reader + tests
@darthf1
Copy link

darthf1 commented Dec 22, 2021

Awesome! I guess this also fixes #2640

@Mytherin Mytherin linked an issue Dec 22, 2021 that may be closed by this pull request
@Mytherin Mytherin merged commit c9573d2 into duckdb:master Dec 23, 2021
@Mytherin Mytherin deleted the complexwriter branch January 17, 2022 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants