[1.0.x] Core: Increase inferred column metrics limit to 100. #5933

nastra · 2022-10-07T06:13:35Z

No description provided.

Co-authored-by: Ryan Blue <blue@apache.org>

haydenflinner · 2023-01-25T20:05:40Z

If I have a table with more than 100 columns, what are the downsides since I'm above this param value? I don't see it documented here -- https://iceberg.apache.org/docs/latest/configuration/

I only ask because I have a table that is basically a collection of events. Upstream, each event has some metadata in a dict. Using a column per key in that metadata dict felt like it would compress better than each row having a {"key1": 123}, where the key names are relatively static and the values would benefit from columnar compression. The majority of such cols are empty for any particular partition which I assume is near 0 storage/runtime overhead. Like, file 1's rows will have metadata dict {"abc": 1234} repeated in virtually the whole GB of data. File 2 may have metadata in most rows of {"def": "foo"} instead.

Core: Increase inferred column metrics limit to 100.

8fe4d0b

nastra added this to the Iceberg 1.0.0 Release milestone Oct 7, 2022

github-actions bot added core data labels Oct 7, 2022

nastra mentioned this pull request Oct 7, 2022

Core: Increase inferred column metrics limit to 100 #5916

Merged

nastra requested a review from rdblue October 7, 2022 07:55

rdblue merged commit e2bb9ad into apache:1.0.x Oct 7, 2022

gaborkaszab pushed a commit to gaborkaszab/iceberg that referenced this pull request Oct 24, 2022

Core: Increase inferred column metrics limit to 100 (apache#5933)

a11c40b

Co-authored-by: Ryan Blue <blue@apache.org>

nastra deleted the 1.0.x-increase-column-metrics branch June 1, 2023 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1.0.x] Core: Increase inferred column metrics limit to 100. #5933

[1.0.x] Core: Increase inferred column metrics limit to 100. #5933

nastra commented Oct 7, 2022

haydenflinner commented Jan 25, 2023

[1.0.x] Core: Increase inferred column metrics limit to 100. #5933

[1.0.x] Core: Increase inferred column metrics limit to 100. #5933

Conversation

nastra commented Oct 7, 2022

haydenflinner commented Jan 25, 2023