You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I haven't found details in the Parquet spec as to whether this is allowed or not, but there is the following quote which could be interpreted as having longer than necessary bitpacking runs not being intentional:
For data pages, the 3 pieces of information are encoded back to back, after the page header. No padding is allowed in the data page
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone, I have a question regarding DuckDB's behaviour when writing Parquet files using the hybrid bitpacking - rle / plain dictionary encoding.
Consider the following example, tested with duckdb version 1.5.2:
DuckDB will write a parquet file containing a single bitpacked run with a size of 256 elements, even though a run with 16 elements would have sufficed. Is this intentional?
I haven't found details in the Parquet spec as to whether this is allowed or not, but there is the following quote which could be interpreted as having longer than necessary bitpacking runs not being intentional:
Source
Beta Was this translation helpful? Give feedback.
All reactions