Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Is parquet using delta encoding for positions? #1552
I'm having trouble confirming as parquet-tools is not building for me right now, but I wanted to check if we think that our
To get delta encoding, you need to set disableDictionaryEncoding to true. This effectively enables the Parquet 2 format, which supports delta encoding. There was a discussion recently on the Spark lists that the vectorized Spark SQL reader only supports Parquet 1, so you may see worse performance with delta encoding from Spark SQL. Truth be told, I looked at file sizes a while back and moving to delta encoding didn't seem to make a huge difference.