Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Is parquet using delta encoding for positions? #1552
I'm having trouble confirming as parquet-tools is not building for me right now, but I wanted to check if we think that our
To get delta encoding, you need to set disableDictionaryEncoding to true. This effectively enables the Parquet 2 format, which supports delta encoding. There was a discussion recently on the Spark lists that the vectorized Spark SQL reader only supports Parquet 1, so you may see worse performance with delta encoding from Spark SQL. Truth be told, I looked at file sizes a while back and moving to delta encoding didn't seem to make a huge difference.