-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Replace Parquet File Writer with Gzipped Jsonl File Writer #60
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this successfully with source-shopify
and source-klaviyo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean! Looks good to me.
@bindipankhudi - Just FYI, there's apparently a bug in DuckDB where if you try to load a blank json file, it will just hang indefinitely. Because of this, I had to refactor the no-data treatment slightly. I modified it so that we don't actually create a batch, but we do still loop through all the streams and finalize all of them. During the finalize step, if there are no batches, then we exit after making sure the table exists. |
Resolves: #50
Add Jsonl file writer.
This file writer better supports variable schemas that were breaking the Parquet writer.