Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to Read parquet records before appending to file #120

Open
govthamreddy opened this issue Nov 20, 2020 · 2 comments
Open

Any way to Read parquet records before appending to file #120

govthamreddy opened this issue Nov 20, 2020 · 2 comments

Comments

@govthamreddy
Copy link

Is there anyway to read the parquet converted record before writing it into parquet file.
My requirement is to create a parquet file directly into azure data lake without storing it locally.

@dobesv
Copy link
Contributor

dobesv commented Dec 28, 2020

Using the parquets package I've been able to stream out parquet data without saving to a temporary file, it supports node streams. It will write out the data one row group at a time, though, so you do need enough memory to hold a full row group. This is a limitation of parquet.

@vishald2509
Copy link

Using the parquets package I've been able to stream out parquet data without saving to a temporary file, it supports node streams. It will write out the data one row group at a time, though, so you do need enough memory to hold a full row group. This is a limitation of parquet.

can you please provide more insight into the code perspective? It will be very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants