-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Line dependent buffer #14512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line dependent buffer #14512
Conversation
… than the default, otherwise is too small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! LGTM - only the serialization is broken by this PR. Perhaps we can delete the old setting and add in a new one instead of changing the type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! One more minor comment
|
Thanks! |
Line dependent buffer (duckdb/duckdb#14512)
Line dependent buffer (duckdb/duckdb#14512) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
In this PR, we modify the buffer size to be dependent on the
max_line_sizeoption.This ensures that our buffers are always appropriately sized when
max_line_sizeis set correctly, allowing us to read CSV files with very large lines.Summary:
By default:
max_line_size: 2097152 bytes (~2MB)buffer_size: 16 *max_line_size(~32MB)per_thread_bytes: 4 *max_line_size(~8MB)The
buffer_sizeandper_thread_byteswill only be adjusted ifmax_line_sizeis set to a value larger than the default.