-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce colreaders #8248
Reduce colreaders #8248
Conversation
# Conflicts: # extension/parquet/parquet_reader.cpp
- remove debug test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good to me. One minor comment:
@@ -399,7 +399,6 @@ void ParquetReader::InitializeSchema() { | |||
if (file_meta_data->schema.size() < 2) { | |||
throw FormatException("Need at least one non-root column in the file"); | |||
} | |||
auto root_reader = CreateReader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not stay in the InitializeSchema
? It seems like we are initializing the root_reader
only right before calling InitializeSchema
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, moved it to InitializeSchema()
. Thanks!
Thanks! |
The ReadStatistics call was creating a root_reader for every column, now we do this only once at initialization.
This reduces the number of creations by the amount of columns (PropagateStatistics does this for all columns)