-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert batch_size to config option #2771
Conversation
), | ||
ConfigDefinition::new_u32( | ||
OPT_BATCH_SIZE, | ||
"Default batch size while creating new batches, it's especially useful for \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this description can be improved but that seemed outside the scope of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the basic pattern looks good to me 👍
I had some questions about type (u32 vs u64) and seeing if we an avoid having to repeat option names so many times but otherwise 👍
Thank you @andygrove
/// Customize batch size | ||
pub fn with_batch_size(mut self, n: usize) -> Self { | ||
pub fn with_batch_size(self, n: usize) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach (with a named function with_batch_size
that calls into the generic implementation below). Seems like a good idea to me
👍
// batch size must be greater than zero | ||
assert!(n > 0); | ||
self.batch_size = n; | ||
self | ||
self.set_u32(OPT_BATCH_SIZE, n as u32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use u64
as batch size (why are we storing as a u32 and then casting back and forth to usize
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to u64
but it still requires casting back and forth to usize
.
map.insert( | ||
BATCH_SIZE.to_owned(), | ||
format!("{}", self.config_options.get_u32(OPT_BATCH_SIZE)), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is possible to avoid having to enumerate all options again in this function -- what about maybe converting all entries in self.config_options
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consolidate BATCH_SIZE
with OPT_BATCH_SIZE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should. Ballista currently relies on this method and the keys used here. Once all the properties are converted I will update Ballista to use the new config_option
method and delete this method,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we use these names internally as well so I ended up removing the duplicate option. This will require Ballista to look for the new name but that is a small change, I filed apache/datafusion-ballista#73 to track this.
map.insert( | ||
BATCH_SIZE.to_owned(), | ||
format!("{}", self.config_options.get_u32(OPT_BATCH_SIZE)), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consolidate BATCH_SIZE
with OPT_BATCH_SIZE
?
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Codecov Report
@@ Coverage Diff @@
## master #2771 +/- ##
==========================================
+ Coverage 84.95% 85.15% +0.20%
==========================================
Files 272 272
Lines 48221 48096 -125
==========================================
- Hits 40964 40958 -6
+ Misses 7257 7138 -119
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me
pub fn to_props(&self) -> HashMap<String, String> { | ||
let mut map = HashMap::new(); | ||
map.insert(BATCH_SIZE.to_owned(), format!("{}", self.batch_size)); | ||
// copy configs from config_options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Which issue does this PR close?
Part of #2756
This just moves a single config but I wanted to get reviews on this approach before moving all of them.
Rationale for this change
Start moving existing configs to the new config mechanism
What changes are included in this PR?
batch_size
to new config mechanismAre there any user-facing changes?
SessionConfig::batch_size
is now a method rather than an attribute.