Skip to content

Conversation

@victoria-yining-huang
Copy link
Contributor

@victoria-yining-huang victoria-yining-huang commented Jun 4, 2025

We want to support batch_size being passed in either from the application code, when a Batch class is instantiated (Batch(batch_size=n)), or when it's being passed in from the deployment_config file.

There is an overriding logic:

  • the value from the application code is the default value
  • the value from config file takes higher precedence. This can represent values in different regions

Copy link
Collaborator

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some high level comments

@victoria-yining-huang victoria-yining-huang force-pushed the vic/add_batch_size_to_config branch from 615bb60 to e391e2e Compare June 9, 2025 16:36
@victoria-yining-huang victoria-yining-huang marked this pull request as ready for review June 9, 2025 18:54
Copy link
Collaborator

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comments in line.
I think this approach is more complex than it needs to be. If you delegate to each step the override process you will not need an app_config: Any attribute that has unspecified behavior for all the subclasses that do not allow for override.

Copy link
Collaborator

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the nullability of the batch_size and see the other comments in line

def __post_init__(self) -> None:
self.ctx.register(self)

def override_config(self, loaded_config: Mapping[str, Any] | None) -> Any:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should not return anything -> None:

Also is there any difference for the function whether it is passed none or an empty Mapping? Supporting both hints that such a difference would exist.
If they mean the same thing, please disallow one by only allow Mapping[str, Any]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, loaded_config only allows Mapping[str, Any] now

# TODO: Use concept of custom triggers to close window
# by either size or time
batch_size: MeasurementUnit
batch_size: Optional[MeasurementUnit] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supposed to be optional. There must always be a default otherwise this introduces an implicit requirement that in some cases all regions must have an override for the batch_size.
If you require a default then we know there is a fallback value for any region that does not specify this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default is mandatory now

Comment on lines 410 to 412
assert (
self.batch_size is not None
), f"{self.name} config must be set before windowing is accessed"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not allowing None also allows you to remove this.

In general only support nullability when you need it. It will remove a lot of corner cases that you would have to test and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

"loaded_batch_size, default_batch_size, expected",
[
pytest.param({"batch_size": 50}, 100, 50, id="Have both loaded and default values"),
pytest.param({"batch_size": 50}, None, 50, id="Only has loaded config file value"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario is not supposed to be allowed.

Copy link
Contributor Author

@victoria-yining-huang victoria-yining-huang Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed. This scenario now code will error out at runtime

@victoria-yining-huang
Copy link
Contributor Author

@fpacifici made default value mandatory

@victoria-yining-huang victoria-yining-huang merged commit 119dada into main Jun 12, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants