-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Standard Ensembler with Route Name Path to SDK #248
Add Standard Ensembler with Route Name Path to SDK #248
Conversation
7647b93
to
b943038
Compare
@@ -27,7 +27,6 @@ on: | |||
- "docs/**" | |||
- "engines/pyfunc-ensembler-job/**" | |||
- "engines/pyfunc-ensembler-service/**" | |||
- "sdk/**" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the addition of SDK e2e tests, I believe it's probably useful to make the main CI pipeline run (e2e tests) whenever changes to the SDK are made.
experiment_mappings: List[Dict[str, str]] | ||
experiment_mappings: List[ | ||
turing.generated.models.EnsemblerStandardConfigExperimentMappings | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the type hint because it was always of the wrong type 👀
""" | ||
self.experiment_mappings = experiment_mappings | ||
self.route_name_path = route_name_path | ||
self.experiment_mappings = experiment_mappings if experiment_mappings else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since route_name_path
and experiment_mappings
are optional fields, we'll need to handle such edge cases. when both are empty, we'll be letting the API return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just configure the default []
value in the parameters?
experiment_mappings: List[Dict[str, str]] = [],
If None
is configured, typically the typing used should be Optional
, rather than a List
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If None is configured, typically the typing used should be Optional, rather than a List in this case
Agree with this - Optional is more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just configure the default [] value in the parameters?
I believe there's an issue with using mutable default arguments in Python, which can potentially lead to unexpected behaviour - this causes warnings to pop up on certain IDEs:
https://medium.com/geekculture/do-not-use-objects-as-default-arguments-in-python-1c940212db2e
Thanks for the reminder on Optional
, I'll add that in to the type hint.
a425c09
to
1683035
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @deadlycoconuts ! 🚀 Left some comments to be addressed.
): | ||
""" | ||
Method to create a new standard ensembler | ||
|
||
:param experiment_mappings: configured mappings between routes and treatments | ||
:param route_name_path: configured routh name path that points to the route name within a given treatment config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo routh
-> route
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow thanks for catching that!
): | ||
""" | ||
Method to create a new standard ensembler | ||
|
||
:param experiment_mappings: configured mappings between routes and treatments | ||
:param route_name_path: configured routh name path that points to the route name within a given treatment config | ||
:param experiment_mappings: configured mappings between routes and treatments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting is misaligned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting that! Strange black didn't fix that 🤔
""" | ||
self.experiment_mappings = experiment_mappings | ||
self.route_name_path = route_name_path | ||
self.experiment_mappings = experiment_mappings if experiment_mappings else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just configure the default []
value in the parameters?
experiment_mappings: List[Dict[str, str]] = [],
If None
is configured, typically the typing used should be Optional
, rather than a List
in this case.
@@ -545,6 +582,9 @@ def to_open_api(self) -> OpenApiModel: | |||
) | |||
for experiment_mapping in self.experiment_mappings | |||
], | |||
route_name_path=self.route_name_path | |||
if self.route_name_path is not None | |||
else "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe an empty string "" is required because the API validator requires this field to be set right? Also, in what cases will this value be None
, asking this because the default value should already be "" based on how you initialise this class, so I think this is not required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question about why we need to manipulate the value here - can we just use None
for experiment_mappings and route_name_path if they are not set? The API doesn't require them to be sent in, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe an empty string "" is required because the API validator requires this field to be set right?
can we just use None for experiment_mappings and route_name_path if they are not set? The API doesn't require them to be sent in, right?
Yeah the OpenAPI classes needs some value to be set and we can't just pass a None
value directly. One way to go around this is to selectively pass non-None
values into the kwargs
dict:
def to_open_api(self) -> OpenApiModel:
kwargs = {}
if self.experiment_mappings is not None:
kwargs["experiment_mappings"] = self.experiment_mappings
if self.route_name_path is not None:
kwargs["route_name_path"] = self.route_name_path
return turing.generated.models.EnsemblerStandardConfig(
**kwargs,
)
I've refactored the StandardRouterEnsemblerConfig
class a little so we aren't manipulating the values of route_name_path
in the to_open_api
method but I've also changed the default values of experiment_mappings
and route_name_path
members to be None
.
Also, in what cases will this (
route_name_path
) value be None[?]
In general, one of route_name_path
or experiment_mappings
will be None
but not the other. This corresponds to the scenario when the user only wants the standard ensembler to be explicitly by the experiment mappings OR the route name path.
Note also that the SDK doesn't perform any validations to ensure one and only of these two fields are set, but the API will perform this and return an error if this property is violated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the OpenAPI classes needs some value to be set and we can't just pass a None value directly.
I've managed to change this by setting the nullable
field as True
within the OpenAPI specs for both experiment_mappings
and route_name_path
.
@@ -0,0 +1,123 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test necessary? I'm just thinking we don't include any E2E tests for all other ensembler types too. And, by this logic, shouldn't we also add an experiments_mapping ensembler test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question. Probably doesn't make sense to have a test for the "route name path" specifically. But we don't have any other e2e test for the standard ensembler at the moment. So I think it's useful to add this. Likewise, we have a test case internally using experiment_mappings, with the proprietary experiment engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, by this logic, shouldn't we also add an experiments_mapping ensembler test?
But we don't have any other e2e test for the standard ensembler at the moment.
We do!! There's an e2e test that uses the standard ensembler with the proprietary experiment engine (it's found in the blandly named 01_create_router_test
file). We can perhaps rename that 01
test with a more explicit name to make it appear more explicitly as a test using the standard ensembler with experiment mappings if that helps 🤔
Is this test necessary?
Well, it is an SDK mirror of the 09_deploy_router_with_route_name_path_std_ensembler_test
that we introduced for the API and so far we've decided to maintain both sets of e2e tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do!! There's an e2e test that uses the standard ensembler with the proprietary experiment engine
It's using the docker ensembler, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, oops I got ensemblers confused with the proprietary experiment engine 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it wouldn't take too much effort to set up an e2e test with the proprietary experiment engine and a standard ensembler 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the e2e tests are not meant to test every combination of values (that's expected to be done by the unit/integration tests). So, IMO renaming this test to 09_deploy_router_with_std_ensembler_test
, to have it represent std ensembler in general, may suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I've just renamed the test :)
""" | ||
self.experiment_mappings = experiment_mappings | ||
self.route_name_path = route_name_path | ||
self.experiment_mappings = experiment_mappings if experiment_mappings else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If None is configured, typically the typing used should be Optional, rather than a List in this case
Agree with this - Optional is more appropriate.
@@ -545,6 +582,9 @@ def to_open_api(self) -> OpenApiModel: | |||
) | |||
for experiment_mapping in self.experiment_mappings | |||
], | |||
route_name_path=self.route_name_path | |||
if self.route_name_path is not None | |||
else "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question about why we need to manipulate the value here - can we just use None
for experiment_mappings and route_name_path if they are not set? The API doesn't require them to be sent in, right?
self.experiment_mappings = experiment_mappings | ||
self.route_name_path = route_name_path | ||
self.experiment_mappings = ( | ||
experiment_mappings # sif experiment_mappings else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just self.experiment_mappings = experiment_mappings
? And, I think the comment is self-explanatory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahah thanks for catching the comment that's not meant to be a comment 😅
experiment_mappings: | ||
items: | ||
$ref: '#/components/schemas/EnsemblerStandardConfig_experiment_mappings' | ||
nullable: true | ||
type: array | ||
route_name_path: | ||
nullable: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two nullable
conditions have to be added to force the autogenerated openapi classes to accept None
values for both experiment_mappings
and route_name_path
.
85a8ced
to
2e6e523
Compare
2e6e523
to
b0df608
Compare
Thanks everyone for the comments! I'll be merging this now! :) |
Context
With the changes introduced in #237 and #233 to support the usage of Standard Ensemblers with Custom Experiment Engines, the Turing API has been updated to handle different request/response payloads, most notably the addition of the
route_name_path
field in the standard ensembler configuration, thus necessitating a change to the Turing SDK to also handle these changes.This PR introduces minor changes to the
EnsemblerStandardConfig
andStandardRouterEnsemblerConfig
to support these changes as well as a new e2e test to mirror that of the newest API test introduced in #237.Main Modifications
.github/workflows/turing.yaml
- Removal of thepaths-ignore
value representing all sdk files in the reposdk/e2e/09_deploy_router_with_route_name_path_std_ensembler_test.py
- Addition of an e2e test that corresponds to the latest API e2e test added in Apply new Standard Ensembler routing policy with Custom Experiment Engines #237sdk/turing/router/config/router_ensembler_config.py
- Addition of theroute_name_path
field toEnsemblerStandardConfig
andStandardRouterEnsemblerConfig