-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3/n] Allows user to disable snapshots from SDK #1400
[3/n] Allows user to disable snapshots from SDK #1400
Conversation
…to eng-3000-allow-users-to-opt-out-of-data-snapshots-3
return count < 10 | ||
|
||
|
||
def deploy(client, resource_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vsreekanti would like your input to confirm this call chain should result in:
- reviews, op_artf and check_disabled_artf to be deleted
- outputs of
metrics
andcheck
are persisted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I feel allowing disabling check artf is very confusing. I'm wondering if we should just return an error if the user attempts to do so. Maybe the same applies to params.
Disabling metric artf makes sense to me though, since esp. table metrics can have sensitive content as they are just a grammar sugar over regular operators.
sdk/aqueduct/client.py
Outdated
disable_snapshots: | ||
If set 'True', disables snapshots for all artifacts not generating from params / metrics / checks | ||
by calling `.disable_snapshot()` on each of these artifacts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence seems fragmented and I am a bit confused about what it means. Pasting it to GPT gave me the following:
disable_snapshots:
If set to 'True', this option disables snapshots for all artifacts that are not generated from parameters, metrics, or checks. It achieves this by calling the `.disable_snapshot()` method on each of these artifacts.
Is this what you're trying to say? @likawind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you paste a loom demo of the manual_qa_tests?
def enable_snapshot(self) -> None: | ||
self._dag.update_artifact_should_persist(self._artifact_id, True) | ||
|
||
def disable_snapshot(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the set_
prefix for any write methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been discussed in https://aqueducthq.slack.com/archives/C01U6RREC7M/p1685489357239879 and given it's this close to release, I'd stick with the decision than making more back-and-forth to the channel. We've been using other verbs like bound
max
over artifacts so I don't see it as a huge misalignment as long as they start with verbs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I didn't realize this was user-facing actually. That's good for now yea
sdk/aqueduct/models/dag.py
Outdated
def update_artifact_should_persist(self, artifact_id: uuid.UUID, should_persist: bool) -> None: | ||
self.must_get_artifact(artifact_id).should_persist = should_persist | ||
|
||
def disable_snapshots_from_non_params_metrics_checks(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find this name a little confusing. I'd prefer if we kept the method more generic disable_snapshots()
and added a nice docstring here indicating that parameters, metric and check artifacts are excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about disable_snapshots_not_from_params_metrics_checks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also prefer my methods to be meaningful, but the description needs to stop somewhere. In this case, it feels like we're leaking implementation details into the method name. Specifically, I'd rather not be listing things in my names (params, metrics, checks), since it's not scalable or very readable imo (also not consistent with rest of code). Nbd if you want to keep it, it just jumped out at me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more worried that disable_snapshot
is misleading because we don't actually disable all snapshots. I do agree that this name is too long and what you mentioned about listing stuff makes sense. Welcome to advices as long as we somehow indicate only some snapshots get disabled here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about things like disable_user_snapshots
but this is also inaccurate..
disable_user_data_snapshots
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to disable_sensitive_snapshots
. sensitive
may confuse reader but that should lead them to the comment where we clarify things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm does disable_op_snapshots
or disable_op_output_snapshots
work? No strong opinions as long as we document it in the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah sounds good, I guess I just don't want it look too harmless so that caller reader would try to look at the docs, rather than skip over it with some wrong assumptions
@@ -14,6 +14,7 @@ class ArtifactMetadata(BaseModel): | |||
# If true, this artifact name is expected to be unique in the DAG. | |||
explicitly_named: bool = False | |||
from_local_data: bool = False | |||
should_persist: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here about what should_persist
means.
f557240
into
eng-3000-allow-users-to-opt-out-of-data-snapshots-2
Describe your changes and why you are making these changes
This PR adds SDK changes to disable snapshots on both artf and wf level. The API is implemented based on https://aqueducthq.slack.com/archives/C01U6RREC7M/p1685489357239879 . And the behavior looks like the following:
publish_flow
, this is equivalent to calldisable_snapshots
for all artifacts from non param / metrics / checks operators. In other words, if the user explicitly disable snapshots for those excluded artifacts, the option will be respected.Related issue number (if any)
ENG-3000
Tests
Verified the manual QA as a part of #1401 .
Checklist before requesting a review
python3 scripts/run_linters.py -h
for usage).run_integration_test
: Runs integration testsskip_integration_test
: Skips integration tests (Should be used when changes are ONLY documentation/UI)