Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] Add always_copy_output option #11884

Merged
merged 7 commits into from
Nov 15, 2022
Merged

Conversation

jigold
Copy link
Contributor

@jigold jigold commented Jun 2, 2022

CHANGE_LOG:

  • Added Job.always_copy_output when using the ServiceBackend. The default behavior is False which
    is a breaking change from the previous behavior to always copy output files irregardless of the job's
    completion state.

Thoughts on introducing this "breaking" change? I think it's okay and we will announce the change on Zulip. I think the benefits of getting rid of this not so intuitive behavior where we copy files despite the main container's completion state outweighs possibly breaking someone's pipeline that relied on files being there when the copy step could have failed as well.

@daniel-goldstein
Copy link
Contributor

daniel-goldstein commented Jun 2, 2022

I think the proposed new default and the option to change it is much more intuitive than the current behavior and worth a change. Though, I think it would be most polite to announce it on zulip/email list a week or two in advance of the release (which you may already planned to do).

@daniel-goldstein
Copy link
Contributor

daniel-goldstein commented Jun 22, 2022

Another thing I just thought of, I think we should add raise some form of error if a job that is always_run has inputs from a job that is not always_copy_output. I can't imagine something like that being intentional, but if there is a valid use case, maybe we have a warning instead of an error.

Copy link
Contributor

@daniel-goldstein daniel-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs rebase fix

@jigold
Copy link
Contributor Author

jigold commented Jun 24, 2022

I'm not sure how I feel about the warning / error suggestion when always run jobs have inputs that aren't always copied out. Cleanup jobs shouldn't care whether the outputs don't get copied out on failure.

@jigold jigold force-pushed the copy-on-success branch from 5d8be5c to 5c23da2 Compare June 24, 2022 18:08
@daniel-goldstein
Copy link
Contributor

Do cleanup jobs take input files from the parent job? IIUC, if I have an always_run job that takes an input from a job that failed, and whose outputs were not written, the input container for the always_run job would fail.

@jigold
Copy link
Contributor Author

jigold commented Jun 29, 2022

Hopefully the new warning gets at the case you were thinking of as well as a more general problem beyond always_copy_output


if self._always_run:
warnings.warn('A job marked as always run has a resource file dependency on another job. If the dependent job fails, '
f'the always run job with the following command will not run:\n{command}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be the always run job with the following command may not succeed. It will run, right?

@jigold
Copy link
Contributor Author

jigold commented Jul 8, 2022

Let's leave the WIP tag on, but this should be all set.

danking
danking previously requested changes Oct 3, 2022
**Version 0.2.96**

- Added ``Job.always_copy_output`` when using the ``ServiceBackend``. The default behavior is `False` which
is a breaking change from the previous behavior to always copy output files irregardless of the job's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

irregardless => regardless

@jigold jigold dismissed stale reviews from daniel-goldstein and danking October 26, 2022 19:46

stale

@jigold
Copy link
Contributor Author

jigold commented Oct 27, 2022

We cannot take off WIP and merge this until November 1st.

danking
danking previously requested changes Nov 2, 2022
Copy link
Contributor

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we making the server side change? If we only change the client and also always send the setting then we preserve the behavior of old versions of the hailtop.batch library while also switching to a different behavior.

@jigold jigold dismissed danking’s stale review November 2, 2022 14:08

maybe answered?

@daniel-goldstein
Copy link
Contributor

Agreed I think this properly preserves behavior for old clients

@daniel-goldstein
Copy link
Contributor

@danking This is assigned to me but since you're the last person to leave changes_requested it seems appropriate that you re-review.

Copy link
Contributor

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jigold jigold removed the WIP label Nov 15, 2022
@danking danking merged commit ddb5f5d into hail-is:main Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants