Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTCONDOR-2344 Better skip_if_dataflow documentation #2350

Closed

Conversation

GregThain
Copy link
Member

@GregThain GregThain commented Mar 28, 2024

<Insert PR description here, and leave checklist below for code review.>

HTCondor Pull Request Checklist for internal reviewers

  • Verify that (GitHub thinks) the merge is clean. If it isn't, and you're confident you can resolve the conflicts, do so. Otherwise, send it back to the original developer.
  • Verify that the related Jira ticket exists and has a target version number and that it is correct.
  • Verify that the Jira ticket is in review status and is assigned to the reviewer.
  • Verify that the Jira ticket (HTCONDOR-xxx) is mentioned at the beginning of the title. Edit it, if not
  • Verify that the branch destination of the PR matches the target version of the ticket
  • Check for correctness of change
  • Check for regression test(s) of new features and bugfixes (if the feature doesn't require root)
  • Check for documentation, if needed (documentation build logs)
  • Check for version history, if needed
  • Check BaTLab dashboard for successful build (https://batlab.chtc.wisc.edu/results/workspace.php) and test for either the PR or a workspace build by the developer that has the Jira ticket as a comment.
  • Check that each commit message references the Jira ticket (HTCONDOR-xxx)

After the above

  • Hit the merge button if the pull request is approved and it is not a security patch (security changes require 2 additional reviews)
  • If the pull request is approved, take the ticket out of review state
  • Assign JIRA Ticket back to the developer

Copy link
Member

@Todd-L-Miller Todd-L-Miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the complete example is good, as is mentioning the conditions for a dataflow job to be skipped early.

I like the mention of DAGs at the end.

Dataflow Jobs
'''''''''''''

A **dataflow job** is a job that might not need to run because its desired
outputs already exist. To skip such a job, add the following line to your
submit file: :index:`dataflow<single: arguments; example>`
outputs already exist, and are more up-to-date than the input files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
outputs already exist, and are more up-to-date than the input files.
outputs already exist, and don't need to be recomputed.

skip_if_dataflow = True

queue

A dataflow job meets any of the following criteria:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence doesn't make any sense, and the whole concept is compromised by calling it "skip-if-dataflow"; "dataflow" makes sense as a type of job, not the state of a job (e.g., already completed). We can't change the submit command name at this point, but this makes the word-smithing harder and and more important.

We can say two different things here: "HTCondor assumes that you've specified all of your job's inputs and outputs; if you haven't, but set :subcom:skip_if_dataflow anyway, HTCondor could skip your job even if it should have been re-run." Or we can say "these are the technical conditions which require a job to be re-run/allow it to be skipped."

Actually, we should probably say both, but separately.

skip_if_dataflow = True

queue

A dataflow job meets any of the following criteria:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
A dataflow job meets any of the following criteria:
HTCondor assumes that you've specified all of your job's inputs and outputs; if you haven't, but set :subcom:skip_if_dataflow anyway, HTCondor could skip your job even if it should have been re-run.
All of the following conditions must be true for HTCondor to skip the job:

?

Comment on lines 560 to 562
* Output files exist, are newer than input files
* Execute file is newer than input files
* Standard input file is newer than input files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't seem like the right conditions, but I remember being confused by this before. It seems like the conditions ought to be:

  • Output files exist and are younger than (the youngest of) the the input files, the standard input file, and the executable.

However, the existing list matches what the code actually does, where "A data flow job meets any of the following critera" means "The job is skipped if any of the following are true."

@Todd-L-Miller
Copy link
Member

This has been merged into HTCONDOR-1899; we're going to simplify the documentation by fixing the code (a little).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants