Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move CalcJob.presubmit call from CalcJob.run to Waiting.execute #3666

Merged
merged 1 commit into from
Dec 14, 2019

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Dec 13, 2019

Fixes #3664

The presubmit calls through to prepare_for_submission which is the
place where the CalcJob implementation writes the input files based on
the inputs nodes in a temporary sandbox folder. The contents of which
are then stored in the repository of the node before the process
transitions to the waiting state in await of the upload task. This
upload task will then copy over the contents of the node repository to
the working directory on the remote computer.

There is a use case to limit which files are copied to the node's
repository from the sandbox folder used by prepare_for_submission.
This means that the sandbox folder will have to be passed to the
upload_calculation. However, since the creation of this happens in the
CalcJob.run call, which is quite far removed from the eventual use,
this opens it up for a lot of problems. The time between folder
population by prepare_for_submission and eventual use in the upload
task can be significant, among other things because the upload task will
have to wait for transport. To make this feasible the creation of the
sandbox folder has to be moved closer to the upload task.

Here we move the presubmit call from CalcJob.run to the upload
transport task which will create the sandbox folder and pass it into the
upload_calculation. This limits the time frame in which there is a
chance for the contents of the sandbox to get lost. In addition, this
now gives additional freedom of deciding which files from the sandbox
are permanently stored in the node's repository for extra provenance.

@sphuber sphuber requested a review from muhrin December 13, 2019 14:14
Copy link
Contributor

@muhrin muhrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from my comment looks good

calcinfo = self.prepare_for_submission(folder)
scheduler = computer.get_scheduler()
for code in codes:
if not code.can_run_on(computer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the equivalent of the previous is_local/get_local_executable/get_content_list check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a different check, this one was just done in the run method while the one you reference was being done in presubmit. I just moved the former here as it makes more sense here. I realize now I can merge the two loops. I will do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muhrin I merged the two checks in the same loop, so this should be good to go

The `presubmit` calls through to `prepare_for_submission` which is the
place where the `CalcJob` implementation writes the input files based on
the inputs nodes in a temporary sandbox folder. The contents of which
are then stored in the repository of the node before the process
transitions to the waiting state in await of the upload task. This
upload task will then copy over the contents of the node repository to
the working directory on the remote computer.

There is a use case to limit which files are copied to the node's
repository from the sandbox folder used by `prepare_for_submission`.
This means that the sandbox folder will have to be passed to the
`upload_calculation`. However, since the creation of this happens in the
`CalcJob.run` call, which is quite far removed from the eventual use,
this opens it up for a lot of problems. The time between folder
population by `prepare_for_submission` and eventual use in the upload
task can be significant, among other things because the upload task will
have to wait for transport. To make this feasible the creation of the
sandbox folder has to be moved closer to the upload task.

Here we move the `presubmit` call from `CalcJob.run` to the upload
transport task which will create the sandbox folder and pass it into the
`upload_calculation`. This limits the time frame in which there is a
chance for the contents of the sandbox to get lost. In addition, this
now gives additional freedom of deciding which files from the sandbox
are permanently stored in the node's repository for extra provenance.
Copy link
Contributor

@muhrin muhrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@sphuber sphuber merged commit b8f40b1 into aiidateam:develop Dec 14, 2019
@sphuber sphuber deleted the fix_3664_move_presubmit_call branch December 14, 2019 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move call of CalcJob.prepare_for_submission to upload transport task
2 participants