Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the execmanager.upload_calculation idempotent'ish #3146

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Jul 8, 2019

Fixes #3145

The upload_calculation would cause an exception if called multiple
times for the same calculation, which can happen if the first time that
the runner was working on it got interrupted, for example due to a
daemon shutdown. The reason is that the second time around the adding of
the remote_folder data node will raise a uniqueness exception,
because there can only be one output with the same label.

Note that full idem-potency is impossible, but this change should make
the problem a lot less likely to occur. The idea is to delay the actual
attaching of the remote folder data node to the last moment possible.
This way, if the method is called again and the folder is already there,
we can be reasonably sure that the files were already retrieved
successfully and we simply return, leaving the call a no-op. This is
done in the beginning of the function to check if the output node already
exists using the LinkManager.first() call. If the node exists, the
upload function has apparently already been called before and reached
the end of the function where it adds the remote folder. This means all
the files were already successfully uploaded so we can safely skip it.

@sphuber sphuber force-pushed the fix_3142_execmanager_upload_calculation_idempotence branch from 2ab663c to ddb6d95 Compare July 9, 2019 13:27
@sphuber sphuber changed the title [BLOCKED] Make the execmanager.upload_calculation idempotent as best as possible Make the execmanager.upload_calculation idempotent as best as possible Jul 9, 2019
@sphuber sphuber changed the title Make the execmanager.upload_calculation idempotent as best as possible Make the execmanager.upload_calculation idempotent'ish Jul 9, 2019
@sphuber
Copy link
Contributor Author

sphuber commented Jul 9, 2019

@giovannipizzi this is now unblocked and ready for review

@giovannipizzi
Copy link
Member

Could you put a comment also here when calling add_incoming that this should be the last thing to do in the function? (And also in this case put a note in the issue about transactions)

The `upload_calculation` would cause an exception if called multiple
times for the same calculation, which can happen if the first time that
the runner was working on it got interrupted, for example due to a
daemon shutdown. The reason is that the second time around the adding of
the `remote_folder` data node will raise a uniqueness exception,
because there can only be one output with the same label.

Note that full idem-potency is impossible, but this change should make
the problem a lot less likely to occur. The idea is to delay the actual
attaching of the remote folder data node to the last moment possible.
This way, if the method is called again and the folder is already there,
we can be reasonably sure that the files were already retrieved
successfully and we simply return, leaving the call a no-op. This is
done in the beginning of the function to check if the output node already
exists using the `LinkManager.first()` call. If the node exists, the
upload function has apparently already been called before and reached
the end of the function where it adds the remote folder. This means all
the files were already successfully uploaded so we can safely skip it.
@sphuber sphuber force-pushed the fix_3142_execmanager_upload_calculation_idempotence branch from ddb6d95 to 629385b Compare July 9, 2019 13:39
@sphuber
Copy link
Contributor Author

sphuber commented Jul 9, 2019

Done. I also noticed I made a mistake in the return value of the initial check. Now it also returns the correct tuple

@sphuber sphuber merged commit b215ed5 into aiidateam:develop Jul 9, 2019
@sphuber sphuber deleted the fix_3142_execmanager_upload_calculation_idempotence branch July 9, 2019 14:09
d-tomerini pushed a commit to d-tomerini/aiida_core that referenced this pull request Sep 30, 2019
)

The `upload_calculation` would cause an exception if called multiple
times for the same calculation, which can happen if the first time that
the runner was working on it got interrupted, for example due to a
daemon shutdown. The reason is that the second time around the adding of
the `remote_folder` data node will raise a uniqueness exception,
because there can only be one output with the same label.

Note that full idem-potency is impossible, but this change should make
the problem a lot less likely to occur. The idea is to delay the actual
attaching of the remote folder data node to the last moment possible.
This way, if the method is called again and the folder is already there,
we can be reasonably sure that the files were already retrieved
successfully and we simply return, leaving the call a no-op. This is
done in the beginning of the function to check if the output node already
exists using the `LinkManager.first()` call. If the node exists, the
upload function has apparently already been called before and reached
the end of the function where it adds the remote folder. This means all
the files were already successfully uploaded so we can safely skip it.
d-tomerini pushed a commit to d-tomerini/aiida_core that referenced this pull request Oct 16, 2019
)

The `upload_calculation` would cause an exception if called multiple
times for the same calculation, which can happen if the first time that
the runner was working on it got interrupted, for example due to a
daemon shutdown. The reason is that the second time around the adding of
the `remote_folder` data node will raise a uniqueness exception,
because there can only be one output with the same label.

Note that full idem-potency is impossible, but this change should make
the problem a lot less likely to occur. The idea is to delay the actual
attaching of the remote folder data node to the last moment possible.
This way, if the method is called again and the folder is already there,
we can be reasonably sure that the files were already retrieved
successfully and we simply return, leaving the call a no-op. This is
done in the beginning of the function to check if the output node already
exists using the `LinkManager.first()` call. If the node exists, the
upload function has apparently already been called before and reached
the end of the function where it adds the remote folder. This means all
the files were already successfully uploaded so we can safely skip it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the execmanager.upload_calculation idempotent as best as possible
2 participants