Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CalcJob: make sure local_copy_list files do not end up in node repo #4415

Merged

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Sep 30, 2020

Fixes #4414

The concept of the local_copy_list is to provide a possibility to
CalcJob plugins to write files to the remote working directory but
that are not also copied to the calculation job's repository folder.
However, due to commit 9dfad2e this
guarantee is broken.

The relevant commit refactored the handling of the local_copy_list in
the upload_calculation method to allow the target filepaths in the
list to contain nested paths with subdirectories that might not yet
necessarily exist. The approach was to first write all files to the
sandbox folder, where it is easier to deal with non-existing
directories. To make sure that these files weren't then also copied to
the node's repository folder, the copied files were also added to the
provenance_exclude_list. However, the logic in that part of the code
did not normalize filepaths, which caused files to be copied that
shouldm't have. The reason is that the provenance_exclude_list could
contain ./path/file_a.txt, which would be compared to the relative
path path/file_a.txt which references the same file, but the strings
are not equal.

The solution is to ensure that all paths are fully normalized before
they are compared. This will turn the relative path ./path/file_a.txt
into path/file_a.txt.

@codecov
Copy link

codecov bot commented Sep 30, 2020

Codecov Report

Merging #4415 into release/1.4.2 will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@                Coverage Diff                @@
##           release/1.4.2    #4415      +/-   ##
=================================================
+ Coverage          79.23%   79.24%   +0.02%     
=================================================
  Files                475      475              
  Lines              34831    34827       -4     
=================================================
+ Hits               27594    27596       +2     
+ Misses              7237     7231       -6     
Flag Coverage Δ
#django 73.07% <100.00%> (-<0.01%) ⬇️
#sqlalchemy 72.31% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/engine/daemon/execmanager.py 62.13% <100.00%> (+0.17%) ⬆️
aiida/common/hashing.py 94.90% <0.00%> (-0.20%) ⬇️
aiida/engine/transports.py 96.37% <0.00%> (ø)
aiida/cmdline/utils/echo.py 80.00% <0.00%> (ø)
aiida/orm/utils/managers.py 92.07% <0.00%> (ø)
aiida/restapi/common/utils.py 71.70% <0.00%> (ø)
aiida/orm/utils/builders/code.py 83.34% <0.00%> (ø)
aiida/tools/calculations/base.py 80.00% <0.00%> (ø)
aiida/cmdline/utils/repository.py 100.00% <0.00%> (ø)
aiida/engine/processes/process.py 92.69% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c4a8b4...739eb14. Read the comment docs.

The concept of the `local_copy_list` is to provide a possibility to
`CalcJob` plugins to write files to the remote working directory but
that are not also copied to the calculation job's repository folder.
However, due to commit 9dfad2e this
guarantee is broken.

The relevant commit refactored the handling of the `local_copy_list` in
the `upload_calculation` method to allow the target filepaths in the
list to contain nested paths with subdirectories that might not yet
necessarily exist. The approach was to first write all files to the
sandbox folder, where it is easier to deal with non-existing
directories. To make sure that these files weren't then also copied to
the node's repository folder, the copied files were also added to the
`provenance_exclude_list`. However, the logic in that part of the code
did not normalize filepaths, which caused files to be copied that
shouldm't have. The reason is that the `provenance_exclude_list` could
contain `./path/file_a.txt`, which would be compared to the relative
path `path/file_a.txt` which references the same file, but the strings
are not equal.

The solution is to ensure that all paths are fully normalized before
they are compared. This will turn the relative path `./path/file_a.txt`
into `path/file_a.txt`.
@sphuber sphuber force-pushed the fix/4414/local-copy-list-stored-repo branch from aff24c5 to 739eb14 Compare September 30, 2020 10:45
@sphuber sphuber changed the base branch from develop to release/1.4.2 September 30, 2020 10:46
Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix!

@giovannipizzi
Copy link
Member

I guess we need to remove the required flag from the 3.5 tests?

@sphuber
Copy link
Contributor Author

sphuber commented Sep 30, 2020

I guess we need to remove the required flag from the 3.5 tests?

Not really. This is going to be merged in release/1.4.2 which still supports Python 3.5. That is why I am merging this straight into the release branch and not into develop. Not sure why the tests didn't run though. The release/1.4.2 branch and this PR branch still define 3.5 in the workflow yml.

@sphuber sphuber merged commit 91df33e into aiidateam:release/1.4.2 Sep 30, 2020
@sphuber sphuber deleted the fix/4414/local-copy-list-stored-repo branch September 30, 2020 21:43
@espenfl
Copy link
Contributor

espenfl commented Oct 1, 2020

Thanks for issuing a fix for this asap. Greatly appreciated. What releases have this issue? We need to block them in the VASP plugin.

@sphuber
Copy link
Contributor Author

sphuber commented Oct 1, 2020

@espenfl v1.4 and v1.4.1 are affected by this bug. I will release a patch release a.s.a.p. Just waiting for one other potential bug that needs to be fixed.

@espenfl
Copy link
Contributor

espenfl commented Oct 1, 2020

@sphuber Thanks. I will block those. Also, I will test to see if 1.4 triggers our tests for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Files in local_copy_list permanently stored in the repository
3 participants