Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in upload_calculation for CalcJobs with local codes #3707

Merged
merged 1 commit into from
Jan 7, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Jan 6, 2020

Fixes #3705

This code path was not tested at all and so the remaining occurrence of
get_folder_list which is part of the old repository interface that was
removed in v1.0.0 went unnoticed.

Copy link
Member

@greschd greschd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I assume you checked that the test fails before the fix?

with NamedTemporaryFile(mode='wb+') as handle:
handle.write(code.get_object_content(f, mode='rb'))
handle.flush()
handle.seek(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the seek really necessary here? Doesn't transport.put open it itself anyway? If it's necessary, might warrant a comment as to why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it is not necessary, I will remove it. The code is a copy of a block a few lines lower. There I already put a comment that it should be improved using node.open once transport.put supports filelike objects to be passed, which is issue #2579 . I decided not to copy the comment again.

Copy link
Member

@ramirezfranciscof ramirezfranciscof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the PR because of the request, but I am adding the disclaimer that I'm still not very familiar with the objects handled here, so take my comments with a grain of salt.

_, node = launch.run_get_node(ArithmeticAddCalculation, **inputs)
uploaded_files = os.listdir(node.dry_run_info['folder'])
for filename in self.local_code.list_object_names():
self.assertTrue(filename in uploaded_files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to understand that the test is comparing the files uploaded (node.dry_run_info['folder']) with the ones set up in the code (local_code.list_object_names()), but I couldn't find anything that tests that the configuration is correct (i.e. line 46 orm.Code(...)), or at least couldn't grep any orm.Code in aiida/backends/tests/orm/* and I didn't find anything in aiida/backends/tests/engine/test_process_builder. So if there is a problem with that it might not trigger this assertion? Perhaps we could either add another test or compare explicitly with the expected files?

handle.write(code.get_object_content(f, mode='rb'))
handle.flush()
handle.seek(0)
transport.put(handle.name, f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be because I'm still not familiar with transport and code objects, but from the current commit description (which only mentions that get_folder_list no longer exists) I don't understand why the change in the parameters of transport.put(...). It seems to change the loop over folders for a loop over objects (file names, correct?), but then I don't understand why you make a temporary file and pass the name of that instead of something related to the looping objects. Also, I don't understand the process of writing content, flushing and rewinding said temporary file.

Again, I'm not sure if this merits any code comment, but maybe just a brief explanation for this changes in the commit message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That change is because the new API doesn't expose directly an absolute path to the file to be put on the remote computer (which wouldn't be possible on an object store backend). Since transport.put wants a filename, the content is first copied to a temporary file.

This part was clear to me, but I have been doing a fair bit of migrating between these two.. so maybe the commit could indeed be more explicit about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will include the reasoning for this and a reference to #2579 which will allow this hack to be removed.

@sphuber
Copy link
Contributor Author

sphuber commented Jan 6, 2020

Looks good. I assume you checked that the test fails before the fix?

Yes, it failed.

@@ -139,8 +139,12 @@ def upload_calculation(node, transport, calc_info, folder, inputs=None, dry_run=
for code in input_codes:
if code.is_local():
# Note: this will possibly overwrite files
for f in code.get_folder_list():
transport.put(code.get_abs_path(f), f)
for f in code.list_object_names():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work if list_object_names contains directories, right? Is that ensured not to happen elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should never contain directotries. If you look at the Code.set_files implementation, which is called to store the files of the local code, it will copy all the files to top level without nesting.

    def set_files(self, files):
        """
        Given a list of filenames (or a single filename string),
        add it to the path (all at level zero, i.e. without folders).
        Therefore, be careful for files with the same name!

I just noticed that I changed that code when the repository interface changed, and I am not sure it does 100% the same. The new implementation checks os.path.isfile which wasn't there explicitly. Old implementation:

for f in files:
    self.repository.add_path(f, os.path.split(f)[1])

However that also does not seem compatible with the docstring which says that you can pass both filenames and directories. I don't think that add_path was doing the correct hing either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, keeping it as files-only seems like a safe thing to do for now -- we can always add directory compatibility if it's needed. Might be good to assert it in the current piece of code, so it's a bit more explicit this is not supported?

Side note: This Therefore, be careful for files with the same name! could easily be improved by just checking first there are no duplicate file names, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but that seems a job for another time. If ever anyone actually starts even using local codes..

This code path was not tested at all and so the remaining occurrence of
`get_folder_list` which is part of the old repository interface that was
removed in `v1.0.0` went unnoticed.

The fix was forced to write files to temporary files and flush them
instead of using a filelike object, because `Transport.put` only accepts
absolute filepaths for the moment.
@sphuber sphuber merged commit 0e9be49 into aiidateam:develop Jan 7, 2020
@sphuber sphuber deleted the fix_3705_local_code branch January 7, 2020 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when using local code: uses inexistent 'get_folder_list' method
3 participants