Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support current link types/link rules (support for workchains and workflows) at the import/export #1764

Merged
merged 34 commits into from
Aug 13, 2018

Conversation

szoupanos
Copy link
Contributor

This pull request implements the new export rules for nodes and links that take into account the changes for the correct export (and import) of workchains and workfunctions.

More info can be found at issues #1102 #1758

(Please don't merge it until I do some final cleaning - I will remove the BLOCKED flag)

…k what we do about the exported links since not all the predecessors are now exported (e.g. Calc-Calc in the reverse way is not exported)
…nced by a Calc or if it is exported independently). Tests added
…ceptions that make the import of a bunch of import files to stop
@codecov-io
Copy link

codecov-io commented Jul 18, 2018

Codecov Report

Merging #1764 into release_v0.12.2 will increase coverage by 0.02%.
The diff coverage is 62.2%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##           release_v0.12.2    #1764      +/-   ##
===================================================
+ Coverage            54.48%   54.51%   +0.02%     
===================================================
  Files                  246      246              
  Lines                32441    32602     +161     
===================================================
+ Hits                 17677    17772      +95     
- Misses               14764    14830      +66
Impacted Files Coverage Δ
aiida/cmdline/commands/rehash.py 33.33% <0%> (ø) ⬆️
aiida/common/archive.py 18.18% <0%> (-0.38%) ⬇️
aiida/cmdline/commands/importfile.py 7.69% <0%> (-0.42%) ⬇️
aiida/tools/dbexporters/tcod.py 55.87% <100%> (ø) ⬆️
aiida/cmdline/commands/exportfile.py 11.74% <42.85%> (+0.71%) ⬆️
aiida/orm/importexport.py 76.21% <66.12%> (-2.65%) ⬇️
aiida/backends/sqlalchemy/models/group.py 86.66% <0%> (-3.34%) ⬇️
aiida/orm/data/base.py 89.74% <0%> (ø) ⬆️
aiida/orm/implementation/general/code.py 65.85% <0%> (+1.46%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 003caf8...2fcc0e7. Read the comment docs.

@szoupanos
Copy link
Contributor Author

This is thoroughly tested in Django. When importing the export files mentioned above to an SQLA profile, I get exceptions similar to those reported at issue #1710

@szoupanos
Copy link
Contributor Author

After the fix for #1710 at PR #1769, import finished correctly at SQLA (without exceptions).
I imported several groups from a Django profile to an SQLA profile and I have to check the results
There are problems with the hashing under SQLA - I will create a separate ticket.

@szoupanos
Copy link
Contributor Author

Any hashing problems found were reported at issue #1771 and solved at the corresponding pull request. Import for SQLA also worked OK.

I exported the sample Django database that I got from Giovanni, and mentioned above, imported it to SQLA and the nodes and links seem to be imported correctly. Rehashing the SQLA database failed so node hashes were not checked.

@szoupanos
Copy link
Contributor Author

Re-hashing problems were fixed and the aforementioned hashes match.

@szoupanos szoupanos changed the title [BLOCKED] Support current link types/link rules (support for workchains and workflows) at the import/export Support current link types/link rules (support for workchains and workflows) at the import/export Aug 12, 2018
@szoupanos
Copy link
Contributor Author

szoupanos commented Aug 12, 2018

If there are no objections (i.e. from Leopold - @ltalirz), I would say that this should be merged (since it is a lot of work and starts to be old enough, the default scenario seems to work without any problems under Django and SQLA etc). Any smaller problems found can be fixed in different tickets/issues.

@ltalirz
Copy link
Member

ltalirz commented Aug 13, 2018

@szoupanos no objections, your Honor :-)
I can approve but perhaps you want to wait for the approval of someone more familiar with the details?

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from a few minor comments that should be easy to fix, I think that we should really put some effort soon towards refactoring this code. The import_tree and export_tree are enormous and make it difficult to follow. I wouldn't require it for this PR, but I do think this has to happen soon. Graph traversal based on different sets of rules (and export/import that is just an extension of that) is an integral part of AiiDA and should be robust. Therefore it would serve us well if this is abstracted and defined rigorously in one place, which export/import functionality can then just leverage.

help='Store only the nodes that are explicitly given, without exporting the parents')
@click.option('-O', '--no-calc-outputs', is_flag=True, default=False,
help='If a calculation is included in the list of nodes to export, do not export its outputs')
@click.option('-IF', '--input_forward', is_flag=True, default=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to maintain the convention of only using single letters for single-letter-options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but it seems that there are overlaps and many of the single letters are already reserved. For example which letter would you give to --call_reversed and to --create_reversed flags. Furthermore, I would like the flags that override the default graph traversal for the export to be as uniform (among them) as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use -I, -C, -R and -X for example. Note that they don't overlap with any of the other flags in this command, nor are they necessarily less intuitive, specially given the long form is available. Which also the long form options should not use underscores, i.e. --create_reversed should be --create-reversed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also consider adding no single-letter version.
These are just abbreviations, which make sense for the options that appear in many places (and people will thus memorize their meaning)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, although they have another function besides easy memorization. If an option is used very often, having a short-hand single letter version is a lot faster to use, even with autocomplete functionality

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference the POSIX cli flag convention that we maintain: https://www.gnu.org/software/libc/manual/html_node/Argument-Syntax.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@click.option('-IF', '--input_forward', is_flag=True, default=False,
help='Follow forward INPUT links (recursively) when calculating the node '
"set to export. By default this is switched off.")
@click.option('-CrR', '--create_reversed', is_flag=True, default=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use show_default=True to have click automatically generate the default value in the help string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

from aiida.backends.utils import load_dbenv
load_dbenv()
from aiida.backends.utils import load_dbenv, is_dbenv_loaded
if not is_dbenv_loaded():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please replace this with the aiida.cmdline.utils.decorators.with_dbenv decorator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, that only exists in develop. We should make sure that we update this when we merge 0.12.2 into develop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we will add it when it goes to develop - I will add a comment at that line. Do we have any other ways to remember it?

@@ -97,6 +97,8 @@ def add_link_from(self, src, label=None, link_type=LinkType.UNSPECIFIED):

if link_type is LinkType.CREATE and \
len(self.get_inputs(link_type=LinkType.CREATE)) > 0:
for l in self.get_inputs(link_type=LinkType.CREATE):
print "=====> ", l, l.dbnode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this print supposed to stay?

@@ -1629,20 +1809,27 @@ def test_links_for_workflows(self):
o1.add_link_from(w1, 'return', link_type=LinkType.RETURN)

uuids_wanted = set(_.uuid for _ in (w1,o1,i1))
links_wanted = [l for l in self.get_all_node_links() if l[3] in ('createlink', 'inputlink')]
# links_wanted = [l for l in self.get_all_node_links() if l[3] in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code

@@ -381,6 +386,10 @@ def import_data_dj(in_path,ignore_unknown_nodes=False,
"is neither a (possibly compressed) tar file, "
"nor a zip file.")

if not folder.get_content_list():
print "The provided file/folder is empty. Exiting silently"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Exiting silently" while printing is a bit weird no? Would it be better to raise an exception and catch it in the caller, who can then decide what to print to the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea is to avoid a more "violent" exception that will occur later when the metadata.json will not be found at the folder. The silent exit makes sense when running import in a script and an empty import files, doesn't stop a full script of 100 imports to run.

I can of course raise an exception and catch it and exit silently at import_data or one level above but I don't see a lot of advantages. The only advantage that I see is that I can unify the behaviour of import_data_dj & import_data_sqla.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these functions will be called mostly from verdi commands, we need to be able to control how the output is printed. Hard coding print statements in these functions makes that impossible. Just above this line, a ValueError is also raised to signal that the passed export file or folder is unusable. I don't see a reason why not to do the same here. But I notice that there are currently a load more print statements, so unless all of those are refactored, I guess changing just this one line won't accomplish too much

# If it is a Data node
else:
curr_node_id = to_be_visited[Data].pop()
# If it is already visited continue to the next node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate line


:param what: a list of Django database entries; they can belong to different
models.
:param what: a list of Django database entries; they can belong to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a backend independent function right?

@szoupanos
Copy link
Contributor Author

@sphuber Thanks for the comments! For the general comment, I would agree if the necessary manpower is found.

@@ -41,20 +41,33 @@ def cli(self, *args):
help='Export the given groups by pk')
@click.option('-g', '--group_names', multiple=True, type=str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also replace underscores with dashes in option names?
This is the way it's done in aiida/cmdline/params/options and we should adopt it aiida-wide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done to all the arguments of verdi export

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not pushed, right?

@szoupanos
Copy link
Contributor Author

OK, I think that I addressed everything.

ltalirz
ltalirz previously approved these changes Aug 13, 2018
@sphuber sphuber merged commit 915dcf9 into aiidateam:release_v0.12.2 Aug 13, 2018
@sphuber
Copy link
Contributor

sphuber commented Aug 13, 2018

Thanks a million @szoupanos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants