FIX: respect nested output namespaces in `Process.exposed_outputs` #4863

sphuber · 2021-04-19T14:04:14Z

The exposed_outputs method was detecting nested namespaces in the
dictionary of outputs of a node by using the namespace separator
character of the port namespace class. However, this character, which is
the ., gets converted to __ when the link is stored in the database.
This transformation is necessary since the . is a reserved character
to enable attribute based dereferencing, e.g.

node.outputs.some_output

Since link labels can contain underscores, we use a double underscore
which, together with the guarantee that link labels don't have leading
or terminating underscores, can uniquely determine the namespaces in any
given link label.

The solution is to not manually reconstruct the namespaces in the output
dictionary but simply use the get_outgoing().nested() method which
does it for us.

ramirezfranciscof · 2021-04-19T14:46:56Z

Does this also address #4623 ? I would think it should, or at least be intimately related. It was mentioned by @bosonie in issue #3533 but I've seen no comment regarding that.

JPchico · 2021-04-19T14:55:38Z

Does this also address #4623 ? I would think it should, or at least be intimately related. It was mentioned by @bosonie in issue #3533 but I've seen no comment regarding that.

I do not think it does, since I think that node.get_outgoing(link_label_filter=name) would still fail to get the proper name as it would pick just the top level label instead of having something like namespace__port which is the kind of label that one would have to pass to link_label_filter.

bosonie · 2021-04-19T14:56:23Z

I do not think it does, since I think that node.get_outgoing(link_label_filter=name) would still fail to get the proper name as it would pick just the top level label instead of having something like namespace__port which is the kind of label that one would have to pass to link_label_filter.

Correct

codecov · 2021-04-19T16:39:19Z

Codecov Report

Merging #4863 (9668717) into develop (f1aab1f) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #4863   +/-   ##
========================================
  Coverage    79.62%   79.62%           
========================================
  Files          519      519           
  Lines        37095    37095           
========================================
  Hits         29532    29532           
  Misses        7563     7563

Flag	Coverage Δ
django	`74.36% <100.00%> (ø)`
sqlalchemy	`73.24% <100.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
aiida/engine/processes/process.py	`91.73% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1aab1f...9668717. Read the comment docs.

The `exposed_outputs` method was detecting nested namespaces in the dictionary of outputs of a node by using the namespace separator character of the port namespace class. However, this character, which is the `.`, gets converted to `__` when the link is stored in the database. This transformation is necessary since the `.` is a reserved character to enable attribute based dereferencing, e.g. node.outputs.some_output Since link labels can contain underscores, we use a double underscore which, together with the guarantee that link labels don't have leading or terminating underscores, can uniquely determine the namespaces in any given link label. The solution is to not manually reconstruct the namespaces in the output dictionary but simply use the `get_outgoing().nested()` method which does it for us.

sphuber · 2021-04-26T07:36:38Z

@ramirezfranciscof could you review this today. This needs to be released in 1.6.2 a.s.a.p as it is blocking some users.

ramirezfranciscof

Sorry for the delay, LGTM!

BTW: I'm a bit curious about having tests for this in test_process.py and test_work_chain.py. It would seem that the main difference is that the first calls the methods directly (tests that the methods work), whereas the second goes through the engine running system (i.e. tests that these methods are actually called during run).

If this is so, perhaps the name / location of the tests is not very indicative of their purpose and it would be good to open an issue to at some point deal with that. Am I missing something else @sphuber ?

sphuber · 2021-04-26T09:02:45Z

Thanks for the quick review @ramirezfranciscof

BTW: I'm a bit curious about having tests for this in test_process.py and test_work_chain.py. It would seem that the main difference is that the first calls the methods directly (tests that the methods work), whereas the second goes through the engine running system (i.e. tests that these methods are actually called during run).

Well, the exposed_outputs is defined on the Process class and so I added the tests in the corresponding unit test. That being said, exposing outputs really only makes sense for workflow type processes because typically you want to expose the ports of a subprocess that you want to call, which is only something that workflows can do. Calculation processes therefore don't really have a use. But that is a different question really. I think that unit tests for a particular method should be done in the file of the respective class. If we move the method to another (sub)class, then the unit tests move as well.

ramirezfranciscof · 2021-04-26T10:23:46Z

Well, the exposed_outputs is defined on the Process class and so I added the tests in the corresponding unit test. That being said, exposing outputs really only makes sense for workflow type processes because typically you want to expose the ports of a subprocess that you want to call, which is only something that workflows can do. Calculation processes therefore don't really have a use. But that is a different question really. I think that unit tests for a particular method should be done in the file of the respective class. If we move the method to another (sub)class, then the unit tests move as well.

Mmm ok, I think we are saying a similar thing. Let me reformulate maybe: I agree with the following points...

Since exposed_outputs is defined on the Process class then the tests for the methods should be on test_process.py (I'm not sure why it has to be defined there instead of in the workchain subclass, but anyways I'm ok with that, it was not the issue).
It is a good idea to test that the engine is calling this method and setting the right thing when we run or submit. And since it only makes sense to do this with a workchain, this should be what gets built.

My point was more questioning if "integration" test should be on test_work_chain.py. Now that I take a second look, I see that all (or at least most) of the tests in test_work_chain.py are done through run calls to the engine, which is the same way we seem to be testing all specific processes (test_calc_job.py, test_calcfunction.py, test_workfunction.py).

Now, I think I still have the question of why don't we test more directly (without going through the engine) the methods of specific subclasses of Process. And also perhaps if we should make it more clear where should the test for direct methods go, and where should the "integration" ones that go through the engine go.

sphuber · 2021-04-26T11:33:44Z

I don't think that there should be a distinction whether integration tests go into test_process or test_work_chain. They both should contain unit tests. Now, writing pure unit tests for these complex classes are tricky and often would require a lot of mocking. We are simply using the full execution to do the mocking for us. You can argue that this makes it an integration test, but I don't think it matters too much. I think the idea should be that whenever reasonably possible we should attempt to isolate the direct interface that we are testing and just mock the required state, but in the case of processes, this can be so difficult that simply running them may be a viable solution as well.

mbercx · 2021-07-05T06:01:12Z

While trying to add a note on tab-completion for the outputs for the tutorial, I ran into the following deprecation warning:

In [1]: cj_node = load_node(311)
/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/managers.py:98: AiidaDeprecationWarning: dereferencing nodes with links containing double underscores is deprecated, simply replace the double underscores with a single dot instead. For example:
`self.inputs.some__label` can be written as `self.inputs.some.label` instead.
Support for double underscores will be removed in the future.
  'Support for double underscores will be removed in the future.', AiidaDeprecationWarning
In [2]: cj_node.outputs. <TAB>

The deprecation warning prints when I try to tab-complete the outputs. @ramirezfranciscof guided me to this PR. I didn't look into much detail what is responsible, but if I revert back to aiida-core==1.6.1, before this change was released, I no longer have this problem.

sphuber · 2021-07-05T07:38:59Z

This is not the corresponding PR, but #4625 is. I am bit surprised that the deprecation warning triggers already at tab completion though. It should only trigger when actually retrieving a node through a label containing double underscores. So I am confused how this is happening. Still, does the tab-completion still work, despite the warning?

mbercx · 2021-07-05T08:47:53Z

This is not the corresponding PR, but #4625 is. I am bit surprised that the deprecation warning triggers already at tab completion though. It should only trigger when actually retrieving a node through a label containing double underscores. So I am confused how this is happening. Still, does the tab-completion still work, despite the warning?

Yes, but it prints every time the tab-completion is attempted. 😅

unkcpz · 2021-07-06T03:42:53Z

By setting a traceback right before the deprecate waring I got the following information:

Traceback (most recent call last):                                                                                      
  File "/home/unkcpz/miniconda3/envs/aiida-core-dev/lib/python3.8/site-packages/jedi/cache.py", line 110, in wrapper
    return dct[key]
KeyError: ((), frozenset())  

During handling of the above exception, another exception occurred:
                              
Traceback (most recent call last):
  File "/home/unkcpz/Projects/python-code/aiida_core/aiida/orm/utils/managers.py", line 88, in _get_node_by_link_label
    node = attribute_dict[label]                                                                                        
KeyError: '__wrapped__'

I have no idea where this __wrapped__ key come from, but escape it solve the issue.

sphuber · 2021-07-06T07:45:21Z

The __wrapped__ dunder method is added when a function is wrapped by a decorator and this returns the original wrapped function (see Python docs). The wrapping seems to come from this jedi cache that is present. I have no idea where this is coming from though and who is asking to cache this stuff. The problem is though that in our code, we warn whenever the label contains a double underscore, which is the case for dunder methods, but we should ignore those. I split this issue off to #5010 and am submitting a PR to fix it.

sphuber mentioned this pull request Apr 19, 2021

Process.exposed_outputs does not work with nested namespaces #3533

Closed

sphuber force-pushed the fix/3533/exposed-outputs-nested-namespaces branch from 8c9a87e to 6867517 Compare April 19, 2021 16:00

sphuber force-pushed the fix/3533/exposed-outputs-nested-namespaces branch from 6867517 to e8b11c9 Compare April 19, 2021 20:22

unkcpz mentioned this pull request Apr 21, 2021

NodeLinkManager support dot separated namespace attributes retrieving #4625

Merged

sphuber requested review from chrisjsewell and ramirezfranciscof April 26, 2021 07:36

ramirezfranciscof approved these changes Apr 26, 2021

View reviewed changes

Merge branch 'develop' into fix/3533/exposed-outputs-nested-namespaces

9668717

ramirezfranciscof self-assigned this Apr 26, 2021

sphuber merged commit 42455ac into aiidateam:develop Apr 26, 2021

sphuber deleted the fix/3533/exposed-outputs-nested-namespaces branch April 26, 2021 09:21

sphuber mentioned this pull request Jul 6, 2021

Tab-completion for the outputs attribute of a Node issues a warning #5010

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: respect nested output namespaces in `Process.exposed_outputs` #4863

FIX: respect nested output namespaces in `Process.exposed_outputs` #4863

sphuber commented Apr 19, 2021

ramirezfranciscof commented Apr 19, 2021

JPchico commented Apr 19, 2021

bosonie commented Apr 19, 2021 •

edited

Loading

codecov bot commented Apr 19, 2021 •

edited

Loading

sphuber commented Apr 26, 2021

ramirezfranciscof left a comment

sphuber commented Apr 26, 2021

ramirezfranciscof commented Apr 26, 2021

sphuber commented Apr 26, 2021

mbercx commented Jul 5, 2021

sphuber commented Jul 5, 2021

mbercx commented Jul 5, 2021

unkcpz commented Jul 6, 2021

sphuber commented Jul 6, 2021

FIX: respect nested output namespaces in Process.exposed_outputs #4863

FIX: respect nested output namespaces in Process.exposed_outputs #4863

Conversation

sphuber commented Apr 19, 2021

ramirezfranciscof commented Apr 19, 2021

JPchico commented Apr 19, 2021

bosonie commented Apr 19, 2021 • edited Loading

codecov bot commented Apr 19, 2021 • edited Loading

Codecov Report

sphuber commented Apr 26, 2021

ramirezfranciscof left a comment

Choose a reason for hiding this comment

sphuber commented Apr 26, 2021

ramirezfranciscof commented Apr 26, 2021

sphuber commented Apr 26, 2021

mbercx commented Jul 5, 2021

sphuber commented Jul 5, 2021

mbercx commented Jul 5, 2021

unkcpz commented Jul 6, 2021

sphuber commented Jul 6, 2021

FIX: respect nested output namespaces in `Process.exposed_outputs` #4863

FIX: respect nested output namespaces in `Process.exposed_outputs` #4863

bosonie commented Apr 19, 2021 •

edited

Loading

codecov bot commented Apr 19, 2021 •

edited

Loading