Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise when calling Node.objects.delete for node with incoming links #4168

Merged
merged 1 commit into from
Jun 19, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Jun 9, 2020

Fixes #4167

This guard was already in place for a node that has outgoing links but
if a node had only incoming links, the method would happily delete it.
This means that one could delete output nodes for example as long as
they had not been used as inputs. Note also that this would leave the
link in place as the deletion is not cascaded automatically.

Deletion through the objects collection currently cannot deal with
automated cascading rules on links and nodes. These rules are
implemented in the ORM itself through the graph traversal rules for
deletion. Nodes that are not isolated should therefore only be deleted
through the utility function that makes sure to leave the resulting
provenance graph in a coherent state.

Regression test for storing a Node with (nested) repository
content with caching.
"""
class TestNodeDelete(AiidaTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these tests actually need the AiidaTestCase, or could that just be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, just copy/paste error, will remove it

@ramirezfranciscof
Copy link
Member

  1. So, this might be tangential to the PR itself, but how does this nested Collection classes work? Is there any advantage to having them inside the Class they are collection (other than being able to call all of them Collection without name clashes)? Is this a typical coding architecture or did you design this specifically for AiiDA?

Btw I think the docstring here is wrong, no?

class Node(Entity, metaclass=AbstractNodeMeta):
(...)
    class Collection(EntityCollection):
        """The collection of AuthInfo entries."""
(...)
  1. I understand this is another path to deleting nodes. Why can't we currently just call the delete_nodes method from inside of here? I mean, ideally both the command line and any other way of deleting nodes should go through the same path as much as possible right?

@sphuber
Copy link
Contributor Author

sphuber commented Jun 9, 2020

1. So, this might be tangential to the PR itself, but how does this nested `Collection` classes work? Is there any advantage to having them inside the Class they are collection (other than being able to call all of them `Collection` without name clashes)? Is this a typical coding architecture or did you design this specifically for AiiDA?

This is a common coding paradigm and the magic is in the Entity base class.

class Entity:
    """An AiiDA entity"""

    _objects = None

    # Define our collection type
    Collection = Collection

    @classproperty
    def objects(cls, backend=None):  # pylint: disable=no-self-argument
        """
        Get a collection for objects of this type.

        :param backend: the optional backend to use (otherwise use default)
        :type backend: :class:`aiida.orm.implementation.Backend`

        :return: an object that can be used to access entities of this type
        :rtype: :class:`aiida.orm.Collection`
        """
        backend = backend or get_manager().get_backend()
        return cls.Collection.get_collection(cls, backend)

By doing it like this, we just have to define the specific Collection class in each Entity subclass and the objects property works correctly.

Btw I think the docstring here is wrong, no?

Yes, it is incorrect.

1. I understand this is another path to deleting nodes. Why can't we currently just call the `delete_nodes` method from inside of here? I mean, ideally both the command line and any other way of deleting nodes should go through the same path as much as possible right?

Because the delete functionality of a node in AiiDA is quite complicated and often has big (unforeseen) consequences. A simple ORM method doesn't really convey this. Its signature would also become a lot more complex. The caller would have to be able to pass in the traversal rules. Will it always just delete everything it needs to without confirmation? If you do want some feedback, you are starting to build in a-back-and-forth that much more resembles CLI commands. That shouldn't be implemented in a simple ORM method.

This is all to say that we could just implement the delete method this way, allowing traversal rules to be passed, but it should not provide any functionality for dry-runs or confirmations. Which would make this very dangerous, as users will find the method in the API, maybe in the shell through auto-completion, and without realizing delete a huge portion of their provenance graph.

@ramirezfranciscof
Copy link
Member

ramirezfranciscof commented Jun 9, 2020

This is a common coding paradigm and the magic is in the Entity base class.

What is it called? I would like to read more about it. I tried googling "nested collection" but aparently this is not the correct name of the concept, he-he.

Because the delete functionality of a node in AiiDA is quite complicated and often has big (unforeseen) consequences. A simple ORM method doesn't really convey this. Its signature would also become a lot more complex. The caller would have to be able to pass in the traversal rules. Will it always just delete everything it needs to without confirmation? If you do want some feedback, you are starting to build in a-back-and-forth that much more resembles CLI commands. That shouldn't be implemented in a simple ORM method.

This is all to say that we could just implement the delete method this way, allowing traversal rules to be passed, but it should not provide any functionality for dry-runs or confirmations. Which would make this very dangerous, as users will find the method in the API, maybe in the shell through auto-completion, and without realizing delete a huge portion of their provenance graph.

Uhm, no, I agree that there are different uses that need to have slightly different behaviours. What I meant to say perhaps is that it could pay of to do some thinking considering all ways in which AiiDA aproaches the node deletion process that would allow us to structure the code into a more consistent shape. When I worked on the delete_nodes and the underlying graph_traversal I tried as much as possible to distribute and organize the code into functions that would have clear responsabilities in order to avoid repeating code and "leaking" functionality. I just feel like a bit of a pity that I was unaware of this alternate deletion path then, as perhaps I could have designed a more versatile tool that could be used here as well.

@sphuber sphuber force-pushed the fix/4167/node-objects-delete branch from c12dde2 to ff2a08a Compare June 9, 2020 10:42
@sphuber sphuber requested a review from greschd June 9, 2020 10:50
@codecov
Copy link

codecov bot commented Jun 9, 2020

Codecov Report

Merging #4168 into develop will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4168      +/-   ##
===========================================
+ Coverage    78.89%   78.93%   +0.04%     
===========================================
  Files          467      467              
  Lines        34503    34505       +2     
===========================================
+ Hits         27217    27232      +15     
+ Misses        7286     7273      -13     
Flag Coverage Δ
#django 70.85% <100.00%> (+0.03%) ⬆️
#sqlalchemy 71.73% <100.00%> (+0.04%) ⬆️
Impacted Files Coverage Δ
aiida/orm/nodes/node.py 92.91% <100.00%> (+1.70%) ⬆️
aiida/orm/implementation/django/nodes.py 94.42% <0.00%> (+1.02%) ⬆️
aiida/orm/implementation/sqlalchemy/nodes.py 95.33% <0.00%> (+1.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3a5853e...f58b4c7. Read the comment docs.

@sphuber
Copy link
Contributor Author

sphuber commented Jun 9, 2020

What is it called? I would like to read more about it. I tried googling "nested collection" but aparently this is not the correct name of the concept, he-he.

Not sure of the official name, but it is not necessarily related to "collections". It could be any abstract class that has another subclass as attribute that is exposed through its properties or method.

Uhm, no, I agree that there are different uses that need to have slightly different behaviours. What I meant to say perhaps is that it could pay of to do some thinking considering all ways in which AiiDA aproaches the node deletion process that would allow us to structure the code into a more consistent shape. When I worked on the delete_nodes and the underlying graph_traversal I tried as much as possible to distribute and organize the code into functions that would have clear responsabilities in order to avoid repeating code and "leaking" functionality. I just feel like a bit of a pity that I was unaware of this alternate deletion path then, as perhaps I could have designed a more versatile tool that could be used here as well.

Fair enough. It is always a good idea to think about and discuss the design and I agree that the design around deleting entities could benefit quite significantly

@sphuber sphuber force-pushed the fix/4167/node-objects-delete branch from ff2a08a to 430b1c2 Compare June 9, 2020 20:20
@sphuber
Copy link
Contributor Author

sphuber commented Jun 19, 2020

@ramirezfranciscof did you want to further discuss this or do you think this can be merged before the release

@ramirezfranciscof
Copy link
Member

Ah no, that's ok, I didn't do an approve review because I thought @greschd was doing the review but now I see it was just a comment. I'll do it now.

Copy link
Member

@ramirezfranciscof ramirezfranciscof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go, thanks!

@greschd greschd removed their request for review June 19, 2020 07:51
This guard was already in place for a node that has outgoing links but
if a node had only incoming links, the method would happily delete it.
This means that one could delete output nodes for example as long as
they had not been used as inputs. Note also that this would leave the
link in place as the deletion is not cascaded automatically.

Deletion through the objects collection currently cannot deal with
automated cascading rules on links and nodes. These rules are
implemented in the ORM itself through the graph traversal rules for
deletion. Nodes that are not isolated should therefore only be deleted
through the utility function that makes sure to leave the resulting
provenance graph in a coherent state.
@sphuber sphuber force-pushed the fix/4167/node-objects-delete branch from 430b1c2 to f58b4c7 Compare June 19, 2020 08:13
@sphuber sphuber merged commit 3a57c03 into aiidateam:develop Jun 19, 2020
@sphuber sphuber deleted the fix/4167/node-objects-delete branch June 19, 2020 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calling Node.objects.delete will not raise even if there are incoming links
3 participants