Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupPaths: A class to provide attribute/key access to Groups, with delimited labels. #3613

Merged
merged 1 commit into from
Apr 8, 2020

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Dec 9, 2019

Fixes #3589

Heya, this is some code that I've been using for a while, to manage my groups as a pseudo folder system, which has been super useful. From the tests you should be able to understand its functionality. For example, if you have three groups labelled ['f1/f2/f3a', 'f1/f2/f3b', 'f1/f2/f3-c/f4a'], you can do:

grouppaths = GroupPaths()
sub_group = grouppaths.f1.f2
for node in sub_group["f3a"].nodes:
    # do something

It appears to relate closely to 'Group Nesting' in the aiidacw2019-plan 😀

@giovannipizzi
Copy link
Member

We discussed some API changes with @chrisjsewell and @greschd that now @chrisjsewell will implement

@giovannipizzi
Copy link
Member

Two questions that came to my mind:

  • are different group types differentiated by the API? Or just the name is used, so users might unexpectedly see different things for different group types?
  • shall we rename the auto-group labels from Verdi autogroup on 2017-11-06 12:26:18 to something like autogroup/20171106_122518?
  • shall we display the concept of group types instead as a top-level folder? or this is confusing?

Any feedback is welcome

@greschd
Copy link
Member

greschd commented Dec 10, 2019

* shall we rename the auto-group labels from `Verdi autogroup on 2017-11-06 12:26:18` to something like `autogroup/20171106_122518`?

Seems nice. However, is this a backwards-incompatible change? We could create both, although that is also kind of ugly.

Do we have some way of specifying feature flags, e.g. in the configuration?

@chrisjsewell
Copy link
Member Author

Morning!
8329485 implements pretty much all the discussed GroupPath API, except some points for discussion, which are denoted with # TODO in the source code.
If you look in the test module, it covers all the methods (I think), and so should give you an idea of how things are working. At the moment its written in pytest, as its quicker for development, and I expect you to have migrated to pytest by the end of this week 😜

Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you weren't done yet, but here's my review - I think very little is left to do!

aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member Author

@giovannipizzi and @sphuber this is ready for 'final' review.
The only change that is definitely needed is for the tests. As you'll see, I'm currently using pytest, with the clear_database fixture. This works on my computer, but looks to be failing above due to no User being set the first time clear_database is called (but ok for subsequent tests!?). @ltalirz is this something that needs fixing within the fixture manager? Otherwise, I can just rewrite them in unittest.

@chrisjsewell
Copy link
Member Author

One other thing I forgot. It would be ideal to include tab completion of paths. I know you have dynamic autocompletion for verdi data sub-commands, but I don't think it is currently possible to do this for arguments, considering I have an open PR on this very issue! click-contrib/click-completion#27

@ramirezfranciscof
Copy link
Member

@giovannipizzi and @sphuber this is ready for 'final' review.
The only change that is definitely needed is for the tests. As you'll see, I'm currently using pytest, with the clear_database fixture. This works on my computer, but looks to be failing above due to no User being set the first time clear_database is called (but ok for subsequent tests!?). @ltalirz is this something that needs fixing within the fixture manager? Otherwise, I can just rewrite them in unittest.

@chrisjsewell @ltalirz

Why have the tests failed? Is it because of the problem we solved last week or is something else related with the quoted comment above? If all is worked out now and you can rebase, I can then review the PR.

@ltalirz
Copy link
Member

ltalirz commented Feb 26, 2020

Ah, this is a known issue - since the AiiDA core tests aren't yet fully migrated to pytest, there can be cases where the test database is completely empty, even without user.

@greschd has just added a clear_database_before_test fixture that you can use to circumvent this. #3783

@chrisjsewell
Copy link
Member Author

Yep @ramirezfranciscof, this is ready for review now 👍

Copy link
Member

@ramirezfranciscof ramirezfranciscof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @chrisjsewell ! This looks like a very interesting tool to help browse one's database in an easy way. I have made some comments regarding some of the functionalities/behavior that I found a bit unexpected but let me know if these had already been discussed during the coding week and were agreed upon as is. As an extra thing I would also ask you to move the test you added from its current location (aiida/backends/tests/tools/groups) to the new more external test section (test/tools/groups).

Since this is still one of my firsts PR reviews it would maybe be good is somebody else (@ltalirz @CasperWA @greschd , I think @sphuber is kind of busy right now...) could give a quick look to check that I didn't miss something obvious or even check if my comments are relevant or I should be focusing on other stuff ("yo' dog, I heard you like reviews...").

aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/tools/groups/__init__.py Show resolved Hide resolved
aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member Author

Thanks @ramirezfranciscof, I will work through these (it may take a little while though!)

@danieleongari
Copy link

danieleongari commented Mar 27, 2020

Hi,
what is the status of this PR?
We are looking forward to this, because it would be very helpful for our user case! Thanks!

@chrisjsewell
Copy link
Member Author

I’ll finish it off next week 👍

@codecov
Copy link

codecov bot commented Apr 4, 2020

Codecov Report

Merging #3613 into develop will increase coverage by 0.08%.
The diff coverage is 90.95%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3613      +/-   ##
===========================================
+ Coverage    78.00%   78.09%   +0.08%     
===========================================
  Files          457      458       +1     
  Lines        33714    33924     +210     
===========================================
+ Hits         26300    26494     +194     
- Misses        7414     7430      +16     
Flag Coverage Δ
#django 70.14% <90.95%> (+0.13%) ⬆️
#sqlalchemy 70.98% <90.95%> (+0.13%) ⬆️
Impacted Files Coverage Δ
aiida/tools/groups/paths.py 90.80% <90.80%> (ø)
aiida/cmdline/commands/cmd_group.py 89.77% <91.66%> (+0.35%) ⬆️
aiida/transports/plugins/local.py 80.46% <0.00%> (+0.25%) ⬆️
aiida/manage/tests/pytest_fixtures.py 91.83% <0.00%> (+4.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 38c4684...ec6d7ea. Read the comment docs.

@chrisjsewell
Copy link
Member Author

All changes made, over to you guys...

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chrisjsewell thanks a lot. It is looking good. I am a bit late to the party with the review, but still have some questions. Sorry for that. Since these are mostly bigger design questions, it would be good to maybe discuss this in person today with the other people involved and hash those out before starting to adapt the code. I think that once we have decided on these open questions, it shouldn't amount to much work whichever which way we decide to go and it should be easy to implement. The features that would be non-breaking we can even decide to add at a later point in time so as to not block this PR. We should just make sure that we are sure about those that we cannot change easily after merging. Shall we perhaps meet after the meeting of 9 am today?

aiida/backends/tests/tools/groups/test_grouppath.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Show resolved Hide resolved
aiida/cmdline/commands/cmd_group.py Outdated Show resolved Hide resolved
aiida/tools/groups/grouppaths.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 6, 2020

As discussed, CLI improvements to open in a separate "enhancements" issue, when this PR is merged:

  • Add --depth CLI option to work with --recursive
  • Add node count option; parent paths will return a count of all descendant nodes, e.g.
$ verdi group path ls --recursive --count
Path       Nodes
-------  ------------
a         10
a/b       4
a/c       6
  • Option to report paths relative to base path, e.g.
$ verdi group path ls a
a/b
a/c
$ verdi group path ls a --relative
b
c
  • make delimiter publicly configurable
  • dealing with multiple type_string

@sphuber
Copy link
Contributor

sphuber commented Apr 6, 2020

Another thing I just realized @chrisjsewell , in terms of enhancements for the future, is to maybe make the currently hardcoded delimiter in GroupPath configurable. Or would this make implementation/behavior to complicated to allow this flexibility?

@chrisjsewell
Copy link
Member Author

Nope thats fine, its already a private variable, just needs to be added to the __init__ args: https://github.com/aiidateam/aiida-core/blob/72e5923876f534a08619202215a7063db1f150ae/aiida/tools/groups/paths.py#L72

@chrisjsewell
Copy link
Member Author

Introduced a cyclic import my moving the files??

Messages
========

aiida/cmdline/params/options/conditional.py
  Line: 1
    pylint: cyclic-import / Cyclic import (aiida.engine.processes -> aiida.engine.processes.workchains -> aiida.engine.processes.workchains.restart)

@sphuber
Copy link
Contributor

sphuber commented Apr 6, 2020

No I think you just got unlucky and this happened in a previous commit that you merged in. These cyclic-imports are not always trustworthy and are notoriously difficult to track down.

@chrisjsewell
Copy link
Member Author

whats the course of action here 😕 ? Do I mess around with imports until it disappears, or do we turn a blind eye 🙈

"""Return the concrete group associated with this path."""
try:
if self.type_string is not None:
return orm.Group.objects.get(label=self.path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed a potential problem with this implementation. Consider the following

Group(label='a', type_string='user').store()
Group(label='a', type_string='auto.import').store()
path = GroupPath('a', type_string=None)
path.group  # This will except because the filters do not uniquely determine a single group.

You catch the NotExistent case but not the MultipleObjectsError that will be thrown in my example. Did you think about this and made the decision to ignore it, or is this something that wasn't discussed during the lifetime of the PR? I am pointing this out here, because quite a bit of the rest of the functionality may depend on this.

To be honest, I am not quite sure what should happen here. Maybe we should make the type_string required. In this way, one can only use the GroupPath to iterate over a specific type of groups, but at least you know that the label will guarantee uniqueness then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I wouldn't say it has been specifically addressed. I agree that we err on the side of disallowing a None type_string for now, and this can always be addressed in a later PR.

On a related note, how does your group sub-classing PR discriminate group classes; via the type string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see 5141b0d

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, the type_string is what is used to load the correct sub class. I am just now actually working on updating the QueryBuilder such that querying with subclasses works properly and this is also based on the type_string which itself is determined by the entry point name. However, working on this made me realize that using core as the entry point name for the base Group class, is not ideal, since this will never match any groups whose entry point that does not start with core., which is to say all groups from external plugins... Not sure what the best approach is here. Setting the entry point for Group base class to empty string is not really possible I think, so I will just have to make an exception for it in the query builder filtering logic.

Copy link
Member Author

@chrisjsewell chrisjsewell Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
This is the use case where I could envisage you might want to start using GroupPath with multiple type_string. But yeh, I don't yet have a concrete solution for how to deal with path clashes in that instance.

Copy link
Member Author

@chrisjsewell chrisjsewell Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in 22ee75a, as discussed, I have changed the group property to the get_group method. This will allow for the later extension of replacing type_string to type_cls, at which point the association of a path to only 0 or 1 groups cannot be guaranteed.
Within this extension, all "iterative" methods would still work fine (like children and walk), but for methods where you directly access a group; get_group, delete_group, get_or_create_group, you would then want to add arguments/logic to ensure only one group is selected.

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 6, 2020

@ltalirz I note that your pgsu package is not yet on Conda. Should this not have been picked up before the merge, with .github/workflows/test-install.yml?

@sphuber
Copy link
Contributor

sphuber commented Apr 6, 2020

@ltalirz I note that your pgsu package is not yet on Conda. Should this have not been picked up before the merge, with .github/workflows/test-install.yml?

I think the test-install.yml is currently only triggered on merge into develop. Leo has already submitted the request for the recipe to be added.

@chrisjsewell
Copy link
Member Author

I think the test-install.yml is currently only triggered on merge into develop

#3892 was a merge into to develop though, wasn't it?

@sphuber
Copy link
Contributor

sphuber commented Apr 6, 2020

#3892 was a merge into to develop though, wasn't it?

Yes what I meant was that it the workflow only runs on new commits on develop. They are not run for PRs.

@chrisjsewell
Copy link
Member Author

ah ok fair enough

Groups can be used to store nodes in AiiDA, but do not have any builtin
hierarchy themselves. However, often it may be useful to think of groups
as folders on a filesystem and the nodes within them as the files.

Building this functionality directly on the database would require
significant changes, but a virtual hierarchy based on the group labels
can be readily provided. This is what the new utility class `GroupPath`
facilitates. It allows group labels to be interpreted as the hierarchy
of groups. Example: consider one has groups with the following labels

    group/sub/a
    group/sub/b
    group/other/c

One could see this as the group `group` containing the sub groups `sub`
and `other`, with `sub` containing `a` and `b` itself. The `GroupPath`
class allows one to exploit this hierarchical naming::

    path = GroupPath('group')
    path.sub.a.get_group()  # will return group with label `group/sub/a`

It can also be used to create groups that do not yet exist:

    path = GroupPath()
    path.some.group.get_or_create_group()

This will create a `Group` with the label `some/group`. The `GroupPath`
class implements many other useful methods to make the traversing and
manipulating of groups a lot easier.
@sphuber sphuber dismissed stale reviews from ramirezfranciscof and giovannipizzi April 8, 2020 18:00

comments addressed

@sphuber sphuber merged commit b14243e into aiidateam:develop Apr 8, 2020
@greschd
Copy link
Member

greschd commented Apr 8, 2020

Great stuff, thanks @chrisjsewell!

@chrisjsewell
Copy link
Member Author

thanks 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possibility to bundle Groups
7 participants