-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ REFACTOR: Delegate RepositoryBackend
control to the Backend
#5169
♻️ REFACTOR: Delegate RepositoryBackend
control to the Backend
#5169
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #5169 +/- ##
===========================================
- Coverage 81.01% 81.00% -0.01%
===========================================
Files 535 535
Lines 37392 37401 +9
===========================================
+ Hits 30289 30292 +3
- Misses 7103 7109 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Hi! In the meantime, just to know where we're heading towards (so that it's clear and, if we all agree, will make it faster to review further PRs), @chrisjsewell did you already write down somewhere (e.g. in an issue) a minimal description of how you envision the classes to behave eventually? (In addition, having this written down will make it easy to move it to the design docs). |
(I guess it's this discussion thread? #5154 |
Bear with me @giovannipizzi, I am literally just writing some UML diagrams for this (before and after)! |
Ok this is done in #5172 (comment) (see also https://www.figma.com/file/oluW9WVn1v1yDj6pWpTiQ7/aiida-backend-design?node-id=11%3A266) |
Thanks for the ping and waiting for me to get a chance to comment. I am perfectly happy to discuss restructuring the part of the code described in the OP and fully agree that there are quite some improvements to be had. But I also agree with @giovannipizzi that it would make sense to first work out in schematic's what design we want to achieve and discuss on this. The PRs should then be much easier to review and approve. The code we have now is the result of many step-wise improvements which are not ideal to get to good solutions. So I would put this PR on hold and organize the discussion and go from there. |
@sphuber I have already created the design schematic in https://www.figma.com/file/oluW9WVn1v1yDj6pWpTiQ7/aiida-backend-design?node-id=11%3A266 As discussed in #5172 (comment) |
RepositoryBackend
control to the Backend
RepositoryBackend
control to the Backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chrisjsewell just two questions about the method deletions on the Repository
class
def initialise(self, **kwargs: Any) -> None: | ||
"""Initialise the repository if it hasn't already been initialised. | ||
|
||
:param kwargs: keyword argument that will be passed to the ``initialise`` call of the backend. | ||
""" | ||
self.backend.initialise(**kwargs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this done now? Is this no longer needed? Or do you just think that the BackendRepository
shouldn't even define an initialise
method but that should be the responsibility of the implementation? It seems to me that if we remove this here, we should probably also remove it from the BackendRepository
interface, shouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for this and the delete
method below, my feeling is you shouldn't need direct access to them from the Repository
, which just provides the interface to the repository for a single Node
s file tree (perhaps in fact the name Repository
is a little misleading and should be NodeRepository
or something to denote it is only scoped to a single node).
If you are going to initialise/delete the whole repository, then you should be doing it via the BackendRepository
, which is what now happens.
(It's a bit like having a Node().delete()
that actually deletes every Node
in the database)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these were only really required before, because Profile.get_repository()
returned Repository
, whereas now Backend.get_repository()
returns BackendRepository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with that if Repository
was indeed designed to only be used with a single node, but it is not the case. It can have access to the entire repository in principle. It is in fact the aiida.orm.nodes.repository.NodeRepositoryMixin
that scopes access to the repository for just the files of a single node. Note that that class then also doesn't have a path to Repository.delete
. So I would keep these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that that class then also doesn't have a path to
Repository.delete
.
Not sure what mean by this; you can still call Repository.backend.erase()
.
Eurgh, I'm kind of against this, because you should never ever be calling Repository.delete
to delete the Container
, and nowhere in the code do you ever have to.
Again, it would be like if you asked your computer to delete a folder, and it went ahead and deleted your whole file system, i.e. since there is a many-to-one correspondence of Repository
to BackendRepository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what mean by this; you can still call Repository.backend.erase().
Sure, but what I am saying is that from the public Node
interface, there is no way to delete the entire repo. Of course you can do it from Node._repository.delete()
but if you are accessing protected attributes there is all sorts of things you can break.
Eurgh, I'm kind of against this, because you should never ever be calling Repository.delete to delete the Container, and nowhere in the code do you ever have to.
The point is that I wrote the Repository
interface as a generic one and not one that imposed the AiiDA concept that it is node-centric and from the node you only get access to part of the repository contents. I see the Repository
as a generic interface to the entire repo and in that sense I think it makes sense to have a delete method and initialise, since some backend implementations might need it. The fact that the public facing API of AiiDA currently doesn't make use of it (as it shouldn't) is not a reason to remove it. I think it is perfectly fair to have such a method that could for example be called by maintenance operations. If you then have to go directly through to the backend implementation defeats the purpose of having the Repository
front end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree to disagree; you also cannot delete the entire database from the front-end.
Plus, I still think it will be overly confusing to people, the distinction between RepositoryBackend
-> Repository
-> NodeRepositoryMixin
; Repository
essentially sits somewhere between the back and front-end
I think it makes sense to have a delete method and initialise, since some backend implementations might need it.
The Repository
class is not part of the backend, so backend implementations cannot do anything with it.
Anyhow, I re-added these redundant methods back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyhow, I re-added these redundant methods back
You forgot to reinstate the test though. Waiting for the approval would have probably caught that mistake.
I still think it will be overly confusing to people
If with "people" you mean users, I think that is not the problem. The Repository
is not and was not intended to be part of the public API. It is a generic interface to any repository that could be used. It intentionally does not embed the concept that the contents are viewed from the perspective of a node instead of its entirety.
I think it makes sense to have a delete method and initialise, since some backend implementations might need it.
The Repository class is not part of the backend, so backend implementations cannot do anything with it.
What I meant here is implementations of the "repository backend", i.e. RepositoryBackend
, some of which may require some sort of initialization, as is the case for the disk object store implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant here is implementations of the "repository backend", i.e. RepositoryBackend, some of which may require some sort of initialization, as is the case for the disk object store implementation.
yes this exactly where the initialise method is already; RepositoryBackend.initialise
, and why it is redundant to have Repository.initialise
def delete(self) -> None: | ||
"""Delete the repository. | ||
|
||
.. important:: This will not just delete the contents of the repository but also the repository itself and all | ||
of its assets. For example, if the repository is stored inside a folder on disk, the folder may be deleted. | ||
|
||
""" | ||
self.backend.erase() | ||
self.reset() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
…iidateam#5169) This commit moves `Profile.get_repository() -> Repository`, to `Backend.get_repository() -> BackendRepository`. The original delegation, (1) did not allow different backends to implement different repository classes, and (2) made it impossible to implement/instantiate a `Backend` that was not directly dependent on the currently loaded `Profile`. Additionally, the commit improves the docstring of the `Backend`, and fixes a minor issue, that `verdi profile delete` left an empty repository folder for that profile.
This PR is a result of observations in #5154 and #5145:
By the current placement of
get_repository
on theProfile
class, this caused two principal issues:Backend
implementations to theDiskObjectStoreRepositoryBackend
Backend
that was not directly dependent on the currently loadedProfile
.These issues, for example, preclude the possibility of implementing a "full" archive backend and instantiating it, whilst a "main"
Backend
is also loaded.This PR moves the
get_repository
method to theBackend
class, addressing both these issues.It also removes the
instatiate
anddelete
methods from the (node)Repository
class (which were just piping through to the backend), since a single node should not have "direct" access to deleting the whole profile's object store 😬.Additionally, it improves the docstring of the
Backend
and fixes a minor issue, thatverdi profile delete
left an empty repository folder for that profile.Note, this is likely the first in the number of PRs to address #5154:
No offence to the people involved (hindsight is always 20/20) but the whole abstraction and flow of control/information between
Config
,Profile
,Manager
,BackendManager
, andBackend
is IMO a bit of a mess really (plus some unnecessary use of global variables, which should always be avoided wherever possible).This greatly hampers implementing desirable features like profile switching.
Firtly, the flow from
Profile
->Backend
could be improved by having theBackend
instantiate with aProfile
(and having this also set theBackend
on theProfile
, to ensure only oneBackend
is instantiated per profile).With this PR, but already with
Backend.get_session
, we are pulling information from the profile via a global variable, rather than via aself.profile
attribute.This would then also allow for the
ENGINE
andSESSION_FACTORY
global variables, inaiida/backends/djsite/__init__.py
andaiida/backends/sqlalchemy/__init__.py
, to be removed; delegating responsibility for database session management to the relevant backend.Similarly,
Config.delete_profile
andManager._load_backend
(currently hard coded to sqla/django backends), should delegate some of their responsibilities to theBackend
implementation.