-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the backup mechanism of the configuration file #3581
Conversation
The `Config` class was creating a new backup each time `store` was called, which also happens if nothing really changed to the configuration content. With the recently introduced change that backup files are now made unique through a timestamp, this led to a lot of backups being created that were essentially clones. To improve this, the `Config.store` method now only writes the file to disk if the contents in memory have changed. This is done by comparing the checksum of the on disk file and that memory contents as written to a temporary file. If the checksums differ a backup is created of the existing file and the new contents written to the temporary file are copied to the actual configuration file location. This latter new approach also guarantees that the backup is always created before overwriting the original one.
cd4f59f
to
71479b8
Compare
aiida/manage/configuration/config.py
Outdated
@@ -39,14 +38,22 @@ class Config(object): # pylint: disable=too-many-public-methods | |||
def from_file(cls, filepath): | |||
"""Instantiate a configuration object from the contents of a given file. | |||
|
|||
.. note:: if the filepath does not exist it will be created with the default configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where in the function is the "default content" written to the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not. The file is touched but is initially empty. The Config
instance is then created with the default configuration but is not stored explicitly. This is done by the caller of from_file
if needed. In our case this is done in load_config
that calls Config.from_file
. If I add to the docstring that the file is touched but the content is not written, is that clear enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a step back, would it not be cleaner if .from_file
simply excepts if the filepath being passed does not exist? After all, the task of a .from_file
method is to instantiate a Config object from the filepath being passed.
If one wants to create a default configuration, shouldn't one simply call the constructor instead, followed by .store()
?
I find this intermediate step confusing, where the configuration file is created but is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design followed more or less directly from your other request (with which I agree) to not have warning messages being printed in the Config
constructor. So the detection of whether the configuration has to be migrated as well as the actual migration (including the warning messages and backup generation) has to be done elsewhere. I think putting this in the from_file
class method makes a lot of sense. Then the question was just how to deal with non-existing files. I think it is fine to put this here as well. This way one can always just call from_file
also in all the unittest utils that don't have to worry about creating the file first or creating an initial default profile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the question was just how to deal with non-existing files. I think it is fine to put this here as well. This way one can always just call from_file also in all the unittest utils that don't have to worry about creating the file first or creating an initial default profile
I agree that it is a little bit less code to write, and so I'm not totally against it (but I would still say that this stretches the meaning of what a.from_file
method is supposed to do).
Let's keep it for the moment - in that case, the question still remains whether this intermediate stage is necessary, where the configuration file is simply "touched" but without the actual content.
If the configuration file does not exist, can we not simply write the default content to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you are right, the indirect is not necessary. I will make that final simplification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sphuber
One question: if I understand correctly, you are saying there are many calls to store()
that are, essentially, unnecessary (since the file content actually isn't changing).
Would it be possible to simply remove those?
This was mostly happening during running the unittests. |
Do you know whether this is an unintended consequence of |
I think I found the problem and it is exactly because |
I would certainly agree that I would say that |
That only happens though when the specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patience @sphuber ;-)
Looks good to me now!
Not at all, code ended up way better as a result of your review. Seems the system works :) thanks a lot |
config.store() | ||
else: | ||
# If the configuration file needs to be migrated first create a specific backup so it can easily be reverted | ||
if config_needs_migrating(config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, one last thing I noticed: if the config file needs migrating, will this now result in two backups?
Won't the .store
method take care of this backup automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah no, because there is no storing going on here...
one could potentially replace the call to _backup
with a .store
though..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have, but for the reporting it is nice to write the filename of the created backup, which is only returned by _backup
. So I think it is fine to keep it like this
Fixes #3580
The
Config
class was creating a new backup each timestore
wascalled, which also happens if nothing really changed to the configuration
content. With the recently introduced change that backup files are now
made unique through a timestamp, this led to a lot of backups being
created that were essentially clones.
To improve this, the
Config.store
method now only writes the file todisk if the contents in memory have changed. This is done by comparing
the checksum of the on disk file and that memory contents as written to
a temporary file. If the check sums differ a backup is created of the
existing file and the new contents written to the temporary file are
copied to the actual configuration file location. This latter new
approach also guarantees that the backup is always created before
overwriting the original one.