-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with ood-daemon restore backup data #269
Comments
After analyzing the actual recovery data and the local data, as well as the reviedw of the whole process of remote-store, it was found that the problem should be caused by the restore step of uni restore, as well as the combination of the existing activation detection of ood-daemon Restore processThe current uni restore recovery steps are defined as follows: CYFS/src/component/cyfs-backup-lib/src/backup/restore_status.rs Lines 8 to 16 in 89110cc
It can be seen that the restore step restores the key-data first, then the objects and chunks, and the {cyfs}/etc/desc directory in the key-data of the device.desc/device.sec identity file, so it will lead to the whole restoration process in which the files in etc/desc are recovered first Binding detection mechanism of ood-daemonThe binding detection logic of ood-daemon is that if the identity file (device.desc/device.sec) in the So the problem comes, after the gateway is started, because the remote store process is still continuing, so the local objects and chunks are still incomplete at this time, the gateway loads the root-state, in the case of loading the root, there will be CYFS/src/component/cyfs-lib/src/prelude/named_object_cache.rs Lines 142 to 148 in 89110cc
CYFS/src/component/cyfs-lib/src/prelude/named_object_cache.rs Lines 261 to 281 in 89110cc
So gateway will initialize a new global-state, the content is also completely empty, resulting in the above problem |
This problem is considered to be fixed from the following two sides: 1. Adjust the steps of uni restoreGive priority to restore objects and chunks, and restore key-data last, make sure {cyfs}/etc/desc is released after objects and chunks are restored to avoid the above problem Also during the key-data recovery process, we need to improve the order of the files recovered, make sure the etc directory is recovered last, and the db files should be recovered first to ensure that ood-daemon has recovered all the data after detecting the identity files in etc/desc dir 2. Improves the binding detection logic of ood-daemon serviceSince the remote-store is carried out inside ood-dameon, ood-daemon can stop detecting bindings when there is a remote store task, and wait until the restore task is completed, then continue with the binding monitor logic. However, this logic improvement is limited and can only play a supplementary role. If an external independent process is used for restorer-store operation, then ood-daemon has no way to know the progress of restore. |
…d-daemon-restore-backup-data' into main
The First option above has been adopted to make improvements, including the following two major changes. 1.
The above two points ensure that even with the existing bind detection logic of ood-daemon, it is possible to restore and bind ood at the same time by restore operation See at 3cd354a |
According to the feedback of some related products, OOD has been found to be missing data during gateway usage use after restoring data according to the following process:
In this case, the ood is in an unbound state and only the ood-daemon process is available
This process may take longer, depending on the size of the backup data, etc.
{cyfs}/etc/desc
directory is backed up.It's found that in some cases the data is missing, especially when accessing the
root-state
to get the dec app's state, the state is completely different from the state of the backed up source oodThe text was updated successfully, but these errors were encountered: