New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relying on modification time may lead to unexpected results (issues with backup-list, and backup-fetch .. LATEST) #694
Comments
I realized, that technically, in the example above with backups A and B, A (which started earlier but finished later than B) may be considered as a "fresher" one because when we restore from it, we replay more WALs and achieve a fresher state. However, still, I would choose B which is started later -- because it would allow me to fetch fewer data from the archive during If the speed is the same for both nodes or B is slower, ordering by creation time and modification time would lead to the same results (first A, then B, so B is considered as newer). In other words, the use of creation time rather than modification time looks more attractive and logically correct in any case. |
@x4m checking in on this issue for your input. |
Testing 1.1 that should have this issue fixed. What we have:
1.1 doesn't seem to have this problem fixed:
-- as we see here, it shows 3 backups for September 22, but in reality, we had only one (0000000500060895000000C3; two others are definitely older, see their LSNs) I also see that the output of We need to ignore modtime and use either LSN or creation time, order by it, and present in the output explicitly. |
cc @x4m |
Hi! Can you please also provide the output of the |
@usernamedt here it is (
meanwhile, the output of
Sorting for the short version differs from that for the long one. |
Ok, then it appears that WAL-G sorts the backups using the creation time only when being launched with the '--detail' flag. I think that this behavior difference is not cool and needs a fix. I can look into it in the nearby future. |
@usernamedt right, this matches my understanding. Thanks. |
I've written a plan to implement the use of the backup creation time in the generic backup-list handler. Hopefully, I or somebody else will put an effort to implement it. |
As I can see (from https://github.com/wal-g/wal-g/blob/master/internal/delete_handler.go#L92, https://github.com/wal-g/wal-g/blob/master/internal/delete_handler.go#L47) modification time is considered as the parameter to order full backups.
This may lead to undesired results when we list full backups or, which is even worse and more important, want to restore from the LATEST full backup.
Two examples when using modification time may cause issues
backup-push
:backup-push
on one node of a 10 TiB database. Let's name this backup, say, A;backup-push
on another node, with better disks – backup B;backup-list
shows B as older, which is not true, andbackup-fetch /path/to/pgdata LATEST
will work with A, not B, which is also a mistake.backup-list
output looks messy, andbackup-fetch /path/to/pgdata LATEST
is not reliable anymore, it cannot be used because it may produce unpredicted results.Note that WAL-E
backup-list
orders backups differently: it uses LSNs to order them (however, it also shows modification time and doesn't show creation time).I propose using either LSNs or creation time and considering modification time only as an additional data, which by no means may be used when making decisions which one of the available full backups is the "LATEST".
The text was updated successfully, but these errors were encountered: