Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relying on modification time may lead to unexpected results (issues with backup-list, and backup-fetch .. LATEST) #694

Open
NikolayS opened this issue Jun 29, 2020 · 10 comments

Comments

@NikolayS
Copy link
Contributor

NikolayS commented Jun 29, 2020

As I can see (from https://github.com/wal-g/wal-g/blob/master/internal/delete_handler.go#L92, https://github.com/wal-g/wal-g/blob/master/internal/delete_handler.go#L47) modification time is considered as the parameter to order full backups.

This may lead to undesired results when we list full backups or, which is even worse and more important, want to restore from the LATEST full backup.

Two examples when using modification time may cause issues

  1. Race condition in the case of overlapping backup-push:
    • we start backup-push on one node of a 10 TiB database. Let's name this backup, say, A;
    • 1 hour later we start backup-push on another node, with better disks – backup B;
    • B is finished sooner than A, therefore it has older modification time;
    • as a result, backup-list shows B as older, which is not true, and backup-fetch /path/to/pgdata LATEST will work with A, not B, which is also a mistake.
  2. If some external tool does something with old backups, the modification time is changed. I understand that WAL-G doesn't expect this, but this is exactly what may easily happen (see https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2297#note_368324826 – old backups are moved to colder/cheaper storage, this action changes their modification time). As a result, backup-list output looks messy, and backup-fetch /path/to/pgdata LATEST is not reliable anymore, it cannot be used because it may produce unpredicted results.

Note that WAL-E backup-list orders backups differently: it uses LSNs to order them (however, it also shows modification time and doesn't show creation time).

I propose using either LSNs or creation time and considering modification time only as an additional data, which by no means may be used when making decisions which one of the available full backups is the "LATEST".

@NikolayS NikolayS changed the title Relying on modification time may lead to unexpected results (issues with backup-list, and backup-fetch LATEST) Relying on modification time may lead to unexpected results (issues with backup-list, and backup-fetch .. LATEST) Jun 29, 2020
@NikolayS
Copy link
Contributor Author

NikolayS commented Jun 29, 2020

I realized, that technically, in the example above with backups A and B, A (which started earlier but finished later than B) may be considered as a "fresher" one because when we restore from it, we replay more WALs and achieve a fresher state. However, still, I would choose B which is started later -- because it would allow me to fetch fewer data from the archive during backup-fetch and restore faster which is very important in the case of DR. To reach a later point in time, I would fetch additional WALs using wal-fetch anyway.

If the speed is the same for both nodes or B is slower, ordering by creation time and modification time would lead to the same results (first A, then B, so B is considered as newer).

In other words, the use of creation time rather than modification time looks more attractive and logically correct in any case.

@NikolayS
Copy link
Contributor Author

cc @x4m @g0djan

@ikbenale
Copy link

ikbenale commented Sep 8, 2020

@x4m checking in on this issue for your input.

@NikolayS
Copy link
Contributor Author

NikolayS commented Sep 23, 2021

Testing 1.1 that should have this issue fixed.

What we have:

  • daily full backups stored in GCS
  • older backups automatically moved to a "colder" storage, which updates their modtime, so they pop up in the tail

1.1 doesn't seem to have this problem fixed:

$ sudo  /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g --version
wal-g version v1.1	cadf598	2021.08.17_18:17:11	PostgreSQL
$ sudo  /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list | tail -10
base_000000050005B94000000022 2021-09-19T10:27:44Z 000000050005B94000000022
base_000000050005BB4C000000B5 2021-09-20T09:37:42Z 000000050005BB4C000000B5
base_000000050005E9DB000000EE 2021-09-20T10:08:28Z 000000050005E9DB000000EE
base_0000000500060413000000F9 2021-09-20T12:41:36Z 0000000500060413000000F9
base_000000050006063800000080 2021-09-21T08:30:01Z 000000050006063800000080
base_000000050005BD6A00000000 2021-09-21T10:12:21Z 000000050005BD6A00000000
base_000000050005EBB70000000F 2021-09-21T20:33:29Z 000000050005EBB70000000F
base_0000000500060895000000C3 2021-09-22T08:38:31Z 0000000500060895000000C3
base_000000050005EE1000000049 2021-09-22T15:30:13Z 000000050005EE1000000049
base_000000050005BF8900000015 2021-09-22T17:39:19Z 000000050005BF8900000015

-- as we see here, it shows 3 backups for September 22, but in reality, we had only one (0000000500060895000000C3; two others are definitely older, see their LSNs)

I also see that the output of backup-list shows modified in the second column, not created – this seems to be inconsistent with the supposed order.

We need to ignore modtime and use either LSN or creation time, order by it, and present in the output explicitly.

@NikolayS
Copy link
Contributor Author

cc @x4m

@usernamedt
Copy link
Member

Testing 1.1 that should have this issue fixed.

What we have:

  • daily full backups stored in GCs
  • older backups automatically moved to a "colder" storage, which updates their modtime, so they pop up in the tail

1.1 doesn't seem to have this problem fixed:

$ sudo  /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g --version
wal-g version v1.1	cadf598	2021.08.17_18:17:11	PostgreSQL
$ sudo  /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list | tail -10
base_000000050005B94000000022 2021-09-19T10:27:44Z 000000050005B94000000022
base_000000050005BB4C000000B5 2021-09-20T09:37:42Z 000000050005BB4C000000B5
base_000000050005E9DB000000EE 2021-09-20T10:08:28Z 000000050005E9DB000000EE
base_0000000500060413000000F9 2021-09-20T12:41:36Z 0000000500060413000000F9
base_000000050006063800000080 2021-09-21T08:30:01Z 000000050006063800000080
base_000000050005BD6A00000000 2021-09-21T10:12:21Z 000000050005BD6A00000000
base_000000050005EBB70000000F 2021-09-21T20:33:29Z 000000050005EBB70000000F
base_0000000500060895000000C3 2021-09-22T08:38:31Z 0000000500060895000000C3
base_000000050005EE1000000049 2021-09-22T15:30:13Z 000000050005EE1000000049
base_000000050005BF8900000015 2021-09-22T17:39:19Z 000000050005BF8900000015

-- as we see here, it shows 3 backups for September 22, but in reality, we had only one (0000000500060895000000C3; two others are definitely older, see their LSNs)

I also see that the output of backup-list shows modified in the second column, not created – this seems to be inconsistent with the supposed order.

We need to ignore modtime and use either LSN or creation time, order by it, and present in the output explicitly.

Hi! Can you please also provide the output of the wal-g backup-list --detail command?

@NikolayS
Copy link
Contributor Author

NikolayS commented Sep 29, 2021

@usernamedt here it is (sudo /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list --detail | tail -10):

name                          modified             wal_segment_backup_start start_time           finish_time          hostname    data_dir        pg_version start_lsn        finish_lsn       is_permanent
...
base_0000000500060413000000F9 2021-09-20T12:41:36Z 0000000500060413000000F9 2021-09-20T04:10:20Z 2021-09-20T12:41:35Z hostnameXXX /path/to/pgdata 120006     1693333688920552 1694422252556592 false
base_000000050006063800000080 2021-09-21T08:30:01Z 000000050006063800000080 2021-09-21T00:00:02Z 2021-09-21T08:30:00Z hostnameXXX /path/to/pgdata 120007     1695689596396096 1696320379778216 false
base_0000000500060895000000C3 2021-09-22T08:38:31Z 0000000500060895000000C3 2021-09-22T00:00:02Z 2021-09-22T08:38:30Z hostnameXXX /path/to/pgdata 120007     1698289175215536 1699026661799232 false
base_0000000500060ADB00000083 2021-09-23T08:38:11Z 0000000500060ADB00000083 2021-09-23T00:00:02Z 2021-09-23T08:38:10Z hostnameXXX /path/to/pgdata 120007     1700787784252472 1701710696775192 false
base_0000000500060D5B0000001D 2021-09-24T08:40:43Z 0000000500060D5B0000001D 2021-09-24T00:00:02Z 2021-09-24T08:40:42Z hostnameXXX /path/to/pgdata 120007     1703534855636720 1704097980979656 false
base_0000000500060F75000000C8 2021-09-25T08:40:48Z 0000000500060F75000000C8 2021-09-25T00:00:02Z 2021-09-25T08:40:47Z hostnameXXX /path/to/pgdata 120007     1705848401506504 1706147095008432 false
base_0000000500061039000000BC 2021-09-26T08:33:26Z 0000000500061039000000BC 2021-09-26T00:00:02Z 2021-09-26T08:33:25Z hostnameXXX /path/to/pgdata 120007     1706690014373488 1706986898397688 false
base_00000005000610EE000000E5 2021-09-27T08:40:38Z 00000005000610EE000000E5 2021-09-27T00:00:02Z 2021-09-27T08:40:37Z hostnameXXX /path/to/pgdata 120006     1707468093682336 1708190998879856 false
base_0000000500061353000000A9 2021-09-28T08:45:10Z 0000000500061353000000A9 2021-09-28T00:00:02Z 2021-09-28T08:45:09Z hostnameXXX /path/to/pgdata 120007     1710099898883816 1710808253201512 false
base_00000005000615B600000040 2021-09-29T08:44:02Z 00000005000615B600000040 2021-09-29T00:00:02Z 2021-09-29T08:44:01Z hostnameXXX /path/to/pgdata 120007     1712722372283792 1713393884340464 false

meanwhile, the output of sudo /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list | tail -10:

name                          modified             wal_segment_backup_start
...
base_000000050005C4D9000000F0 2021-09-26T10:33:30Z 000000050005C4D9000000F0
base_00000005000610EE000000E5 2021-09-27T08:40:38Z 00000005000610EE000000E5
base_000000050005F68E0000009F 2021-09-27T13:10:13Z 000000050005F68E0000009F
base_000000050005C6E3000000B3 2021-09-27T13:42:11Z 000000050005C6E3000000B3
base_0000000500061353000000A9 2021-09-28T08:45:10Z 0000000500061353000000A9
base_000000050005C90B00000073 2021-09-28T09:49:05Z 000000050005C90B00000073
base_000000050005F8FB0000006E 2021-09-28T15:34:33Z 000000050005F8FB0000006E
base_00000005000615B600000040 2021-09-29T08:44:02Z 00000005000615B600000040
base_000000050005FB8000000089 2021-09-29T12:51:08Z 000000050005FB8000000089
base_000000050005CB30000000BD 2021-09-29T13:32:37Z 000000050005CB30000000BD

Sorting for the short version differs from that for the long one.

@usernamedt
Copy link
Member

@usernamedt here it is (sudo /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list --detail | tail -10):


name                          modified             wal_segment_backup_start start_time           finish_time          hostname    data_dir        pg_version start_lsn        finish_lsn       is_permanent

...

base_0000000500060413000000F9 2021-09-20T12:41:36Z 0000000500060413000000F9 2021-09-20T04:10:20Z 2021-09-20T12:41:35Z hostnameXXX /path/to/pgdata 120006     1693333688920552 1694422252556592 false

base_000000050006063800000080 2021-09-21T08:30:01Z 000000050006063800000080 2021-09-21T00:00:02Z 2021-09-21T08:30:00Z hostnameXXX /path/to/pgdata 120007     1695689596396096 1696320379778216 false

base_0000000500060895000000C3 2021-09-22T08:38:31Z 0000000500060895000000C3 2021-09-22T00:00:02Z 2021-09-22T08:38:30Z hostnameXXX /path/to/pgdata 120007     1698289175215536 1699026661799232 false

base_0000000500060ADB00000083 2021-09-23T08:38:11Z 0000000500060ADB00000083 2021-09-23T00:00:02Z 2021-09-23T08:38:10Z hostnameXXX /path/to/pgdata 120007     1700787784252472 1701710696775192 false

base_0000000500060D5B0000001D 2021-09-24T08:40:43Z 0000000500060D5B0000001D 2021-09-24T00:00:02Z 2021-09-24T08:40:42Z hostnameXXX /path/to/pgdata 120007     1703534855636720 1704097980979656 false

base_0000000500060F75000000C8 2021-09-25T08:40:48Z 0000000500060F75000000C8 2021-09-25T00:00:02Z 2021-09-25T08:40:47Z hostnameXXX /path/to/pgdata 120007     1705848401506504 1706147095008432 false

base_0000000500061039000000BC 2021-09-26T08:33:26Z 0000000500061039000000BC 2021-09-26T00:00:02Z 2021-09-26T08:33:25Z hostnameXXX /path/to/pgdata 120007     1706690014373488 1706986898397688 false

base_00000005000610EE000000E5 2021-09-27T08:40:38Z 00000005000610EE000000E5 2021-09-27T00:00:02Z 2021-09-27T08:40:37Z hostnameXXX /path/to/pgdata 120006     1707468093682336 1708190998879856 false

base_0000000500061353000000A9 2021-09-28T08:45:10Z 0000000500061353000000A9 2021-09-28T00:00:02Z 2021-09-28T08:45:09Z hostnameXXX /path/to/pgdata 120007     1710099898883816 1710808253201512 false

base_00000005000615B600000040 2021-09-29T08:44:02Z 00000005000615B600000040 2021-09-29T00:00:02Z 2021-09-29T08:44:01Z hostnameXXX /path/to/pgdata 120007     1712722372283792 1713393884340464 false

meanwhile, the output of sudo /usr/bin/envdir /etc/wal-g.d/env /opt/wal-g/bin/wal-g backup-list | tail -10:


name                          modified             wal_segment_backup_start

...

base_000000050005C4D9000000F0 2021-09-26T10:33:30Z 000000050005C4D9000000F0

base_00000005000610EE000000E5 2021-09-27T08:40:38Z 00000005000610EE000000E5

base_000000050005F68E0000009F 2021-09-27T13:10:13Z 000000050005F68E0000009F

base_000000050005C6E3000000B3 2021-09-27T13:42:11Z 000000050005C6E3000000B3

base_0000000500061353000000A9 2021-09-28T08:45:10Z 0000000500061353000000A9

base_000000050005C90B00000073 2021-09-28T09:49:05Z 000000050005C90B00000073

base_000000050005F8FB0000006E 2021-09-28T15:34:33Z 000000050005F8FB0000006E

base_00000005000615B600000040 2021-09-29T08:44:02Z 00000005000615B600000040

base_000000050005FB8000000089 2021-09-29T12:51:08Z 000000050005FB8000000089

base_000000050005CB30000000BD 2021-09-29T13:32:37Z 000000050005CB30000000BD

Sorting for the short version differs from that for the long one.

Ok, then it appears that WAL-G sorts the backups using the creation time only when being launched with the '--detail' flag. I think that this behavior difference is not cool and needs a fix. I can look into it in the nearby future.

@NikolayS
Copy link
Contributor Author

@usernamedt right, this matches my understanding. Thanks.

@usernamedt
Copy link
Member

@usernamedt right, this matches my understanding. Thanks.

I've written a plan to implement the use of the backup creation time in the generic backup-list handler. Hopefully, I or somebody else will put an effort to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants