New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: add deprecation notice for backup cmd with explicit subdir #79447
backupccl: add deprecation notice for backup cmd with explicit subdir #79447
Conversation
One nuance to note: if the latest backup in a collection is one with a user defined subdir, an incremental backup with LATEST will not issue a warning. I think this is fine.
|
@@ -792,6 +792,14 @@ func backupPlanHook( | |||
initialDetails.Destination.Subdir = latestFileName | |||
} else if subdir != "" { | |||
initialDetails.Destination.Subdir = "/" + strings.TrimPrefix(subdir, "/") | |||
// Deprecation notice for `BACKUP INTO` syntax with an explicit subdir. | |||
// Remove this once the syntax is deleted in 22.2. | |||
p.BufferClientNotice(ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand correctly - we want to deprecate this in BACKUP, but not RESTORE, right? So they can restore from arbitrary points in a series of incremental layers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct! Users of SHOW BACKUP and RESTORE should still be able to pick the specific the full backup in a collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you!
Release note (sql change): Previously, Backup commmands allowed the user to specify a custom subdirectory name for their backups via `BACKUP .. INTO <subdir> IN <collectionURI>`. After this change, this will no longer be supported. Users can only create a full backup via `Backup ... INTO <collectionURI>` or an incremental backup on the latest full backup in their collection via `BACKUP ... INTO LATEST IN <collectionURI>`. This deprecation also removes the need to address a bug in `SHOW BACKUPS IN` which cannot display user defined subdirectories.
b228a8c
to
7d2d1d2
Compare
bors r=benbardin |
Build succeeded: |
Huh, I thought we only wanted to deprecate this only if there is not an existing backup there? I believe it is perfectly fine to say The deprecation here was motivated by our desire to limit how new possible values of |
hm, i guess you're right. At first I didn't see a use case for creating an incremental backup for the non latest full backup in a collection; but if something goes wrong with the latest full backup chain in a collection, you may want to increment on an earlier backup. I'll fix this. |
You could also be backing up AS OF a particular time that is before the end time of the chain in |
hmmmmm, now that is an intriguing use case I hadn't thought about, which implies we should clarify some things in the code:
I think we should discuss next week and create another issue with a ga blocker? I think we never caught this bc all of the as of system time tests use the old backup syntax :( Another complication: this ambiguity is in 21.2. |
Wouldn't we still want that to be done on top of LATEST? |
Eh, not necessarily. Say it is now 22-04-08 09:15, and you have backup /22/4/7 with inc backups at 06, 12, 18 and then /22/4/8 with one inc backup at 06. You look at the logs and go "oh no, we ran this bad UPDATE last night at 21:30". You might then do |
Sorry, I was ambiguous with "we" :) Users for sure might want to do what you describe, but don't we as CRDB developers think that's too complicated to be a good idea? I.e. the simplicity of "all incremental backups go on LATEST" seems really powerful to me - kinda feels like asking for trouble to help users make trees of backups that we can't really help them manage. |
Eh, I dunno. I think these should be separate concerns. "LATEST" as just a shorthand/helper for picking a value of |
What if we add the following guardrail: if the user passes AS OF SYSTEM TIME, fail fast if the LATEST subdir (or the explicitly passed time based subdir) is a timestamp after the provided AOST? |
Release note (sql change): cockroachdb#79447 mistakingly warned users that explicit time based backup subdirectories were deprecated. CRDB still welcomes these! Non time based subdirectories will be deprecated.
I'd step back:
|
More concretely, that'd probably mean:
|
sounds good. I can file a ga blocked issue. |
Informs cockroachdb#79674, cockroachdb#79672 The changes in this PR sprang out of discussions in cockroachdb#79447 Release note (sql change): This patch introduces a few UX guardrails: command. After further discussion, we realized explicit subdirectories were useful for running incremental backups, but not for full backups. To that end, this pr throws a deprecation warning if: a) a user passes a subdirectory; b) there does not already exist a full backup in that subdirectory. Discussion in cockroachdb#79447 also lead to a discovery of two bugs for backups with AS OF SYSTEM TIME: Previously, a user could run an AS OF SYSTEM TIME incremental backup with an end time earlier than the previous backup's end time , which could lead to an out of order incremental backup chain. This PR causes the incremental backup to fail if the AS OF SYSTEM TIME is less than the previous backup's end time. Previously, if a user ran `BACKUP INTO dest AS OF SYSTEM TIME t` and a full backup subdirectory already existed at t, the backup would mistakingly increment on that full backup instead of failing. Without the IN keyword, the user expects a full backup, not an incremental backup. In this patch, the full backup fails when it detects a full backup already exists at the resolved subdirectory.
Informs cockroachdb#79674, cockroachdb#79672 The changes in this PR sprang out of discussions in cockroachdb#79447 which deprecated the ability to pass an explicit subdirectory in any backup command. This PR tweaks that deprecation warning and adds further UX guardrails. Release note (sql change): This patch introduces a few UX guardrails, including one breaking change: [breaking]: After further discussion, we realized explicit subdirectories were useful for running incremental backups, but not for full backups. To that end, this pr throws an error if: a) a user passes a subdirectory; b) there does not already exist a full backup in that subdirectory. The user can enable this deprecated syntax by switching the new bulkio.backup.deprecated_full_backup_with_subdir cluster setting to true. Discussion in cockroachdb#79447 also lead to a discovery of two bugs for backups with AS OF SYSTEM TIME: Previously, a user could run an AS OF SYSTEM TIME incremental backup with an end time earlier than the previous backup's end time , which could lead to an out of order incremental backup chain. This PR causes the incremental backup to fail if the AS OF SYSTEM TIME is less than the previous backup's end time. Previously, if a user ran `BACKUP INTO dest AS OF SYSTEM TIME t` and a full backup subdirectory already existed at t, the backup would mistakingly increment on that full backup instead of failing. Without the IN keyword, the user expects a full backup, not an incremental backup. In this patch, the full backup fails when it detects a full backup already exists at the resolved subdirectory.
Informs cockroachdb#79674, cockroachdb#79672 The changes in this PR sprang out of discussions in cockroachdb#79447 which deprecated the ability to pass an explicit subdirectory in any backup command. This PR tweaks that deprecation warning and adds further UX guardrails. Release note (backward-incompatible change): This patch introduces a few UX guardrails, including one breaking change: [breaking]: After further discussion, we realized explicit subdirectories were useful for running incremental backups, but not for full backups. To that end, this pr throws an error if: a) a user passes a subdirectory; b) there does not already exist a full backup in that subdirectory. The user can enable this deprecated syntax by switching the new bulkio.backup.deprecated_full_backup_with_subdir cluster setting to true. Discussion in cockroachdb#79447 also lead to a discovery of two bugs for backups with AS OF SYSTEM TIME: Previously, a user could run an AS OF SYSTEM TIME incremental backup with an end time earlier than the previous backup's end time , which could lead to an out of order incremental backup chain. This PR causes the incremental backup to fail if the AS OF SYSTEM TIME is less than the previous backup's end time. Previously, if a user ran `BACKUP INTO dest AS OF SYSTEM TIME t` and a full backup subdirectory already existed at t, the backup would mistakingly increment on that full backup instead of failing. Without the IN keyword, the user expects a full backup, not an incremental backup. In this patch, the full backup fails when it detects a full backup already exists at the resolved subdirectory.
79799: backupccl: add UX guardrails during backup subdirectory resolution r=dt a=msbutler Informs #79674, #79672 The changes in this PR sprang out of discussions in #79447 which deprecated the ability to pass an explicit subdirectory in any backup command. This PR tweaks that deprecation warning and adds further UX guardrails. Release note (backward-incompatible change): This patch introduces a few UX guardrails, including one breaking change: [breaking]: After further discussion, we realized explicit subdirectories were useful for running incremental backups, but not for full backups. To that end, this pr throws an error if: a) a user passes a subdirectory; b) there does not already exist a full backup in that subdirectory. The user can enable this deprecated syntax by switching the new bulkio.backup.deprecated_full_backup_with_subdir cluster setting to true. Discussion in #79447 also lead to a discovery of two bugs for backups with AS OF SYSTEM TIME: Previously, a user could run an AS OF SYSTEM TIME incremental backup with an end time earlier than the previous backup's end time , which could lead to an out of order incremental backup chain. This PR causes the incremental backup to fail if the AS OF SYSTEM TIME is less than the previous backup's end time. Previously, if a user ran `BACKUP INTO dest AS OF SYSTEM TIME t` and a full backup subdirectory already existed at t, the backup would mistakingly increment on that full backup instead of failing. Without the IN keyword, the user expects a full backup, not an incremental backup. In this patch, the full backup fails when it detects a full backup already exists at the resolved subdirectory. Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Informs cockroachdb#79674, cockroachdb#79672 The changes in this PR sprang out of discussions in cockroachdb#79447 which deprecated the ability to pass an explicit subdirectory in any backup command. This PR tweaks that deprecation warning and adds further UX guardrails. Release note (backward-incompatible change): This patch introduces a few UX guardrails, including one breaking change: [breaking]: After further discussion, we realized explicit subdirectories were useful for running incremental backups, but not for full backups. To that end, this pr throws an error if: a) a user passes a subdirectory; b) there does not already exist a full backup in that subdirectory. The user can enable this deprecated syntax by switching the new bulkio.backup.deprecated_full_backup_with_subdir cluster setting to true. Discussion in cockroachdb#79447 also lead to a discovery of two bugs for backups with AS OF SYSTEM TIME: Previously, a user could run an AS OF SYSTEM TIME incremental backup with an end time earlier than the previous backup's end time , which could lead to an out of order incremental backup chain. This PR causes the incremental backup to fail if the AS OF SYSTEM TIME is less than the previous backup's end time. Previously, if a user ran `BACKUP INTO dest AS OF SYSTEM TIME t` and a full backup subdirectory already existed at t, the backup would mistakingly increment on that full backup instead of failing. Without the IN keyword, the user expects a full backup, not an incremental backup. In this patch, the full backup fails when it detects a full backup already exists at the resolved subdirectory.
backupccl: add deprecation notice for backup cmd with explicit subdir
Release note (sql change): Previously, Backup commands allowed the user to
specify a custom subdirectory name for their backups via
BACKUP .. INTO <subdir> IN <collectionURI>
. After this change, this will no longer besupported. Users can only create a full backup via
Backup ... INTO <collectionURI>
or an incremental backup on the latest full backup in theircollection via
BACKUP ... INTO LATEST IN <collectionURI>
. This deprecationalso removes the need to address a bug in
SHOW BACKUPS IN
which cannot displayuser defined subdirectories.