-
Notifications
You must be signed in to change notification settings - Fork 479
Restructured DR backup section for clarity and thoroughness #20873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructured DR backup section for clarity and thoroughness #20873
Conversation
Restructured backup and dr section for clarity and thoroughness
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
msbutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for getting this started! let me know if you have clarifying questions about my feedback.
|
|
||
| ## Disaster recovery | ||
|
|
||
| When cluster virtualization is enabled, [backup]({% link {{ page.version.version }}/backup.md %}) and [restore]({% link {{ page.version.version }}/restore.md %}) commands are scoped to the virtual cluster by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think this line can removed. i don't think it adds much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| Cockroach Labs recommends that you regularly [back up]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups) your _application virtual cluster (app VC)_. Only the app VC's data and settings are included in these backups, and data and settings for other virtual clusters or for the _system virtual cluster (system VC)_ are omitted. If needed, you can [restore](#restore-a-virtual-cluster) these backups to a new app VC. Use the following process to back up your app VC. | ||
|
|
||
| To back up a virtual cluster: | ||
| 1. [Connect](#connect-to-a-virtual-cluster) to the app VC as a user with the `admin` role on the app VC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i would say "a user with the BACKUP privilege". You don't need to be admin to take a backup. Reference doc: https://www.cockroachlabs.com/docs/stable/security-reference/authorization#supported-privileges
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before getting specific about the privilege type, we could make a generic statement that says 'conect to the app vc as a user with supported privileges{link to the doc michael linked}. In this example, we connect to the app VC as a user with the Backup privilege:'
| ~~~ | ||
|
|
||
| For details about restoring a backup of a virtual cluster, refer to [Restore a virtual cluster](#restore-a-virtual-cluster). | ||
| 1. [Perform a full backup]({% link {{ page.version.version }}/backup.md %}#back-up-a-cluster): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we recommend that users run backups via backup schedules (ref). Schedules have a nicer UX (no need to manually take backups), and they manage the backup's protected timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious though, as a new hire, do you think our backups docs should more explicitly point customers to use schedules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have both. "Perform a one off full backup or create a backup schedule so that backups can be automatically taken on your behalf at a set frequency" (that wasn't great wording but something along those lines) and then we have a code snippet for setting a backup schedule too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm, maybe we gotta align here lol. What is the use case for taking a one off full backup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we look at our current docs for backup and restore, 'scheduled backups' is like a subsection of that. If we want to restructure our backup/restore docs to emphasize scheduled backups more, we can do that, but I think that I'm trying to match what our docs say today. @peachdawnleach it'd be good to hear your opinion here, but I do agree w Michael that we need to show an example of creating a backup schedule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alicia-l2 @msbutler I think it's a good idea to at least have instructions for both, unless we want to say that customers should never take a one off full backup for any reason. If there's any case where it would be necessary, it's worth having, but if we're 100% anti- one off backups then we should remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm cool with both.
| 1. [Back up the cluster]({% link {{ page.version.version }}/backup.md %}), and include the `INCLUDE_ALL_SECONDARY_TENANTS` flag in the `BACKUP` command. All virtual clusters and the system virtual cluster are included in the backup. | ||
| You can also back up your system VC to preserve metadata such as users and cluster settings. Use the following process to back up your system VC. | ||
|
|
||
| 1. [Connect](#connect-to-the-system-virtual-cluster) to the system VC as a user with the `admin` role on the system VC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same comment about admin role
|
|
||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ sql | ||
| BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing about schedules.
alicia-l2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - great start!
|
|
||
| ## Disaster recovery | ||
|
|
||
| When cluster virtualization is enabled, [backup]({% link {{ page.version.version }}/backup.md %}) and [restore]({% link {{ page.version.version }}/restore.md %}) commands are scoped to the virtual cluster by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
|
||
| When connected to a virtual cluster from the DB Console, metrics which measure SQL and related activity show data scoped to the virtual cluster. All other metrics are collected system-wide and display the same data on all virtual clusters including the system virtual cluster. | ||
|
|
||
| ## Disaster recovery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this to "Backup and Restore?" Was it always 'disaster recovery?'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was disaster recovery previously but I'm happy to change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, could we change it? Lots of folks use Backup and Restore for other use cases (like replicating data into a development cluster for testing)
| When cluster virtualization is enabled, [backup]({% link {{ page.version.version }}/backup.md %}) and [restore]({% link {{ page.version.version }}/restore.md %}) commands are scoped to the virtual cluster by default. | ||
|
|
||
| ### Back up a virtual cluster | ||
| Cockroach Labs recommends that you regularly [back up]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups) your _application virtual cluster (app VC)_. Only the app VC's data and settings are included in these backups, and data and settings for other virtual clusters or for the _system virtual cluster (system VC)_ are omitted. If needed, you can [restore](#restore-a-virtual-cluster) these backups to a new app VC. Use the following process to back up your app VC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if the first sentence were a little more generic. like "Cockroach Labs recommends that you regularly back up your data. When working with virtual clusters, backups should be performed on the application virtual cluster."
And then, we can add some sort of 'note' about system virtual clusters and how you can back those up too if you want to keep a record of your system settings somewhere.
Can we also remove the "If needed, you can also restore these backups to a new app VC?" We already have that Restore section down below, and IMO it flows a bit nicer if we keep this focused on backups.
| Cockroach Labs recommends that you regularly [back up]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups) your _application virtual cluster (app VC)_. Only the app VC's data and settings are included in these backups, and data and settings for other virtual clusters or for the _system virtual cluster (system VC)_ are omitted. If needed, you can [restore](#restore-a-virtual-cluster) these backups to a new app VC. Use the following process to back up your app VC. | ||
|
|
||
| To back up a virtual cluster: | ||
| 1. [Connect](#connect-to-a-virtual-cluster) to the app VC as a user with the `admin` role on the app VC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before getting specific about the privilege type, we could make a generic statement that says 'conect to the app vc as a user with supported privileges{link to the doc michael linked}. In this example, we connect to the app VC as a user with the Backup privilege:'
| ~~~ | ||
|
|
||
| For details about restoring a backup of a virtual cluster, refer to [Restore a virtual cluster](#restore-a-virtual-cluster). | ||
| 1. [Perform a full backup]({% link {{ page.version.version }}/backup.md %}#back-up-a-cluster): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have both. "Perform a one off full backup or create a backup schedule so that backups can be automatically taken on your behalf at a set frequency" (that wasn't great wording but something along those lines) and then we have a code snippet for setting a backup schedule too.
| ### Back up the entire cluster | ||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ sql | ||
| BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if in this code snippet and in the system vc code snippet we should change the URI example to be: 'external://backup_s3/app' and 'external://backup_s3/system' respectively. If the user is taking both system and app backups, then won't we run into the collision issue if we just use 'external://backup_s3' for both? @msbutler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds reasonable to me.
|
hey @peachdawnleach, did you intend to push an update to github or are you still working on the addressing feedback? |
|
@msbutler I'm still working on some updates before pushing- I got sidetracked with some more time sensitive stuff yesterday but hoping to have this updated today. |
|
no worries! just wanted to double check |
Added sections for each level of backup and for schedules, as well as other changes from review
alicia-l2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. Thanks for the great work restructuring i think it's a great start!
|
|
||
| For details, refer to [Work with virtual clusters]({% link {{ page.version.version }}/work-with-virtual-clusters.md %}#upgrade-a-cluster). | ||
|
|
||
| ### Disaster recovery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say 'backup and restore' right? To match the change that we made on the other cluster virtualization page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, if we want to include 'PCR' in this i'm cool w that, but we should have an intro sentence about disaster recovery generically IMO.
| ## Backup and restore | ||
|
|
||
| When cluster virtualization is enabled, [backup]({% link {{ page.version.version }}/backup.md %}) and [restore]({% link {{ page.version.version }}/restore.md %}) commands are scoped to the virtual cluster by default. | ||
| Cockroach Labs recommends that you regularly [back up]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups) your data. When using virtual clusters, perform backups on the _application virtual cluster (app VC)_. Only the app VC's data and settings are included in these backups, and data and settings for other virtual clusters or for the _system virtual cluster (system VC)_ are omitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we want to have a final sentence which is kind of saying 'you don't need this data in the system virtual cluster' or like basically reassuring users that the system virtual cluster data is not critical
|
|
||
| 1. [Connect to the virtual cluster](#connect-to-a-virtual-cluster) you want to back up as a user with the `admin` role on the virtual cluster. | ||
| 1. [Back up the cluster]({% link {{ page.version.version }}/backup.md %}). Only the virtual cluster's data and settings are included in the backup, and data and settings for other virtual clusters or for the system virtual cluster is omitted. | ||
| Use the following process to create a schedule for a cluster-level backup of your app VC. In this example, the schedule takes revision history for the backup every day at midnight. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example, the schedule takes revision history for the backup every day at midnight.
can we add 'also takes revision history for the backup every day at midnight, but you can configure your backups how you'd like.' And then can we link to https://deploy-preview-20873--cockroachdb-docs.netlify.app/docs/v25.4/backup-and-restore-overview this page where there are all the options of diff types of backups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uber nit: i'd prefer providing a backup example without options (e.g revision history). We don't want users reaching for non default backup flavors because they are in general examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msbutler what is the default for this? Like if they don't explicitly configure revision history, will no revision history be taken at all? I want to make sure they know what they're getting with the default if that's what we're putting here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, a user does not need to specify any backup or restore options for the command to run successfully. If the user omits the revision history option::
- a full backup taken at time t1: is a full scan of the live data as of t1, e.g. SELECT .. AOST t1
- an inc backup from t1-t3: grabs the latest updates to live data as of t3.
If you haven't done so already (or find this unfamiliar or confusing), i highly recommend watching my dr molting session that explains this with more examples https://drive.google.com/file/d/1nRuIdY5cjKrXG6GRyastDda68u7HCQ5r/view
fwiw i don't think you need to explain the nuances of revision vs non revision history backups here.
|
|
||
| For information on scheduling backups at different levels or with other options, consult [CREATE SCHEDULE FOR BACKUP]({% link {{ page.version.version }}/create-schedule-for-backup.md %}). | ||
|
|
||
| ### Back up your app VC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rephrase to like 'take a one-off appvc backup?' and also explain that sometimes users would want to take a one-off backup for a separate copy.
| BACKUP INTO 'external://backup_s3/app' AS OF SYSTEM TIME '-10s'; | ||
| ~~~ | ||
|
|
||
| ### Back up a database from your app VC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should combine these sections? Like "How to take backups of specific objects from within your app VC?"
the main thing we're trying to get across is that you have to connect to the app VC and take the backup from within the app vc.
IT would also be nice to get a code snippet of the SQL command to use to log in to the app vc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IT would also be nice to get a code snippet of the SQL command to use to log in to the app vc.
I think this is already in there?
{% include_cached copy-clipboard.html %}
~~~ shell
cockroach sql --url \
"postgresql://root@{primary node IP or hostname}:26257?options=-ccluster={app_virtual_cluster_name}&sslmode=verify-full" \
--certs-dir "certs"
~~~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also don't think we need additional examples for database and table backups. we can provide a cluster level backup and then point to the backup docs. the flow is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synced offline with alicia. i think we can remove the db,table,schema level examples below.
| BACKUP bank.customers, bank.accounts INTO 'external://backup_s3/app/table' AS OF SYSTEM TIME '-10s'; | ||
| ~~~ | ||
|
|
||
| ### Back up a schema from your app VC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually users don't need to back up a schema - i think we should remove this - but appreciate that you were so comprehensive on it! For future reference most users just do full cluster, db, or table.
Changes from second round of reviews
broken link
alicia-l2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small nit but other than that LGTM! thanks!
|
|
||
| 1. [Connect to the virtual cluster](#connect-to-a-virtual-cluster) you want to back up as a user with the `admin` role on the virtual cluster. | ||
| 1. [Back up the cluster]({% link {{ page.version.version }}/backup.md %}). Only the virtual cluster's data and settings are included in the backup, and data and settings for other virtual clusters or for the system virtual cluster is omitted. | ||
| Use the following process to create a schedule for a cluster-level backup of your app VC. In this example, the schedule takes an incremental backup every day at midnight, and a full backup weekly. Consult [CREATE SCHEDULE FOR BACKUP]({% link {{ page.version.version }}/create-schedule-for-backup.md %}#parameters) for a full list of backup schedule options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove 'cluster-level' and just say backup? and then add 'This happens to be a cluster-level backup' in the '2. Create a backup schedule:' line
| ## Backup and restore | ||
|
|
||
| When cluster virtualization is enabled, [backup]({% link {{ page.version.version }}/backup.md %}) and [restore]({% link {{ page.version.version }}/restore.md %}) commands are scoped to the virtual cluster by default. | ||
| Cockroach Labs recommends that you regularly [back up]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups) your data. When using virtual clusters, perform backups on the _application virtual cluster (app VC)_. Only the app VC's data and settings are included in these backups, and data and settings for other virtual clusters or for the _system virtual cluster (system VC)_ are omitted. You can also back up your system VC to preserve metadata such as users and cluster settings, but this data is usually not critical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why "data and settings"? could we just say "data"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, a user may misinterpret the line "You can also back up your system VC to preserve metadata such as users and cluster settings" to mean that their app vc users are stored in the system tenant. this is not true.
I would rephrase "You can also back up your system VC to preserve metadata related to your system vc"
| schedule_id | name | status | first_run | schedule | backup_stmt | ||
| ---------------------+----------------+------------------------------------------------+----------------------------------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------- | ||
| 588796190000218113 | schedule_label | PAUSED: Waiting for initial backup to complete | NULL | @daily | BACKUP INTO LATEST IN 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached | ||
| 588796190012702721 | schedule_label | ACTIVE | 2020-09-10 16:52:17.280821+00:00 | @weekly | BACKUP INTO 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: revision history is in the result here.
| ~~~ | ||
|
|
||
| You can restore a backup of a virtual cluster to: | ||
| You can also take one-off backups of a single [database]({% link {{ page.version.version }}/backup.md %}#back-up-a-database) or [table]({% link {{ page.version.version }}/backup.md %}#back-up-a-table-or-view) on your app VC. For more information on backup options, consult [BACKUP]({% link {{ page.version.version }}/backup.md %}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "single" and include "database(s)"
|
|
||
| ### Back up your system VC | ||
|
|
||
| You can also back up your system VC to preserve metadata such as users and cluster settings. Use the following process to back up your system VC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/preserve metadata such as users and cluster settings/ preserve metadata stored in the system VC/r
| - The original virtual cluster on the original CockroachDB cluster. | ||
| - A different virtual cluster on the original CockroachDB cluster. | ||
| - A different virtual cluster on a different CockroachDB cluster with cluster virtualization enabled. | ||
| If needed, you can restore a full backup to a new app VC with no user-created databases or tables. To restore your app VC from the latest full backup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "full". Also, the line "no user-created databases or tables" only applies to cluster level restores, so you may want to say this requirement is specific to the example below.
More small changes from review
|
@msbutler just added your requested changes :) |
florence-crl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending suggestions
Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com>
Fixes: DOC-15097
Restructured backup and dr section for clarity and thoroughness