Add small changes from PCR overhaul #20304

peachdawnleach · 2025-09-08T17:24:09Z

Working on a minor overhaul of the PCR docs- these are the smaller/less involved changes. Expecting to have separate PRs for individual larger changes soon.

netlify · 2025-09-08T17:24:33Z

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`c15ab3c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/68c9b7776e7b4600084f68b8

netlify · 2025-09-08T17:24:33Z

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`c15ab3c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-api-docs/deploys/68c9b777ef3be60008637079

github-actions · 2025-09-08T17:24:36Z

Files changed:

src/current/_includes/v25.3/known-limitations/physical-cluster-replication.md:

src/current/v25.3/physical-cluster-replication-overview.md
src/current/v25.3/known-limitations.md
src/current/v25.3/set-up-physical-cluster-replication.md
src/current/_includes/v25.3/known-limitations/physical-cluster-replication.md:

src/current/v25.3/physical-cluster-replication-overview.md
src/current/v25.3/known-limitations.md
src/current/v25.3/set-up-physical-cluster-replication.md
src/current/_includes/v25.3/known-limitations/physical-cluster-replication.md (Error: Circular reference found. Your build will fail.)

src/current/_includes/v25.3/physical-replication/interface-virtual-cluster.md:

src/current/v25.3/physical-cluster-replication-technical-overview.md
src/current/_includes/v25.3/physical-replication/phys-rep-sql-pages.md:

src/current/v25.3/create-virtual-cluster.md
src/current/v25.3/failover-replication.md
src/current/v25.3/physical-cluster-replication-monitoring.md
src/current/v25.3/physical-cluster-replication-overview.md
src/current/v25.3/physical-cluster-replication-technical-overview.md
src/current/v25.3/set-up-physical-cluster-replication.md

netlify · 2025-09-08T17:31:29Z

✅ Netlify Preview

Name	Link
🔨 Latest commit	`c15ab3c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-docs/deploys/68c9b77736ae8c0007aeda03
😎 Deploy Preview	https://deploy-preview-20304--cockroachdb-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

alicia-l2

lgtm other than some nits! Could you request a review from @msbutler as well?

src/current/v25.3/failover-replication.md

alicia-l2 · 2025-09-08T21:49:47Z

src/current/v25.3/set-up-physical-cluster-replication.md

 {% include_cached copy-clipboard.html %}
 ~~~ shell
-cockroach encode-uri {replication user}:{password}@{node IP or hostname}:26257 --ca-cert certs/ca.crt --inline
+cockroach encode-uri {replication user}:{password}@{node IP or hostname}:26257 --ca-cert {path to certs directory}/certs/ca.crt --inline


just to check, are you planning to add the CREATE EXTERAL CONNECTION stuff after this PR?

Yes- hoping to talk to you more about this soon

msbutler · 2025-09-10T11:50:31Z

@peachdawnleach @alicia-l2 is the plan to update the docs for older major versions as well?

alicia-l2 · 2025-09-10T13:16:14Z

@peachdawnleach @alicia-l2 is the plan to update the docs for older major versions as well?

Yes, backport to 24.1

msbutler · 2025-09-10T18:17:10Z

hey @peachdawnleach , could you coax the docs review automation to create a list of preview pages i can review? It's hard to tell for the markdown files what the changes will actually reflect.

Also, is standard for this patch to contain updates to previous release versions. We'd like these changes to be backported through 24.1

Added a number of small changes from PCR overhaul doc and fixed broken links and typos More various changes More various changes - will elaborate as needed in comments Moved info Moved info to a more relevant section Fixed broken link Accidentally broke a link - fixed typo Fixed more broken links Lots of broken links from these updates- hopefully fixed the last ones Minor fixes from review A few minor changes based on Alicia's review

peachdawnleach · 2025-09-11T13:06:04Z

Hi @msbutler - the build isn't working right now while I'm working on backporting, but this preview is an accurate representation of the changes made.

msbutler

I left too many comments :D. Perhaps @alicia-l2 should take a first pass at my hot takes.

msbutler · 2025-09-11T13:51:02Z

src/current/v25.3/physical-cluster-replication-overview.md

-{{site.data.alerts.callout_danger}}
-The standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster.
+{{site.data.alerts.callout_info}}
+The entire standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster at the time of [failover]({% link {{ page.version.version }}/failover-replication.md %}).


what do you mean by "entire"? Imho this sentence would be clearer if it were: "the system virtual cluster on the standby must always be at the same or one non skippable version ahead of the system virtual cluster on the primary".

Some other thoughts:

Recall that the replicating tenant on the standby is a pure reflection (including its version) of the app tenant on the primary.

innovations releases make this "one major version" language hard to talk about. we should use the same language used in our backup/restore docs.

I don't think these constraints are specific failover time. they apply always. See the magic doc.

cc @alicia-l2

yeah, let's remove 'at the time of failover.' I think we need to add another page to describe upgrades, let's keep as is for now (after removing 'at the time of failover').

msbutler · 2025-09-11T13:59:58Z

src/current/v25.3/physical-cluster-replication-technical-overview.md

 {% include {{ page.version.version }}/physical-replication/interface-virtual-cluster.md %}

-This separation of concerns means that the replication stream can operate without affecting work happening in a virtual cluster.
+If you utilize the read from standby feature in PCR, the standby cluster has an additional reader virtual cluster which is a copy of the application virtual cluster. 


i dont think "copy" is the right term here as it implies that it's copy of all user data. We could instead write: "additional reader virtual cluster which safely serves read requests on the replicating virtual cluster" You could also provide some analogy with symlinks if you wanted.

Yeah let's do that, eventually we're going to have another page for this btw

msbutler · 2025-09-11T14:01:00Z

src/current/v25.3/physical-cluster-replication-technical-overview.md

-This separation of concerns means that the replication stream can operate without affecting work happening in a virtual cluster.
+If you utilize the read from standby feature in PCR, the standby cluster has an additional reader virtual cluster which is a copy of the application virtual cluster. 
+
+This separation of controls and data means that the replication stream can operate without affecting work happening in a virtual cluster.


"without affecting work happening in a virtual cluster." i don't understand what this means. is this referring specifically to the primary host cluster?

I think we can remove this line, this line is basically trying to describe cluster virtualization which we already have a section of docs dedicated to

msbutler · 2025-09-11T14:04:12Z

src/current/v25.3/physical-cluster-replication-technical-overview.md

 The stream initialization proceeds as follows:

-1. The standby's consumer job connects via its system virtual cluster to the primary cluster and starts the primary cluster's physical stream producer job.
+1. The standby's consumer job connects to the primary cluster via the standby's system virtual cluster and starts the primary cluster's physical stream producer job.


nit: remove "physical". fwiw, the actual job type is "REPLICATION STREAM PRODUCER"

msbutler · 2025-09-11T14:06:06Z

src/current/v25.3/physical-cluster-replication-technical-overview.md

 ### Failover and promotion process

-The tracked replicated time and the advancing protected timestamp allows the replication stream to also track _retained time_, which is a timestamp in the past indicating the lower bound that the replication stream could fail over to. Therefore, the _failover window_ for a replication job falls between the retained time and the replicated time.
+The tracked replicated time and the advancing protected timestamp allow the replication stream to also track _retained time_, which is a timestamp in the past indicating the lower bound that the replication stream could fail over to. The retained time can be up to 4 hours in the past, due to the protected timestamp. Therefore, the _failover window_ for a replication job falls between the retained time and the replicated time.


nit: i don't see what the "tracked" and "advancing" adjectives are doing here. both the replicated time and the protected timestamp are "tracked" and "advancing". I'd remove them.

I think we should keep them, the fact that they're tracked/advancing isn't necessarily obvious to users who aren't already familiar with PCR.

msbutler · 2025-09-11T14:08:58Z

src/current/v25.3/set-up-physical-cluster-replication.md

    ~~~

-### Create a user for the standby cluster
+### Create a replication user and password


nit: "Create a user with replication privileges"

looks like this wasn't changed

You're right- this was in two places and only got changed in one. Thanks!

msbutler · 2025-09-11T14:21:05Z

src/current/v25.3/failover-replication.md

 You can replicate data from an existing CockroachDB cluster that does not have [cluster virtualization]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) enabled to a standby cluster with cluster virtualization enabled. For instructions on setting up a PCR in this way, refer to [Set up PCR from an existing cluster]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster).

-After a [failover](#failover) to the standby cluster, you may want to then set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are couple of ways to set up a new standby, and some considerations.
+After a [failover](#failover) to the standby cluster, you must set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are multiple ways to set up a new standby, and some considerations.


imho: i think this tutorial could more careful about using always using "original primary" and "original standby" when describing failback.

why the change from "you may want to" to "you must"? What if PCR is used for a one time migration?

@alicia-l2 I can't really speak to why this change was made, but it was in the google doc I worked from.

@msbutler I agree that this could be more careful and specific- I'll add that to my list of more substantial changes that I'll be working on in upcoming PRs.

We can keep it as 'you may want'

Some changes based on a review from Michael Butler

peachdawnleach · 2025-09-11T19:37:57Z

Moving the backport to a separate branch/PR as it's fairly complicated with replacing old terminology

one more change from review

peachdawnleach · 2025-09-15T15:59:21Z

Hi @alicia-l2 and @msbutler - all requested changes have been addressed, am I good to proceed to docs review on this? Thanks!

msbutler · 2025-09-15T17:45:42Z

src/current/v25.3/set-up-physical-cluster-replication.md

    ~~~

-### Create a user for the standby cluster
+### Create a replication user and password


looks like this wasn't changed

Small reword from review

Fixed broken links

…kroachdb/docs into 20250902-DOC-14757-small-pcr-changes

peachdawnleach · 2025-09-15T20:02:12Z

Changes ready for docs review: preview here

florence-crl

lgtm pending suggestions

src/current/v25.3/create-virtual-cluster.md

src/current/v25.3/failover-replication.md

florence-crl · 2025-09-16T17:10:12Z

src/current/v25.3/failover-replication.md


- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster).
- [After the PCR stream used an existing cluster as the primary cluster](#fail-back-after-pcr-from-an-existing-cluster).
+- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster). If this failback is initiated within 24 hours of the failover, PCR replicates the net-new changes from the standby cluster to the primary cluster, so you do not need to re-seed the primary cluster.


What is meant by "re-seed the primary cluster"? Please clarify.

Also should "re-seed" not have a hyphen and instead be "reseed"? Below "reseeding" is used in src/current/v25.3/physical-cluster-replication-overview.md on line 34.

florence-crl · 2025-09-16T17:23:04Z

src/current/v25.3/physical-cluster-replication-overview.md

 - **Improved RPO and RTO**: Depending on workload and deployment configuration, [replication lag]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}) between the primary and standby is generally in the tens-of-seconds range. The failover process from the primary cluster to the standby should typically happen within five minutes when completing a failover to the latest replicated time using [`LATEST`]({% link {{ page.version.version }}/alter-virtual-cluster.md %}#synopsis).
 - **Failover to a timestamp in the past or the future**: In the case of logical disasters or mistakes, you can [fail over]({% link {{ page.version.version }}/failover-replication.md %}) from the primary to the standby cluster to a timestamp in the past. This means that you can return the standby to a timestamp before the mistake was replicated to the standby. Furthermore, you can plan a failover by specifying a timestamp in the future.
- **Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event without an initial scan.
+- **Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event without reseeding data for an initial scan.


similar to above comment: maybe clarify "reseeding data".

src/current/v25.3/set-up-physical-cluster-replication.md

Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com>

Changes from docs review

* Various changes Added a number of small changes from PCR overhaul doc and fixed broken links and typos More various changes More various changes - will elaborate as needed in comments Moved info Moved info to a more relevant section Fixed broken link Accidentally broke a link - fixed typo Fixed more broken links Lots of broken links from these updates- hopefully fixed the last ones Minor fixes from review A few minor changes based on Alicia's review * Adjustments from review Some changes based on a review from Michael Butler * changed 'must' to 'may want to' one more change from review * Small change from review Small reword from review * Fixed broken links Fixed broken links * Apply suggestions from code review Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> * Changes from docs review Changes from docs review --------- Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com>

* Various changes Added a number of small changes from PCR overhaul doc and fixed broken links and typos More various changes More various changes - will elaborate as needed in comments Moved info Moved info to a more relevant section Fixed broken link Accidentally broke a link - fixed typo Fixed more broken links Lots of broken links from these updates- hopefully fixed the last ones Minor fixes from review A few minor changes based on Alicia's review * Full backport Fully backported DOC-14757 changes as far as v24.1. * Add small changes from PCR overhaul (#20304) * Various changes Added a number of small changes from PCR overhaul doc and fixed broken links and typos More various changes More various changes - will elaborate as needed in comments Moved info Moved info to a more relevant section Fixed broken link Accidentally broke a link - fixed typo Fixed more broken links Lots of broken links from these updates- hopefully fixed the last ones Minor fixes from review A few minor changes based on Alicia's review * Adjustments from review Some changes based on a review from Michael Butler * changed 'must' to 'may want to' one more change from review * Small change from review Small reword from review * Fixed broken links Fixed broken links * Apply suggestions from code review Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> * Changes from docs review Changes from docs review --------- Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> * Backported changes from review Backported changes from review * Fixed broken links * Fixed redirects * Removed accidental plus signs * Apply suggestions from code review Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> * Update src/current/v24.1/failover-replication.md Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> --------- Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com>

peachdawnleach requested a review from alicia-l2 September 8, 2025 20:00

alicia-l2 approved these changes Sep 8, 2025

View reviewed changes

msbutler self-requested a review September 9, 2025 17:37

peachdawnleach force-pushed the 20250902-DOC-14757-small-pcr-changes branch from 2337283 to e4f1917 Compare September 10, 2025 18:33

msbutler reviewed Sep 11, 2025

View reviewed changes

peachdawnleach and others added 2 commits September 11, 2025 15:18

Merge branch 'main' into 20250902-DOC-14757-small-pcr-changes

32ce109

Adjustments from review

2aff171

Some changes based on a review from Michael Butler

peachdawnleach and others added 2 commits September 12, 2025 11:08

changed 'must' to 'may want to'

21fb607

one more change from review

Merge branch 'main' into 20250902-DOC-14757-small-pcr-changes

6502a4b

msbutler approved these changes Sep 15, 2025

View reviewed changes

peachdawnleach and others added 4 commits September 15, 2025 14:15

Small change from review

19d8a36

Small reword from review

Merge branch 'main' into 20250902-DOC-14757-small-pcr-changes

5068833

Fixed broken links

76f2685

Fixed broken links

Merge branch '20250902-DOC-14757-small-pcr-changes' of github.com:coc…

ea5f3b5

…kroachdb/docs into 20250902-DOC-14757-small-pcr-changes

florence-crl self-requested a review September 15, 2025 20:42

florence-crl approved these changes Sep 16, 2025

View reviewed changes

peachdawnleach and others added 2 commits September 16, 2025 14:37

Apply suggestions from code review

02854a2

Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com>

Changes from docs review

8bfbfac

Changes from docs review

Merge branch 'main' into 20250902-DOC-14757-small-pcr-changes

c15ab3c

peachdawnleach enabled auto-merge (squash) September 16, 2025 19:16

peachdawnleach merged commit 268762c into main Sep 16, 2025
6 checks passed

peachdawnleach deleted the 20250902-DOC-14757-small-pcr-changes branch September 16, 2025 19:40

peachdawnleach mentioned this pull request Sep 16, 2025

Small PCR changes backport #20363

Merged

Add small changes from PCR overhaul #20304

Add small changes from PCR overhaul #20304

Uh oh!

Conversation

peachdawnleach commented Sep 8, 2025

Uh oh!

netlify bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Uh oh!

netlify bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-api-docs canceled.

Uh oh!

github-actions bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Files changed:

Uh oh!

netlify bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Netlify Preview

Uh oh!

alicia-l2 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

msbutler commented Sep 10, 2025

Uh oh!

alicia-l2 commented Sep 10, 2025

Uh oh!

msbutler commented Sep 10, 2025

Uh oh!

peachdawnleach commented Sep 11, 2025

Uh oh!

msbutler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peachdawnleach commented Sep 11, 2025

Uh oh!

peachdawnleach commented Sep 15, 2025

Uh oh!

netlify bot commented Sep 8, 2025 •

edited

Loading

netlify bot commented Sep 8, 2025 •

edited

Loading

github-actions bot commented Sep 8, 2025 •

edited

Loading

netlify bot commented Sep 8, 2025 •

edited

Loading

alicia-l2 left a comment •

edited

Loading