Skip to content

Conversation

@peachdawnleach
Copy link
Contributor

@peachdawnleach peachdawnleach commented Oct 6, 2025

Addresses: DOC-13854

Adding information about the new read from standby feature in pcr

Added new pages and added them to the TOC
@netlify
Copy link

netlify bot commented Oct 6, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit dca6a6c
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/68f2745e344bd60008be0d64

@netlify
Copy link

netlify bot commented Oct 6, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit dca6a6c
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/68f2745ee5078a0008dc3ef4

@github-actions
Copy link

github-actions bot commented Oct 6, 2025

Files changed:

@netlify
Copy link

netlify bot commented Oct 6, 2025

Deploy Preview for cockroachdb-docs failed. Why did it fail? →

Name Link
🔨 Latest commit 90bf4d9
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/68e3fbaee727ec0008f6e24a

@netlify
Copy link

netlify bot commented Oct 6, 2025

Netlify Preview

Name Link
🔨 Latest commit dca6a6c
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/68f2745e71acf70008e1d820
😎 Deploy Preview https://deploy-preview-20502--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Fixed broken links
…om:cockroachdb/docs into 2025-10-01-doc-13854-add-read-from-standby
Build was failing because summary was too short
@peachdawnleach
Copy link
Contributor Author

@msbutler msbutler self-requested a review October 7, 2025 12:57
Copy link

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice start!

SELECT region, SUM(amount) FROM orders GROUP BY region;
~~~

The results of queries on the standby cluster reflect the state of the primary cluster as of the replicated time.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think a more accurate way to say this is: "reads are always served at a historical time approaches the replicated time." (its not quite the replicated time currently)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be accurate to word this as "The results of queries on the standby cluster reflect the state of the primary cluster as of a historical time that approaches the replicated time"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msbutler is this due to the lag between primary and standby AND readervc?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peachdawnleach yeah i like that language.

@alicia-l2 the lag between the AOST the user provides and the AOST actually used is due to a very annoying technical bug i really want to address. It has to do with the fact that the reader tenant descriptors are updated after the replicated data comes in on the replicating tenant.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msbutler do we have a github item for this bug? imo we should get this in for 26.1 if we can.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would market this as a "known limitation" :D

cockroachdb/cockroach#155369

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this line to the doc

The output provides the following information:
- the replication status of the standby cluster
- the timestamp of the most recently applied event on the standby cluster
- any lag relative to the primary cluster
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to see the replicated time used by reader queries specifically, we could point users to the resolved time posted by the standby poller job on the reader vc. maybe that's a follow up item for this page cc @alicia-l2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved time posted by the standby poller job on the reader vc.

how does one get to this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run SHOW JOBS on the reader tenant. Or view the job on the db console of the reader tenant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this language work here?

"For the actual replicated time of a specific query on the ReaderVC, find the resolved time posted by the standby poller job on the ReaderVC. You can find this information by viewing the job on the DB Console, or by running SHOW JOBS on the ReaderVC."

Copy link

@msbutler msbutler Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For the actual replicated time of a specific query on the ReaderVC, find the resolved time posted by the standby poller job on the ReaderVC. You can find this information by viewing the job on the DB Console, or by running SHOW JOBS on the ReaderVC."

i don't think that's quite right either.

I would be in favor of removing this "Monitor replication lag" section all together because: monitoring pcr lag is a separate user flow compared to monitoring the reader tenant workflow. It feels a bit redundant to explain SHOW VIRTUAL CLUSTER WITH REPLICATION STATUS here and over here.

Above, where you write "historical time that approaches the replicated time.", you could instead link "replicated time" to the show tenant with replication status page here..

I will sync with alicia on the correct ux for monitoring reader tenant queries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Can we remove 'poller' though? That's an internal term.
Can we also say "Standby cluster's DB console?"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alicia-l2 what do you think of my point about removing the "monitor replication lag" section?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, tbh I think that once we address that bug you were talking about we can then fix this. I'm fine removing. @peachdawnleach

Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great start! one thing to note is that we probably also have to edit the sql syntax pages as well for ALTER VIRTUAL CLUSTERhttps://deploy-preview-20502--cockroachdb-docs.netlify.app/docs/v25.3/alter-virtual-cluster


## How the read from standby feature works

PCR utilizes cluster virtualization to separate clusters' control planes from their data planes. A cluster always has one control plane, called a _system virtual cluster (SystemVC)_, and at least one data plane, called an _App Virtual Cluster (AppVC)_. A cluster's SystemVC manages PCR jobs and cluster metadata, and is not used for application queries. All data tables, system tables, and cluster settings in the standby cluster's AppVC are identical to the primary cluster's AppVC. The standby cluster's AppVC itself remains offline during replication.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standby cluster's AppVC itself remains offline during replication.

we should say the reason, @msbutler can you help here?

Small changes based on tech review
SELECT region, SUM(amount) FROM orders GROUP BY region;
~~~

The results of queries on the standby cluster reflect the state of the primary cluster as of the replicated time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peachdawnleach yeah i like that language.

@alicia-l2 the lag between the AOST the user provides and the AOST actually used is due to a very annoying technical bug i really want to address. It has to do with the fact that the reader tenant descriptors are updated after the replicated data comes in on the replicating tenant.

The output provides the following information:
- the replication status of the standby cluster
- the timestamp of the most recently applied event on the standby cluster
- any lag relative to the primary cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run SHOW JOBS on the reader tenant. Or view the job on the db console of the reader tenant.

Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm just some tiny nits - thanks!

The output provides the following information:
- the replication status of the standby cluster
- the timestamp of the most recently applied event on the standby cluster
- any lag relative to the primary cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Can we remove 'poller' though? That's an internal term.
Can we also say "Standby cluster's DB console?"

@rmloveland rmloveland self-requested a review October 16, 2025 14:42
Copy link
Contributor

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, non-blocking comments to take or leave

Added links
Added see also links
@peachdawnleach peachdawnleach enabled auto-merge (squash) October 17, 2025 16:53
@peachdawnleach peachdawnleach merged commit 886dfe2 into main Oct 17, 2025
6 checks passed
@peachdawnleach peachdawnleach deleted the 2025-10-01-doc-13854-add-read-from-standby branch October 17, 2025 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants