Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c2c: replicate system tenant span configurations #106823

Closed
4 tasks done
msbutler opened this issue Jul 14, 2023 · 1 comment · Fixed by #110048
Closed
4 tasks done

c2c: replicate system tenant span configurations #106823

msbutler opened this issue Jul 14, 2023 · 1 comment · Fixed by #110048
Assignees
Labels
C-wishlist A wishlist feature.

Comments

@msbutler
Copy link
Collaborator

msbutler commented Jul 14, 2023

To obey a replicating app tenant's zone configurations on the destination cluster, we need to replicate the system tenant's span configuration table, as outlined in section two of this design doc. This project has a few steps:

Producer Side

  • Refactor the event stream to decode span config kv updates into span config records and send over updates relevant to the replicating tenant

Destination Side

  • Within the ingestion job, spin up the span config client for replicating the system tenant span config updates
  • Write these span config records to the online system span config table.
  • Buffer up all initial scan keys on resume and write them transactionally

Jira issue: CRDB-29731

@msbutler msbutler added the C-wishlist A wishlist feature. label Jul 14, 2023
@msbutler msbutler self-assigned this Jul 14, 2023
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 14, 2023
This patch plumbs the forSpanConfigs arg through the streamclient.Create()
interface and modifies producer job creation to return a specification for
streaming the spanConfiguration table if that arg is passed.

A future PR will plumb this info through the StreamPartition spec.

It's worth noting that we don't necessarily need this plumbing: if we see that
tenant Id is for the system tenant, then we could automatically set up a span
config stream. I think this extra plumbing is worth it because it makes
everything a bit easier to read.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 14, 2023
This patch plumbs the forSpanConfigs arg through the streamclient.Create()
interface and modifies producer job creation to return a specification for
streaming the spanConfiguration table if that arg is passed.

A future PR will plumb this info through the StreamPartition spec.

It's worth noting that we don't necessarily need this plumbing: if we see that
tenant Id is for the system tenant, then we could automatically set up a span
config stream. I think this extra plumbing is worth it because it makes
everything a bit easier to read.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 14, 2023
This patch plumbs the forSpanConfigs arg through the streamclient.Create()
interface and modifies producer job creation to return a specification for
streaming the spanConfiguration table if that arg is passed.

A future PR will plumb this info through the StreamPartition spec.

It's worth noting that we don't necessarily need this plumbing: if we see that
tenant Id is for the system tenant, then we could automatically set up a span
config stream. I think this extra plumbing is worth it because it makes
everything a bit easier to read.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 14, 2023
This patch plumbs the forSpanConfigs arg through the streamclient.Create()
interface and modifies producer job creation to return a specification for
streaming the spanConfiguration table if that arg is passed.

A future PR will plumb this info through the StreamPartition spec.

It's worth noting that we don't necessarily need this plumbing: if we see that
tenant Id is for the system tenant, then we could automatically set up a span
config stream. I think this extra plumbing is worth it because it makes
everything a bit easier to read.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 21, 2023
This patch adds a new call to the stream client and the new
`crdb_internal.start_span_config_replication_stream()` sql call which will
begin a replication stream on the _system_ tenant's span configuration table to
stream updates specific to the passed in application tenant. The ForSpanConfigs
flag is now persisted to the producer job record to indicate this flavour of
replication stream.

A future PR will plumb this info through the StreamPartition spec.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 21, 2023
If the user creates a stream client for span config updates, they will now
receive span config updates in their subscription.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 24, 2023
This patch adds a new call to the stream client and the new
`crdb_internal.start_span_config_replication_stream()` sql call which will
begin a replication stream on the _system_ tenant's span configuration table to
stream updates specific to the passed in application tenant. The ForSpanConfigs
flag is now persisted to the producer job record to indicate this flavour of
replication stream.

A future PR will plumb this info through the StreamPartition spec.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Jul 27, 2023
If the user creates a stream client for span config updates, they will now
receive span config updates in their subscription.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Aug 25, 2023
This patch modifies the span config event stream to emit a checkpoint event
containing the rangefeed frontier after the event stream processes each
rangefeed cache flush.

The span config client can then use this information while processing updates.
Specifically, the subscription.Next() call may return a
checkpoint which indicates that all updates up to a given frontier have been
emitted by the rangefeed.

This patch also fixes two bugs:
- prevents sending an empty batch of updates
- prevents sending system target span config updates

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Aug 25, 2023
This patch spawns a new goroutine on the stream ingestion job coordinator which
in turn creates a special span config client and subscription for replicating
span config updates. As updates come in, the coordinator will buffer updates
and deletes with the same source side commit timestamp and write them in a
transaction to the destination side system span configuration table when it
observes a new update with a newer timestamp.

The ingestion side assumes each replicated update is unique and in timestamp
order, which is enforced by the producer side logic built in cockroachdb#108356. This
assumption simplifies the destination side ingestion logic which must write
updates with the same source side transaction commit timestamp at the
same new timestamp on the destination side. This invariant ensures a span
configuration's target (i.e. the span that a configuration applies to) never
overlaps with any other span configuration target. Else, c2c would break the
span config reconciliation system.

Note that cutover will not revert any ingested span config updates, as we
wouldn't not want to issue revert range requests on an online system table in
the system tenant. That being said, a future PR will need to teach the
destination side application tenant to conduct a full span reconcilation job
immediately after cutover, which will safely revert any span config updates
that committed after the cutover timestamp.

Informs cockroachdb#106823

Release note: None
craig bot pushed a commit that referenced this issue Aug 25, 2023
108359: backupccl: hookup tracing aggregator events from the restore job r=stevendanna a=adityamaru

This change builds on top of #107994 and wires up each restore
data processor to emit TracingAggregatorEvents to the job coordinator.
These events are periodically flushed to files in the `job_info`
table and are consumable via the DBConsole Job Details page.

Fixes: #100126
Release note: None

109291: streamingccl: stream span config checkpoints r=stevendanna a=msbutler

This patch modifies the span config event stream to emit a checkpoint event
containing the rangefeed frontier after the event stream processes each
rangefeed cache flush.

The span config client can then use this information while processing updates.
Specifically, the subscription.Next() call may return a
checkpoint which indicates that all updates up to a given frontier have been
emitted by the rangefeed.

This patch also fixes two bugs:
- prevents sending an empty batch of updates
- prevents sending system target span config updates

Informs #106823

Release note: None

Co-authored-by: adityamaru <adityamaru@gmail.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
msbutler added a commit to msbutler/cockroach that referenced this issue Aug 28, 2023
This patch spawns a new goroutine on the stream ingestion job coordinator which
in turn creates a special span config client and subscription for replicating
span config updates. As updates come in, the coordinator will buffer updates
and deletes with the same source side commit timestamp and write them in a
transaction to the destination side system span configuration table when it
observes a new update with a newer timestamp.

The ingestion side assumes each replicated update is unique and in timestamp
order, which is enforced by the producer side logic built in cockroachdb#108356. This
assumption simplifies the destination side ingestion logic which must write
updates with the same source side transaction commit timestamp at the
same new timestamp on the destination side. This invariant ensures a span
configuration's target (i.e. the span that a configuration applies to) never
overlaps with any other span configuration target. Else, c2c would break the
span config reconciliation system.

Note that cutover will not revert any ingested span config updates, as we
wouldn't not want to issue revert range requests on an online system table in
the system tenant. That being said, a future PR will need to teach the
destination side application tenant to conduct a full span reconcilation job
immediately after cutover, which will safely revert any span config updates
that committed after the cutover timestamp.

Informs cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Aug 30, 2023
This patch spawns a new goroutine on the stream ingestion job coordinator which
in turn creates a special span config client and subscription for replicating
span config updates. As updates come in, the coordinator will buffer updates
and deletes with the same source side commit timestamp and write them in a
transaction to the destination side system span configuration table when it
observes a new update with a newer timestamp.

The ingestion side assumes each replicated update is unique and in timestamp
order, which is enforced by the producer side logic built in cockroachdb#108356. This
assumption simplifies the destination side ingestion logic which must write
updates with the same source side transaction commit timestamp at the
same new timestamp on the destination side. This invariant ensures a span
configuration's target (i.e. the span that a configuration applies to) never
overlaps with any other span configuration target. Else, c2c would break the
span config reconciliation system.

Note that cutover will not revert any ingested span config updates, as we
wouldn't not want to issue revert range requests on an online system table in
the system tenant. That being said, a future PR will need to teach the
destination side application tenant to conduct a full span reconcilation job
immediately after cutover, which will safely revert any span config updates
that committed after the cutover timestamp.

A future PR will also improve the ingestion logic during an initial scan on
Resume. Because the job does not maintain a protected timestamp on the span
config table, the job may miss updates if it resumes after a long pause period.
So, the ingestor will need to buffer all initial scan updates and flush them in
one transaction along with a tenant span wide delete, in order to maintain the
span config table invariants.

Informs cockroachdb#106823

Release note: None
craig bot pushed a commit that referenced this issue Aug 30, 2023
107557: streamingest: replicate span configs relevant to replicating tenant r=stevendanna a=msbutler

This patch spawns a new goroutine on the stream ingestion job coordinator which
in turn creates a special span config client and subscription for replicating
span config updates. As updates come in, the coordinator will buffer updates
and deletes with the same source side commit timestamp and write them in a
transaction to the destination side system span configuration table when it
observes a new update with a newer timestamp.

The ingestion side assumes each replicated update is unique and in timestamp
order, which is enforced by the producer side logic built in #108356. This
assumption simplifies the destination side ingestion logic which must write
updates with the same source side transaction commit timestamp at the
same new timestamp on the destination side. This invariant ensures a span
configuration's target (i.e. the span that a configuration applies to) never
overlaps with any other span configuration target. Else, c2c would break the
span config reconciliation system.

Note that cutover will not revert any ingested span config updates, as we
wouldn't not want to issue revert range requests on an online system table in
the system tenant. That being said, a future PR will need to teach the
destination side application tenant to conduct a full span reconcilation job
immediately after cutover, which will safely revert any span config updates
that committed after the cutover timestamp.

A future PR will also improve the ingestion logic during an initial scan on
Resume. Because the job does not maintain a protected timestamp on the span
config table, the job may miss updates if it resumes after a long pause period.
So, the ingestor will need to buffer all initial scan updates and flush them in
one transaction along with a tenant span wide delete, in order to maintain the
span config table invariants.

Informs #106823

Release note: None

Co-authored-by: Michael Butler <butler@cockroachlabs.com>
msbutler added a commit to msbutler/cockroach that referenced this issue Sep 5, 2023
Previously, the spanConfigIngestor treated updates from the rangefeed initial
scan the same way as regular updates: the ingestor would incrementally update
the destination side span config table by flushing updates with same source
side transaction timestamp. To undestand why this is problematic, recall that
an initial scan contains all the latest span config records for the relevant
tenant. So, some of these updates may have already been written to the
destination side. Further, the initial scan does not replicate source side span
config updates that have since been written over by later updates. To
summarize, its not possible to replicate a consistent view of the span
config table by writing incremental updates surfaced by the initial scan.

In this patch, the ingestor now buffers all updates surfaced by the initial
scan and flushes these updates in one transaction which also deletes all
existing span config records for the replicating tenant.

Fixes cockroachdb#106823

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Sep 7, 2023
Previously, the spanConfigIngestor treated updates from the rangefeed initial
scan the same way as regular updates: the ingestor would incrementally update
the destination side span config table by flushing updates with same source
side transaction timestamp. To undestand why this is problematic, recall that
an initial scan contains all the latest span config records for the relevant
tenant. So, some of these updates may have already been written to the
destination side. Further, the initial scan does not replicate source side span
config updates that have since been written over by later updates. To
summarize, its not possible to replicate a consistent view of the span
config table by writing incremental updates surfaced by the initial scan.

In this patch, the ingestor now buffers all updates surfaced by the initial
scan and flushes these updates in one transaction which also deletes all
existing span config records for the replicating tenant.

Fixes cockroachdb#106823

Release note: None
craig bot pushed a commit that referenced this issue Sep 8, 2023
110048: c2c: write all span config updates from intial scan in one transaction r=stevendanna a=msbutler

Previously, the spanConfigIngestor treated updates from the rangefeed initial scan the same way as regular updates: the ingestor would incrementally update the destination side span config table by flushing updates with same source side transaction timestamp. To undestand why this is problematic, recall that an initial scan contains all the latest span config records for the relevant tenant. So, some of these updates may have already been written to the destination side. Further, the initial scan does not replicate source side span config updates that have since been written over by later updates. To summarize, its not possible to replicate a consistent view of the span config table by writing incremental updates surfaced by the initial scan.

In this patch, the ingestor now buffers all updates surfaced by the initial scan and flushes these updates in one transaction which also deletes all existing span config records for the replicating tenant.

Fixes #106823

Release note: None

110215: changefeedccl: Fix test data race r=miretskiy a=miretskiy

Fixes #110155

Release note: None

Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@craig craig bot closed this as completed in 7b1505e Sep 8, 2023
Disaster Recovery Backlog automation moved this from Triage to Done Sep 8, 2023
msbutler added a commit to msbutler/cockroach that referenced this issue Sep 11, 2023
This patch adds a test that ensures that a replicating tenant's regional
constraints are obeyed in the destination cluster. This test serves as an end
to end test of the span config replication work tracked in cockroachdb#106823.

Epic: none

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue Sep 12, 2023
This patch adds a test that ensures that a replicating tenant's regional
constraints are obeyed in the destination cluster. This test serves as an end
to end test of the span config replication work tracked in cockroachdb#106823.

This patch also sets the following source system tenant cluster settings in
the c2c e2e framework: kv.rangefeed.closed_timestamp_refresh_interval: 200ms,
kv.closed_timestamp.side_transport_interval: 50 ms. CDC e2e tests also set
these cluster settings.

Epic: none

Release note: None
craig bot pushed a commit that referenced this issue Sep 12, 2023
110150: cli: fix debug pebble commands on encrypted stores r=RaduBerinde a=RaduBerinde

Currently the debug pebble commands only work correctly on an
encrypted store if the encrypted store's path is `cockroach-data` or
the store directory is passed using `--store` (in addition to being
passed to the pebble subcommand itself). What's worse, knowledge of
this subtle fact was lost among team members.

The root cause is that we are trying to resolve encryption options
using the server config.  The difficulty is that there are a bunch of
different commands and there is no unified way to obtain the store
directory of interest

To fix this, we create `autoDecryptFS`. This is a `vfs.FS`
implementation which is able to automatically detect encrypted paths
and use the correct unencrypted FS. It does this by having a list of
known encrypted stores (the ones in the `--enterprise-encryption`
flag), and looking for any of these paths as ancestors of any path in
an operation. This new implementation replaces `swappableFS` and
`absoluteFS`.

We also improve the error message when we try to open an encrypted
store without setting up the key correctly.

Fixes: #110121

Release note (bug fix): `cockroach debug pebble` commands now work
correctly with encrypted stores which don't use the default
`cockroach-data` path without having to also pass `--store`.

110173: sql: optimize persistedsqlstats flush size check r=j82w a=j82w

Problem:
The `persistedsqlstats` size check to make sure the table is not 1.5x the max size is done on every flush which is done on every node every 10 minutes by default. This can cause serialization issues as it is over the entire table. The check is unnecessary most of the time, because it should only fail if the compaction job is failing.

Solution:
1. Reduce the check interval to only be done once an hour by default, and make it configurable.
2. The system table is split in to 8 shards. Instead of checking the entire table count limit it to only one shard. This reduces the scope of the check and reduces the chance of serialization issues.

This was preivously reverted because of a flakey test because the size check is only done on a single shard. The tests are updated to increase the limit and the number of statements to make sure every shard has data.

Fixes: #109619

Release note (sql change): The persistedsqlstats table max size check is now done once an hour instead of every 10 minutes. This reduces the risk of serialization errors on the statistics tables.

110264: c2c: add region constraints replication test r=msbutler a=msbutler

This patch adds a test that ensures that a replicating tenant's regional
constraints are obeyed in the destination cluster. This test serves as an end
to end test of the span config replication work tracked in #106823.

This patch also sets the following source system tenant cluster settings in
the c2c e2e framework: kv.rangefeed.closed_timestamp_refresh_interval: 200ms,
kv.closed_timestamp.side_transport_interval: 50 ms. CDC e2e tests also set
these cluster settings.

Informs #109059

Release note: None

110334: roachtest: ensure c2c/shutdown tests start destination tenant with online node r=stevendanna a=msbutler

An earlier patch #110033 introduced a change that starts the destination tenant from any destination node, but did not consider if that node was shut down.  If the driver attempts to connect to the shut down node, the roachtest fails. This patch ensures that the tenant is started on a node that will be online.

Fixes #110317

Release note: None

110364: upgrade: remove buggy TTL repair r=rafiss a=ecwall

Fixes #110363

The TTL descriptor repair in FirstUpgradeFromReleasePrecondition incorrectly
removes TTL fields from table descriptors after incorrectly comparing the
table descriptor's TTL job schedule ID to a set of job IDs.

This change removes the repair until tests are properly added.

Release note (bug fix): Remove buggy TTL descriptor repair. Previously,
upgrading from 22.2.X to 23.1.9 incorrectly removed TTL storage params from
tables (visible via `SHOW CREATE TABLE <ttl-table>;`) while attempting to
repair table descriptors. This resulted in the node that attempts to run the
TTL job crashing due to a panic caused by the missing TTL storage params.
Clusters currently on 22.2.X should NOT be upgraded to 23.1.9 and should
be upgraded to 23.1.10 or later directly.

110431: workflows: stale.yml: update action version r=RaduBerinde a=RaduBerinde

The stale bot closes issues as "completed" instead of "not planned". More recent versions have added a configuration setting for this, and it defaults to "not planned". This commit updates the action to the latest version.

Epic: none
Release note: None

110451: engineccl: skip BenchmarkTimeBoundIterate r=RaduBerinde a=jbowens

This benchmark's assertions have recently become flaky.

Epic: none
Informs: #110299
Release note: none

Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Co-authored-by: j82w <jwilley@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: Evan Wall <wall@cockroachlabs.com>
Co-authored-by: RaduBerinde <radu@cockroachlabs.com>
Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-wishlist A wishlist feature.
Development

Successfully merging a pull request may close this issue.

1 participant