-
Notifications
You must be signed in to change notification settings - Fork 22
Changefeed Performance
The overall throughput and latency of Replicator is driven by the performance of the source changefeed and the target's ability to accept writes. There are a variety of configuration knobs that can be tweaked.
Operators with more than one changefeed should consider the use of changefeed metrics labels.
SET CLUSTER SETTING changefeed.experimental_poll_interval = '10ms';
SET CLUSTER SETTING changefeed.mux_rangefeed.enabled = true;
SET CLUSTER SETTING kv.rangefeed.closed_timestamp_refresh_interval = '3s';
SET CLUSTER SETTING kv.rangefeed.catchup_scan_concurrency = 64; --default is 8
SET CLUSTER SETTING kv.rangefeed.concurrent_catchup_iterators = 64; --default is 16
SET CLUSTER SETTING kv.rangefeed.scheduler.enabled = true;
-- Enable schema-locking on each table that is part of the changefeed.
-- This can be set and reset on the fly. It allows the changefeed emitter
-- to skip schema versioning checks in the steady-state.
ALTER TABLE foo_tbl SET(schema_locked=t);
ALTER TABLE foo_tbl RESET schema_locked; -- If/when schema changes are necessary.
-- Minimum required WITH options for webhook delivery.
CREATE CHANGEFEED FOR TABLE YCSB.USERTABLE
INTO 'webhook-https://127.0.0.1:30004/ycsb/public?insecure_tls_skip_verify=true'
WITH updated, resolved='1s', min_checkpoint_frequency='1s',
webhook_sink_config='{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
CockroachDB changefeeds guarantee at-least-once delivery of each message, but do not
guarantee idempotent redelivery of mutations under all circumstances
in the default configuration. That is, for some row R
, it would be possible to see
a sequence of row timestamps (1, 2, 3, 2, 4)
or just (1, 2, 3, 2)
.
Replicator's default, fully-consistent mode inherently debounces repeated deliveries. However, to prevent a row from being rolled back in eventually-consistent modes (BestEffort or Immediate), Replicator must write "stub" entries to the staging tables to know which versions of a row have already been applied to the target.
Idempotent changefeed redelivery can be configured by setting the
changefeed.frontier_checkpoint_frequency
cluster setting to 0. Once this has been done,
the --assumeIdempotent
flag may be passed to the start
command to disable the extra
version-tracking queries.
In deployments with high network latency, consider bandwidth delay product effects and optimize for larger transfer windows.
Pre-splitting tables across a larger number of ranges also provides additional opportunities for concurrent data transfer between CockroachDB and Replicator.
Additional guidance can be found at CRDB Advanced Chagnefeed Configuration
The replicator start
command supports a --discard
flag that will discard incoming changefeed
payloads and immediately return a 200 OK
response. This mode can be used to test the maximum
throughput that any particular source cluster and Replicator deployment can achieve. An accompanying
--discardDelay
flag can add an artificial delay to this mode to help with BDP-related testing.