diff --git a/src/current/_data/redirects.yml b/src/current/_data/redirects.yml
index dc181e03936..b8fd9300a93 100644
--- a/src/current/_data/redirects.yml
+++ b/src/current/_data/redirects.yml
@@ -299,6 +299,12 @@
- destination: molt/migrate-data-load-and-replication.md
sources: [':version/migrate-from-postgres.md']
+- destination: molt/migrate-load-replicate.md
+ sources: ['molt/migrate-data-load-replicate-only.md']
+
+- destination: molt/migrate-resume-replication.md
+ sources: ['molt/migrate-replicate-only.md']
+
- destination: molt/migration-overview.md
sources: [':version/migration-overview.md']
diff --git a/src/current/_includes/molt/crdb-to-crdb-migration.md b/src/current/_includes/molt/crdb-to-crdb-migration.md
new file mode 100644
index 00000000000..5973c74b129
--- /dev/null
+++ b/src/current/_includes/molt/crdb-to-crdb-migration.md
@@ -0,0 +1,3 @@
+{{site.data.alerts.callout_info}}
+For CockroachDB-to-CockroachDB migrations, contact your account team for guidance.
+{{site.data.alerts.end}}
\ No newline at end of file
diff --git a/src/current/_includes/molt/fetch-data-load-output.md b/src/current/_includes/molt/fetch-data-load-output.md
index de9085a7dc8..78a8af7a07c 100644
--- a/src/current/_includes/molt/fetch-data-load-output.md
+++ b/src/current/_includes/molt/fetch-data-load-output.md
@@ -1,5 +1,16 @@
1. Check the output to observe `fetch` progress.
+ {% if page.name == "migrate-load-replicate.md" %}
+
+ The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting Replicator](#start-replicator):
+
+ {% include_cached copy-clipboard.html %}
+ ~~~
+ replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786
+ ~~~
+
+ {% endif %}
+
A `starting fetch` message indicates that the task has started:
@@ -16,7 +27,7 @@
~~~ json
- {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"2358840","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
+ {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"26685786","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
~~~
@@ -68,7 +79,7 @@
~~~
{% if page.name != "migrate-bulk-load.md" %}
- This message includes a `cdc_cursor` value. You must set the `--defaultGTIDSet` replication flag to this value when starting [`replication-only` mode](#replicate-changes-to-cockroachdb):
+ This message includes a `cdc_cursor` value. You must set the `--defaultGTIDSet` replication flag to this value when [starting Replicator](#start-replicator):
{% include_cached copy-clipboard.html %}
~~~
@@ -81,15 +92,4 @@
~~~ json
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"2358840","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
~~~
-
-
- {% if page.name == "migrate-data-load-replicate-only.md" %}
-
- The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting`replication-only` mode](#replicate-changes-to-cockroachdb):
-
- {% include_cached copy-clipboard.html %}
- ~~~
- replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786
- ~~~
-
- {% endif %}
\ No newline at end of file
+
\ No newline at end of file
diff --git a/src/current/_includes/molt/fetch-metrics.md b/src/current/_includes/molt/fetch-metrics.md
index a432d7f5fa9..b4c37bb4cd9 100644
--- a/src/current/_includes/molt/fetch-metrics.md
+++ b/src/current/_includes/molt/fetch-metrics.md
@@ -1,3 +1,5 @@
+### Fetch metrics
+
By default, MOLT Fetch exports [Prometheus](https://prometheus.io/) metrics at `http://127.0.0.1:3030/metrics`. You can override the address with `--metrics-listen-addr '{host}:{port}'`, where the endpoint will be `http://{host}:{port}/metrics`.
Cockroach Labs recommends monitoring the following metrics during data load:
diff --git a/src/current/_includes/molt/fetch-replication-output.md b/src/current/_includes/molt/fetch-replication-output.md
index 28b8248c586..84496eda6b4 100644
--- a/src/current/_includes/molt/fetch-replication-output.md
+++ b/src/current/_includes/molt/fetch-replication-output.md
@@ -19,8 +19,8 @@
DEBUG [Jan 22 13:52:40] upserted rows conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"migration_schema\".\"employees\"" upserted=1
~~~
- {% if page.name != "migrate-replicate-only.md" %}
+ {% if page.name != "migrate-resume-replication.md" %}
{{site.data.alerts.callout_success}}
- If replication is interrupted, you can [resume replication]({% link molt/migrate-replicate-only.md %}).
+ If replication is interrupted, you can [resume replication]({% link molt/migrate-resume-replication.md %}).
{{site.data.alerts.end}}
{% endif %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/fetch-replicator-flags.md b/src/current/_includes/molt/fetch-replicator-flags.md
deleted file mode 100644
index 2551ebc6bbc..00000000000
--- a/src/current/_includes/molt/fetch-replicator-flags.md
+++ /dev/null
@@ -1,92 +0,0 @@
-In the `molt fetch` command, use `--replicator-flags` to pass options to the included `replicator` process that handles continuous replication. For details on all available flags, refer to the [MOLT Fetch documentation]({% link molt/molt-fetch.md %}#replication-flags).
-
-{% if page.name == "migrate-data-load-replicate-only.md" %}
-
-| Flag | Description |
-|-----------------|----------------------------------------------------------------------------------------------------------------|
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-
-
-
-| Flag | Description |
-|--------------------|-------------------------------------------------------------------------------------------------------------------------------------|
-| `--defaultGTIDSet` | **Required.** Default GTID set for changefeed. |
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-| `--userscript` | Path to a userscript that enables table filtering from MySQL sources. Refer to [Table filter userscript](#table-filter-userscript). |
-
-Replication from MySQL requires `--defaultGTIDSet`, which sets the starting GTID for replication. You can find this value in the `cdc_cursor` field of the `fetch complete` message after the [initial data load](#load-data-into-cockroachdb) completes.
-
-
-
-| Flag | Description |
-|---------------------|--------------------------------------------------------------------------------------------------------------------------------------|
-| `--scn` | **Required.** Snapshot System Change Number (SCN) for the initial changefeed starting point. |
-| `--backfillFromSCN` | **Required.** SCN of the earliest active transaction at the time of the snapshot. Ensures no transactions are skipped. |
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-| `--userscript` | Path to a userscript that enables table filtering from Oracle sources. Refer to [Table filter userscript](#table-filter-userscript). |
-
-Replication from Oracle requires `--scn` and `--backfillFromSCN`, which specify the snapshot SCN and the earliest active transaction SCN, respectively. You can find these values in the message `replication-only mode should include the following replicator flags` after the [initial data load](#load-data-into-cockroachdb) completes.
-
-
-{% elsif page.name == "migrate-replicate-only.md" %}
-| Flag | Description |
-|-------------------|----------------------------------------------------------------------------------------------------------------|
-| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. |
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-
-Resuming replication requires `--stagingSchema`, which specifies the staging schema name used as a checkpoint. MOLT Fetch [logs the staging schema name]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb) as the `staging database name` when it starts replication. For example:
-
-~~~ json
- {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"staging database name: _replicator_1749699789613149000"}
-~~~
-
-
-{{site.data.alerts.callout_info}}
-When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript]({% link molt/migrate-data-load-replicate-only.md %}?filters=mysql#table-filter-userscript).
-{{site.data.alerts.end}}
-
-
-
-{{site.data.alerts.callout_info}}
-When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript]({% link molt/migrate-data-load-replicate-only.md %}?filters=oracle#table-filter-userscript).
-{{site.data.alerts.end}}
-
-
-{% elsif page.name == "migrate-data-load-and-replication.md" %}
-| Flag | Description |
-|-----------------|----------------------------------------------------------------------------------------------------------------|
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-
-
-{{site.data.alerts.callout_info}}
-When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript](#table-filter-userscript).
-{{site.data.alerts.end}}
-
-
-
-{{site.data.alerts.callout_info}}
-When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript](#table-filter-userscript).
-{{site.data.alerts.end}}
-
-
-{% elsif page.name == "migrate-failback.md" %}
-| Flag | Description |
-|--------------------|--------------------------------------------------------------------------------------------------------------------------------------|
-| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. |
-| `--tlsCertificate` | Path to the server TLS certificate for the webhook sink. Refer to [Secure failback for changefeed](#secure-changefeed-for-failback). |
-| `--tlsPrivateKey` | Path to the server TLS private key for the webhook sink. Refer to [Secure failback for changefeed](#secure-changefeed-for-failback). |
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-
-- Failback requires `--stagingSchema`, which specifies the staging schema name used as a checkpoint. MOLT Fetch [logs the staging schema name]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb) when it starts replication:
-
- ~~~ shell
- staging database name: _replicator_1749699789613149000
- ~~~
-
-- When configuring a [secure changefeed](#secure-changefeed-for-failback) for failback, you **must** include `--tlsCertificate` and `--tlsPrivateKey`, which specify the paths to the server certificate and private key for the webhook sink connection.
-
-{% else %}
-| Flag | Description |
-|-----------------|----------------------------------------------------------------------------------------------------------------|
-| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
-{% endif %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/fetch-secure-connection-strings.md b/src/current/_includes/molt/fetch-secure-connection-strings.md
deleted file mode 100644
index a1cd259cf42..00000000000
--- a/src/current/_includes/molt/fetch-secure-connection-strings.md
+++ /dev/null
@@ -1,34 +0,0 @@
-To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
-
-- Avoid plaintext connection strings.
-
-- URL-encode connection strings for the source database and [CockroachDB]({% link {{site.current_cloud_version}}/connect-to-the-database.md %}) so special characters in passwords are handled correctly.
-
- - Given a password `a$52&`, pass it to the `molt escape-password` command with single quotes:
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt escape-password --password 'a$52&'
- ~~~
-
- Use the encoded password in your `--source` connection string. For example:
-
- ~~~
- --source 'postgres://migration_user:a%2452%26@localhost:5432/replicationload'
- ~~~
-
-- Provide your connection strings as environment variables. For example:
-
- ~~~ shell
- export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full"
- export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
- ~~~
-
- Afterward, reference the environment variables as follows:
-
- ~~~
- --source $SOURCE
- --target $TARGET
- ~~~
-
-- If possible, use an external secrets manager to load the environment variables from stored secrets.
\ No newline at end of file
diff --git a/src/current/_includes/molt/fetch-table-filter-userscript.md b/src/current/_includes/molt/fetch-table-filter-userscript.md
index c64c5adf595..5a2b5abc0ff 100644
--- a/src/current/_includes/molt/fetch-table-filter-userscript.md
+++ b/src/current/_includes/molt/fetch-table-filter-userscript.md
@@ -34,8 +34,8 @@ api.configureSource("defaultdb.migration_schema", {
});
~~~
-Pass the userscript to MOLT Fetch with the `--userscript` [replication flag](#replication-flags):
+Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replication-flags):
~~~
---replicator-flags "--userscript table_filter.ts"
+--userscript table_filter.ts
~~~
\ No newline at end of file
diff --git a/src/current/_includes/molt/migration-prepare-database.md b/src/current/_includes/molt/migration-prepare-database.md
index ea632aad30f..e3bbb27c183 100644
--- a/src/current/_includes/molt/migration-prepare-database.md
+++ b/src/current/_includes/molt/migration-prepare-database.md
@@ -1,6 +1,6 @@
#### Create migration user on source database
-Create a dedicated migration user (e.g., `MIGRATION_USER`) on the source database. This user is responsible for reading data from source tables during the migration. You will pass this username in the [source connection string](#source-connection-string).
+Create a dedicated migration user (for example, `MIGRATION_USER`) on the source database. This user is responsible for reading data from source tables during the migration. You will pass this username in the [source connection string](#source-connection-string).
{% include_cached copy-clipboard.html %}
@@ -12,11 +12,31 @@ Grant the user privileges to connect, view schema objects, and select the tables
{% include_cached copy-clipboard.html %}
~~~ sql
-GRANT CONNECT ON DATABASE source_database TO MIGRATION_USER;
-GRANT USAGE ON SCHEMA migration_schema TO MIGRATION_USER;
-GRANT SELECT ON ALL TABLES IN SCHEMA migration_schema TO MIGRATION_USER;
-ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT SELECT ON TABLES TO MIGRATION_USER;
+GRANT CONNECT ON DATABASE source_database TO migration_user;
+GRANT USAGE ON SCHEMA migration_schema TO migration_user;
+GRANT SELECT ON ALL TABLES IN SCHEMA migration_schema TO migration_user;
+ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT SELECT ON TABLES TO migration_user;
~~~
+
+{% if page.name != "migrate-bulk-load.md" %}
+Grant the `SUPERUSER` role to the user (recommended for replication configuration):
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER USER migration_user WITH SUPERUSER;
+~~~
+
+Alternatively, grant the following permissions to create replication slots, access replication data, create publications, and add tables to publications:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER USER migration_user WITH LOGIN REPLICATION;
+GRANT CREATE ON DATABASE source_database TO migration_user;
+ALTER TABLE migration_schema.table_name OWNER TO migration_user;
+~~~
+
+Run the `ALTER TABLE` command for each table to replicate.
+{% endif %}
@@ -29,9 +49,19 @@ Grant the user privileges to select only the tables you migrate:
{% include_cached copy-clipboard.html %}
~~~ sql
-GRANT SELECT ON source_database.* TO MIGRATION_USER@'%';
+GRANT SELECT ON source_database.* TO 'migration_user'@'%';
FLUSH PRIVILEGES;
~~~
+
+{% if page.name != "migrate-bulk-load.md" %}
+For replication, grant additional privileges for binlog access:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'migration_user'@'%';
+FLUSH PRIVILEGES;
+~~~
+{% endif %}
@@ -61,7 +91,12 @@ GRANT EXECUTE_CATALOG_ROLE TO C##MIGRATION_USER;
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;
-- Access to necessary V$ views
+GRANT SELECT ON V_$LOG TO C##MIGRATION_USER;
+GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER;
+GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER;
+GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$DATABASE TO C##MIGRATION_USER;
+GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER;
-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO C##MIGRATION_USER;
@@ -124,7 +159,13 @@ GRANT SELECT, FLASHBACK ON migration_schema.tbl TO MIGRATION_USER;
{% if page.name != "migrate-bulk-load.md" %}
#### Configure source database for replication
+{{site.data.alerts.callout_info}}
+Connect to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary), **not** a replica. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.
+{{site.data.alerts.end}}
+
+Verify that you are connected to the primary server by running `SELECT pg_is_in_recovery();` and getting a `false` result.
+
Enable logical replication by setting `wal_level` to `logical` in `postgresql.conf` or in the SQL shell. For example:
{% include_cached copy-clipboard.html %}
@@ -134,22 +175,72 @@ ALTER SYSTEM SET wal_level = 'logical';
-For MySQL **8.0 and later** sources, enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) consistency. Set the following values in `mysql.cnf`, in the SQL shell, or as flags in the `mysql` start command:
+Enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) and configure binary logging. Set `binlog-row-metadata` or `binlog-row-image` to `full` to provide complete metadata for replication.
-- `--enforce-gtid-consistency=ON`
-- `--gtid-mode=ON`
-- `--binlog-row-metadata=full`
+Configure binlog retention to ensure GTIDs remain available throughout the migration:
-For MySQL **5.7** sources, set the following values. Note that `binlog-row-image` is used instead of `binlog-row-metadata`. Set `server-id` to a unique integer that differs from any other MySQL server you have in your cluster (e.g., `3`).
+- MySQL 8.0.1+: Set `binlog_expire_logs_seconds` (default: 2592000 = 30 days) based on your migration timeline.
+- MySQL < 8.0: Set `expire_logs_days`, or manually manage retention by setting `max_binlog_size` and using `PURGE BINARY LOGS BEFORE NOW() - INTERVAL 1 HOUR` (adjusting the interval as needed). Force binlog rotation with `FLUSH BINARY LOGS` if needed.
+- Managed services: Refer to provider-specific configuration for [Amazon RDS](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/mysql-stored-proc-configuring.html) or [Google Cloud SQL](https://cloud.google.com/sql/docs/mysql/flags#mysql-b).
-- `--enforce-gtid-consistency=ON`
-- `--gtid-mode=ON`
-- `--binlog-row-image=full`
-- `--server-id={ID}`
-- `--log-bin=log-bin`
+{% comment %}
+{{site.data.alerts.callout_info}}
+GTID replication sends all database changes to Replicator. To limit replication to specific tables or schemas, use a userscript.
+{{site.data.alerts.end}}
+{% endcomment %}
+
+| Version | Configuration |
+|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| MySQL 5.6 | `--gtid-mode=on` `--enforce-gtid-consistency=on` `--server-id={unique_id}` `--log-bin=mysql-binlog` `--binlog-format=row` `--binlog-row-image=full` `--log-slave-updates=ON` |
+| MySQL 5.7 | `--gtid-mode=on` `--enforce-gtid-consistency=on` `--binlog-row-image=full` `--server-id={unique_id}` `--log-bin=log-bin` |
+| MySQL 8.0+ | `--gtid-mode=on` `--enforce-gtid-consistency=on` `--binlog-row-metadata=full` |
+| MariaDB | `--log-bin` `--server_id={unique_id}` `--log-basename=master1` `--binlog-format=row` `--binlog-row-metadata=full` |
+##### Enable ARCHIVELOG and FORCE LOGGING
+
+Enable `ARCHIVELOG` mode for LogMiner to access archived redo logs:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Check current log mode
+SELECT log_mode FROM v$database;
+
+-- Enable ARCHIVELOG (requires database restart)
+SHUTDOWN IMMEDIATE;
+STARTUP MOUNT;
+ALTER DATABASE ARCHIVELOG;
+ALTER DATABASE OPEN;
+
+-- Verify ARCHIVELOG is enabled
+SELECT log_mode FROM v$database; -- Expected: ARCHIVELOG
+~~~
+
+Enable supplemental logging for primary keys:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Enable minimal supplemental logging for primary keys
+ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
+
+-- Verify supplemental logging status
+SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
+-- Expected:
+-- SUPPLEMENTAL_LOG_DATA_MIN: IMPLICIT (or YES)
+-- SUPPLEMENTAL_LOG_DATA_PK: YES
+~~~
+
+Enable `FORCE LOGGING` to ensure all changes are logged:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE FORCE LOGGING;
+
+-- Verify FORCE LOGGING is enabled
+SELECT force_logging FROM v$database; -- Expected: YES
+~~~
+
##### Create source sentinel table
Create a checkpoint table called `_replicator_sentinel` in the Oracle schema you will migrate:
@@ -219,11 +310,11 @@ ON
~~~
~~~
- GROUP# MEMBER START_SCN END_SCN
-_________ _________________________________________ ____________ ______________________
- 3 /opt/oracle/oradata/ORCLCDB/redo03.log 1232896 9295429630892703743
- 2 /opt/oracle/oradata/ORCLCDB/redo02.log 1155042 1232896
- 1 /opt/oracle/oradata/ORCLCDB/redo01.log 1141934 1155042
+ GROUP# MEMBER START_SCN END_SCN
+_________ _________________________________________ ____________ ______________________
+ 3 /opt/oracle/oradata/ORCLCDB/redo03.log 1232896 9295429630892703743
+ 2 /opt/oracle/oradata/ORCLCDB/redo02.log 1155042 1232896
+ 1 /opt/oracle/oradata/ORCLCDB/redo01.log 1141934 1155042
3 rows selected.
~~~
diff --git a/src/current/_includes/molt/migration-stop-replication.md b/src/current/_includes/molt/migration-stop-replication.md
index 9da45f9be36..feb97edff1d 100644
--- a/src/current/_includes/molt/migration-stop-replication.md
+++ b/src/current/_includes/molt/migration-stop-replication.md
@@ -1,4 +1,6 @@
+{% if page.name != "migrate-failback.md" %}
1. Stop application traffic to your source database. **This begins downtime.**
+{% endif %}
1. Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. There are two ways to determine that replication has fully drained:
- When replication is caught up, you will not see new `upserted rows` logs.
diff --git a/src/current/_includes/molt/molt-connection-strings.md b/src/current/_includes/molt/molt-connection-strings.md
index f33426755e3..a180ab210b2 100644
--- a/src/current/_includes/molt/molt-connection-strings.md
+++ b/src/current/_includes/molt/molt-connection-strings.md
@@ -4,6 +4,12 @@ Define the connection strings for the [source](#source-connection-string) and [t
The `--source` flag specifies the connection string for the source database:
+{% if page.name != "migrate-bulk-load.md" %}
+{{site.data.alerts.callout_info}}
+The source connection **must** point to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary). Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.
+{{site.data.alerts.end}}
+{% endif %}
+
~~~
--source 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'
@@ -66,4 +72,4 @@ For details, refer to [Connect using a URL]({% link {{site.current_cloud_version
#### Secure connections
-{% include molt/fetch-secure-connection-strings.md %}
\ No newline at end of file
+{% include molt/molt-secure-connection-strings.md %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-docker.md b/src/current/_includes/molt/molt-docker.md
index e6a29879b6c..0e02705974b 100644
--- a/src/current/_includes/molt/molt-docker.md
+++ b/src/current/_includes/molt/molt-docker.md
@@ -1,11 +1,9 @@
-For details on pulling Docker images, see [Docker image](#docker-image).
+#### Performance
-### Performance
-
-MOLT Fetch and Verify are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.
+MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.
{% if page.name == "molt-fetch.md" %}
-### Authentication
+#### Authentication
When using MOLT Fetch with [cloud storage](#bucket-path), it is necessary to specify volumes and environment variables, as described in the following sections for [Google Cloud Storage](#google-cloud-storage) and [Amazon S3](#amazon-s3).
@@ -17,7 +15,7 @@ docker run -it cockroachdb/molt fetch ...
For more information on `docker run`, see the [Docker documentation](https://docs.docker.com/reference/cli/docker/container/run/).
-#### Google Cloud Storage
+##### Google Cloud Storage
If you are using [Google Cloud Storage](https://cloud.google.com/storage/docs/access-control) for [cloud storage](#bucket-path):
@@ -44,7 +42,7 @@ docker run \
For details on Google Cloud Storage authentication, see [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials).
-#### Amazon S3
+##### Amazon S3
If you are using [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-iam.html) for [cloud storage](#bucket-path):
@@ -61,20 +59,34 @@ docker run \
~~~
{% endif %}
-### Local connection strings
+#### Local connection strings
When testing locally, specify the host as follows:
- For macOS, use `host.docker.internal`. For example:
-~~~
---source 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable'
---target "postgres://root@host.docker.internal:26257/molt?sslmode=disable"
-~~~
+ {% if page.name == "molt-replicator.md" %}
+ ~~~
+ --sourceConn 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable'
+ --targetConn "postgres://root@host.docker.internal:26257/molt?sslmode=disable"
+ ~~~
+ {% else %}
+ ~~~
+ --source 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable'
+ --target "postgres://root@host.docker.internal:26257/molt?sslmode=disable"
+ ~~~
+ {% endif %}
- For Linux and Windows, use `172.17.0.1`. For example:
-~~~
---source 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable'
---target "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
-~~~
\ No newline at end of file
+ {% if page.name == "molt-replicator.md" %}
+ ~~~
+ --sourceConn 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable'
+ --targetConn "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
+ ~~~
+ {% else %}
+ ~~~
+ --source 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable'
+ --target "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
+ ~~~
+ {% endif %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-install.md b/src/current/_includes/molt/molt-install.md
index f18561a03fe..86b224ba26f 100644
--- a/src/current/_includes/molt/molt-install.md
+++ b/src/current/_includes/molt/molt-install.md
@@ -14,20 +14,9 @@ The following binaries are included:
- `molt`
- `replicator`
-Both `molt` and `replicator` must be in your current **working directory**. To use replication features, `replicator` must be located either in the same directory as `molt` or in a directory directly beneath `molt`. For example, either of the following would be valid:
-
-~~~
-/migration-project/ # Your current working directory
-├── molt # MOLT binary
-└── replicator # Replicator binary
-~~~
-
-~~~
-/migration-project/ # Your current working directory
-├── molt # MOLT binary
-└── bin/ # Subdirectory
- └── replicator # Replicator binary
-~~~
+{{site.data.alerts.callout_success}}
+For ease of use, keep both `molt` and `replicator` in your current working directory.
+{{site.data.alerts.end}}
To display the current version of each binary, run `molt --version` and `replicator --version`.
@@ -39,16 +28,19 @@ MOLT Fetch is supported on Red Hat Enterprise Linux (RHEL) 9 and above.
{{site.data.alerts.end}}
{% endif %}
-### Docker image
+### Docker images
+
+{% if page.name != "molt-replicator.md" %}
+#### MOLT Fetch
-[Docker multi-platform images](https://hub.docker.com/r/cockroachdb/molt/tags) containing both the AMD and ARM binaries are available. To pull the latest image for PostgreSQL and MySQL:
+[Docker multi-platform images](https://hub.docker.com/r/cockroachdb/molt/tags) containing both the AMD and ARM `molt` and `replicator` binaries are available. To pull the latest image for PostgreSQL and MySQL:
{% include_cached copy-clipboard.html %}
~~~ shell
docker pull cockroachdb/molt
~~~
-To pull a specific version (e.g., `1.1.3`):
+To pull a specific version (for example, `1.1.3`):
{% include_cached copy-clipboard.html %}
~~~ shell
@@ -61,5 +53,22 @@ To pull the latest image for Oracle (note that only `linux/amd64` is supported):
~~~ shell
docker pull cockroachdb/molt:oracle-latest
~~~
+{% endif %}
+
+{% if page.name != "molt-fetch.md" %}
+#### MOLT Replicator
+
+[Docker images for MOLT Replicator](https://hub.docker.com/r/cockroachdb/replicator/tags) are also available as a standalone binary:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+docker pull cockroachdb/replicator
+~~~
-{% if page.name != "molt.md" %}For details on running in Docker, refer to [Docker usage](#docker-usage).{% endif %}
+To pull a specific version (for example, `v1.1.1`):
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+docker pull cockroachdb/replicator:v1.1.1
+~~~
+{% endif %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-limitations.md b/src/current/_includes/molt/molt-limitations.md
index 00fc35f014c..4e41fb29b93 100644
--- a/src/current/_includes/molt/molt-limitations.md
+++ b/src/current/_includes/molt/molt-limitations.md
@@ -1,5 +1,7 @@
### Limitations
+#### Fetch limitations
+
- `OID LOB` types in PostgreSQL are not supported, although similar types like `BYTEA` are supported.
@@ -13,14 +15,12 @@
- Only tables with [primary key]({% link {{ site.current_cloud_version }}/primary-key.md %}) types of [`INT`]({% link {{ site.current_cloud_version }}/int.md %}), [`FLOAT`]({% link {{ site.current_cloud_version }}/float.md %}), or [`UUID`]({% link {{ site.current_cloud_version }}/uuid.md %}) can be sharded with [`--export-concurrency`]({% link molt/molt-fetch.md %}#best-practices).
{% if page.name != "migrate-bulk-load.md" %}
-#### Replication limitations
+#### Replicator limitations
-
-- Replication modes require write access to the PostgreSQL primary instance. MOLT cannot create replication slots or run replication against a read replica.
-
+- Replication modes require connection to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary). MOLT cannot obtain replication checkpoints or transaction metadata from replicas.
-- MySQL replication is supported only with GTID-based configurations. Binlog-based features that do not use GTID are not supported.
+- MySQL replication is supported only with [GTID](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html)-based configurations. Binlog-based features that do not use GTID are not supported.
diff --git a/src/current/_includes/molt/molt-secure-connection-strings.md b/src/current/_includes/molt/molt-secure-connection-strings.md
new file mode 100644
index 00000000000..fb9e82a01b5
--- /dev/null
+++ b/src/current/_includes/molt/molt-secure-connection-strings.md
@@ -0,0 +1,52 @@
+- To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
+
+ - Avoid plaintext connection strings.
+
+ - Provide your connection strings as environment variables. For example:
+
+ ~~~ shell
+ export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full"
+ export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
+ ~~~
+
+ Afterward, reference the environment variables in MOLT commands:
+
+ {% if page.name == "molt-replicator.md" %}
+ ~~~
+ --sourceConn $SOURCE
+ --targetConn $TARGET
+ ~~~
+ {% else %}
+ ~~~
+ --source $SOURCE
+ --target $TARGET
+ ~~~
+ {% endif %}
+
+ - If possible, use an external secrets manager to load the environment variables from stored secrets.
+
+- Use TLS-enabled connection strings to encrypt data in transit from MOLT to the database. When using TLS certificates, ensure certificate files are accessible to the MOLT binary on the same machine.
+
+ For example, a PostgreSQL connection string with TLS certificates:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~
+ postgresql://migration_user@db.example.com:5432/appdb?sslmode=verify-full&sslrootcert=/etc/molt/certs/ca.pem&sslcert=/etc/molt/certs/client.crt&sslkey=/etc/molt/certs/client.key
+ ~~~
+
+- URL-encode connection strings for the source database and [CockroachDB]({% link {{site.current_cloud_version}}/connect-to-the-database.md %}) so special characters in passwords are handled correctly.
+
+ - Given a password `a$52&`, pass it to the `molt escape-password` command with single quotes:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ molt escape-password --password 'a$52&'
+ ~~~
+
+ Use the encoded password in your connection string. For example:
+
+ ~~~
+ postgres://migration_user:a%2452%26@localhost:5432/replicationload
+ ~~~
+
+- Remove `sslmode=disable` from production connection strings.
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-setup.md b/src/current/_includes/molt/molt-setup.md
index 6a04a2c6615..00f9f642a3b 100644
--- a/src/current/_includes/molt/molt-setup.md
+++ b/src/current/_includes/molt/molt-setup.md
@@ -14,7 +14,7 @@
- Create a CockroachDB [{{ site.data.products.cloud }}]({% link cockroachcloud/create-your-cluster.md %}) or [{{ site.data.products.core }}]({% link {{ site.current_cloud_version }}/install-cockroachdb-mac.md %}) cluster.
- Install the [MOLT (Migrate Off Legacy Technology)]({% link releases/molt.md %}#installation) tools.
-- Review the MOLT Fetch [best practices]({% link molt/molt-fetch.md %}#best-practices).
+- Review the [Fetch]({% link molt/molt-fetch.md %}#best-practices) and {% if page.name != "migrate-bulk-load.md" %}[Replicator]({% link molt/molt-replicator.md %}#best-practices){% endif %} best practices.
- Review [Migration Strategy]({% link molt/migration-strategy.md %}).
@@ -37,7 +37,33 @@
{% include molt/migration-create-sql-user.md %}
-## Configure data load
+{% if page.name != "migrate-bulk-load.md" %}
+### Configure GC TTL
+
+Before starting the [initial data load](#start-fetch), configure the [garbage collection (GC) TTL]({% link {{ site.current_cloud_version }}/configure-replication-zones.md %}#gc-ttlseconds) on the source CockroachDB cluster to ensure that historical data remains available when replication begins. The GC TTL must be long enough to cover the full duration of the data load.
+
+Increase the GC TTL before starting the data load. For example, to set the GC TTL to 24 hours:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = 86400;
+~~~
+
+{{site.data.alerts.callout_info}}
+The GC TTL duration must be higher than your expected time for the initial data load.
+{{site.data.alerts.end}}
+
+Once replication has started successfully (which automatically protects its own data range), you can restore the GC TTL to its original value. For example, to restore to 5 minutes:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = 300;
+~~~
+
+For details, refer to [Protect Changefeed Data from Garbage Collection]({% link {{ site.current_cloud_version }}/protect-changefeed-data.md %}).
+{% endif %}
+
+## Configure Fetch
When you run `molt fetch`, you can configure the following options for data load:
@@ -46,10 +72,7 @@ When you run `molt fetch`, you can configure the following options for data load
- [Table handling mode](#table-handling-mode): Determine how existing target tables are initialized before load.
- [Schema and table filtering](#schema-and-table-filtering): Specify schema and table names to migrate.
- [Data load mode](#data-load-mode): Choose between `IMPORT INTO` and `COPY FROM`.
-- [Metrics](#metrics): Configure metrics collection during the load.
-{% if page.name != "migrate-bulk-load.md" %}
-- [Replication flags](#replication-flags): Configure the `replicator` process.
-{% endif %}
+- [Fetch metrics](#fetch-metrics): Configure metrics collection during initial data load.
### Connection strings
@@ -71,12 +94,4 @@ When you run `molt fetch`, you can configure the following options for data load
{% include molt/fetch-data-load-modes.md %}
-### Metrics
-
-{% include molt/fetch-metrics.md %}
-
-{% if page.name == "migrate-data-load-and-replication.md" %}
-### Replication flags
-
-{% include molt/fetch-replicator-flags.md %}
-{% endif %}
\ No newline at end of file
+{% include molt/fetch-metrics.md %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-troubleshooting-failback.md b/src/current/_includes/molt/molt-troubleshooting-failback.md
new file mode 100644
index 00000000000..1667cc4c690
--- /dev/null
+++ b/src/current/_includes/molt/molt-troubleshooting-failback.md
@@ -0,0 +1,64 @@
+### Failback issues
+
+If the changefeed shows connection errors in `SHOW CHANGEFEED JOB`:
+
+##### Connection refused
+
+~~~
+transient error: Post "https://replicator-host:30004/molt/public": dial tcp [::1]:30004: connect: connection refused
+~~~
+
+This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.
+
+**Resolution:** Verify that MOLT Replicator is running on the port specified in the changefeed `INTO` configuration. Confirm the host and port are correct.
+
+##### Incorrect schema path errors
+
+This error occurs when the [CockroachDB changefeed]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}) webhook URL path does not match the target database schema naming convention:
+
+~~~
+transient error: 400 Bad Request: unknown schema:
+~~~
+
+The webhook URL path is specified in the `INTO` clause when you [create the changefeed]({% link molt/migrate-failback.md %}#create-the-cockroachdb-changefeed). For example: `webhook-https://replicator-host:30004/database/schema`.
+
+**Resolution:** Verify the webhook path format matches your target database type:
+
+- PostgreSQL or CockroachDB targets: Use `/database/schema` format. For example, `webhook-https://replicator-host:30004/migration_schema/public`.
+- MySQL targets: Use `/database` format (schema is implicit). For example, `webhook-https://replicator-host:30004/migration_schema`.
+- Oracle targets: Use `/DATABASE` format in uppercase. For example, `webhook-https://replicator-host:30004/MIGRATION_SCHEMA`.
+
+For details on configuring the webhook sink URI, refer to [Webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink).
+
+##### GC threshold error
+
+~~~
+batch timestamp * must be after replica GC threshold
+~~~
+
+This indicates starting from an invalid cursor that has been garbage collected.
+
+**Resolution:** Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};
+~~~
+
+##### Duplicated data re-application
+
+This occurs when resuming a changefeed from a cursor causes excessive data duplication.
+
+**Resolution:** Clear the staging database to prevent duplication. **This deletes all checkpoints and buffered data**, so use with caution:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+DROP DATABASE _replicator;
+~~~
+
+For more targeted cleanup, delete mutations from specific staging tables:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+DELETE FROM _replicator.employees WHERE true;
+~~~
\ No newline at end of file
diff --git a/src/current/_includes/molt/molt-troubleshooting.md b/src/current/_includes/molt/molt-troubleshooting-fetch.md
similarity index 71%
rename from src/current/_includes/molt/molt-troubleshooting.md
rename to src/current/_includes/molt/molt-troubleshooting-fetch.md
index e885bf5edb7..8c9d26031d4 100644
--- a/src/current/_includes/molt/molt-troubleshooting.md
+++ b/src/current/_includes/molt/molt-troubleshooting-fetch.md
@@ -1,11 +1,15 @@
-## Troubleshooting
+### Fetch issues
##### Fetch exits early due to mismatches
-`molt fetch` exits early in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `true`:
+When run in `none` or `truncate-if-exists` mode, `molt fetch` exits early in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `true`:
- A source table is missing a primary key.
- A source primary key and target primary key have mismatching types.
+ {{site.data.alerts.callout_success}}
+ These restrictions (missing or mismatching primary keys) can be bypassed with [`--skip-pk-check`]({% link molt/molt-fetch.md %}#skip-primary-key-matching).
+ {{site.data.alerts.end}}
+
- A [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) primary key has a different [collation]({% link {{site.current_cloud_version}}/collate.md %}) on the source and target.
- A source and target column have mismatching types that are not [allowable mappings]({% link molt/molt-fetch.md %}#type-mapping).
- A target table is missing a column that is in the corresponding source table.
@@ -39,23 +43,7 @@ GRANT SELECT, FLASHBACK ON migration_schema.orders TO C##MIGRATION_USER;
##### Table or view does not exist
-If the Oracle migration user lacks privileges on certain tables, you may receive errors stating that the table or view does not exist. Either use `--table-filter` to [limit the tables to be migrated](#schema-and-table-filtering), or grant the migration user `SELECT` privileges on all objects in the schema. Refer to [Create migration user on source database](#create-migration-user-on-source-database).
-
-{% if page.name != "migrate-bulk-load.md" %}
-##### Missing redo logs or unavailable SCN
-
-If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.
-
-Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.
-
-##### Missing replicator flags
-
-If required `--replicator-flags` are missing, ensure that the necessary flags for your mode are included. For details, refer to [Replication flags](#replication-flags).
-
-##### Replicator lag
-
-If the `replicator` process is lagging significantly behind the current Oracle SCN, you may see log messages like: `replicator is catching up to the current SCN at 5000 from 1000…`. This indicates that replication is progressing but is still behind the most recent changes on the source database.
-{% endif %}
+If the Oracle migration user lacks privileges on certain tables, you may receive errors stating that the table or view does not exist. Either use `--table-filter` to {% if page.name != "migrate-load-replicate.md" %}[limit the tables to be migrated]({% link molt/migrate-load-replicate.md %}#schema-and-table-filtering){% else %}[limit the tables to be migrated](#schema-and-table-filtering){% endif %}, or grant the migration user `SELECT` privileges on all objects in the schema. Refer to {% if page.name != "migrate-load-replicate.md" %}[Create migration user on source database]({% link molt/migrate-load-replicate.md %}#create-migration-user-on-source-database){% else %}[Create migration user on source database](#create-migration-user-on-source-database){% endif %}.
##### Oracle sessions remain open after forcefully stopping `molt` or `replicator`
diff --git a/src/current/_includes/molt/molt-troubleshooting-replication.md b/src/current/_includes/molt/molt-troubleshooting-replication.md
new file mode 100644
index 00000000000..d7d8bd2fe34
--- /dev/null
+++ b/src/current/_includes/molt/molt-troubleshooting-replication.md
@@ -0,0 +1,253 @@
+### Forward replication issues
+
+##### Performance troubleshooting
+
+If MOLT Replicator appears hung or performs poorly:
+
+1. Enable trace logging with `-vv` to get more visibility into the replicator's state and behavior.
+
+1. If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace `{host}` and `{metrics-port}` with your Replicator host and the port specified by `--metricsAddr`:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~shell
+ curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out
+ curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out
+ curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out
+ curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.out
+ ~~~
+
+1. Monitor lag metrics and adjust performance parameters as needed.
+
+
+##### Unable to create publication or slot
+
+This error occurs when logical replication is not supported.
+
+**Resolution:** If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.
+
+Verify that the source database supports logical replication by checking the `wal_level` parameter on PostgreSQL:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SHOW wal_level;
+~~~
+
+If `wal_level` is not set to `logical`, update it and restart PostgreSQL:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+ALTER SYSTEM SET wal_level = 'logical';
+~~~
+
+##### Replication slot already exists
+
+~~~
+ERROR: replication slot "molt_slot" already exists
+~~~
+
+**Resolution:** Either create a new slot with a different name, or drop the existing slot to start fresh:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SELECT pg_drop_replication_slot('molt_slot');
+~~~
+
+{{site.data.alerts.callout_danger}}
+Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.
+{{site.data.alerts.end}}
+
+##### Publication does not exist
+
+~~~
+run CREATE PUBLICATION molt_fetch FOR ALL TABLES;
+~~~
+
+**Resolution:** Create the publication on the source database. Ensure you also create the replication slot:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+CREATE PUBLICATION molt_publication FOR ALL TABLES;
+SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
+~~~
+
+##### Could not connect to PostgreSQL
+
+~~~
+could not connect to source database: failed to connect to `user=migration_user database=source_database`
+~~~
+
+**Resolution:** Verify the connection details including user, host, port, and database name. Ensure the database name in your `--sourceConn` connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the `CREATE PUBLICATION` and `SELECT pg_create_logical_replication_slot()` commands. Check if TLS certificates need to be included in the connection URI.
+
+##### Wrong replication slot name
+
+~~~
+run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database
+~~~
+
+**Resolution:** {% if page.name != "migrate-load-replicate.md" %}[Create the replication slot]({% link molt/migrate-load-replicate.md %}#configure-source-database-for-replication){% else %}[Create the replication slot](#configure-source-database-for-replication){% endif %} or verify the correct slot name:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
+~~~
+
+
+{% if page.name == "migrate-resume-replication.md" %}
+##### Resuming from stale location
+
+
+For PostgreSQL, the replication slot on the source database tracks progress automatically. Clearing the memo table is only necessary if the replication slot was destroyed and you need to restart replication from a specific LSN.
+
+**Resolution:** Clear the `_replicator.memo` table:
+
+
+
+**Resolution:** Clear the `_replicator.memo` table to remove stale GTID checkpoints:
+
+
+
+**Resolution:** Clear the `_replicator.memo` table to remove stale SCN (System Change Number) checkpoints:
+
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+DELETE FROM _replicator.memo WHERE true;
+~~~
+{% endif %}
+
+
+##### Repeated binlog syncing restarts
+
+If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.
+
+**Resolution:** Verify the GTID set is valid and **not** purged:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Check if GTID is in executed set
+SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;
+
+-- Check if GTID is purged
+SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;
+~~~
+
+Interpret the results as follows:
+
+- If `in_executed` returns `1` and `in_purged` returns `0`, the GTID is valid for replication.
+- If `in_purged` returns `1`, the GTID has been purged and you must find a newer consistent point.
+- If both return `0`, the GTID doesn't exist in the records and is invalid.
+
+If the GTID is purged or invalid, follow these steps:
+
+1. Increase binlog retention by configuring `binlog_expire_logs_seconds` in MySQL:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ sql
+ -- Increase binlog retention (example: 7 days = 604800 seconds)
+ SET GLOBAL binlog_expire_logs_seconds = 604800;
+ ~~~
+
+ {{site.data.alerts.callout_info}}
+ For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.
+ {{site.data.alerts.end}}
+
+1. Get a current GTID set to restart replication:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ sql
+ -- For MySQL < 8.0:
+ SHOW MASTER STATUS;
+ -- For MySQL 8.0+:
+ SHOW BINARY LOG STATUS;
+ ~~~
+
+ ~~~
+ +---------------+----------+--------------+------------------+-------------------------------------------+
+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+ +---------------+----------+--------------+------------------+-------------------------------------------+
+ | binlog.000005 | 197 | | | 77263736-7899-11f0-81a5-0242ac120002:1-38 |
+ +---------------+----------+--------------+------------------+-------------------------------------------+
+ ~~~
+
+ Use the `Executed_Gtid_Set` value for the `--defaultGTIDSet` flag.
+
+##### Invalid GTID format
+
+Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.
+
+**Resolution:** Use a valid GTID from `SHOW MASTER STATUS` (MySQL < 8.0) or `SHOW BINARY LOG STATUS` (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.
+
+{% if page.name == "migrate-resume-replication.md" %}
+##### Stale GTID from cache
+
+**Resolution:** Clear the `_replicator` database memo table:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+DELETE FROM _replicator.memo WHERE true;
+~~~
+{% endif %}
+
+
+
+##### Table/column names exceed 30 characters
+
+Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.
+
+**Resolution:** Rename tables and columns to 30 characters or fewer before migration.
+
+##### Unsupported data types
+
+LogMiner and replication do not support:
+
+- Long `BLOB`/`CLOB`s (4000+ characters)
+- User-defined types (UDTs)
+- Nested tables
+- Varrays
+- `GEOGRAPHY` and `GEOMETRY`
+
+**Resolution:** Convert unsupported data types or exclude affected tables from replication.
+
+##### LOB column UPDATE statements
+
+UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.
+
+**Resolution:** Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.
+
+##### JSONB null handling
+
+SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.
+
+**Resolution:** Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.
+
+##### Missing redo logs or unavailable SCN
+
+If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.
+
+**Resolution:** Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.
+
+##### Replicator lag
+
+If the `replicator` process is lagging significantly behind the current Oracle SCN, you may see log messages like: `replicator is catching up to the current SCN at 5000 from 1000…`. This indicates that replication is progressing but is still behind the most recent changes on the source database.
+
+
+##### Schema drift errors
+
+Indicates source and target schemas are mismatched:
+
+~~~
+WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name
+~~~
+
+**Resolution:** Align schemas or use userscripts to transform data.
+
+##### Apply flow failures
+
+Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:
+
+~~~
+WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
+ERROR: maximum number of retries (10) exceeded
+~~~
+
+**Resolution:** Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.
\ No newline at end of file
diff --git a/src/current/_includes/molt/optimize-replicator-performance.md b/src/current/_includes/molt/optimize-replicator-performance.md
new file mode 100644
index 00000000000..e9a6b27dce7
--- /dev/null
+++ b/src/current/_includes/molt/optimize-replicator-performance.md
@@ -0,0 +1,17 @@
+Configure the following [`replicator` flags]({% link molt/molt-replicator.md %}#flags) to optimize replication throughput and resource usage. Test different combinations in a pre-production environment to find the optimal balance of stability and performance for your workload.
+
+{{site.data.alerts.callout_info}}
+The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.
+{{site.data.alerts.end}}
+
+| Flag | Description |
+|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--parallelism` | Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity. |
+| `--flushSize` | Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency. |
+| `--targetApplyQueueSize` | Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption. |
+| `--targetMaxPoolSize` | Set larger than `--parallelism` by a safety factor to avoid exhausting target pool connections. Replicator enforces setting parallelism to 80% of this value. |
+| `--collapseMutations` | Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state. |
+| `--enableParallelApplies` | Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase `--targetMaxPoolSize` or reduce `--parallelism`. |
+| `--flushPeriod` | Set to the maximum allowable time between flushes (for example, `10s` if data must be applied within 10 seconds). Works with `--flushSize` to control when buffered mutations are committed to the target. |
+| `--quiescentPeriod` | Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve. |
+| `--scanSize` | Applies to {% if page.name != "migrate-failback".md" %}[failback]({% link molt/migrate-failback.md %}){% else %}failback{% endif %} (`replicator start`) scenarios **only**. Balance memory usage and throughput. Increase to read more rows at once from the CockroachDB staging cluster for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability. |
\ No newline at end of file
diff --git a/src/current/_includes/molt/oracle-migration-prerequisites.md b/src/current/_includes/molt/oracle-migration-prerequisites.md
index 8fb1422a5da..54eedc22e93 100644
--- a/src/current/_includes/molt/oracle-migration-prerequisites.md
+++ b/src/current/_includes/molt/oracle-migration-prerequisites.md
@@ -2,7 +2,7 @@
#### Oracle Instant Client
-Install Oracle Instant Client on the machine that will run `molt` and `replicator`:
+Install Oracle Instant Client on the machine that will run `molt` and `replicator`. If using the MOLT Replicator binary (instead of Docker), the Oracle Instant Client libraries must be accessible at `/usr/lib`.
- On macOS ARM machines, download the [Oracle Instant Client](https://www.oracle.com/database/technologies/instant-client/macos-arm64-downloads.html#ic_osx_inst). After installation, you should have a new directory at `/Users/$USER/Downloads/instantclient_23_3` containing `.dylib` files. Set the `LD_LIBRARY_PATH` environment variable to this directory:
@@ -17,55 +17,11 @@ Install Oracle Instant Client on the machine that will run `molt` and `replicato
~~~ shell
sudo apt-get install -yqq --no-install-recommends libaio1t64
sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1
- curl -o /tmp/ora-libs.zip https://replicator.cockroachdb.com/third_party/instantclient-basiclite-linux-amd64.zip
- unzip -d /tmp /tmp/ora-libs.zip
+ unzip -d /tmp /tmp/instantclient-basiclite-linux-amd64.zip
sudo mv /tmp/instantclient_21_13/* /usr/lib
export LD_LIBRARY_PATH=/usr/lib
~~~
-{% if page.name != "migrate-bulk-load.md" %}
-#### Enable `ARCHIVELOG`
-
-Enable `ARCHIVELOG` mode on the Oracle database. This is required for Oracle LogMiner, Oracle's built-in changefeed tool that captures DML events for replication.
-
-{% include_cached copy-clipboard.html %}
-~~~ sql
-SELECT log_mode FROM v$database;
-SHUTDOWN IMMEDIATE;
-STARTUP MOUNT;
-ALTER DATABASE ARCHIVELOG;
-ALTER DATABASE OPEN;
-SELECT log_mode FROM v$database;
-~~~
-
-~~~
-LOG_MODE
---------
-ARCHIVELOG
-
-1 row selected.
-~~~
-
-Enable supplemental primary key logging for logical replication:
-
-{% include_cached copy-clipboard.html %}
-~~~ sql
-ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
-SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
-~~~
-
-~~~
-SUPPLEMENTAL_LOG_DATA_MIN SUPPLEMENTAL_LOG_DATA_PK
-------------------------- ------------------------
-IMPLICIT YES
-
-1 row selected.
-~~~
-
-Enable `FORCE_LOGGING` to ensure that all data changes are captured for the tables to migrate:
-
-{% include_cached copy-clipboard.html %}
-~~~ sql
-ALTER DATABASE FORCE LOGGING;
-~~~
-{% endif %}
\ No newline at end of file
+ {{site.data.alerts.callout_success}}
+ You can also download Oracle Instant Client directly from the Oracle site for [Linux ARM64](https://www.oracle.com/database/technologies/instant-client/linux-amd64-downloads.html) or [Linux x86-64](https://www.oracle.com/ca-en/database/technologies/instant-client/linux-x86-64-downloads.html).
+ {{site.data.alerts.end}}
\ No newline at end of file
diff --git a/src/current/_includes/molt/replicator-flags-usage.md b/src/current/_includes/molt/replicator-flags-usage.md
new file mode 100644
index 00000000000..99f93b4ba0b
--- /dev/null
+++ b/src/current/_includes/molt/replicator-flags-usage.md
@@ -0,0 +1,74 @@
+The following [MOLT Replicator]({% link molt/molt-replicator.md %}) flags are **required** for continuous replication. For details on all available flags, refer to the [MOLT Replicator documentation]({% link molt/molt-replicator.md %}#flags).
+
+{% if page.name == "migrate-load-replicate.md" %}
+
+| Flag | Description |
+|-------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--slotName` | **Required.** PostgreSQL replication slot name. Must match the slot name specified with `--pglogical-replication-slot-name` in the [MOLT Fetch command](#start-fetch). |
+| `--targetSchema` | **Required.** Target schema name on CockroachDB where tables will be replicated. |
+| `--stagingSchema` | **Required.** Staging schema name for replication metadata and checkpoints. |
+| `--stagingCreateSchema` | **Required.** Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time. |
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+
+
+
+| Flag | Description |
+|-------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--targetSchema` | **Required.** Target schema name on CockroachDB where tables will be replicated. |
+| `--defaultGTIDSet` | **Required.** Default GTID set for changefeed. |
+| `--stagingSchema` | **Required.** Staging schema name for replication metadata and checkpoints. |
+| `--stagingCreateSchema` | **Required.** Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time. |
+| `--fetchMetadata` | Explicitly fetch column metadata for MySQL versions that do not support `binlog_row_metadata`. Requires `SELECT` permissions on the source database or `PROCESS` privileges. |
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+| `--userscript` | Path to a userscript that enables table filtering from MySQL sources. Refer to [Table filter userscript](#table-filter-userscript). |
+
+You can find the starting GTID in the `cdc_cursor` field of the `fetch complete` message after the [initial data load](#start-fetch) completes.
+
+
+
+| Flag | Description |
+|-------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
+| `--sourceSchema` | **Required.** Source schema name on Oracle where tables will be replicated from. |
+| `--targetSchema` | **Required.** Target schema name on CockroachDB where tables will be replicated. |
+| `--scn` | **Required.** Snapshot System Change Number (SCN) for the initial changefeed starting point. |
+| `--backfillFromSCN` | **Required.** SCN of the earliest active transaction at the time of the snapshot. Ensures no transactions are skipped. |
+| `--stagingSchema` | **Required.** Staging schema name for replication metadata and checkpoints. |
+| `--stagingCreateSchema` | **Required.** Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time. |
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+| `--userscript` | Path to a userscript that enables table filtering from Oracle sources. Refer to [Table filter userscript](#table-filter-userscript). |
+
+You can find the SCN values in the message `replication-only mode should include the following replicator flags` after the [initial data load](#start-fetch) completes.
+
+
+{% elsif page.name == "migrate-resume-replication.md" %}
+| Flag | Description |
+|-------------------|----------------------------------------------------------------------------------------------------------------|
+| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. |
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+
+The staging schema was created during [initial replication setup]({% link molt/migrate-load-replicate.md %}#start-replicator) with `--stagingCreateSchema`.
+
+
+{{site.data.alerts.callout_info}}
+When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript]({% link molt/migrate-load-replicate.md %}#table-filter-userscript).
+{{site.data.alerts.end}}
+
+
+{% elsif page.name == "migrate-failback.md" %}
+| Flag | Description |
+|--------------------|--------------------------------------------------------------------------------------------------------------------------------------|
+| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. |
+| `--bindAddr` | **Required.** Network address to bind the webhook sink for the changefeed. For example, `:30004`. |
+| `--tlsCertificate` | Path to the server TLS certificate for the webhook sink. Refer to [TLS certificate and key](#tls-certificate-and-key). |
+| `--tlsPrivateKey` | Path to the server TLS private key for the webhook sink. Refer to [TLS certificate and key](#tls-certificate-and-key). |
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+
+- The staging schema is first created during [initial replication setup]({% link molt/migrate-load-replicate.md %}#start-replicator) with `--stagingCreateSchema`.
+
+- When configuring a [secure changefeed](#tls-certificate-and-key) for failback, you **must** include `--tlsCertificate` and `--tlsPrivateKey`, which specify the paths to the server certificate and private key for the webhook sink connection.
+
+{% else %}
+| Flag | Description |
+|-----------------|----------------------------------------------------------------------------------------------------------------|
+| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. |
+{% endif %}
\ No newline at end of file
diff --git a/src/current/_includes/molt/replicator-flags.md b/src/current/_includes/molt/replicator-flags.md
index 950acac388b..4dea042a028 100644
--- a/src/current/_includes/molt/replicator-flags.md
+++ b/src/current/_includes/molt/replicator-flags.md
@@ -1,56 +1,54 @@
-### Replication flags
-
-The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication.
-
-| Flag | Type | Description |
-|----------------------------------------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `--applyTimeout` | `DURATION` | The maximum amount of time to wait for an update to be applied.
**Default:** `30s` |
-| `--dlqTableName` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.
**Default:** `replicator_dlq` |
-| `--enableParallelApplies` | `BOOL` | Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage.
**Default:** `false` |
-| `--flushPeriod` | `DURATION` | Flush queued mutations after this duration.
**Default:** `1s` |
-| `--flushSize` | `INT` | Ideal batch size to determine when to flush mutations.
**Default:** `30s` |
-| `--logDestination` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. |
-| `--logFormat` | `STRING` | Choose log output format: `"fluent"`, `"text"`.
**Default:** `"text"` |
-| `--maxRetries` | `INT` | Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.
**Default:** `10` |
-| `--metricsAddr` | `STRING` | A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. |
-| `--parallelism` | `INT` | The number of concurrent database transactions to use.
**Default:** `16` |
-| `--quiescentPeriod` | `DURATION` | How often to retry deferred mutations.
**Default:** `10s` |
-| `--retireOffset` | `DURATION` | How long to delay removal of applied mutations.
**Default:** `24h0m0s` |
-| `--retryInitialBackoff` | `DURATION` | Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation.
**Default:** `25ms` |
-| `--retryMaxBackoff` | `DURATION` | Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors.
**Default:** `2s` |
-| `--retryMultiplier` | `INT` | Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database.
**Default:** `2` |
-| `--scanSize` | `INT` | The number of rows to retrieve from the staging database used to store metadata for [replication modes](#fetch-mode).
**Default:** `10000` |
-| `--schemaRefresh` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.
**Default:** `1m0s` |
-| `--sourceConn` | `STRING` | The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to [Oracle replication flags](#oracle-replication-flags). |
-| `--stageDisableCreateTableReaderIndex` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.
**Default:** `false` |
-| `--stageMarkAppliedLimit` | `INT` | Limit the number of mutations to be marked applied in a single statement.
**Default:** `100000` |
-| `--stageSanityCheckPeriod` | `DURATION` | How often to validate staging table apply order (`-1` to disable).
**Default:** `10m0s` |
-| `--stageSanityCheckWindow` | `DURATION` | How far back to look when validating staging table apply order.
**Default:** `1h0m0s` |
-| `--stageUnappliedPeriod` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).
**Default:** `1m0s` |
-| `--stagingConn` | `STRING` | The staging database's connection string. |
-| `--stagingCreateSchema` | | Automatically create the staging schema if it does not exist. |
-| `--stagingIdleTime` | `DURATION` | Maximum lifetime of an idle connection.
**Default:** `1m0s` |
-| `--stagingJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.
**Default:** `15s` |
-| `--stagingMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.
**Default:** `5m0s` |
-| `--stagingMaxPoolSize` | `INT` | The maximum number of staging database connections.
**Default:** `128` |
-| `--stagingSchema` | `STRING` | Name of the CockroachDB schema that stores replication metadata. **Required** each time [`--mode replication-only`](#replication-only) is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction. For details, refer to [Resume replication](#resume-replication).
**Default:** `_replicator.public` |
-| `--targetApplyQueueSize` | `INT` | Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (`pglogical`) sources, and replaces the deprecated `--copierChannel` and `--stageCopierChannelSize` flags. |
-| `--targetConn` | `STRING` | The target database's connection string. |
-| `--targetIdleTime` | `DURATION` | Maximum lifetime of an idle connection.
**Default:** `1m0s` |
-| `--targetJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.
**Default:** `15s` |
-| `--targetMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.
**Default:** `5m0s` |
-| `--targetMaxPoolSize` | `INT` | The maximum number of target database connections.
**Default:** `128` |
-| `--targetSchema` | `STRING` | The SQL database schema in the target cluster to update. |
-| `--targetStatementCacheSize` | `INT` | The maximum number of prepared statements to retain.
**Default:** `128` |
-| `--taskGracePeriod` | `DURATION` | How long to allow for task cleanup when recovering from errors.
**Default:** `1m0s` |
-| `--timestampLimit` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.
**Default:** `1000` |
-| `--userscript` | `STRING` | The path to a TypeScript configuration script. For example, `--userscript 'script.ts'`. |
-| `-v`, `--verbose` | `COUNT` | Increase logging verbosity to `debug`; repeat for `trace`. |
-
-##### PostgreSQL replication flags
-
-The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication from a [PostgreSQL source database](#source-and-target-databases).
+### Global flags
+
+| Flag | Type | Description |
+|----------------------------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--applyTimeout` | `DURATION` | The maximum amount of time to wait for an update to be applied.
**Default:** `30s` |
+| `--dlqTableName` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.
**Default:** `replicator_dlq` |
+| `--enableParallelApplies` | `BOOL` | Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage.
**Default:** `false` |
+| `--flushPeriod` | `DURATION` | Flush queued mutations after this duration.
**Default:** `1s` |
+| `--flushSize` | `INT` | Ideal batch size to determine when to flush mutations.
**Default:** `30s` |
+| `--logDestination` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. |
+| `--logFormat` | `STRING` | Choose log output format: `"fluent"`, `"text"`.
**Default:** `"text"` |
+| `--maxRetries` | `INT` | Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.
**Default:** `10` |
+| `--metricsAddr` | `STRING` | A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. |
+| `--parallelism` | `INT` | The number of concurrent database transactions to use.
**Default:** `16` |
+| `--quiescentPeriod` | `DURATION` | How often to retry deferred mutations.
**Default:** `10s` |
+| `--retireOffset` | `DURATION` | How long to delay removal of applied mutations.
**Default:** `24h0m0s` |
+| `--retryInitialBackoff` | `DURATION` | Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation.
**Default:** `25ms` |
+| `--retryMaxBackoff` | `DURATION` | Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors.
**Default:** `2s` |
+| `--retryMultiplier` | `INT` | Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database.
**Default:** `2` |
+| `--scanSize` | `INT` | The number of rows to retrieve from the staging database used to store metadata for replication.
**Default:** `10000` |
+| `--schemaRefresh` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.
**Default:** `1m0s` |
+| `--sourceConn` | `STRING` | The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to [Oracle replication flags](#oraclelogminer-replication-flags). |
+| `--stageDisableCreateTableReaderIndex` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.
**Default:** `false` |
+| `--stageMarkAppliedLimit` | `INT` | Limit the number of mutations to be marked applied in a single statement.
**Default:** `100000` |
+| `--stageSanityCheckPeriod` | `DURATION` | How often to validate staging table apply order (`-1` to disable).
**Default:** `10m0s` |
+| `--stageSanityCheckWindow` | `DURATION` | How far back to look when validating staging table apply order.
**Default:** `1h0m0s` |
+| `--stageUnappliedPeriod` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).
**Default:** `1m0s` |
+| `--stagingConn` | `STRING` | The staging database's connection string. |
+| `--stagingCreateSchema` | | Automatically create the staging schema if it does not exist. |
+| `--stagingIdleTime` | `DURATION` | Maximum lifetime of an idle connection.
**Default:** `1m0s` |
+| `--stagingJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.
**Default:** `15s` |
+| `--stagingMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.
**Default:** `5m0s` |
+| `--stagingMaxPoolSize` | `INT` | The maximum number of staging database connections.
**Default:** `128` |
+| `--stagingSchema` | `STRING` | Name of the CockroachDB schema that stores replication metadata. **Required** each time `replicator` is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.
**Default:** `_replicator.public` |
+| `--targetApplyQueueSize` | `INT` | Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (`pglogical`) sources, and replaces the deprecated `--copierChannel` and `--stageCopierChannelSize` flags. |
+| `--targetConn` | `STRING` | The target database's connection string. |
+| `--targetIdleTime` | `DURATION` | Maximum lifetime of an idle connection.
**Default:** `1m0s` |
+| `--targetJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.
**Default:** `15s` |
+| `--targetMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.
**Default:** `5m0s` |
+| `--targetMaxPoolSize` | `INT` | The maximum number of target database connections.
**Default:** `128` |
+| `--targetSchema` | `STRING` | The SQL database schema in the target cluster to update. |
+| `--targetStatementCacheSize` | `INT` | The maximum number of prepared statements to retain.
**Default:** `128` |
+| `--taskGracePeriod` | `DURATION` | How long to allow for task cleanup when recovering from errors.
**Default:** `1m0s` |
+| `--timestampLimit` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.
**Default:** `1000` |
+| `--userscript` | `STRING` | The path to a TypeScript configuration script. For example, `--userscript 'script.ts'`. |
+| `-v`, `--verbose` | `COUNT` | Increase logging verbosity. Use `-v` for `debug` logging or `-vv` for `trace` logging. |
+
+### `pglogical` replication flags
+
+The following flags are used when replicating from a [PostgreSQL source database](#source-connection-strings).
| Flag | Type | Description |
|---------------------|------------|---------------------------------------------------------------------------------|
@@ -58,31 +56,31 @@ The following flags are set with [`--replicator-flags`](#global-flags) and can b
| `--slotName` | `STRING` | The replication slot in the source database.
**Default:** `"replicator"` |
| `--standbyTimeout` | `DURATION` | How often to report WAL progress to the source server.
**Default:** `5s` |
-##### MySQL replication flags
+### `mylogical` replication flags
-The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication from a [MySQL source database](#source-and-target-databases).
+The following flags are used when replicating from a [MySQL source database](#source-connection-strings).
-| Flag | Type | Description |
-|--------------------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `--defaultGTIDSet` | `STRING` | Default GTID set, in the format `source_uuid:min(interval_start)-max(interval_end)`. **Required** the first time [`--mode replication-only`](#replication-only) is run, as the GTID set provides a replication marker for streaming changes. For details, refer to [Replicate changes](#replication-only). |
-| `--fetchMetadata` | | Fetch column metadata explicitly, for older versions of MySQL that don't support `binlog_row_metadata`. |
-| `--replicationProcessID` | `UINT32` | The replication process ID to report to the source database.
**Default:** `10` |
+| Flag | Type | Description |
+|--------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--defaultGTIDSet` | `STRING` | Default GTID set, in the format `source_uuid:min(interval_start)-max(interval_end)`. **Required** the first time `replicator` is run, as the GTID set provides a replication marker for streaming changes.
+| `--fetchMetadata` | | Fetch column metadata explicitly, for older versions of MySQL that do not support `binlog_row_metadata`. |
+| `--replicationProcessID` | `UINT32` | The replication process ID to report to the source database.
**Default:** `10` |
-##### Oracle replication flags
+### `oraclelogminer` replication flags
-The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication from an [Oracle source database](#source-and-target-databases).
+The following flags are used when replicating from an [Oracle source database](#source-connection-strings).
-| Flag | Type | Description |
-|------------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `--scn` | `INT` | The snapshot System Change Number (SCN) queried by MOLT Fetch for the initial data load. |
-| `--backfillFromSCN` | `INT` | The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. |
-| `--sourcePDBConn` | `STRING` | Connection string for the Oracle pluggable database (PDB). Only required when using an [Oracle multitenant configuration](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html). `--sourceConn`](#replication-flags) **must** be included. |
-| `--schema-filter` | `STRING` | Restricts replication to the specified Oracle PDB schema (user). Set to the PDB user that owns the tables you want to replicate. Without this flag, replication will be attempted on tables from other users. |
-| `--oracle-application-users` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). |
+| Flag | Type | Description |
+|------------------------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--sourceSchema` | `STRING` | **Required.** Source schema name on Oracle where tables will be replicated from. |
+| `--scn` | `INT` | The snapshot System Change Number (SCN) from the initial data load. **Required** the first time `replicator` is run, as the SCN provides a replication marker for streaming changes. |
+| `--backfillFromSCN` | `INT` | The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. |
+| `--sourcePDBConn` | `STRING` | Connection string for the Oracle pluggable database (PDB). Only required when using an [Oracle multitenant configuration](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html). [`--sourceConn`](#global-flags) **must** be included. |
+| `--oracle-application-users` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). |
-##### Failback replication flags
+### `start` failback flags
-The following flags are set with [`--replicator-flags`](#global-flags) and can be used in [`failback` mode](#failback).
+The following flags are used for failback from CockroachDB.
| Flag | Type | Description |
|----------------------------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
@@ -102,3 +100,14 @@ The following flags are set with [`--replicator-flags`](#global-flags) and can b
| `--tlsCertificate` | `STRING` | A path to a PEM-encoded TLS certificate chain. |
| `--tlsPrivateKey` | `STRING` | A path to a PEM-encoded TLS private key. |
| `--tlsSelfSigned` | | If true, generate a self-signed TLS certificate valid for `localhost`. |
+
+### `make-jwt` flags
+
+The following flags are used with the [`make-jwt` command](#token-quickstart) to generate JWT tokens for changefeed authentication.
+
+| Flag | Type | Description |
+|-----------------|----------|----------------------------------------------------------------------------------|
+| `-a`, `--allow` | `STRING` | One or more `database.schema` identifiers. Can be repeated for multiple schemas. |
+| `--claim` | | If `true`, print a minimal JWT claim instead of signing. |
+| `-k`, `--key` | `STRING` | The path to a PEM-encoded private key to sign the token with. |
+| `-o`, `--out` | `STRING` | A file to write the token to. |
\ No newline at end of file
diff --git a/src/current/_includes/molt/replicator-metrics.md b/src/current/_includes/molt/replicator-metrics.md
new file mode 100644
index 00000000000..538e0898e3e
--- /dev/null
+++ b/src/current/_includes/molt/replicator-metrics.md
@@ -0,0 +1,37 @@
+### Replicator metrics
+
+MOLT Replicator can export [Prometheus](https://prometheus.io/) metrics by setting the `--metricsAddr` flag to a port (for example, `--metricsAddr :30005`). Metrics are not enabled by default. When enabled, metrics are available at the path `/_/varz`. For example: `http://localhost:30005/_/varz`.
+
+Cockroach Labs recommends monitoring the following metrics during replication:
+
+{% if page.name == "migrate-failback.md" %}
+| Metric Name | Description |
+|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
+| `commit_to_stage_lag_seconds` | Time between when a mutation is written to the source CockroachDB cluster and when it is written to the staging database. |
+| `source_commit_to_apply_lag_seconds` | End-to-end lag from when a mutation is written to the source CockroachDB cluster to when it is applied to the target database. |
+| `stage_mutations_total` | Number of mutations staged for application to the target database. |
+| `apply_conflicts_total` | Number of rows that experienced a compare-and-set (CAS) conflict. |
+| `apply_deletes_total` | Number of rows deleted. |
+| `apply_duration_seconds` | Length of time it took to successfully apply mutations. |
+| `apply_errors_total` | Number of times an error was encountered while applying mutations. |
+| `apply_resolves_total` | Number of rows that experienced a compare-and-set (CAS) conflict and which were resolved. |
+| `apply_upserts_total` | Number of rows upserted. |
+| `target_apply_queue_depth` | Number of batches in the target apply queue. Indicates how backed up the applier flow is between receiving changefeed data and applying it to the target database. |
+| `target_apply_queue_utilization_percent` | Utilization percentage (0.0-100.0) of the target apply queue capacity. Use this to understand how close the queue is to capacity and to set alerting thresholds for backpressure conditions. |
+| `core_parallelism_utilization_percent` | Current utilization percentage of the applier flow parallelism capacity. Shows what percentage of the configured parallelism is actively being used. |
+{% else %}
+| Metric Name | Description |
+|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
+| `commit_to_stage_lag_seconds` | Time between when a mutation is written to the source database and when it is written to the staging database. |
+| `source_commit_to_apply_lag_seconds` | End-to-end lag from when a mutation is written to the source database to when it is applied to the target CockroachDB. |
+| `apply_conflicts_total` | Number of rows that experienced a compare-and-set (CAS) conflict. |
+| `apply_deletes_total` | Number of rows deleted. |
+| `apply_duration_seconds` | Length of time it took to successfully apply mutations. |
+| `apply_errors_total` | Number of times an error was encountered while applying mutations. |
+| `apply_resolves_total` | Number of rows that experienced a compare-and-set (CAS) conflict and which were resolved. |
+| `apply_upserts_total` | Number of rows upserted. |
+{% endif %}
+
+You can use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize the metrics. For Oracle-specific metrics, import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).
+
+To check MOLT Replicator health when metrics are enabled, run `curl http://localhost:30005/_/healthz` (replacing the port with your `--metricsAddr` value). This returns a status code of `200` if Replicator is running.
\ No newline at end of file
diff --git a/src/current/_includes/v23.1/sidebar-data/migrate.json b/src/current/_includes/v23.1/sidebar-data/migrate.json
index 351d9254e37..ce9bbcd78fc 100644
--- a/src/current/_includes/v23.1/sidebar-data/migrate.json
+++ b/src/current/_includes/v23.1/sidebar-data/migrate.json
@@ -23,22 +23,16 @@
"/molt/migrate-bulk-load.html"
]
},
- {
- "title": "Load and Replicate",
- "urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
{
"title": "Load and Replicate Separately",
"urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v23.2/sidebar-data/migrate.json b/src/current/_includes/v23.2/sidebar-data/migrate.json
index 351d9254e37..ce9bbcd78fc 100644
--- a/src/current/_includes/v23.2/sidebar-data/migrate.json
+++ b/src/current/_includes/v23.2/sidebar-data/migrate.json
@@ -23,22 +23,16 @@
"/molt/migrate-bulk-load.html"
]
},
- {
- "title": "Load and Replicate",
- "urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
{
"title": "Load and Replicate Separately",
"urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v24.1/sidebar-data/migrate.json b/src/current/_includes/v24.1/sidebar-data/migrate.json
index 351d9254e37..ce9bbcd78fc 100644
--- a/src/current/_includes/v24.1/sidebar-data/migrate.json
+++ b/src/current/_includes/v24.1/sidebar-data/migrate.json
@@ -23,22 +23,16 @@
"/molt/migrate-bulk-load.html"
]
},
- {
- "title": "Load and Replicate",
- "urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
{
"title": "Load and Replicate Separately",
"urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v24.2/sidebar-data/migrate.json b/src/current/_includes/v24.2/sidebar-data/migrate.json
index 351d9254e37..ce9bbcd78fc 100644
--- a/src/current/_includes/v24.2/sidebar-data/migrate.json
+++ b/src/current/_includes/v24.2/sidebar-data/migrate.json
@@ -23,22 +23,16 @@
"/molt/migrate-bulk-load.html"
]
},
- {
- "title": "Load and Replicate",
- "urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
{
"title": "Load and Replicate Separately",
"urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v24.3/sidebar-data/migrate.json b/src/current/_includes/v24.3/sidebar-data/migrate.json
index 351d9254e37..ce9bbcd78fc 100644
--- a/src/current/_includes/v24.3/sidebar-data/migrate.json
+++ b/src/current/_includes/v24.3/sidebar-data/migrate.json
@@ -23,22 +23,16 @@
"/molt/migrate-bulk-load.html"
]
},
- {
- "title": "Load and Replicate",
- "urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
{
"title": "Load and Replicate Separately",
"urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v25.1/sidebar-data/migrate.json b/src/current/_includes/v25.1/sidebar-data/migrate.json
index 351d9254e37..77b969311fb 100644
--- a/src/current/_includes/v25.1/sidebar-data/migrate.json
+++ b/src/current/_includes/v25.1/sidebar-data/migrate.json
@@ -26,19 +26,13 @@
{
"title": "Load and Replicate",
"urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
- {
- "title": "Load and Replicate Separately",
- "urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v25.2/sidebar-data/migrate.json b/src/current/_includes/v25.2/sidebar-data/migrate.json
index 7762f069cbc..aa7bb4f6646 100644
--- a/src/current/_includes/v25.2/sidebar-data/migrate.json
+++ b/src/current/_includes/v25.2/sidebar-data/migrate.json
@@ -26,19 +26,13 @@
{
"title": "Load and Replicate",
"urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
- {
- "title": "Load and Replicate Separately",
- "urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v25.3/sidebar-data/migrate.json b/src/current/_includes/v25.3/sidebar-data/migrate.json
index 351d9254e37..77b969311fb 100644
--- a/src/current/_includes/v25.3/sidebar-data/migrate.json
+++ b/src/current/_includes/v25.3/sidebar-data/migrate.json
@@ -26,19 +26,13 @@
{
"title": "Load and Replicate",
"urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
- {
- "title": "Load and Replicate Separately",
- "urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/_includes/v25.4/sidebar-data/migrate.json b/src/current/_includes/v25.4/sidebar-data/migrate.json
index 351d9254e37..77b969311fb 100644
--- a/src/current/_includes/v25.4/sidebar-data/migrate.json
+++ b/src/current/_includes/v25.4/sidebar-data/migrate.json
@@ -26,19 +26,13 @@
{
"title": "Load and Replicate",
"urls": [
- "/molt/migrate-data-load-and-replication.html"
- ]
- },
- {
- "title": "Load and Replicate Separately",
- "urls": [
- "/molt/migrate-data-load-replicate-only.html"
+ "/molt/migrate-load-replicate.html"
]
},
{
"title": "Resume Replication",
"urls": [
- "/molt/migrate-replicate-only.html"
+ "/molt/migrate-resume-replication.html"
]
},
{
@@ -64,6 +58,12 @@
"/molt/molt-fetch.html"
]
},
+ {
+ "title": "Replicator",
+ "urls": [
+ "/molt/molt-replicator.html"
+ ]
+ },
{
"title": "Verify",
"urls": [
diff --git a/src/current/advisories/a144650.md b/src/current/advisories/a144650.md
index cf13b97f1db..969f5c24488 100644
--- a/src/current/advisories/a144650.md
+++ b/src/current/advisories/a144650.md
@@ -106,7 +106,7 @@ Follow these steps after [`detect_144650.sh` finds a corrupted job or problemati
#### MOLT Fetch
-By default, MOLT Fetch uses [`IMPORT INTO`]({% link v25.1/import-into.md %}) to load data into CockroachDB, and can therefore be affected by this issue. [As recommended in the migration documentation]({% link molt/migrate-data-load-replicate-only.md %}#stop-replication-and-verify-data), a run of [MOLT Fetch]({% link molt/molt-fetch.md %}) should be followed by a run of [MOLT Verify]({% link molt/molt-verify.md %}) to ensure that all data on the target side matches the data on the source side.
+By default, MOLT Fetch uses [`IMPORT INTO`]({% link v25.1/import-into.md %}) to load data into CockroachDB, and can therefore be affected by this issue. [As recommended in the migration documentation]({% link molt/migrate-load-replicate.md %}#stop-replication-and-verify-data), a run of [MOLT Fetch]({% link molt/molt-fetch.md %}) should be followed by a run of [MOLT Verify]({% link molt/molt-verify.md %}) to ensure that all data on the target side matches the data on the source side.
- If you ran MOLT Verify after completing your MOLT Fetch run, and Verify did not find mismatches, then MOLT Fetch was unaffected by this issue.
diff --git a/src/current/images/molt/migration_flow.svg b/src/current/images/molt/migration_flow.svg
index f6e0bddb271..e66dbb43359 100644
--- a/src/current/images/molt/migration_flow.svg
+++ b/src/current/images/molt/migration_flow.svg
@@ -15,15 +15,20 @@
-
-
-
-
-
-
-
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/current/molt/migrate-bulk-load.md b/src/current/molt/migrate-bulk-load.md
index d7184cdc098..666e5d1ac31 100644
--- a/src/current/molt/migrate-bulk-load.md
+++ b/src/current/molt/migrate-bulk-load.md
@@ -5,15 +5,17 @@ toc: true
docs_area: migrate
---
-Use `data-load` mode to perform a one-time bulk load of source data into CockroachDB.
+Perform a one-time bulk load of source data into CockroachDB.
+
+{% include molt/crdb-to-crdb-migration.md %}
{% include molt/molt-setup.md %}
-## Load data into CockroachDB
+## Start Fetch
Perform the bulk load of the source data.
-1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data into CockroachDB, specifying [`--mode data-load`]({% link molt/molt-fetch.md %}#fetch-mode) to perform a one-time data load. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It limits the migration to a single schema and filters for three specific tables. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`. Include the `--ignore-replication-check` flag to skip replication checkpoint queries, which eliminates the need to configure the source database for logical replication.
+1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data into CockroachDB. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It limits the migration to a single schema and filters for three specific tables. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`. Include the `--ignore-replication-check` flag to skip replication checkpoint queries, which eliminates the need to configure the source database for logical replication.
{% include_cached copy-clipboard.html %}
@@ -25,7 +27,6 @@ Perform the bulk load of the source data.
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
- --mode data-load \
--ignore-replication-check
~~~
@@ -40,7 +41,6 @@ Perform the bulk load of the source data.
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
- --mode data-load \
--ignore-replication-check
~~~
@@ -58,7 +58,6 @@ Perform the bulk load of the source data.
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
- --mode 'data-load' \
--ignore-replication-check
~~~
@@ -77,7 +76,9 @@ Perform the bulk load of the source data.
Perform a cutover by resuming application traffic, now to CockroachDB.
-{% include molt/molt-troubleshooting.md %}
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-fetch.md %}
## See also
diff --git a/src/current/molt/migrate-data-load-and-replication.md b/src/current/molt/migrate-data-load-and-replication.md
index d2fb266d59c..e69de29bb2d 100644
--- a/src/current/molt/migrate-data-load-and-replication.md
+++ b/src/current/molt/migrate-data-load-and-replication.md
@@ -1,115 +0,0 @@
----
-title: Load and Replicate
-summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster.
-toc: true
-docs_area: migrate
----
-
-{% assign tab_names_html = "Load and replicate;Replicate separately" %}
-{% assign html_page_filenames = "migrate-data-load-and-replication.html;migrate-data-load-replicate-only.html" %}
-
-{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %}
-
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
-
-Use `data-load-and-replication` mode to perform a one-time bulk load of source data and start continuous replication in a single command.
-
-{{site.data.alerts.callout_success}}
-You can also [load and replicate separately]({% link molt/migrate-data-load-replicate-only.md %}) using `data-load` and `replicate-only`.
-{{site.data.alerts.end}}
-
-{% include molt/molt-setup.md %}
-
-## Load data into CockroachDB
-
-Start the initial load of data into the target database. Continuous replication of changes will start once the data load is complete.
-
-1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying `--mode data-load-and-replication` to perform an initial load followed by continuous replication. In this example, the `--metricsAddr :30005` [replication flag](#replication-flags) enables a Prometheus endpoint at `http://localhost:30005/_/varz` where replication metrics will be served. You can use these metrics to [verify that replication has drained](#stop-replication-and-verify-data) in a later step.
-
-
- Specify a replication slot name with `--pglogical-replication-slot-name`. This is required for [replication after data load](#replicate-changes-to-cockroachdb).
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --pglogical-replication-slot-name cdc_slot \
- --replicator-flags '--metricsAddr :30005' \
- --mode data-load-and-replication
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \
- --mode data-load-and-replication
- ~~~
-
-
-
- The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string.
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --source-cdb $SOURCE_CDB \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \
- --mode data-load-and-replication
- ~~~
-
-
-{% include molt/fetch-data-load-output.md %}
-
-## Replicate changes to CockroachDB
-
-1. Continuous replication begins immediately after `fetch complete`.
-
-{% include molt/fetch-replication-output.md %}
-
-## Stop replication and verify data
-
-Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful.
-
-{% include molt/migration-stop-replication.md %}
-
-{% include molt/verify-output.md %}
-
-## Modify the CockroachDB schema
-
-{% include molt/migration-modify-target-schema.md %}
-
-## Cutover
-
-Perform a cutover by resuming application traffic, now to CockroachDB.
-
-{% include molt/molt-troubleshooting.md %}
-
-## See also
-
-- [Migration Overview]({% link molt/migration-overview.md %})
-- [Migration Strategy]({% link molt/migration-strategy.md %})
-- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
-- [MOLT Fetch]({% link molt/molt-fetch.md %})
-- [MOLT Verify]({% link molt/molt-verify.md %})
-- [Migration Failback]({% link molt/migrate-failback.md %})
\ No newline at end of file
diff --git a/src/current/molt/migrate-data-load-replicate-only.md b/src/current/molt/migrate-data-load-replicate-only.md
deleted file mode 100644
index dc7aafc3a90..00000000000
--- a/src/current/molt/migrate-data-load-replicate-only.md
+++ /dev/null
@@ -1,163 +0,0 @@
----
-title: Load and Replicate Separately
-summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster.
-toc: true
-docs_area: migrate
----
-
-{% assign tab_names_html = "Load and replicate;Replicate separately" %}
-{% assign html_page_filenames = "migrate-data-load-and-replication.html;migrate-data-load-replicate-only.html" %}
-
-{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %}
-
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
-
-Perform an initial bulk load of the source data using `data-load` mode, then use `replication-only` mode to replicate ongoing changes to the target.
-
-{{site.data.alerts.callout_success}}
-You can also [load and replicate in a single command]({% link molt/migrate-data-load-and-replication.md %}) using `data-load-and-replication`.
-{{site.data.alerts.end}}
-
-{% include molt/molt-setup.md %}
-
-## Load data into CockroachDB
-
-Perform the initial load of the source data.
-
-1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying [`--mode data-load`]({% link molt/molt-fetch.md %}#fetch-mode) to perform a one-time data load. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It also limits the migration to a single schema and filters three specific tables to migrate. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`.
-
-
- Specify a replication slot name with `--pglogical-replication-slot-name`. This is required for [replication in a subsequent step](#replicate-changes-to-cockroachdb).
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --pglogical-replication-slot-name cdc_slot \
- --mode data-load
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --mode data-load
- ~~~
-
-
-
- The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string.
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --source-cdb $SOURCE_CDB \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --bucket-path 's3://migration/data/cockroach' \
- --table-handling truncate-if-exists \
- --mode data-load
- ~~~
-
-
-{% include molt/fetch-data-load-output.md %}
-
-## Verify the data load
-
-Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful.
-
-{% include molt/verify-output.md %}
-
-## Replicate changes to CockroachDB
-
-With initial load complete, start replication of ongoing changes on the source to CockroachDB.
-
-
-{% include molt/fetch-replicator-flags.md %}
-
-1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to start replication on CockroachDB, specifying [`--mode replication-only`]({% link molt/molt-fetch.md %}#fetch-mode). In this example, the `--metricsAddr :30005` replication flag enables a Prometheus endpoint at `http://localhost:30005/_/varz` where replication metrics will be served. You can use these metrics to [verify that replication has drained](#stop-replication-and-verify-data) in a later step.
-
-
- Be sure to specify the same `--pglogical-replication-slot-name` value that you provided in [Load data into CockroachDB](#load-data-into-cockroachdb).
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --table-filter 'employees|payments|orders' \
- --pglogical-replication-slot-name cdc_slot \
- --replicator-flags '--metricsAddr :30005' \
- --mode replication-only
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --table-filter 'employees|payments|orders' \
- --non-interactive \
- --replicator-flags '--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 --metricsAddr :30005 --userscript table_filter.ts' \
- --mode replication-only
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --source-cdb $SOURCE_CDB \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --replicator-flags '--backfillFromSCN 26685444 --scn 26685786 --metricsAddr :30005 --userscript table_filter.ts' \
- --mode 'replication-only'
- ~~~
-
-
-{% include molt/fetch-replication-output.md %}
-
-## Stop replication and verify data
-
-{% include molt/migration-stop-replication.md %}
-
-1. Repeat [Verify the data load](#verify-the-data-load) to verify the updated data.
-
-## Modify the CockroachDB schema
-
-{% include molt/migration-modify-target-schema.md %}
-
-## Cutover
-
-Perform a cutover by resuming application traffic, now to CockroachDB.
-
-{% include molt/molt-troubleshooting.md %}
-
-## See also
-
-- [Migration Overview]({% link molt/migration-overview.md %})
-- [Migration Strategy]({% link molt/migration-strategy.md %})
-- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
-- [MOLT Fetch]({% link molt/molt-fetch.md %})
-- [MOLT Verify]({% link molt/molt-verify.md %})
-- [Migration Failback]({% link molt/migrate-failback.md %})
\ No newline at end of file
diff --git a/src/current/molt/migrate-failback.md b/src/current/molt/migrate-failback.md
index cfd6dd8a693..b6a39d78d24 100644
--- a/src/current/molt/migrate-failback.md
+++ b/src/current/molt/migrate-failback.md
@@ -1,16 +1,14 @@
---
title: Migration Failback
-summary: Learn how to fail back from a CockroachDB cluster to a PostgreSQL or MySQL database.
+summary: Learn how to fail back from a CockroachDB cluster to a PostgreSQL, MySQL, or Oracle database.
toc: true
docs_area: migrate
---
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
+{{site.data.alerts.callout_info}}
+These instructions assume you have already [installed MOLT and completed the prerequisites]({% link molt/migrate-load-replicate.md %}#before-you-begin) for your source dialect.
{{site.data.alerts.end}}
-If issues arise during migration, run MOLT Fetch in `failback` mode after stopping replication and before writing to CockroachDB. This ensures that data remains consistent on the source in case you need to roll back the migration.
-
@@ -19,182 +17,291 @@ If issues arise during migration, run MOLT Fetch in `failback` mode after stoppi
## Prepare the CockroachDB cluster
-[Enable rangefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}#enable-rangefeeds) on the CockroachDB cluster:
+{{site.data.alerts.callout_success}}
+For details on enabling CockroachDB changefeeds, refer to [Create and Configure Changefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}).
+{{site.data.alerts.end}}
+
+If you are migrating to a CockroachDB {{ site.data.products.core }} cluster, [enable rangefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}#enable-rangefeeds) on the cluster:
{% include_cached copy-clipboard.html %}
~~~ sql
SET CLUSTER SETTING kv.rangefeed.enabled = true;
~~~
-
-## Grant Oracle user permissions
+Use the following optional settings to increase changefeed throughput.
+
+{{site.data.alerts.callout_danger}}
+The following settings can impact source cluster performance and stability, especially SQL foreground latency during writes. For details, refer to [Advanced Changefeed Configuration]({% link {{ site.current_cloud_version }}/advanced-changefeed-configuration.md %}).
+{{site.data.alerts.end}}
+
+To lower changefeed emission latency, but increase SQL foreground latency:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SET CLUSTER SETTING kv.rangefeed.closed_timestamp_refresh_interval = '250ms';
+~~~
+
+To lower the [closed timestamp]({% link {{ site.current_cloud_version }}/architecture/transaction-layer.md %}#closed-timestamps) lag duration:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SET CLUSTER SETTING kv.closed_timestamp.target_duration = '1s';
+~~~
+
+To improve catchup speeds but increase cluster CPU usage:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+SET CLUSTER SETTING kv.rangefeed.concurrent_catchup_iterators = 64;
+~~~
+
+## Grant target database user permissions
+
+You should have already created a migration user on the target database (your original source database) with the necessary privileges. Refer to [Create migration user on source database]({% link molt/migrate-load-replicate.md %}#create-migration-user-on-source-database).
-You should have already created a migration user on the source database with the necessary privileges. Refer to [Create migration user on source database]({% link molt/migrate-data-load-replicate-only.md %}?filters=oracle#create-migration-user-on-source-database).
+For failback replication, grant the user additional privileges to write data back to the target database:
-Grant the Oracle user additional `INSERT` and `UPDATE` privileges on the tables to fail back:
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Grant INSERT and UPDATE on tables to fail back to
+GRANT INSERT, UPDATE ON ALL TABLES IN SCHEMA migration_schema TO migration_user;
+ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT INSERT, UPDATE ON TABLES TO migration_user;
+~~~
+
+
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Grant INSERT and UPDATE on tables to fail back to
+GRANT SELECT, INSERT, UPDATE ON source_database.* TO 'migration_user'@'%';
+FLUSH PRIVILEGES;
+~~~
+
+
{% include_cached copy-clipboard.html %}
~~~ sql
+-- Grant INSERT, UPDATE, and FLASHBACK on tables to fail back to
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.employees TO MIGRATION_USER;
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.payments TO MIGRATION_USER;
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.orders TO MIGRATION_USER;
~~~
-## Configure failback
-
-Configure the MOLT Fetch connection strings and filters for `failback` mode, ensuring that the CockroachDB changefeed is correctly targeting your original source.
+## Configure Replicator
-### Connection strings
+When you run `replicator`, you can configure the following options for replication:
-In `failback` mode, the `--source` and `--target` connection strings are reversed from other migration modes:
+- [Connection strings](#connection-strings): Specify URL‑encoded source and target connections.
+- [TLS certificate and key](#tls-certificate-and-key): Configure secure TLS connections.
+- [Replication flags](#replication-flags): Specify required and optional flags to configure replicator behavior.
+
+- [Tuning parameters](#tuning-parameters): Optimize failback performance and resource usage.
+
+- [Replicator metrics](#replicator-metrics): Monitor failback replication performance.
-`--source` is the CockroachDB connection string. For example:
+### Connection strings
-~~~
---source 'postgres://crdb_user@localhost:26257/defaultdb?sslmode=verify-full'
-~~~
+For failback, MOLT Replicator uses `--targetConn` to specify the destination database where you want to replicate CockroachDB changes, and `--stagingConn` for the CockroachDB staging database.
-`--target` is the connection string of the database you migrated from.
+`--targetConn` is the connection string of the database you want to replicate changes to (the database you originally migrated from).
-
For example:
+
~~~
---target 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full'
+--targetConn 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full'
~~~
-For example:
-
~~~
---target 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt'
+--targetConn 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt'
~~~
-For example:
+~~~
+--targetConn 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1'
+~~~
+
+
+`--stagingConn` is the CockroachDB connection string for staging operations:
~~~
---target 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1'
+--stagingConn 'postgres://crdb_user@localhost:26257/defaultdb?sslmode=verify-full'
~~~
-{{site.data.alerts.callout_info}}
-With Oracle Multitenant deployments, `--source-cdb` is **not** necessary for `failback`.
-{{site.data.alerts.end}}
-
+#### Secure connections
+
+{% include molt/molt-secure-connection-strings.md %}
-### Secure changefeed for failback
+### TLS certificate and key
-`failback` mode creates a [CockroachDB changefeed]({% link {{ site.current_cloud_version }}/change-data-capture-overview.md %}) and sets up a [webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink) to pass change events from CockroachDB to the failback target. In production, you should override the [default insecure changefeed]({% link molt/molt-fetch.md %}#default-insecure-changefeed) with secure settings.
+Always use **secure TLS connections** for failback replication to protect data in transit. Do **not** use insecure configurations in production: avoid the `--disableAuthentication` and `--tlsSelfSigned` Replicator flags and `insecure_tls_skip_verify=true` query parameter in the changefeed webhook URI.
-Provide these overrides in a JSON file. At minimum, the JSON should include the base64-encoded client certificate ([`client_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-cert)), key ([`client_key`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-key)), and CA ([`ca_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#ca-cert)) for the webhook sink.
+Generate self-signed TLS certificates or certificates from an external CA. Ensure the TLS server certificate and key are accessible on the MOLT Replicator host machine via a relative or absolute file path. When you [start failback with Replicator](#start-replicator), specify the paths with `--tlsCertificate` and `--tlsPrivateKey`. For example:
{% include_cached copy-clipboard.html %}
-~~~ json
-{
- "sink_query_parameters": "client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}"
-}
+~~~ shell
+replicator start \
+... \
+--tlsCertificate ./certs/server.crt \
+--tlsPrivateKey ./certs/server.key
~~~
-{{site.data.alerts.callout_success}}
-In the `molt fetch` command, use `--replicator-flags` to specify the paths to the server certificate and key for the webhook sink. Refer to [Replication flags](#replication-flags).
-{{site.data.alerts.end}}
-
-Pass the JSON file path to `molt` via `--changefeeds-path`. For example:
+The client certificates defined in the changefeed webhook URI must correspond to the server certificates specified in the `replicator` command. This ensures proper TLS handshake between the changefeed and MOLT Replicator. To include client certificates in the changefeed webhook URL, encode them with `base64` and then URL-encode the output with `jq`:
{% include_cached copy-clipboard.html %}
-~~~
---changefeeds-path 'changefeed-secure.json'
+~~~ shell
+base64 -i ./client.crt | jq -R -r '@uri'
+base64 -i ./client.key | jq -R -r '@uri'
+base64 -i ./ca.crt | jq -R -r '@uri'
~~~
-Because the changefeed runs inside the CockroachDB cluster, the `--changefeeds-path` file must reference a webhook endpoint address reachable by the cluster, not necessarily your local workstation.
+When you [create the changefeed](#create-the-cockroachdb-changefeed), pass the encoded certificates in the changefeed URL, where `client_cert`, `client_key`, and `ca_cert` are [webhook sink parameters]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-parameters). For example:
-For details, refer to [Changefeed override settings]({% link molt/molt-fetch.md %}#changefeed-override-settings).
+{% include_cached copy-clipboard.html %}
+~~~ sql
+CREATE CHANGEFEED FOR TABLE table1, table2
+INTO 'webhook-https://host:port/database/schema?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}'
+WITH ...;
+~~~
+
+For additional details on the webhook sink URI, refer to [Webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink).
### Replication flags
-{% include molt/fetch-replicator-flags.md %}
+{% include molt/replicator-flags-usage.md %}
-## Fail back from CockroachDB
+
+### Tuning parameters
-Start failback to the source database.
+{% include molt/optimize-replicator-performance.md %}
+
-1. Cancel replication to CockroachDB by entering `ctrl-c` to issue a `SIGTERM` signal to the `fetch` process. This returns an exit code `0`.
+{% include molt/replicator-metrics.md %}
-1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to fail back to the source database, specifying `--mode failback`. In this example, we filter the `migration_schema` schema and the `employees`, `payments`, and `orders` tables, configure the staging schema with `--replicator-flags`, and use `--changefeeds-path` to provide the secure changefeed override.
+## Stop forward replication
+
+{% include molt/migration-stop-replication.md %}
+
+## Start Replicator
+
+1. Run the [MOLT Replicator]({% link molt/molt-replicator.md %}) `start` command to begin failback replication from CockroachDB to your source database. In this example, `--metricsAddr :30005` enables a Prometheus endpoint for monitoring replication metrics, and `--bindAddr :30004` sets up the webhook endpoint for the changefeed.
+
+ `--stagingSchema` specifies the staging database name (`_replicator` in this example) used for replication checkpoints and metadata. This staging database was created during [initial forward replication]({% link molt/migrate-load-replicate.md %}#start-replicator) when you first ran MOLT Replicator with `--stagingCreateSchema`.
-
{% include_cached copy-clipboard.html %}
~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key' \
- --mode failback \
- --changefeeds-path 'changefeed-secure.json'
+ replicator start \
+ --targetConn $TARGET \
+ --stagingConn $STAGING \
+ --stagingSchema _replicator \
+ --metricsAddr :30005 \
+ --bindAddr :30004 \
+ --tlsCertificate ./certs/server.crt \
+ --tlsPrivateKey ./certs/server.key \
+ -v
+ ~~~
+
+## Create the CockroachDB changefeed
+
+Create a CockroachDB changefeed to send changes to MOLT Replicator.
+
+1. Get the current logical timestamp from CockroachDB, after [ensuring that forward replication has fully drained](#stop-forward-replication):
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ sql
+ SELECT cluster_logical_timestamp();
+ ~~~
+
+ ~~~
+ cluster_logical_timestamp
+ ----------------------------------
+ 1759246920563173000.0000000000
+ ~~~
+
+1. Create the CockroachDB changefeed pointing to the MOLT Replicator webhook endpoint. Use `cursor` to specify the logical timestamp from the preceding step.
+
+ {{site.data.alerts.callout_info}}
+ Ensure that only **one** changefeed points to MOLT Replicator at a time to avoid mixing streams of incoming data.
+ {{site.data.alerts.end}}
+
+ {{site.data.alerts.callout_success}}
+ For details on the webhook sink URI, refer to [Webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink).
+ {{site.data.alerts.end}}
+
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ sql
+ CREATE CHANGEFEED FOR TABLE employees, payments, orders \
+ INTO 'webhook-https://replicator-host:30004/migration_schema/public?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \
+ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
~~~
{% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key' \
- --mode failback \
- --changefeeds-path 'changefeed-secure.json'
+ ~~~ sql
+ CREATE CHANGEFEED FOR TABLE employees, payments, orders \
+ INTO 'webhook-https://replicator-host:30004/migration_schema?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \
+ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
~~~
{% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key --userscript table_filter.ts' \
- --mode failback \
- --changefeeds-path 'changefeed-secure.json'
+ ~~~ sql
+ CREATE CHANGEFEED FOR TABLE employees, payments, orders \
+ INTO 'webhook-https://replicator-host:30004/MIGRATION_SCHEMA?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \
+ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
~~~
-
- {{site.data.alerts.callout_info}}
- With Oracle Multitenant deployments, while `--source-cdb` is required for other `fetch` modes, it is **not** necessary for `failback`.
- {{site.data.alerts.end}}
-1. Check the output to observe `fetch progress`.
-
- A `starting replicator` message indicates that the task has started:
+ The output shows the job ID:
- ~~~ json
- {"level":"info","time":"2025-02-20T15:55:44-05:00","message":"starting replicator"}
+ ~~~
+ job_id
+ -----------------------
+ 1101234051444375553
~~~
- The `staging database name` message contains the name of the staging schema:
+1. Monitor the changefeed status, specifying the job ID:
- ~~~ json
- {"level":"info","time":"2025-02-11T14:56:20-05:00","message":"staging database name: _replicator_1739303283084207000"}
+ ~~~ sql
+ SHOW CHANGEFEED JOB 1101234051444375553;
~~~
- A `creating changefeed` message indicates that a changefeed will be passing change events from CockroachDB to the failback target:
+ ~~~
+ job_id | ... | status | running_status | ...
+ ----------------------+-----+---------+-------------------------------------------+----
+ 1101234051444375553 | ... | running | running: resolved=1759246920563173000,0 | ...
+ ~~~
+
+ To confirm the changefeed is active and replicating changes to the target database, check that `status` is `running` and `running_status` shows `running: resolved={timestamp}`.
- ~~~ json
- {"level":"info","time":"2025-02-20T15:55:44-05:00","message":"creating changefeed on the source CRDB database"}
+ {{site.data.alerts.callout_danger}}
+ `running: resolved` may be reported even if data isn't being sent properly. This typically indicates incorrect host/port configuration or network connectivity issues.
+ {{site.data.alerts.end}}
+
+1. Verify that Replicator is reporting incoming HTTP requests from the changefeed. To do so, check the MOLT Replicator logs. Since you enabled debug logging with `-v`, you should see periodic HTTP request successes:
+
+ ~~~
+ DEBUG [Aug 25 11:52:47] httpRequest="&{0x14000b068c0 45 200 3 9.770958ms false false}"
+ DEBUG [Aug 25 11:52:48] httpRequest="&{0x14000d1a000 45 200 3 13.438125ms false false}"
~~~
+ These debug messages confirm successful changefeed connections to MOLT Replicator. You can disable verbose logging after verifying the connection.
+
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-failback.md %}
+
## See also
+- [MOLT Replicator]({% link molt/molt-replicator.md %})
- [Migration Overview]({% link molt/migration-overview.md %})
- [Migration Strategy]({% link molt/migration-strategy.md %})
-- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
- [MOLT Fetch]({% link molt/molt-fetch.md %})
-- [MOLT Verify]({% link molt/molt-verify.md %})
\ No newline at end of file
diff --git a/src/current/molt/migrate-load-replicate.md b/src/current/molt/migrate-load-replicate.md
new file mode 100644
index 00000000000..18ea2a49a75
--- /dev/null
+++ b/src/current/molt/migrate-load-replicate.md
@@ -0,0 +1,290 @@
+---
+title: Load and Replicate
+summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster.
+toc: true
+docs_area: migrate
+---
+
+Perform an initial bulk load of the source data using [MOLT Fetch]({% link molt/molt-fetch.md %}), then use [MOLT Replicator]({% link molt/molt-replicator.md %}) to replicate ongoing changes to the target.
+
+{% include molt/crdb-to-crdb-migration.md %}
+
+{% include molt/molt-setup.md %}
+
+## Start Fetch
+
+Perform the initial load of the source data.
+
+1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It also limits the migration to a single schema and filters three specific tables to migrate. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`.
+
+
+ You **must** include `--pglogical-replication-slot-name` and `--pglogical-publication-and-slot-drop-and-recreate` to automatically create the publication and replication slot during the data load.
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ molt fetch \
+ --source $SOURCE \
+ --target $TARGET \
+ --schema-filter 'migration_schema' \
+ --table-filter 'employees|payments|orders' \
+ --bucket-path 's3://migration/data/cockroach' \
+ --table-handling truncate-if-exists \
+ --pglogical-replication-slot-name molt_slot \
+ --pglogical-publication-and-slot-drop-and-recreate
+ ~~~
+
+
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ molt fetch \
+ --source $SOURCE \
+ --target $TARGET \
+ --schema-filter 'migration_schema' \
+ --table-filter 'employees|payments|orders' \
+ --bucket-path 's3://migration/data/cockroach' \
+ --table-handling truncate-if-exists \
+ ~~~
+
+
+
+ The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string.
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ molt fetch \
+ --source $SOURCE \
+ --source-cdb $SOURCE_CDB \
+ --target $TARGET \
+ --schema-filter 'migration_schema' \
+ --table-filter 'employees|payments|orders' \
+ --bucket-path 's3://migration/data/cockroach' \
+ --table-handling truncate-if-exists \
+ ~~~
+
+
+{% include molt/fetch-data-load-output.md %}
+
+## Verify the data load
+
+Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful.
+
+{% include molt/verify-output.md %}
+
+## Configure Replicator
+
+When you run `replicator`, you can configure the following options for replication:
+
+- [Replication connection strings](#replication-connection-strings): Specify URL-encoded source and target database connections.
+- [Replication flags](#replication-flags): Specify required and optional flags to configure replicator behavior.
+
+- [Tuning parameters](#tuning-parameters): Optimize replication performance and resource usage.
+
+- [Replicator metrics](#replicator-metrics): Monitor replication progress and performance.
+
+### Replication connection strings
+
+MOLT Replicator uses `--sourceConn` and `--targetConn` to specify the source and target database connections.
+
+`--sourceConn` specifies the connection string of the source database:
+
+
+~~~
+--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
+~~~
+
+
+
+~~~
+--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
+~~~
+
+
+
+~~~
+--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'
+~~~
+
+For Oracle Multitenant databases, also specify `--sourcePDBConn` with the PDB connection string:
+
+~~~
+--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'
+~~~
+
+
+`--targetConn` specifies the target CockroachDB connection string:
+
+~~~
+--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
+~~~
+
+{{site.data.alerts.callout_success}}
+Follow best practices for securing connection strings. Refer to [Secure connections](#secure-connections).
+{{site.data.alerts.end}}
+
+### Replication flags
+
+{% include molt/replicator-flags-usage.md %}
+
+
+### Tuning parameters
+
+{% include molt/optimize-replicator-performance.md %}
+
+
+{% include molt/replicator-metrics.md %}
+
+## Start Replicator
+
+With initial load complete, start replication of ongoing changes on the source to CockroachDB using [MOLT Replicator]({% link molt/molt-replicator.md %}).
+
+{{site.data.alerts.callout_info}}
+MOLT Fetch captures a consistent point-in-time checkpoint at the start of the data load (shown as `cdc_cursor` in the fetch output). Starting replication from this checkpoint ensures that all changes made during and after the data load are replicated to CockroachDB, preventing data loss or duplication. The following steps use the checkpoint values from the fetch output to start replication at the correct position.
+{{site.data.alerts.end}}
+
+
+1. Run the `replicator` command, using the same slot name that you specified with `--pglogical-replication-slot-name` in the [Fetch command](#start-fetch). Use `--stagingSchema` to specify a unique name for the staging database, and include `--stagingCreateSchema` to have MOLT Replicator automatically create the staging database:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ replicator pglogical \
+ --sourceConn $SOURCE \
+ --targetConn $TARGET \
+ --targetSchema defaultdb.public \
+ --slotName molt_slot \
+ --stagingSchema _replicator \
+ --stagingCreateSchema \
+ --metricsAddr :30005 \
+ -v
+ ~~~
+
+
+
+1. Run the `replicator` command, specifying the GTID from the [checkpoint recorded during data load](#start-fetch). Use `--stagingSchema` to specify a unique name for the staging database, and include `--stagingCreateSchema` to have MOLT Replicator automatically create the staging database:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ replicator mylogical \
+ --sourceConn $SOURCE \
+ --targetConn $TARGET \
+ --targetSchema defaultdb.public \
+ --defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 \
+ --stagingSchema _replicator \
+ --stagingCreateSchema \
+ --metricsAddr :30005 \
+ --userscript table_filter.ts \
+ -v
+ ~~~
+
+ {{site.data.alerts.callout_success}}
+ For MySQL versions that do not support `binlog_row_metadata`, include `--fetchMetadata` to explicitly fetch column metadata. This requires additional permissions on the source MySQL database. Grant `SELECT` permissions with `GRANT SELECT ON source_database.* TO 'migration_user'@'localhost';`. If that is insufficient for your deployment, use `GRANT PROCESS ON *.* TO 'migration_user'@'localhost';`, though this is more permissive and allows seeing processes and server status.
+ {{site.data.alerts.end}}
+
+
+
+1. Run the `replicator` command, specifying the backfill and starting SCN from the [checkpoint recorded during data load](#start-fetch). Use `--stagingSchema` to specify a unique name for the staging database, and include `--stagingCreateSchema` to have MOLT Replicator automatically create the staging database:
+
+ {% include_cached copy-clipboard.html %}
+ ~~~ shell
+ replicator oraclelogminer \
+ --sourceConn $SOURCE \
+ --sourcePDBConn $SOURCE_PDB \
+ --targetConn $TARGET \
+ --sourceSchema migration_schema \
+ --targetSchema defaultdb.public \
+ --backfillFromSCN 26685444 \
+ --scn 26685786 \
+ --stagingSchema _replicator \
+ --stagingCreateSchema \
+ --metricsAddr :30005 \
+ --userscript table_filter.ts \
+ -v
+ ~~~
+
+ {{site.data.alerts.callout_info}}
+ When filtering out tables in a schema with a userscript, replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.
+ {{site.data.alerts.end}}
+
+
+## Verify replication
+
+1. Verify that Replicator is processing changes successfully. To do so, check the MOLT Replicator logs. Since you enabled debug logging with `-v`, you should see connection and row processing messages:
+
+
+ You should see periodic primary keepalive messages:
+
+ ~~~
+ DEBUG [Aug 25 14:38:10] primary keepalive received ReplyRequested=false ServerTime="2025-08-25 14:38:09.556773 -0500 CDT" ServerWALEnd=0/49913A58
+ DEBUG [Aug 25 14:38:15] primary keepalive received ReplyRequested=false ServerTime="2025-08-25 14:38:14.556836 -0500 CDT" ServerWALEnd=0/49913E60
+ ~~~
+
+ When rows are successfully replicated, you should see debug output like the following:
+
+ ~~~
+ DEBUG [Aug 25 14:40:02] upserted rows conflicts=0 duration=7.855333ms proposed=1 target="\"molt\".\"public\".\"tbl1\"" upserted=1
+ DEBUG [Aug 25 14:40:02] progressed to LSN: 0/49915DD0
+ ~~~
+
+
+
+ You should see binlog syncer connection and row processing:
+
+ ~~~
+ [2025/08/25 15:29:09] [info] binlogsyncer.go:463 begin to sync binlog from GTID set 77263736-7899-11f0-81a5-0242ac120002:1-38
+ [2025/08/25 15:29:09] [info] binlogsyncer.go:409 Connected to mysql 8.0.43 server
+ INFO [Aug 25 15:29:09] connected to MySQL version 8.0.43
+ ~~~
+
+ When rows are successfully replicated, you should see debug output like the following:
+
+ ~~~
+ DEBUG [Aug 25 15:29:38] upserted rows conflicts=0 duration=1.801ms proposed=1 target="\"molt\".\"public\".\"tbl1\"" upserted=1
+
+ ~~~
+
+
+
+ When transactions are read from the Oracle source, you should see registered transaction IDs (XIDs):
+
+ ~~~
+ DEBUG [Jul 3 15:55:12] registered xid 0f001f0040060000
+ DEBUG [Jul 3 15:55:12] registered xid 0b001f00bb090000
+ ~~~
+
+ When rows are successfully replicated, you should see debug output like the following:
+
+ ~~~
+ DEBUG [Jul 3 15:55:12] upserted rows conflicts=0 duration=2.620009ms proposed=13 target="\"molt_movies\".\"USERS\".\"CUSTOMER_CONTACT\"" upserted=13
+ DEBUG [Jul 3 15:55:12] upserted rows conflicts=0 duration=2.212807ms proposed=16 target="\"molt_movies\".\"USERS\".\"CUSTOMER_DEVICE\"" upserted=16
+ ~~~
+
+
+ These messages confirm successful replication. You can disable verbose logging after verifying the connection.
+
+## Stop replication and verify data
+
+{% include molt/migration-stop-replication.md %}
+
+1. Repeat [Verify the data load](#verify-the-data-load) to verify the updated data.
+
+## Modify the CockroachDB schema
+
+{% include molt/migration-modify-target-schema.md %}
+
+## Cutover
+
+Perform a cutover by resuming application traffic, now to CockroachDB.
+
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-fetch.md %}
+{% include molt/molt-troubleshooting-replication.md %}
+
+## See also
+
+- [Migration Overview]({% link molt/migration-overview.md %})
+- [Migration Strategy]({% link molt/migration-strategy.md %})
+- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
+- [MOLT Fetch]({% link molt/molt-fetch.md %})
+- [MOLT Verify]({% link molt/molt-verify.md %})
+- [Migration Failback]({% link molt/migrate-failback.md %})
diff --git a/src/current/molt/migrate-replicate-only.md b/src/current/molt/migrate-replicate-only.md
index ae030f6f25c..e69de29bb2d 100644
--- a/src/current/molt/migrate-replicate-only.md
+++ b/src/current/molt/migrate-replicate-only.md
@@ -1,85 +0,0 @@
----
-title: Resume Replication
-summary: Restart ongoing replication using an existing staging schema checkpoint.
-toc: true
-docs_area: migrate
----
-
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
-
-Use `replication-only` mode to resume replication to CockroachDB after an interruption, without reloading data.
-
-{{site.data.alerts.callout_info}}
-These steps assume that you previously started replication. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}#replicate-changes-to-cockroachdb) or [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb).
-{{site.data.alerts.end}}
-
-
-
-
-
-
-
-## Resume replication after interruption
-
-{% include molt/fetch-replicator-flags.md %}
-
-1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to start replication on CockroachDB, specifying [`--mode replication-only`]({% link molt/molt-fetch.md %}#fetch-mode).
-
-
- Be sure to specify the same `--pglogical-replication-slot-name` value that you provided on [data load]({% link molt/migrate-data-load-replicate-only.md %}#load-data-into-cockroachdb).
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --pglogical-replication-slot-name cdc_slot \
- --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005' \
- --mode replication-only
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --non-interactive \
- --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005 --userscript table_filter.ts' \
- --mode replication-only
- ~~~
-
-
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt fetch \
- --source $SOURCE \
- --source-cdb $SOURCE_CDB \
- --target $TARGET \
- --schema-filter 'migration_schema' \
- --table-filter 'employees|payments|orders' \
- --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005 --userscript table_filter.ts' \
- --mode 'replication-only'
- ~~~
-
-
- Replication resumes from the last checkpoint without performing a fresh load.
-
-{% include molt/fetch-replication-output.md %}
-
-## See also
-
-- [Migration Overview]({% link molt/migration-overview.md %})
-- [Migration Strategy]({% link molt/migration-strategy.md %})
-- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
-- [MOLT Fetch]({% link molt/molt-fetch.md %})
-- [MOLT Verify]({% link molt/molt-verify.md %})
-- [Migration Failback]({% link molt/migrate-failback.md %})
\ No newline at end of file
diff --git a/src/current/molt/migrate-resume-replication.md b/src/current/molt/migrate-resume-replication.md
new file mode 100644
index 00000000000..2ccd0b68cf3
--- /dev/null
+++ b/src/current/molt/migrate-resume-replication.md
@@ -0,0 +1,97 @@
+---
+title: Resume Replication
+summary: Resume replication after an interruption.
+toc: true
+docs_area: migrate
+---
+
+Resume replication using [MOLT Replicator]({% link molt/molt-replicator.md %}) by running `replicator` with the same arguments used during [initial replication setup]({% link molt/migrate-load-replicate.md %}?filters=postgres#start-replicator). Replicator will automatically resume from the saved checkpoint in the existing staging schema.
+
+{{site.data.alerts.callout_info}}
+These instructions assume you have already started replication at least once. To start replication for the first time, refer to [Load and Replicate]({% link molt/migrate-load-replicate.md %}#start-replicator).
+{{site.data.alerts.end}}
+
+
+
+
+
+
+
+## Resume replication after interruption
+
+
+Run the [MOLT Replicator]({% link molt/molt-replicator.md %}) `pglogical` command using the same `--stagingSchema` value from your [initial replication command]({% link molt/migrate-load-replicate.md %}?filters=postgres#start-replicator).
+
+Be sure to specify the same `--slotName` value that you used during your [initial replication command]({% link molt/migrate-load-replicate.md %}?filters=postgres#start-replicator). The replication slot on the source PostgreSQL database automatically tracks the LSN (Log Sequence Number) checkpoint, so replication will resume from where it left off.
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator pglogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--targetSchema defaultdb.public \
+--slotName molt_slot \
+--stagingSchema _replicator \
+--metricsAddr :30005 \
+-v
+~~~
+
+
+
+Run the [MOLT Replicator]({% link molt/molt-replicator.md %}) `mylogical` command using the same `--stagingSchema` value from your [initial replication command]({% link molt/migrate-load-replicate.md %}?filters=mysql#start-replicator).
+
+Replicator will automatically use the saved GTID (Global Transaction Identifier) from the `memo` table in the staging schema (in this example, `_replicator.memo`) and track advancing GTID checkpoints there. To have Replicator start from a different GTID instead of resuming from the checkpoint, clear the `memo` table with `DELETE FROM _replicator.memo;` and run the `replicator` command with a new `--defaultGTIDSet` value.
+
+{{site.data.alerts.callout_success}}
+For MySQL versions that do not support `binlog_row_metadata`, include `--fetchMetadata` to explicitly fetch column metadata. This requires additional permissions on the source MySQL database. Grant `SELECT` permissions with `GRANT SELECT ON source_database.* TO 'migration_user'@'localhost';`. If that is insufficient for your deployment, use `GRANT PROCESS ON *.* TO 'migration_user'@'localhost';`, though this is more permissive and allows seeing processes and server status.
+{{site.data.alerts.end}}
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator mylogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--targetSchema defaultdb.public \
+--stagingSchema _replicator \
+--metricsAddr :30005 \
+--userscript table_filter.ts \
+-v
+~~~
+
+
+
+Run the [MOLT Replicator]({% link molt/molt-replicator.md %}) `oraclelogminer` command using the same `--stagingSchema` value from your [initial replication command]({% link molt/migrate-load-replicate.md %}?filters=oracle#start-replicator).
+
+Replicator will automatically find the correct restart SCN (System Change Number) from the `_oracle_checkpoint` table in the staging schema. The restart point is determined by the non-committed row with the smallest `startscn` column value.
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator oraclelogminer \
+--sourceConn $SOURCE \
+--sourcePDBConn $SOURCE_PDB \
+--sourceSchema migration_schema \
+--targetConn $TARGET \
+--stagingSchema _replicator \
+--metricsAddr :30005 \
+--userscript table_filter.ts
+~~~
+
+{{site.data.alerts.callout_info}}
+When filtering out tables in a schema with a userscript, replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.
+{{site.data.alerts.end}}
+
+
+Replication resumes from the last checkpoint without performing a fresh load. Monitor the metrics endpoint at `http://localhost:30005/_/varz` to track replication progress.
+
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-replication.md %}
+
+## See also
+
+- [Migration Overview]({% link molt/migration-overview.md %})
+- [Migration Strategy]({% link molt/migration-strategy.md %})
+- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
+- [MOLT Fetch]({% link molt/molt-fetch.md %})
+- [MOLT Verify]({% link molt/molt-verify.md %})
+- [Migration Failback]({% link molt/migrate-failback.md %})
\ No newline at end of file
diff --git a/src/current/molt/migrate-to-cockroachdb.md b/src/current/molt/migrate-to-cockroachdb.md
index 60aabe29810..a64832549da 100644
--- a/src/current/molt/migrate-to-cockroachdb.md
+++ b/src/current/molt/migrate-to-cockroachdb.md
@@ -7,13 +7,14 @@ docs_area: migrate
MOLT Fetch supports various migration flows using [MOLT Fetch modes]({% link molt/molt-fetch.md %}#fetch-mode).
-| Migration flow | Mode | Description | Best for |
-|----------------------------------------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
-| [Bulk load]({% link molt/migrate-bulk-load.md %}) | `--mode data-load` | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) |
-| [Data load and replication]({% link molt/migrate-data-load-and-replication.md %}) | `--mode data-load-and-replication` | Load source data, then replicate subsequent changes continuously. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
-| [Data load then replication-only]({% link molt/migrate-data-load-replicate-only.md %}) | `--mode data-load`, then `--mode replication-only` | Load source data first, then start replication in a separate task. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
-| [Resume replication]({% link molt/migrate-replicate-only.md %}) | `--mode replication-only` | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync |
-| [Failback]({% link molt/migrate-failback.md %}) | `--mode failback` | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios |
+{% include molt/crdb-to-crdb-migration.md %}
+
+| Migration flow | Mode | Description | Best for |
+|---------------------------------------------------------------------|------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
+| [Bulk load]({% link molt/migrate-bulk-load.md %}) | `--mode data-load` | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) |
+| [Load and replicate]({% link molt/migrate-load-replicate.md %}) | MOLT Fetch + MOLT Replicator | Load source data using MOLT Fetch, then replicate subsequent changes using MOLT Replicator. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
+| [Resume replication]({% link molt/migrate-resume-replication.md %}) | `--mode replication-only` | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync |
+| [Failback]({% link molt/migrate-failback.md %}) | `--mode failback` | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios |
### Bulk load
@@ -23,14 +24,13 @@ For migrations that tolerate downtime, use `data-load` mode to perform a one-tim
To minimize downtime during migration, MOLT Fetch supports replication streams that sync ongoing changes from the source database to CockroachDB. Instead of performing the entire data load during a planned downtime window, you can perform an initial load followed by continuous replication. Writes are only briefly paused to allow replication to drain before final cutover. The length of the pause depends on the volume of write traffic and the amount of replication lag between the source and CockroachDB.
-- Use `data-load-and-replication` mode to perform both steps in one task. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}).
-- Use `data-load` followed by `replication-only` to perform the steps separately. Refer to [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}).
+- Use MOLT Fetch for data loading followed by MOLT Replicator for replication. Refer to [Load and replicate]({% link molt/migrate-load-replicate.md %}).
### Recovery and rollback strategies
If the migration is interrupted or you need to abort cutover, MOLT Fetch supports safe recovery flows:
-- Use `replication-only` to resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-replicate-only.md %}).
+- Use `replication-only` to resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-resume-replication.md %}).
- Use `failback` to reverse the migration, syncing changes from CockroachDB back to the original source. This ensures data consistency on the source so that you can retry later. Refer to [Migration Failback]({% link molt/migrate-failback.md %}).
## See also
diff --git a/src/current/molt/migration-overview.md b/src/current/molt/migration-overview.md
index f74cf9f9fd7..163dc26d8aa 100644
--- a/src/current/molt/migration-overview.md
+++ b/src/current/molt/migration-overview.md
@@ -28,7 +28,7 @@ A migration to CockroachDB generally follows this sequence:
1. Load data into CockroachDB: Use [MOLT Fetch]({% link molt/molt-fetch.md %}) to bulk-ingest your source data.
1. (Optional) Verify consistency before replication: Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the data loaded into CockroachDB is consistent with the source.
1. Finalize target schema: Recreate indexes or constraints on CockroachDB that you previously dropped to facilitate data load.
-1. Replicate ongoing changes: Enable continuous replication with [MOLT Fetch]({% link molt/molt-fetch.md %}#replication-only) to keep CockroachDB in sync with the source.
+1. Replicate ongoing changes: Enable continuous replication with [MOLT Replicator]({% link molt/molt-replicator.md %}) to keep CockroachDB in sync with the source.
1. Verify consistency before cutover: Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the CockroachDB data is consistent with the source.
1. Cut over to CockroachDB: Redirect application traffic to the CockroachDB cluster.
@@ -38,7 +38,7 @@ For more details, refer to [Migration flows](#migration-flows).
[MOLT (Migrate Off Legacy Technology)]({% link releases/molt.md %}) is a set of tools for schema conversion, data load, replication, and validation. Migrations with MOLT are resilient, restartable, and scalable to large data sets.
-MOLT [Fetch](#fetch) and [Verify](#verify) are CLI-based to maximize control, automation, and visibility during the data load and replication stages.
+MOLT [Fetch](#fetch), [Replicator](#replicator), and [Verify](#verify) are CLI-based to maximize control, automation, and visibility during the data load and replication stages.
@@ -55,10 +55,16 @@ MOLT [Fetch](#fetch) and [Verify](#verify) are CLI-based to maximize control, au
@@ -67,6 +73,8 @@ MOLT [Fetch](#fetch) and [Verify](#verify) are CLI-based to maximize control, au
+{% include molt/crdb-to-crdb-migration.md %}
+
### Schema Conversion Tool
The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) converts a source database schema to a CockroachDB-compatible schema. The tool performs the following actions:
@@ -77,15 +85,22 @@ The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
### Fetch
-[MOLT Fetch]({% link molt/molt-fetch.md %}) performs the core data migration to CockroachDB. It supports:
+[MOLT Fetch]({% link molt/molt-fetch.md %}) performs the initial data load to CockroachDB. It supports:
- [Multiple migration flows](#migration-flows) via `IMPORT INTO` or `COPY FROM`.
- Data movement via [cloud storage, local file servers, or direct copy]({% link molt/molt-fetch.md %}#data-path).
- [Concurrent data export]({% link molt/molt-fetch.md %}#best-practices) from multiple source tables and shards.
-- [Continuous replication]({% link molt/molt-fetch.md %}#replication-only), enabling you to minimize downtime before cutover.
- [Schema transformation rules]({% link molt/molt-fetch.md %}#transformations).
- After exporting data with `IMPORT INTO`, safe [continuation]({% link molt/molt-fetch.md %}#fetch-continuation) to retry failed or interrupted tasks from specific checkpoints.
-- [Failback]({% link molt/molt-fetch.md %}#failback), which replicates changes from CockroachDB back to the original source via a secure changefeed.
+
+### Replicator
+
+[MOLT Replicator]({% link molt/molt-replicator.md %}) provides continuous replication capabilities for minimal-downtime migrations. It supports:
+
+- Continuous replication from source databases to CockroachDB.
+- [Multiple consistency modes]({% link molt/molt-replicator.md %}#consistency-modes) for balancing throughput and transactional guarantees.
+- Failback replication from CockroachDB back to source databases.
+- [Performance tuning]({% link molt/molt-replicator.md %}#optimize-performance) for high-throughput workloads.
### Verify
@@ -97,39 +112,33 @@ The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %})
## Migration flows
-MOLT Fetch supports various migration flows using [MOLT Fetch modes]({% link molt/molt-fetch.md %}#fetch-mode).
-
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
+MOLT supports various migration flows using [MOLT Fetch]({% link molt/molt-fetch.md %}) for data loading and [MOLT Replicator]({% link molt/molt-replicator.md %}) for ongoing replication.
-| Migration flow | Mode | Description | Best for |
-|----------------------------------------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
-| [Bulk load]({% link molt/migrate-bulk-load.md %}) | `--mode data-load` | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) |
-| [Data load and replication]({% link molt/migrate-data-load-and-replication.md %}) | `--mode data-load-and-replication` | Load source data, then replicate subsequent changes continuously. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
-| [Data load then replication-only]({% link molt/migrate-data-load-replicate-only.md %}) | `--mode data-load`, then `--mode replication-only` | Load source data first, then start replication in a separate task. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
-| [Resume replication]({% link molt/migrate-replicate-only.md %}) | `--mode replication-only` | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync |
-| [Failback]({% link molt/migrate-failback.md %}) | `--mode failback` | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios |
+| Migration flow | Tools | Description | Best for |
+|------------------------------------------------------------------------|------------------------------|----------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
+| [Bulk load]({% link molt/migrate-bulk-load.md %}) | MOLT Fetch | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) |
+| [Data load and replication]({% link molt/migrate-load-replicate.md %}) | MOLT Fetch + MOLT Replicator | Load source data with Fetch, then replicate subsequent changes continuously with Replicator. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations |
+| [Resume replication]({% link molt/migrate-resume-replication.md %}) | MOLT Replicator | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync |
+| [Failback]({% link molt/migrate-failback.md %}) | MOLT Replicator | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios |
### Bulk load
-For migrations that tolerate downtime, use `data-load` mode to perform a one-time bulk load of source data into CockroachDB. Refer to [Bulk Load]({% link molt/migrate-bulk-load.md %}).
+For migrations that tolerate downtime, use MOLT Fetch in `data-load` mode to perform a one-time bulk load of source data into CockroachDB. Refer to [Bulk Load]({% link molt/migrate-bulk-load.md %}).
### Migrations with minimal downtime
-To minimize downtime during migration, MOLT Fetch supports replication streams that continuously synchronize changes from the source database to CockroachDB. Instead of loading all data during a planned downtime window, you can run an initial load followed by continuous replication. Writes are paused only briefly to allow replication to drain before the final cutover. The duration of this pause depends on the volume of write traffic and the replication lag between the source and CockroachDB.
+To minimize downtime during migration, use MOLT Fetch for initial data loading followed by MOLT Replicator for continuous replication. Instead of loading all data during a planned downtime window, you can run an initial load followed by continuous replication. Writes are paused only briefly to allow replication to drain before the final cutover. The duration of this pause depends on the volume of write traffic and the replication lag between the source and CockroachDB.
-- Use `data-load-and-replication` mode to perform both steps in one task. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}).
-- Use `data-load` followed by `replication-only` to perform the steps separately. Refer to [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}).
+Refer to [Load and Replicate]({% link molt/migrate-load-replicate.md %}) for detailed instructions.
### Recovery and rollback strategies
-If the migration is interrupted or cutover must be aborted, MOLT Fetch provides safe recovery options:
+If the migration is interrupted or cutover must be aborted, MOLT Replicator provides safe recovery options:
-- Use `replication-only` to resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-replicate-only.md %}).
-- Use `failback` to reverse the migration, synchronizing changes from CockroachDB back to the original source. This ensures data consistency on the source so that you can retry the migration later. Refer to [Migration Failback]({% link molt/migrate-failback.md %}).
+- Resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-resume-replication.md %}).
+- Use failback mode to reverse the migration, synchronizing changes from CockroachDB back to the original source. This ensures data consistency on the source so that you can retry the migration later. Refer to [Migration Failback]({% link molt/migrate-failback.md %}).
## See also
- [Migration Strategy]({% link molt/migration-strategy.md %})
-- [MOLT Releases]({% link releases/molt.md %})
\ No newline at end of file
+- [MOLT Releases]({% link releases/molt.md %})
diff --git a/src/current/molt/migration-strategy.md b/src/current/molt/migration-strategy.md
index e6616ba46fe..eb2b46f65f5 100644
--- a/src/current/molt/migration-strategy.md
+++ b/src/current/molt/migration-strategy.md
@@ -41,7 +41,7 @@ It's important to fully [prepare the migration](#prepare-for-migration) in order
- *Minimal downtime* impacts as few customers as possible, ideally without impacting their regular usage. If your application is intentionally offline at certain times (e.g., outside business hours), you can migrate the data without users noticing. Alternatively, if your application's functionality is not time-sensitive (e.g., it sends batched messages or emails), you can queue requests while the system is offline and process them after completing the migration to CockroachDB.
- MOLT enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime), using continuous replication of source changes to CockroachDB.
+ MOLT enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime), using [MOLT Replicator]({% link molt/molt-replicator.md %}) for continuous replication of source changes to CockroachDB.
- *Reduced functionality* takes some, but not all, application functionality offline. For example, you can disable writes but not reads while you migrate the application data, and queue data to be written after completing the migration.
@@ -151,7 +151,7 @@ Performing a dry run is highly recommended. In addition to demonstrating how lon
*Cutover* is the process of switching application traffic from the source database to CockroachDB. Once the source data is fully migrated to CockroachDB, switch application traffic to the new database to end downtime.
-MOLT enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime), using continuous replication of source changes to CockroachDB.
+MOLT enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime), using [MOLT Replicator]({% link molt/molt-replicator.md %}) for continuous replication of source changes to CockroachDB.
To safely cut over when using replication:
diff --git a/src/current/molt/molt-fetch.md b/src/current/molt/molt-fetch.md
index 04a44e8e1d5..7ba02734849 100644
--- a/src/current/molt/molt-fetch.md
+++ b/src/current/molt/molt-fetch.md
@@ -7,66 +7,56 @@ docs_area: migrate
MOLT Fetch moves data from a source database into CockroachDB as part of a [database migration]({% link molt/migration-overview.md %}).
-MOLT Fetch uses [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to move the source data to cloud storage (Google Cloud Storage, Amazon S3, or Azure Blob Storage), a local file server, or local memory. Once the data is exported, MOLT Fetch can load the data into a target CockroachDB database and replicate changes from the source database. For details, refer to [Migration phases](#migration-phases).
+MOLT Fetch uses [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to move the source data to cloud storage (Google Cloud Storage, Amazon S3, or Azure Blob Storage), a local file server, or local memory. Once the data is exported, MOLT Fetch loads the data into a target CockroachDB database. For details, refer to [Migration phases](#migration-phases).
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
+## Terminology
+
+- *Shard*: A portion of a table's data exported concurrently during the data export phase. Tables are divided into shards to enable parallel processing. For details, refer to [Table sharding](#table-sharding).
+- *Continuation token*: An identifier that marks the progress of a fetch task. Used to resume data loading from the point of interruption if a fetch task fails. For details, refer to [Fetch continuation](#fetch-continuation).
+- *Intermediate files*: Temporary data files written to cloud storage or a local file server during the data export phase. These files are used to stage exported data before importing it into CockroachDB during the data import phase. For details, refer to [Data path](#data-path).
+
+## Prerequisites
-## Supported databases
+### Supported databases
-The following source databases are currently supported:
+The following source databases are supported:
- PostgreSQL 11-16
- MySQL 5.7, 8.0 and later
- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
-## Installation
-
-{% include molt/molt-install.md %}
-
-## Setup
-
-Complete the following items before using MOLT Fetch:
-
-- Follow the recommendations in [Best practices](#best-practices) and [Security recommendations](#security-recommendations).
-
-- Ensure that the source and target schemas are identical, unless you enable automatic schema creation with the [`drop-on-target-and-recreate`](#target-table-handling) option. If you are creating the target schema manually, review the behaviors in [Mismatch handling](#mismatch-handling).
+### Database configuration
-- Ensure that the SQL user running MOLT Fetch has [`SELECT` privileges]({% link {{site.current_cloud_version}}/grant.md %}#supported-privileges) on the source and target CockroachDB databases, along with the required privileges to run [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}#required-privileges) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}#required-privileges) (depending on the command used for [data movement](#data-load-mode)) on CockroachDB, as described on their respective pages.
+Ensure that the source and target schemas are identical, unless you enable automatic schema creation with the [`drop-on-target-and-recreate`](#target-table-handling) option. If you are creating the target schema manually, review the behaviors in [Mismatch handling](#mismatch-handling).
-- If you plan to use continuous replication, using either the MOLT Fetch [replication feature](#data-load-and-replication) or an [external change data capture (CDC) tool](#cdc-cursor), you must configure the source database for replication. Refer to the tutorial setup steps for [PostgreSQL]({% link molt/migrate-data-load-and-replication.md %}#configure-source-database-for-replication), [MySQL]({% link molt/migrate-data-load-and-replication.md %}?filters=mysql#configure-source-database-for-replication), and [Oracle]({% link molt/migrate-data-load-and-replication.md %}?filters=oracle#configure-source-database-for-replication).
-
- {{site.data.alerts.callout_success}}
- If you will not use replication (for example, during a [bulk load migration]({% link molt/migrate-bulk-load.md %}) or when doing a one-time data export from a read replica), you can skip replication setup and use the [`--ignore-replication-check`](#global-flags) flag with MOLT Fetch. This flag instructs MOLT Fetch to skip querying for replication checkpoints (such as `pg_current_wal_insert_lsn()` on PostgreSQL, `gtid_executed` on MySQL, and `CURRENT_SCN` on Oracle).
- {{site.data.alerts.end}}
+{{site.data.alerts.callout_info}}
+MOLT Fetch does not support migrating sequences. If your source database contains sequences, refer to the [guidance on indexing with sequential keys]({% link {{site.current_cloud_version}}/sql-faqs.md %}#how-do-i-generate-unique-slowly-increasing-sequential-numbers-in-cockroachdb). If a sequential key is necessary in your CockroachDB table, you must create it manually. After using MOLT Fetch to load the data onto the target, but before cutover, make sure to update each sequence's current value using [`setval()`]({% link {{site.current_cloud_version}}/functions-and-operators.md %}#sequence-functions) so that new inserts continue from the correct point.
+{{site.data.alerts.end}}
-- URL-encode the connection strings for the source database and [CockroachDB]({% link {{site.current_cloud_version}}/connect-to-the-database.md %}). This ensures that the MOLT tools can parse special characters in your password.
+If you plan to use cloud storage for the data migration, follow the steps in [Cloud storage security](#cloud-storage-security).
- - Given a password `a$52&`, pass it to the `molt escape-password` command with single quotes:
+### User permissions
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- molt escape-password --password 'a$52&'
- ~~~
+The SQL user running MOLT Fetch requires specific privileges on both the source and target databases:
- Substitute the following encoded password in your original connection url string:
+| Database | Required Privileges | Details |
+|--------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
+| PostgreSQL source |
`CONNECT` on database.
`USAGE` on schema.
`SELECT` on tables to migrate.
| [Create PostgreSQL migration user]({% link molt/migrate-bulk-load.md %}#create-migration-user-on-source-database) |
+| MySQL source |
`SELECT` on tables to migrate.
| [Create MySQL migration user]({% link molt/migrate-bulk-load.md %}?filters=mysql#create-migration-user-on-source-database) |
+| Oracle source |
`CONNECT` and `CREATE SESSION`.
`SELECT` and `FLASHBACK` on tables to migrate.
`SELECT` on metadata views (`ALL_USERS`, `DBA_USERS`, `DBA_OBJECTS`, `DBA_SYNONYMS`, `DBA_TABLES`).
`SELECT`, `INSERT`, `UPDATE`, `DELETE` on target tables.
For `IMPORT INTO`: `SELECT`, `INSERT`, `DROP` on target tables. Optionally `EXTERNALIOIMPLICITACCESS` for implicit cloud storage authentication.
For `COPY FROM`: `admin` role.
| [Create CockroachDB user]({% link molt/migrate-bulk-load.md %}#create-the-sql-user) |
- ~~~
- a%2452%26
- ~~~
+## Installation
- - Use the encoded password in your connection string. For example:
+{% include molt/molt-install.md %}
- ~~~
- postgres://postgres:a%2452%26@localhost:5432/replicationload
- ~~~
+### Docker usage
-- If you plan to use cloud storage for the data migration, follow the steps in [Secure cloud storage](#secure-cloud-storage).
+{% include molt/molt-docker.md %}
## Migration phases
-MOLT Fetch operates in distinct phases to move data from source databases to CockroachDB. These phases can be executed together using an integrated mode ([`data-load-and-replication`]({% link molt/migrate-data-load-and-replication.md %})) or separately using targeted modes ([`data-load` and `replication-only`]({% link molt/migrate-data-load-replicate-only.md %})). For details on available modes, refer to [Fetch mode](#fetch-mode).
+MOLT Fetch operates in distinct phases to move data from source databases to CockroachDB. For details on available modes, refer to [Fetch mode](#fetch-mode).
### Data export phase
@@ -82,69 +72,6 @@ MOLT Fetch loads the exported data into the target CockroachDB database. The pro
- [Data movement](#data-load-mode)
- [Target table handling](#target-table-handling)
-### Replication phase
-
-For minimal-downtime migrations, MOLT Fetch can continuously replicate ongoing changes from the source. The system monitors source database transaction logs to capture ongoing writes, applies changes to CockroachDB in near real-time through streaming replication, and provides metrics to track replication lag and ensure data consistency. For details, refer to:
-
-- [Replicate changes](#replication-only)
-- [Resume replication](#resume-replication)
-
-## Best practices
-
-{{site.data.alerts.callout_success}}
-To verify that your connections and configuration work properly, run MOLT Fetch in a staging environment before migrating any data in production. Use a test or development environment that closely resembles production.
-{{site.data.alerts.end}}
-
-- To prevent connections from terminating prematurely during the [data export phase](#data-export-phase), set the following to high values on the source database:
-
- - **Maximum allowed number of connections.** MOLT Fetch can export data across multiple connections. The number of connections it will create is the number of shards ([`--export-concurrency`](#global-flags)) multiplied by the number of tables ([`--table-concurrency`](#global-flags)) being exported concurrently.
-
- {{site.data.alerts.callout_info}}
- With the default numerical range sharding, only tables with [primary key]({% link {{ site.current_cloud_version }}/primary-key.md %}) types of [`INT`]({% link {{ site.current_cloud_version }}/int.md %}), [`FLOAT`]({% link {{ site.current_cloud_version }}/float.md %}), or [`UUID`]({% link {{ site.current_cloud_version }}/uuid.md %}) can be sharded. PostgreSQL users can enable [`--use-stats-based-sharding`](#global-flags) to use statistics-based sharding for tables with primary keys of any data type. For details, refer to [Table sharding](#table-sharding).
- {{site.data.alerts.end}}
-
- - **Maximum lifetime of a connection.**
-
-- For PostgreSQL sources using [`--use-stats-based-sharding`](#global-flags), run [`ANALYZE`]({% link {{ site.current_cloud_version }}/create-statistics.md %}) on source tables before migration to ensure optimal shard distribution. This is especially important for large tables where even distribution can significantly improve export performance.
-
-- If a PostgreSQL database is set as a [source](#source-and-target-databases), ensure that [`idle_in_transaction_session_timeout`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT) on PostgreSQL is either disabled or set to a value longer than the duration of the [data export phase](#data-export-phase). Otherwise, the connection will be prematurely terminated. To estimate the time needed to export the PostgreSQL tables, you can perform a dry run and sum the value of [`molt_fetch_table_export_duration_ms`](#metrics) for all exported tables.
-
-- To prevent memory outages during `READ COMMITTED` [data export](#data-export-phase) of tables with large rows, estimate the amount of memory used to export a table:
-
- ~~~
- --row-batch-size * --export-concurrency * average size of the table rows
- ~~~
-
- If you are exporting more than one table at a time (i.e., [`--table-concurrency`](#global-flags) is set higher than `1`), add the estimated memory usage for the tables with the largest row sizes. Ensure that you have sufficient memory to run `molt fetch`, and adjust `--row-batch-size` accordingly. For details on how concurrency and sharding interact, refer to [Table sharding](#table-sharding).
-
-- If a table in the source database is much larger than the other tables, [filter and export the largest table](#schema-and-table-selection) in its own `molt fetch` task. Repeat this for each of the largest tables. Then export the remaining tables in another task.
-
-- When using [`IMPORT INTO`](#data-load-mode) during the [data import phase](#data-import-phase) to load tables into CockroachDB, if the fetch task terminates before the import job completes, the hanging import job on the target database will keep the table offline. To make this table accessible again, [manually resume or cancel the job]({% link {{site.current_cloud_version}}/import-into.md %}#view-and-control-import-jobs). Then resume `molt fetch` using [continuation](#fetch-continuation), or restart the task from the beginning.
-
-- Ensure that the machine running MOLT Fetch is large enough to handle the amount of data being migrated. Fetch performance can sometimes be limited by available resources, but should always be making progress. To identify possible resource constraints, observe the `molt_fetch_rows_exported` [metric](#metrics) for decreases in the number of rows being processed. You can use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view metrics. For details on optimizing export performance through sharding, refer to [Table sharding](#table-sharding).
-
-- {% include molt/molt-drop-constraints-indexes.md %}
-
-- MOLT Fetch does not support migrating sequences. If your source database contains sequences, refer to the [guidance on indexing with sequential keys]({% link {{site.current_cloud_version}}/sql-faqs.md %}#how-do-i-generate-unique-slowly-increasing-sequential-numbers-in-cockroachdb). If a sequential key is necessary in your CockroachDB table, you must create it manually. After using MOLT Fetch to load and replicate the data onto the target, but before cutover, make sure to update each sequence's current value using [`setval()`]({% link {{site.current_cloud_version}}/functions-and-operators.md %}#sequence-functions) so that new inserts continue from the correct point.
-
-## Security recommendations
-
-Cockroach Labs **strongly** recommends the following:
-
-### Secure connections
-
-- Use secure connections to the source and [target CockroachDB database]({% link {{site.current_cloud_version}}/connection-parameters.md %}#additional-connection-parameters) whenever possible.
-- When performing [failback](#failback), use a secure changefeed connection by [overriding the default configuration](#changefeed-override-settings).
-- By default, insecure connections (i.e., `sslmode=disable` on PostgreSQL; `sslmode` not set on MySQL) are disallowed. When using an insecure connection, `molt fetch` returns an error. To override this check, you can enable the `--allow-tls-mode-disable` flag. Do this **only** when testing, or if a secure SSL/TLS connection to the source or target database is not possible.
-
-### Connection strings
-
-{% include molt/fetch-secure-connection-strings.md %}
-
-### Secure cloud storage
-
-{% include molt/fetch-secure-cloud-storage.md %}
-
## Commands
| Command | Usage |
@@ -161,58 +88,54 @@ Cockroach Labs **strongly** recommends the following:
### Global flags
-| Flag | Description |
-|------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `--source` | (Required) Connection string used to connect to the Oracle PDB (in a CDB/PDB architecture) or to a standalone database (non‑CDB). For details, refer to [Source and target databases](#source-and-target-databases). |
-| `--source-cdb` | Connection string for the Oracle container database (CDB) when using a multitenant (CDB/PDB) architecture. Omit this flag on a non‑multitenant Oracle database. For details, refer to [Source and target databases](#source-and-target-databases). |
-| `--target` | (Required) Connection string for the target database. For details, refer to [Source and target databases](#source-and-target-databases). |
-| `--allow-tls-mode-disable` | Allow insecure connections to databases. Secure SSL/TLS connections should be used by default. This should be enabled **only** if secure SSL/TLS connections to the source or target database are not possible. |
-| `--assume-role` | Service account to use for assume role authentication. `--use-implicit-auth` must be included. For example, `--assume-role='user-test@cluster-ephemeral.iam.gserviceaccount.com' --use-implicit-auth`. For details, refer to [Cloud Storage Authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). |
-| `--bucket-path` | The path within the [cloud storage](#bucket-path) bucket where intermediate files are written (e.g., `'s3://bucket/path'` or `'gs://bucket/path'`). Only the URL path is used; query parameters (e.g., credentials) are ignored. To pass in query parameters, use the appropriate flags: `--assume-role`, `--import-region`, `--use-implicit-auth`. |
-| `--case-sensitive` | Toggle case sensitivity when comparing table and column names on the source and target. To disable case sensitivity, set `--case-sensitive=false`. If `=` is **not** included (e.g., `--case-sensitive false`), the flag is interpreted as `--case-sensitive` (i.e., `--case-sensitive=true`).
**Default:** `false` |
-| `--changefeeds-path` | Path to a JSON file that contains changefeed override settings for [failback](#failback), when enabled with `--mode failback`. If not specified, an insecure default configuration is used, and `--allow-tls-mode-disable` must be included. For details, see [Fail back to source database](#failback). |
-| `--cleanup` | Whether to delete intermediate files after moving data using [cloud or local storage](#data-path). **Note:** Cleanup does not occur on [continuation](#fetch-continuation). |
-| `--compression` | Compression method for data when using [`IMPORT INTO`](#data-load-mode) (`gzip`/`none`).
**Default:** `gzip` |
-| `--continuation-file-name` | Restart fetch at the specified filename if the task encounters an error. `--fetch-id` must be specified. For details, see [Fetch continuation](#fetch-continuation). |
-| `--continuation-token` | Restart fetch at a specific table, using the specified continuation token, if the task encounters an error. `--fetch-id` must be specified. For details, see [Fetch continuation](#fetch-continuation). |
-| `--crdb-pts-duration` | The duration for which each timestamp used in data export from a CockroachDB source is protected from garbage collection. This ensures that the data snapshot remains consistent. For example, if set to `24h`, each timestamp is protected for 24 hours from the initiation of the export job. This duration is extended at regular intervals specified in `--crdb-pts-refresh-interval`.
**Default:** `24h0m0s` |
-| `--crdb-pts-refresh-interval` | The frequency at which the protected timestamp's validity is extended. This interval maintains protection of the data snapshot until data export from a CockroachDB source is completed. For example, if set to `10m`, the protected timestamp's expiration will be extended by the duration specified in `--crdb-pts-duration` (e.g., `24h`) every 10 minutes while export is not complete.
**Default:** `10m0s` |
-| `--direct-copy` | Enables [direct copy](#direct-copy), which copies data directly from source to target without using an intermediate store. |
-| `--export-concurrency` | Number of shards to export at a time per table, each on a dedicated thread. This controls how many shards are created for each individual table during the [data export phase](#data-export-phase) and is distinct from `--table-concurrency`, which controls how many tables are processed simultaneously. The total number of concurrent threads is the product of `--export-concurrency` and `--table-concurrency`. Tables can be sharded with a range-based or stats-based mechanism. For details, refer to [Table sharding](#table-sharding).
**Default:** `4` |
-| `--filter-path` | Path to a JSON file defining row-level filters for the [data import phase](#data-import-phase). Refer to [Selective data movement](#selective-data-movement). |
-| `--fetch-id` | Restart fetch task corresponding to the specified ID. If `--continuation-file-name` or `--continuation-token` are not specified, fetch restarts for all failed tables. |
-| `--flush-rows` | Number of rows before the source data is flushed to intermediate files. **Note:** If `--flush-size` is also specified, the fetch behavior is based on the flag whose criterion is met first. |
-| `--flush-size` | Size (in bytes) before the source data is flushed to intermediate files. **Note:** If `--flush-rows` is also specified, the fetch behavior is based on the flag whose criterion is met first. |
-| `--ignore-replication-check` | Skip querying for replication checkpoints such as `pg_current_wal_insert_lsn()` on PostgreSQL, `gtid_executed` on MySQL, and `CURRENT_SCN` on Oracle. This option is intended for use in non-replication modes, such as scenarios with planned application downtime or testing. It eliminates the need for replication-specific configurations like GTID-based replication in MySQL, logical replication in PostgreSQL, and access to Oracle views such as `V$DATABASE`, `V$SESSION`, and `V$TRANSACTION`. |
-| `--import-batch-size` | The number of files to be imported at a time to the target database during the [data import phase](#data-import-phase). This applies only when using [`IMPORT INTO`](#data-load-mode) for data movement. **Note:** Increasing this value can improve the performance of full-scan queries on the target database shortly after fetch completes, but very high values are not recommended. If any individual file in the import batch fails, you must [retry](#fetch-continuation) the entire batch.
**Default:** `1000` |
-| `--import-region` | The region of the [cloud storage](#bucket-path) bucket. This applies only to [Amazon S3 buckets](#bucket-path). Set this flag only if you need to specify an `AWS_REGION` explicitly when using [`IMPORT INTO`](#data-load-mode) for data movement. For example, `--import-region=ap-south-1`. |
-| `--local-path` | The path within the [local file server](#local-path) where intermediate files are written (e.g., `data/migration/cockroach`). `--local-path-listen-addr` must be specified. |
-| `--local-path-crdb-access-addr` | Address of a [local file server](#local-path) that is **publicly accessible**. This flag is only necessary if CockroachDB cannot reach the local address specified with `--local-path-listen-addr` (e.g., when moving data to a CockroachDB {{ site.data.products.cloud }} deployment). `--local-path` and `--local-path-listen-addr` must be specified.
**Default:** Value of `--local-path-listen-addr`. |
-| `--local-path-listen-addr` | Write intermediate files to a [local file server](#local-path) at the specified address (e.g., `'localhost:3000'`). `--local-path` must be specified. |
-| `--log-file` | Write messages to the specified log filename. If no filename is provided, messages write to `fetch-{datetime}.log`. If `"stdout"` is provided, messages write to `stdout`. |
-| `--logging` | Level at which to log messages (`trace`/`debug`/`info`/`warn`/`error`/`fatal`/`panic`).
**Default:** `info` |
-| `--metrics-listen-addr` | Address of the Prometheus metrics endpoint, which has the path `{address}/metrics`. For details on important metrics to monitor, see [Metrics](#metrics).
**Default:** `'127.0.0.1:3030'` |
-| `--mode` | Configure the MOLT Fetch behavior: `data-load`, `data-load-and-replication`, `replication-only`, `export-only`, `import-only`, or `failback`. For details, refer to [Fetch mode](#fetch-mode).
**Default:** `data-load` |
-| `--non-interactive` | Run the fetch task without interactive prompts. This is recommended **only** when running `molt fetch` in an automated process (i.e., a job or continuous integration). |
-| `--pglogical-publication-name` | If set, the name of the [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html) that will be created or used for replication. Used in [`replication-only`](#replication-only) mode.
**Default:** `molt_fetch` |
-| `--pglogical-publication-and-slot-drop-and-recreate` | If set, drops the [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html) and slots if they exist and then recreates them. Used in [`replication-only`](#replication-only) mode. |
-| `--pglogical-replication-slot-name` | The name of a replication slot to create before taking a snapshot of data (e.g., `'fetch'`). **Required** in order to perform continuous [replication](#data-load-and-replication) from a source PostgreSQL database. |
-| `--pglogical-replication-slot-plugin` | The output plugin used for logical replication under `--pglogical-replication-slot-name`.
**Default:** `pgoutput` |
-| `--pprof-listen-addr` | Address of the pprof endpoint.
**Default:** `'127.0.0.1:3031'` |
-| `--replicator-flags` | If continuous [replication](#data-load-and-replication) is enabled with `--mode data-load-and-replication`, `--mode replication-only`, or `--mode failback`, specify [replication flags](#replication-flags) to override. For example: `--replicator-flags "--tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key"` |
-| `--row-batch-size` | Number of rows per shard to export at a time. For details on sharding, refer to [Table sharding](#table-sharding). See also [Best practices](#best-practices).
**Default:** `100000` |
-| `--schema-filter` | Move schemas that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).
**Default:** `'.*'` |
-| `--skip-pk-check` | Skip primary-key matching to allow data load when source or target tables have missing or mismatched primary keys. Disables sharding and bypasses `--export-concurrency` and `--row-batch-size` settings. Refer to [Skip primary key matching](#skip-primary-key-matching).
**Default:** `false` |
-| `--table-concurrency` | Number of tables to export at a time. The number of concurrent threads is the product of `--export-concurrency` and `--table-concurrency`.
**Default:** `4` |
-| `--table-exclusion-filter` | Exclude tables that match a specified [POSIX regular expression](https://wikipedia.org/wiki/Regular_expression).
This value **cannot** be set to `'.*'`, which would cause every table to be excluded.
**Default:** Empty string |
-| `--table-filter` | Move tables that match a specified [POSIX regular expression](https://wikipedia.org/wiki/Regular_expression).
**Default:** `'.*'` |
-| `--table-handling` | How tables are initialized on the target database (`none`/`drop-on-target-and-recreate`/`truncate-if-exists`). For details, see [Target table handling](#target-table-handling).
**Default:** `none` |
-| `--transformations-file` | Path to a JSON file that defines transformations to be performed on the target schema during the fetch task. Refer to [Transformations](#transformations). |
-| `--type-map-file` | Path to a JSON file that contains explicit type mappings for automatic schema creation, when enabled with `--table-handling drop-on-target-and-recreate`. For details on the JSON format and valid type mappings, see [type mapping](#type-mapping). |
-| `--use-console-writer` | Use the console writer, which has cleaner log output but introduces more latency.
**Default:** `false` (log as structured JSON) |
-| `--use-copy` | Use [`COPY FROM`](#data-load-mode) to move data. This makes tables queryable during data load, but is slower than using `IMPORT INTO`. For details, refer to [Data movement](#data-load-mode). |
-| `--use-implicit-auth` | Use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}) for [cloud storage](#bucket-path) URIs. |
-| `--use-stats-based-sharding` | Enable statistics-based sharding for PostgreSQL sources. This allows sharding of tables with primary keys of any data type and can create more evenly distributed shards compared to the default numerical range sharding. Requires PostgreSQL 11+ and access to `pg_stats`. For details, refer to [Table sharding](#table-sharding). |
+| Flag | Description |
+|---------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `--source` | (Required) Connection string used to connect to the Oracle PDB (in a CDB/PDB architecture) or to a standalone database (non‑CDB). For details, refer to [Source and target databases](#source-and-target-databases). |
+| `--source-cdb` | Connection string for the Oracle container database (CDB) when using a multitenant (CDB/PDB) architecture. Omit this flag on a non‑multitenant Oracle database. For details, refer to [Source and target databases](#source-and-target-databases). |
+| `--target` | (Required) Connection string for the target database. For details, refer to [Source and target databases](#source-and-target-databases). |
+| `--allow-tls-mode-disable` | Allow insecure connections to databases. Secure SSL/TLS connections should be used by default. This should be enabled **only** if secure SSL/TLS connections to the source or target database are not possible. |
+| `--assume-role` | Service account to use for assume role authentication. `--use-implicit-auth` must be included. For example, `--assume-role='user-test@cluster-ephemeral.iam.gserviceaccount.com' --use-implicit-auth`. For details, refer to [Cloud Storage Authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). |
+| `--bucket-path` | The path within the [cloud storage](#bucket-path) bucket where intermediate files are written (e.g., `'s3://bucket/path'` or `'gs://bucket/path'`). Only the URL path is used; query parameters (e.g., credentials) are ignored. To pass in query parameters, use the appropriate flags: `--assume-role`, `--import-region`, `--use-implicit-auth`. |
+| `--case-sensitive` | Toggle case sensitivity when comparing table and column names on the source and target. To disable case sensitivity, set `--case-sensitive=false`. If `=` is **not** included (e.g., `--case-sensitive false`), the flag is interpreted as `--case-sensitive` (i.e., `--case-sensitive=true`).
**Default:** `false` |
+| `--cleanup` | Whether to delete intermediate files after moving data using [cloud or local storage](#data-path). **Note:** Cleanup does not occur on [continuation](#fetch-continuation). |
+| `--compression` | Compression method for data when using [`IMPORT INTO`](#data-load-mode) (`gzip`/`none`).
**Default:** `gzip` |
+| `--continuation-file-name` | Restart fetch at the specified filename if the task encounters an error. `--fetch-id` must be specified. For details, see [Fetch continuation](#fetch-continuation). |
+| `--continuation-token` | Restart fetch at a specific table, using the specified continuation token, if the task encounters an error. `--fetch-id` must be specified. For details, see [Fetch continuation](#fetch-continuation). |
+| `--crdb-pts-duration` | The duration for which each timestamp used in data export from a CockroachDB source is protected from garbage collection. This ensures that the data snapshot remains consistent. For example, if set to `24h`, each timestamp is protected for 24 hours from the initiation of the export job. This duration is extended at regular intervals specified in `--crdb-pts-refresh-interval`.
**Default:** `24h0m0s` |
+| `--crdb-pts-refresh-interval` | The frequency at which the protected timestamp's validity is extended. This interval maintains protection of the data snapshot until data export from a CockroachDB source is completed. For example, if set to `10m`, the protected timestamp's expiration will be extended by the duration specified in `--crdb-pts-duration` (e.g., `24h`) every 10 minutes while export is not complete.
**Default:** `10m0s` |
+| `--direct-copy` | Enables [direct copy](#direct-copy), which copies data directly from source to target without using an intermediate store. |
+| `--export-concurrency` | Number of shards to export at a time per table, each on a dedicated thread. This controls how many shards are created for each individual table during the [data export phase](#data-export-phase) and is distinct from `--table-concurrency`, which controls how many tables are processed simultaneously. The total number of concurrent threads is the product of `--export-concurrency` and `--table-concurrency`. Tables can be sharded with a range-based or stats-based mechanism. For details, refer to [Table sharding](#table-sharding).
**Default:** `4` |
+| `--export-retry-max-attempts` | Maximum number of retry attempts for source export queries when connection failures occur. Only supported for PostgreSQL and CockroachDB sources.
**Default:** `3` |
+| `--export-retry-max-duration` | Maximum total duration for retrying source export queries. If `0`, no time limit is enforced. Only supported for PostgreSQL and CockroachDB sources.
**Default:** `5m0s` |
+| `--filter-path` | Path to a JSON file defining row-level filters for the [data import phase](#data-import-phase). Refer to [Selective data movement](#selective-data-movement). |
+| `--fetch-id` | Restart fetch task corresponding to the specified ID. If `--continuation-file-name` or `--continuation-token` are not specified, fetch restarts for all failed tables. |
+| `--flush-rows` | Number of rows before the source data is flushed to intermediate files. **Note:** If `--flush-size` is also specified, the fetch behavior is based on the flag whose criterion is met first. |
+| `--flush-size` | Size (in bytes) before the source data is flushed to intermediate files. **Note:** If `--flush-rows` is also specified, the fetch behavior is based on the flag whose criterion is met first. |
+| `--ignore-replication-check` | Skip querying for replication checkpoints such as `pg_current_wal_insert_lsn()` on PostgreSQL, `gtid_executed` on MySQL, and `CURRENT_SCN` on Oracle. This option is intended for use during bulk load migrations or when doing a one-time data export from a read replica. |
+| `--import-batch-size` | The number of files to be imported at a time to the target database during the [data import phase](#data-import-phase). This applies only when using [`IMPORT INTO`](#data-load-mode) for data movement. **Note:** Increasing this value can improve the performance of full-scan queries on the target database shortly after fetch completes, but very high values are not recommended. If any individual file in the import batch fails, you must [retry](#fetch-continuation) the entire batch.
**Default:** `1000` |
+| `--import-region` | The region of the [cloud storage](#bucket-path) bucket. This applies only to [Amazon S3 buckets](#bucket-path). Set this flag only if you need to specify an `AWS_REGION` explicitly when using [`IMPORT INTO`](#data-load-mode) for data movement. For example, `--import-region=ap-south-1`. |
+| `--local-path` | The path within the [local file server](#local-path) where intermediate files are written (e.g., `data/migration/cockroach`). `--local-path-listen-addr` must be specified. |
+| `--local-path-crdb-access-addr` | Address of a [local file server](#local-path) that is **publicly accessible**. This flag is only necessary if CockroachDB cannot reach the local address specified with `--local-path-listen-addr` (e.g., when moving data to a CockroachDB {{ site.data.products.cloud }} deployment). `--local-path` and `--local-path-listen-addr` must be specified.
**Default:** Value of `--local-path-listen-addr`. |
+| `--local-path-listen-addr` | Write intermediate files to a [local file server](#local-path) at the specified address (e.g., `'localhost:3000'`). `--local-path` must be specified. |
+| `--log-file` | Write messages to the specified log filename. If no filename is provided, messages write to `fetch-{datetime}.log`. If `"stdout"` is provided, messages write to `stdout`. |
+| `--logging` | Level at which to log messages (`trace`/`debug`/`info`/`warn`/`error`/`fatal`/`panic`).
**Default:** `info` |
+| `--metrics-listen-addr` | Address of the Prometheus metrics endpoint, which has the path `{address}/metrics`. For details on important metrics to monitor, refer to [Monitoring](#monitoring).
**Default:** `'127.0.0.1:3030'` |
+| `--mode` | Configure the MOLT Fetch behavior: `data-load`, `export-only`, or `import-only`. For details, refer to [Fetch mode](#fetch-mode).
**Default:** `data-load` |
+| `--non-interactive` | Run the fetch task without interactive prompts. This is recommended **only** when running `molt fetch` in an automated process (i.e., a job or continuous integration). |
+| `--pprof-listen-addr` | Address of the pprof endpoint.
**Default:** `'127.0.0.1:3031'` |
+| `--row-batch-size` | Number of rows per shard to export at a time. For details on sharding, refer to [Table sharding](#table-sharding). See also [Best practices](#best-practices).
**Default:** `100000` |
+| `--schema-filter` | Move schemas that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).
**Default:** `'.*'` |
+| `--skip-pk-check` | Skip primary-key matching to allow data load when source or target tables have missing or mismatched primary keys. Disables sharding and bypasses `--export-concurrency` and `--row-batch-size` settings. Refer to [Skip primary key matching](#skip-primary-key-matching).
**Default:** `false` |
+| `--table-concurrency` | Number of tables to export at a time. The number of concurrent threads is the product of `--export-concurrency` and `--table-concurrency`.
**Default:** `4` |
+| `--table-exclusion-filter` | Exclude tables that match a specified [POSIX regular expression](https://wikipedia.org/wiki/Regular_expression).
This value **cannot** be set to `'.*'`, which would cause every table to be excluded.
**Default:** Empty string |
+| `--table-filter` | Move tables that match a specified [POSIX regular expression](https://wikipedia.org/wiki/Regular_expression).
**Default:** `'.*'` |
+| `--table-handling` | How tables are initialized on the target database (`none`/`drop-on-target-and-recreate`/`truncate-if-exists`). For details, see [Target table handling](#target-table-handling).
**Default:** `none` |
+| `--transformations-file` | Path to a JSON file that defines transformations to be performed on the target schema during the fetch task. Refer to [Transformations](#transformations). |
+| `--type-map-file` | Path to a JSON file that contains explicit type mappings for automatic schema creation, when enabled with `--table-handling drop-on-target-and-recreate`. For details on the JSON format and valid type mappings, see [type mapping](#type-mapping). |
+| `--use-console-writer` | Use the console writer, which has cleaner log output but introduces more latency.
**Default:** `false` (log as structured JSON) |
+| `--use-copy` | Use [`COPY FROM`](#data-load-mode) to move data. This makes tables queryable during data load, but is slower than using `IMPORT INTO`. For details, refer to [Data movement](#data-load-mode). |
+| `--use-implicit-auth` | Use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}) for [cloud storage](#bucket-path) URIs. |
+| `--use-stats-based-sharding` | Enable statistics-based sharding for PostgreSQL sources. This allows sharding of tables with primary keys of any data type and can create more evenly distributed shards compared to the default numerical range sharding. Requires PostgreSQL 11+ and access to `pg_stats`. For details, refer to [Table sharding](#table-sharding). |
### `tokens list` flags
@@ -222,7 +145,6 @@ Cockroach Labs **strongly** recommends the following:
| `--conn-string` | (Required) Connection string for the target database. For details, see [List active continuation tokens](#list-active-continuation-tokens). |
| `-n`, `--num-results` | Number of results to return. |
-{% include molt/replicator-flags.md %}
## Usage
@@ -231,293 +153,72 @@ The following sections describe how to use the `molt fetch` [flags](#flags).
### Source and target databases
{{site.data.alerts.callout_success}}
-Follow the recommendations in [Connection strings](#connection-strings).
+Follow the recommendations in [Connection security](#connection-security).
{{site.data.alerts.end}}
-#### `--source`
-
`--source` specifies the connection string of the source database.
-PostgreSQL or CockroachDB:
+PostgreSQL or CockroachDB connection string:
{% include_cached copy-clipboard.html %}
~~~
--source 'postgresql://{username}:{password}@{host}:{port}/{database}'
~~~
-MySQL:
+MySQL connection string:
{% include_cached copy-clipboard.html %}
~~~
--source 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
~~~
-Oracle:
+Oracle connection string:
{% include_cached copy-clipboard.html %}
~~~
--source 'oracle://{username}:{password}@{host}:{port}/{service_name}'
~~~
-In Oracle migrations, the `--source` connection string specifies a PDB (in [Oracle Multitenant databases](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html)) or single database. The `{username}` corresponds to the owner of the tables you will migrate.
-
-#### `--source-cdb`
-
-The `--source-cdb` flag specifies the connection string for the Oracle container database (CDB) in an Oracle Multitenant deployment. Omit this flag on a non‑multitenant Oracle database.
+For Oracle Multitenant databases, `--source-cdb` specifies the container database (CDB) connection. `--source` specifies the pluggable database (PDB):
{% include_cached copy-clipboard.html %}
~~~
---source oracle://{username}:{password}@{host}:{port}/{service_name}
---source-cdb oracle://{username}:{password}@{host}:{port}/{container_service}
+--source 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'
+--source-cdb 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
~~~
-#### `--target`
-
`--target` specifies the [CockroachDB connection string]({% link {{site.current_cloud_version}}/connection-parameters.md %}#connect-using-a-url):
{% include_cached copy-clipboard.html %}
~~~
---target 'postgresql://{username}:{password}@{host}:{port}/{database}
+--target 'postgresql://{username}:{password}@{host}:{port}/{database}'
~~~
### Fetch mode
-`--mode` specifies the MOLT Fetch behavior:
+`--mode` specifies the MOLT Fetch behavior.
-- [Load data into CockroachDB](#data-load)
-- [Load data and replicate changes to CockroachDB](#data-load-and-replication)
-- [Replicate changes to CockroachDB](#replication-only)
-- [Export and import the data only](#export-only-and-import-only)
-- [Fail back to source database](#failback)
-
-{{site.data.alerts.callout_danger}}
-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
-
-#### `data-load`
-
-`data-load` (default) instructs MOLT Fetch to load the source data into CockroachDB. It does not replicate any subsequent changes on the source.
+`data-load` (default) instructs MOLT Fetch to load the source data into CockroachDB:
{% include_cached copy-clipboard.html %}
~~~
--mode data-load
~~~
-If the source is a PostgreSQL database and you intend to [replicate changes](#replication-only) afterward, **also** specify a replication slot name with `--pglogical-replication-slot-name`. MOLT Fetch will create a replication slot with this name. For example, the following snippet instructs MOLT Fetch to create a slot named `replication_slot` to use for replication:
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode data-load
---pglogical-replication-slot-name 'replication_slot'
-~~~
-
-{{site.data.alerts.callout_success}}
-In case you need to rename your [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html), also include `--pglogical-publication-name` to specify the new publication name and `--pglogical-publication-and-slot-drop-and-recreate` to ensure that the publication and replication slot are created in the correct order. For details on these flags, refer to [Global flags](#global-flags).
-{{site.data.alerts.end}}
-
-#### `data-load-and-replication`
-
-{{site.data.alerts.callout_info}}
-Before using this option, the source database **must** be configured for continuous replication, as described in [Setup](#replication-setup).
-{{site.data.alerts.end}}
-
-`data-load-and-replication` instructs MOLT Fetch to load the source data into CockroachDB, and replicate any subsequent changes on the source. This enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime).
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode data-load-and-replication
-~~~
-
-If the source is a PostgreSQL database, you **must** also specify a replication slot name with `--pglogical-replication-slot-name`. MOLT Fetch will create a replication slot with this name. For example, the following snippet instructs MOLT Fetch to create a slot named `replication_slot` to use for replication:
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode data-load-and-replication
---pglogical-replication-slot-name 'replication_slot'
-~~~
-
-{{site.data.alerts.callout_success}}
-In case you need to rename your [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html), also include `--pglogical-publication-name` to specify the new publication name and `--pglogical-publication-and-slot-drop-and-recreate` to ensure that the publication and replication slot are created in the correct order. For details on these flags, refer to [Global flags](#global-flags).
-{{site.data.alerts.end}}
-
-Continuous replication begins once the initial load is complete, as indicated by a `fetch complete` message in the output. If replication is interrupted, you can [resume replication](#resume-replication).
-
-To cancel replication, enter `ctrl-c` to issue a `SIGTERM` signal. This returns an exit code `0`. If replication fails, a non-zero exit code is returned.
-
-To customize the replication behavior (an advanced use case), use `--replicator-flags` to specify one or more [replication-specific flags](#replication-flags).
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode data-load-and-replication
---replicator-flags "--applyTimeout '1h' --parallelism 64"
-~~~
-
-#### `replication-only`
-
-{{site.data.alerts.callout_info}}
-Before using this option:
-
-- The source database **must** be configured for continuous replication, as described in [Setup](#replication-setup).
-- The `replicator` binary **must** be located either in the same directory as `molt` or in a directory directly beneath `molt`. Both `molt` and `replicator` must be in your current working directory for MOLT to locate the replicator binary.
-{{site.data.alerts.end}}
-
-`replication-only` instructs MOLT Fetch to replicate ongoing changes on the source to CockroachDB, using the specified replication marker. This assumes you have already run [`--mode data-load`](#data-load) to load the source data into CockroachDB. This enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime).
-
-- For a PostgreSQL source, you should have already created a replication slot when [loading data](#data-load). Specify the same replication slot name using `--pglogical-replication-slot-name`. For example:
-
- {% include_cached copy-clipboard.html %}
- ~~~
- --mode replication-only
- --pglogical-replication-slot-name 'replication_slot'
- ~~~
-
- {{site.data.alerts.callout_success}}
- In case you want to run `replication-only` without already having loaded data (e.g., for testing), also include `--pglogical-publication-and-slot-drop-and-recreate` to ensure that the publication and replication slot are created in the correct order. For details on this flag, refer to [Global flags](#global-flags).
- {{site.data.alerts.end}}
-
-
-- For a MySQL source, replication requires specifying a starting GTID set with the `--defaultGTIDSet` replication flag. After the initial data load completes, locate the [`cdc_cursor`](#cdc-cursor) value in the `fetch complete` log output and use it as the GTID set. For example:
-
- {% include_cached copy-clipboard.html %}
- ~~~ shell
- --mode replication-only \
- --replicator-flags "--defaultGTIDSet b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21"
- ~~~
-
-If replication is interrupted, you can [resume replication](#resume-replication).
-
-##### Resume replication
-
-`replication-only` can be used to resume replication if it is interrupted in either `data-load-and-replication` or `replication-only` mode.
-
-Specify the staging schema with the [`--stagingSchema` replication flag](#replication-flags). MOLT Fetch outputs the schema name as `staging database name: {schema_name}` after the initial replication run.
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode replication-only
---replicator-flags "--stagingSchema {schema_name}"
-~~~
-
-You **must** include the `--stagingSchema` replication flag when resuming replication, as the schema provides a replication marker for streaming changes.
-
-To cancel replication, enter `ctrl-c` to issue a `SIGTERM` signal. This returns an exit code `0`. If replication fails, a non-zero exit code is returned.
-
-#### `export-only` and `import-only`
-
-`export-only` instructs MOLT Fetch to export the source data to the specified [cloud storage](#bucket-path) or [local file server](#local-path). It does not load the data into CockroachDB.
+`export-only` instructs MOLT Fetch to export the source data to the specified [cloud storage](#bucket-path) or [local file server](#local-path). It does not load the data into CockroachDB:
{% include_cached copy-clipboard.html %}
~~~
--mode export-only
~~~
-`import-only` instructs MOLT Fetch to load the source data in the specified [cloud storage](#bucket-path) or [local file server](#local-path) into the CockroachDB target.
+`import-only` instructs MOLT Fetch to load the source data in the specified [cloud storage](#bucket-path) or [local file server](#local-path) into the CockroachDB target:
{% include_cached copy-clipboard.html %}
~~~
--mode import-only
~~~
-#### `failback`
-
-{{site.data.alerts.callout_danger}}
-Before using `failback` mode, refer to the [technical advisory]({% link advisories/a123371.md %}) about a bug that affects changefeeds on CockroachDB v22.2, v23.1.0 to v23.1.21, v23.2.0 to v23.2.5, and testing versions of v24.1 through v24.1.0-rc.1.
-{{site.data.alerts.end}}
-
-If you encounter issues after moving data to CockroachDB, you can use `failback` mode to replicate changes on CockroachDB back to the initial source database. In case you need to roll back the migration, this ensures that data is consistent on the initial source database.
-
-`failback` mode creates a [CockroachDB changefeed]({% link {{ site.current_cloud_version }}/change-data-capture-overview.md %}) and sets up a [webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink) to pass change events from CockroachDB to the failback target. In production, you should **secure the connection** by specifying [changefeed override settings](#changefeed-override-settings) in a JSON file. These settings override the [default insecure changefeed](#default-insecure-changefeed) values, which are suited for testing only. Include the [`--changefeeds-path`](#global-flags) flag to indicate the path to the JSON file.
-
-{% include_cached copy-clipboard.html %}
-~~~
---mode failback
---changefeeds-path 'changefeed-settings.json'
-~~~
-
-When running `molt fetch --mode failback`, `--source` is the CockroachDB connection string and `--target` is the connection string of the database you migrated from. `--table-filter` specifies the tables to watch for change events. For example:
-
-{% include_cached copy-clipboard.html %}
-~~~
---source 'postgresql://{username}:{password}@{host}:{port}/{database}'
---target 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
---table-filter 'employees|payments'
-~~~
-
-{{site.data.alerts.callout_info}}
-MySQL 5.7 and later are supported as MySQL failback targets.
-{{site.data.alerts.end}}
-
-##### Changefeed override settings
-
-You can specify the following [`CREATE CHANGEFEED` parameters]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#parameters) in the override JSON. If any parameter is not specified, its [default value](#default-insecure-changefeed) is used.
-
-- The following [`CREATE CHANGEFEED` sink URI parameters]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#sink-uri):
- - `host`: The hostname or IP address of the [webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink) where change events are sent. The applicable certificates of the failback target (i.e., the [source database](#source-and-target-databases) from which you migrated) **must** be located on this machine.
- - `port`: The port of the [webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink).
- - `sink_query_parameters`: A comma-separated list of [`CREATE CHANGEFEED` query parameters]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#query-parameters). This includes the base64-encoded client certificate ([`client_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-cert)), key ([`client_key`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-key)), and CA ([`ca_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#ca-cert)) for a secure webhook sink.
-- The following [`CREATE CHANGEFEED` options]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#options):
- - [`resolved`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#resolved)
- - [`min_checkpoint_frequency`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#min-checkpoint-frequency)
- - [`initial_scan`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#initial-scan)
- - [`webhook_sink_config`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#webhook-sink-config)
-
-{{site.data.alerts.callout_info}}
-If there is already a running CockroachDB changefeed with the same webhook sink URL (excluding query parameters) and [watched tables]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}), the existing changefeed is used for `failback`.
-{{site.data.alerts.end}}
-
-**Use a secure changefeed connection whenever possible.** The [default insecure configuration](#default-insecure-changefeed) is **not** recommended in production. To secure the changefeed connection, define `sink_query_parameters` in the JSON as follows:
-
-{% include_cached copy-clipboard.html %}
-~~~ json
-{
- "sink_query_parameters": "client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}"
-}
-~~~
-
-`client_cert`, `client_key`, and `ca_cert` are [webhook sink parameters]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-parameters) that must be base64- and URL-encoded (for example, use the command `base64 -i ./client.crt | jq -R -r '@uri'`).
-
-In the `molt fetch` command, also include [`--replicator-flags`](#failback-replication-flags) to specify the paths to the server certificate and key that correspond to the client certs defined in `sink_query_parameters`. For example:
-
-{% include_cached copy-clipboard.html %}
-~~~
---changefeeds-path 'changefeed-secure.json'
---replicator-flags "--tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key"
-~~~
-
-For a complete example of using `molt fetch` in `failback` mode, see [Fail back securely from CockroachDB](#fail-back-securely-from-cockroachdb).
-
-##### Default insecure changefeed
-
-{{site.data.alerts.callout_danger}}
-Insecure configurations are **not** recommended. In production, run failback with a secure changefeed connection. For details, see [Changefeed override settings](#changefeed-override-settings).
-{{site.data.alerts.end}}
-
-When `molt fetch --mode failback` is run without specifying `--changefeeds-path`, the following [`CREATE CHANGEFEED` parameters]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#parameters) are used for the changefeed:
-
-~~~ json
-{
- "host": "localhost",
- "port": 30004,
- "sink_query_parameters": "insecure_tls_skip_verify=true",
- "resolved": "1s",
- "min_checkpoint_frequency": "1s",
- "initial_scan": "no",
- "webhook_sink_config": "{\"Flush\":{\"Bytes\":1048576}}"
-}
-~~~
-
-The default parameters specify a local webhook sink (`"localhost"`) and an insecure sink connection (`"insecure_tls_skip_verify=true"`), which are suited for testing only. In order to run `failback` with the default insecure configuration, you must also include the following flags:
-
-{% include_cached copy-clipboard.html %}
-~~~
---allow-tls-mode-disable
---replicator-flags '--tlsSelfSigned --disableAuthentication'
-~~~
-
-{{site.data.alerts.callout_info}}
-Either `--changefeeds-path`, which overrides the default insecure configuration; or `--allow-tls-mode-disable`, which enables the use of the default insecure configuration, must be specified in `failback` mode. Otherwise, `molt fetch` will error.
-{{site.data.alerts.end}}
-
### Data load mode
MOLT Fetch can use either [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to load data into CockroachDB.
@@ -603,10 +304,10 @@ Because stats-based sharding analyzes the entire table, running `--use-stats-bas
MOLT Fetch can move the source data to CockroachDB via [cloud storage](#bucket-path), a [local file server](#local-path), or [directly](#direct-copy) without an intermediate store.
-#### `--bucket-path`
+#### Bucket path
{{site.data.alerts.callout_success}}
-Only the path specified in `--bucket-path` is used. Query parameters, such as credentials, are ignored. To authenticate cloud storage, follow the steps in [Secure cloud storage](#secure-cloud-storage).
+Only the path specified in `--bucket-path` is used. Query parameters, such as credentials, are ignored. To authenticate cloud storage, follow the steps in [Secure cloud storage](#cloud-storage-security).
{{site.data.alerts.end}}
`--bucket-path` instructs MOLT Fetch to write intermediate files to a path within [Google Cloud Storage](https://cloud.google.com/storage/docs/buckets), [Amazon S3](https://aws.amazon.com/s3/), or [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs) to which you have the necessary permissions. Use additional [flags](#global-flags), shown in the following examples, to specify authentication or region parameters as required for bucket access.
@@ -640,7 +341,7 @@ Connect to an Azure Blob Storage container with [implicit authentication]({% lin
--use-implicit-auth
~~~
-#### `--local-path`
+#### Local path
`--local-path` instructs MOLT Fetch to write intermediate files to a path within a [local file server]({% link {{site.current_cloud_version}}/use-a-local-file-server.md %}). `local-path-listen-addr` specifies the address of the local file server. For example:
@@ -665,7 +366,7 @@ For example, if you are migrating to CockroachDB {{ site.data.products.cloud }},
[Cloud storage](#bucket-path) is often preferable to a local file server, which can require considerable disk space.
{{site.data.alerts.end}}
-#### `--direct-copy`
+#### Direct copy
`--direct-copy` specifies that MOLT Fetch should use `COPY FROM` to move the source data directly to CockroachDB without an intermediate store:
@@ -698,7 +399,7 @@ By default, MOLT Fetch moves all data from the [`--source`](#source-and-target-d
### Selective data movement
-Use `--filter-path` to specify the path to a JSON file that defines row-level filtering for data load. This enables you to move a subset of data in a table, rather than all data in the table. To apply row-level filters during replication, you need a [userscript](#filter-path-userscript-for-replication).
+Use `--filter-path` to specify the path to a JSON file that defines row-level filtering for data load. This enables you to move a subset of data in a table, rather than all data in the table. To apply row-level filters during replication, use [MOLT Replicator]({% link molt/molt-replicator.md %}) with userscripts.
{% include_cached copy-clipboard.html %}
~~~
@@ -741,10 +442,9 @@ The JSON file should contain one or more entries in `filters`, each with a `reso
If the expression references columns that are not indexed, MOLT Fetch will emit a warning like: `filter expression ‘v > 100' contains column ‘v' which is not indexed. This may lead to performance issues.`
{{site.data.alerts.end}}
+{% comment %}
#### `--filter-path` userscript for replication
-The `--filter-path` flag applies **only** when loading data, and is ignored for replication. For example, when `--filter-path` is combined with [`--mode data-load-and-replication`](#data-load-and-replication), rows are only filtered on data load. During subsequent replication, all rows on the migrated tables will appear on the target.
-
To use `--filter-path` with replication, create and save a TypeScript userscript (e.g., `filter-script.ts`). The following script ensures that only rows where `v > 100` are replicated to `defaultdb.public.t1`:
{% include_cached copy-clipboard.html %}
@@ -762,12 +462,13 @@ api.configureSource("defaultdb.public", {
});
~~~
-When you run `molt fetch`, apply the userscript with the `--userscript` [replication flag](#replication-flags):
+Apply the userscript with the `--userscript` replication flag:
{% include_cached copy-clipboard.html %}
~~~
---replicator-flags "--userscript 'filter-script.ts'"
+--userscript 'filter-script.ts'
~~~
+{% endcomment %}
### Target table handling
@@ -798,34 +499,15 @@ When using the `drop-on-target-and-recreate` option, MOLT Fetch creates a new Co
#### Mismatch handling
-If either [`none`](#target-table-handling) or [`truncate-if-exists`](#target-table-handling) is set, `molt fetch` loads data into the existing tables on the target CockroachDB database. If the target schema mismatches the source schema, `molt fetch` will exit early in [certain cases](#exit-early), and will need to be re-run from the beginning.
+If either [`none`](#target-table-handling) or [`truncate-if-exists`](#target-table-handling) is set, `molt fetch` loads data into the existing tables on the target CockroachDB database. If the target schema mismatches the source schema, `molt fetch` will exit early in certain cases, and will need to be re-run from the beginning. For details, refer to [Fetch exits early due to mismatches](#fetch-exits-early-due-to-mismatches).
{{site.data.alerts.callout_info}}
This does not apply when [`drop-on-target-and-recreate`](#target-table-handling) is specified, since this option automatically creates a compatible CockroachDB schema.
{{site.data.alerts.end}}
-`molt fetch` exits early in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `true`:
-
-- A source table is missing a primary key.
-- A source and table primary key have mismatching types.
- {{site.data.alerts.callout_success}}
- These restrictions (missing or mismatching primary keys) can be bypassed with [`--skip-pk-check`](#skip-primary-key-matching).
- {{site.data.alerts.end}}
-
-- A [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) primary key has a different [collation]({% link {{site.current_cloud_version}}/collate.md %}) on the source and target.
-- A source and target column have mismatching types that are not [allowable mappings](#type-mapping).
-- A target table is missing a column that is in the corresponding source table.
-- A source column is nullable, but the corresponding target column is not nullable (i.e., the constraint is more strict on the target).
-
-`molt fetch` can continue in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `false`:
-
-- A target table has a column that is not in the corresponding source table.
-- A source column has a `NOT NULL` constraint, and the corresponding target column is nullable (i.e., the constraint is less strict on the target).
-- A [`DEFAULT`]({% link {{site.current_cloud_version}}/default-value.md %}), [`CHECK`]({% link {{site.current_cloud_version}}/check.md %}), [`FOREIGN KEY`]({% link {{site.current_cloud_version}}/foreign-key.md %}), or [`UNIQUE`]({% link {{site.current_cloud_version}}/unique.md %}) constraint is specified on a target column and not on the source column.
-
#### Skip primary key matching
-`--skip-pk-check` removes the [requirement that source and target tables share matching primary keys](#exit-early) for data load. When this flag is set:
+`--skip-pk-check` removes the [requirement that source and target tables share matching primary keys](#fetch-exits-early-due-to-mismatches) for data load. When this flag is set:
- The data load proceeds even if the source or target table lacks a primary key, or if their primary key columns do not match.
- [Table sharding](#table-sharding) is disabled. Each table is exported in a single batch within one shard, bypassing `--export-concurrency` and `--row-batch-size`. As a result, memory usage and execution time may increase due to full table scans.
@@ -1048,12 +730,12 @@ SHOW CREATE TABLE computed;
### Fetch continuation
-If MOLT Fetch fails while loading data into CockroachDB from intermediate files, it exits with an error message, fetch ID, and [continuation token](#list-active-continuation-tokens) for each table that failed to load on the target database. You can use this information to continue the task from the *continuation point* where it was interrupted. For an example, see [Continue fetch after encountering an error](#continue-fetch-after-encountering-an-error).
+If MOLT Fetch fails while loading data into CockroachDB from intermediate files, it exits with an error message, fetch ID, and [continuation token](#list-active-continuation-tokens) for each table that failed to load on the target database. You can use this information to continue the task from the *continuation point* where it was interrupted.
Continuation is only possible under the following conditions:
- All data has been exported from the source database into intermediate files on [cloud](#bucket-path) or [local storage](#local-path).
-- The *initial load* of source data into the target CockroachDB database is incomplete. This means that ongoing [replication](#data-load-and-replication) of source data has not begun.
+- The *initial load* of source data into the target CockroachDB database is incomplete.
{{site.data.alerts.callout_info}}
Only one fetch ID and set of continuation tokens, each token corresponding to a table, are active at any time. See [List active continuation tokens](#list-active-continuation-tokens).
@@ -1117,244 +799,199 @@ A change data capture (CDC) cursor is written to the output as `cdc_cursor` at t
{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"}
~~~
-Use the `cdc_cursor` value as the starting GTID set for MySQL replication by passing it to the `--defaultGTIDSet` replication flag (refer to [Replication flags](#replication-flags)).
+Use the `cdc_cursor` value as the checkpoint for MySQL or Oracle replication with [MOLT Replicator]({% link molt/molt-replicator.md %}#replication-checkpoints).
You can also use the `cdc_cursor` value with an external change data capture (CDC) tool to continuously replicate subsequent changes from the source database to CockroachDB.
-### Metrics
+## Security
-By default, MOLT Fetch exports [Prometheus](https://prometheus.io/) metrics at `127.0.0.1:3030/metrics`. You can configure this endpoint with the `--metrics-listen-addr` [flag](#global-flags).
+Cockroach Labs strongly recommends the following security practices.
+
+### Connection security
+
+{% include molt/molt-secure-connection-strings.md %}
{{site.data.alerts.callout_info}}
-If [replication](#fetch-mode) is active, metrics from the `replicator` process are enabled by setting the `--metricsAddr` [replication flag](#replication-flags), and are served at `http://host:port/_/varz`.
+By default, insecure connections (i.e., `sslmode=disable` on PostgreSQL; `sslmode` not set on MySQL) are disallowed. When using an insecure connection, `molt fetch` returns an error. To override this check, you can enable the `--allow-tls-mode-disable` flag. Do this **only** when testing, or if a secure SSL/TLS connection to the source or target database is not possible.
{{site.data.alerts.end}}
-Cockroach Labs recommends monitoring the following metrics:
+### Cloud storage security
-| Metric Name | Description |
-|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
-| `molt_fetch_num_tables` | Number of tables that will be moved from the source. |
-| `molt_fetch_num_task_errors` | Number of errors encountered by the fetch task. |
-| `molt_fetch_overall_duration` | Duration (in seconds) of the fetch task. |
-| `molt_fetch_rows_exported` | Number of rows that have been exported from a table. For example: `molt_fetch_rows_exported{table="public.users"}` |
-| `molt_fetch_rows_imported` | Number of rows that have been imported from a table. For example: `molt_fetch_rows_imported{table="public.users"}` |
-| `molt_fetch_table_export_duration_ms` | Duration (in milliseconds) of a table's export. For example: `molt_fetch_table_export_duration_ms{table="public.users"}` |
-| `molt_fetch_table_import_duration_ms` | Duration (in milliseconds) of a table's import. For example: `molt_fetch_table_import_duration_ms{table="public.users"}` |
+{% include molt/fetch-secure-cloud-storage.md %}
-You can also use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view the preceding metrics.
+## Common workflows
-## Docker usage
+### Bulk data load
-{% include molt/molt-docker.md %}
+To perform a bulk data load migration from your source database to CockroachDB, run the `molt fetch` command with the required flags.
-## Examples
+Specify the source and target database connections. For connection string formats, refer to [Source and target databases](#source-and-target-databases):
-The following examples demonstrate how to issue `molt fetch` commands to load data into CockroachDB. These examples assume that [secure connections](#secure-connections) to the source and target database are used.
+{% include_cached copy-clipboard.html %}
+~~~
+--source $SOURCE
+--target $TARGET
+~~~
-{{site.data.alerts.callout_success}}
-After successfully running MOLT Fetch, you can run [`molt verify`]({% link molt/molt-verify.md %}) to confirm that replication worked successfully without missing or mismatched rows.
-{{site.data.alerts.end}}
+Specify how to move data to CockroachDB. Use [cloud storage](#bucket-path) for intermediate file storage:
-### Load PostgreSQL data via S3 with continuous replication
+{% include_cached copy-clipboard.html %}
+~~~
+--bucket-path 's3://bucket/path'
+~~~
-The following `molt fetch` command uses [`IMPORT INTO`](#data-load-mode) to load a subset of tables from a PostgreSQL database to CockroachDB.
+Alternatively, use a [local file server](#local-path) for intermediate storage:
{% include_cached copy-clipboard.html %}
-~~~ shell
-molt fetch \
---source 'postgres://postgres:postgres@localhost/molt' \
---target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \
---table-handling 'truncate-if-exists' \
---table-filter 'employees' \
---bucket-path 's3://migration/data/cockroach' \
---cleanup \
---pglogical-replication-slot-name 'replication_slot' \
---mode data-load-and-replication
+~~~
+--local-path /migration/data/cockroach
+--local-path-listen-addr 'localhost:3000'
~~~
-- `--table-handling` specifies that existing tables on CockroachDB should be truncated before the source data is loaded.
-- `--table-filter` filters for tables with the `employees` string in the name.
-- `--bucket-path` specifies a directory on an [Amazon S3 bucket](#data-path) where intermediate files will be written.
-- `--cleanup` specifies that the intermediate files should be removed after the source data is loaded.
-- `--pglogical-replication-slot-name` specifies a replication slot name to be created on the source PostgreSQL database. This is used in continuous [replication](#data-load-and-replication).
-- `--mode data-load-and-replication` starts continuous [replication](#data-load-and-replication) of data from the source database to CockroachDB after the fetch task succeeds.
-
-If the fetch task succeeds, the output displays a `fetch complete` message like the following:
+Alternatively, use [direct copy](#direct-copy) to move data directly without intermediate storage:
-~~~ json
-{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
+{% include_cached copy-clipboard.html %}
+~~~
+--direct-copy
~~~
-{{site.data.alerts.callout_info}}
-If the fetch task encounters an error, it will exit and can be [continued](#continue-fetch-after-encountering-an-error).
-{{site.data.alerts.end}}
+Optionally, filter which schemas and tables to migrate. By default, all schemas and tables are migrated. For details, refer to [Schema and table selection](#schema-and-table-selection):
-Continuous [replication](#data-load-and-replication) begins immediately afterward:
+{% include_cached copy-clipboard.html %}
+~~~
+--schema-filter 'public'
+--table-filter '.*user.*'
+~~~
-~~~ json
-{"level":"info","time":"2024-05-13T14:33:07-04:00","message":"starting replicator"}
-{"level":"info","time":"2024-05-13T14:36:22-04:00","message":"creating publication"}
+Specify how to handle target tables. By default, `--table-handling` is set to `none`, which loads data without changing existing data in the tables. For details, refer to [Target table handling](#target-table-handling):
+
+{% include_cached copy-clipboard.html %}
~~~
+--table-handling truncate-if-exists
+~~~
+
+When performing a bulk load without subsequent replication, use `--ignore-replication-check` to skip querying for replication checkpoints (such as `pg_current_wal_insert_lsn()` on PostgreSQL, `gtid_executed` on MySQL, and `CURRENT_SCN` on Oracle). This is appropriate when:
-To cancel replication, enter `ctrl-c` to issue a `SIGTERM` signal.
+- Performing a one-time data migration with no plan to replicate ongoing changes.
+- Exporting data from a read replica where replication checkpoints are unavailable.
-### Load MySQL data via GCP with continuous replication
+{% include_cached copy-clipboard.html %}
+~~~
+--ignore-replication-check
+~~~
-The following `molt fetch` command uses [`COPY FROM`](#data-load-mode) to load a subset of tables from a MySQL database to CockroachDB.
+At minimum, the `molt fetch` command should include the source, target, data path, and `--ignore-replication-check` flags:
{% include_cached copy-clipboard.html %}
~~~ shell
molt fetch \
---source 'mysql://root:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \
---target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \
---table-handling 'truncate-if-exists' \
---table-filter 'employees' \
---bucket-path 'gs://migration/data/cockroach' \
---use-copy \
---cleanup \
---mode data-load-and-replication
-~~~
-
-- `--source` specifies the MySQL connection string and the certificates in URL-encoded format. Secure connections should be used by default. Refer to [Setup](#setup).
-- `--table-handling` specifies that existing tables on CockroachDB should be truncated before the source data is loaded.
-- `--table-filter` filters for tables with the `employees` string in the name.
-- `--bucket-path` specifies a directory on an [Google Cloud Storage bucket](#data-path) where intermediate files will be written.
-- `--use-copy` specifies that `COPY FROM` is used to load the tables, keeping the source tables online and queryable but loading the data more slowly than `IMPORT INTO`.
-- `--cleanup` specifies that the intermediate files should be removed after the source data is loaded.
-- `--mode data-load-and-replication` starts continuous [replication](#data-load-and-replication) of data from the source database to CockroachDB after the fetch task succeeds.
-
-If the fetch task succeeds, the output displays a `fetch complete` message like the following:
-
-~~~ json
-{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
+--source $SOURCE \
+--target $TARGET \
+--bucket-path 's3://bucket/path' \
+--ignore-replication-check
~~~
-{{site.data.alerts.callout_info}}
-If the fetch task encounters an error, it will exit and can be [continued](#continue-fetch-after-encountering-an-error).
-{{site.data.alerts.end}}
+For detailed steps, refer to [Bulk load migration]({% link molt/migrate-bulk-load.md %}).
-Continuous [replication](#data-load-and-replication) begins immediately afterward:
+### Load before replication
-~~~ json
-{"level":"info","time":"2024-05-13T14:33:07-04:00","message":"starting replicator"}
-~~~
+To perform an initial data load before setting up ongoing replication with [MOLT Replicator]({% link molt/molt-replicator.md %}), run the `molt fetch` command without `--ignore-replication-check`. This captures replication checkpoints during the data load.
-To cancel replication, enter `ctrl-c` to issue a `SIGTERM` signal.
+The workflow is the same as [Bulk data load](#bulk-data-load), except:
-### Load CockroachDB data via direct copy
+- Exclude `--ignore-replication-check`. MOLT Fetch will query and record replication checkpoints.
+- After the data load completes, check the [CDC cursor](#cdc-cursor) in the output for the checkpoint value to use with MOLT Replicator.
-The following `molt fetch` command uses [`COPY FROM`](#data-load-mode) to load all tables directly from one CockroachDB database to another.
+At minimum, the `molt fetch` command should include the source, target, and data path flags:
{% include_cached copy-clipboard.html %}
~~~ shell
molt fetch \
---source 'postgres://root@localhost:26257/defaultdb?sslmode=disable' \
---target 'postgres://root@localhost:26258/defaultdb?sslmode=disable' \
---table-handling 'none' \
---direct-copy \
---allow-tls-mode-disable
+--source $SOURCE \
+--target $TARGET \
+--bucket-path 's3://bucket/path'
~~~
-- `--source` specifies `sslmode=disable` to establish an insecure connection. By default, insecure connections are disallowed and should be used **only** for testing or if a secure SSL/TLS connection to the source or target database is not possible.
-- `--table-handling` specifies that existing tables on the target CockroachDB database should not be modified before the source data is loaded.
-- `--direct-copy` specifies that `COPY FROM` is used to load the tables directly, without creating intermediate files.
-- `--allow-tls-mode-disable` enables insecure connections to the source and target databases. Refer to [Secure connections](#secure-connections).
-
-### Continue fetch after encountering an error
-
-If the fetch task encounters an error, it exits with an error message, fetch ID, and continuation token for each table that failed to load on the target database. You can use these values to [continue the fetch task](#fetch-continuation) from where it was interrupted.
+The output will include a `cdc_cursor` value at the end of the fetch task:
~~~ json
-{"level":"info","table":"public.tbl1","file_name":"shard_01_part_00000001.csv","message":"creating or updating token for duplicate key value violates unique constraint \"tbl1_pkey\"; Key (id)=(22) already exists."}
-{"level":"info","table":"public.tbl1","continuation_token":"5e7c7173-101c-4539-9b8d-28fad37d0240","message":"created continuation token"}
-{"level":"info","fetch_id":"87bf8dc0-803c-4e26-89d5-3352576f92a7","message":"continue from this fetch ID"}
+{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"}
~~~
-To retry a specific table, reissue the initial `molt fetch` command and include the fetch ID and a continuation token:
+Use this `cdc_cursor` value when starting MOLT Replicator to ensure replication begins from the correct position. For detailed steps, refer to [Load and replicate]({% link molt/migrate-load-replicate.md %}).
-{{site.data.alerts.callout_success}}
-You can use `molt fetch tokens list` to list all active continuation tokens. Refer to [List active continuation tokens](#list-active-continuation-tokens).
-{{site.data.alerts.end}}
+## Monitoring
-{% include_cached copy-clipboard.html %}
-~~~ shell
-molt fetch \
-... \
---fetch-id '87bf8dc0-803c-4e26-89d5-3352576f92a7' \
---continuation-token '5e7c7173-101c-4539-9b8d-28fad37d0240'
-~~~
+### Metrics
-To retry all tables that failed, exclude `--continuation-token` from the command. When prompted, type `y` to clear all active continuation tokens. To avoid the prompt (e.g., when running `molt fetch` in a job), include the `--non-interactive` flag:
+By default, MOLT Fetch exports [Prometheus](https://prometheus.io/) metrics at `127.0.0.1:3030/metrics`. You can configure this endpoint with the `--metrics-listen-addr` [flag](#global-flags).
-{% include_cached copy-clipboard.html %}
-~~~ shell
-molt fetch \
-... \
---fetch-id '87bf8dc0-803c-4e26-89d5-3352576f92a7' \
---non-interactive
-~~~
+Cockroach Labs recommends monitoring the following metrics:
-### Fail back securely from CockroachDB
+| Metric Name | Description |
+|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
+| `molt_fetch_num_tables` | Number of tables that will be moved from the source. |
+| `molt_fetch_num_task_errors` | Number of errors encountered by the fetch task. |
+| `molt_fetch_overall_duration` | Duration (in seconds) of the fetch task. |
+| `molt_fetch_rows_exported` | Number of rows that have been exported from a table. For example: `molt_fetch_rows_exported{table="public.users"}` |
+| `molt_fetch_rows_imported` | Number of rows that have been imported from a table. For example: `molt_fetch_rows_imported{table="public.users"}` |
+| `molt_fetch_table_export_duration_ms` | Duration (in milliseconds) of a table's export. For example: `molt_fetch_table_export_duration_ms{table="public.users"}` |
+| `molt_fetch_table_import_duration_ms` | Duration (in milliseconds) of a table's import. For example: `molt_fetch_table_import_duration_ms{table="public.users"}` |
-{{site.data.alerts.callout_danger}}
-Before using `failback` mode, refer to the [technical advisory]({% link advisories/a123371.md %}) about a bug that affects changefeeds on CockroachDB v22.2, v23.1.0 to v23.1.21, v23.2.0 to v23.2.5, and testing versions of v24.1 through v24.1.0-rc.1.
-{{site.data.alerts.end}}
+You can also use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view the preceding metrics.
-The following `molt fetch` command uses [`failback` mode](#failback) to securely replicate changes from CockroachDB back to a MySQL database. This assumes that you migrated data from MySQL to CockroachDB, and want to keep the data consistent on MySQL in case you need to roll back the migration.
+## Best practices
-{% include_cached copy-clipboard.html %}
-~~~ shell
-molt fetch \
---source 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \
---target 'mysql://root:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \
---table-filter 'employees|payments' \
---non-interactive \
---logging debug \
---replicator-flags "--tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key" \
---mode failback \
---changefeeds-path 'changefeed-secure.json'
-~~~
-
-- `--source` specifies the connection string of the CockroachDB database to which you migrated.
-- `--target` specifies the connection string of the MySQL database acting as the failback target.
-- `--table-filter` specifies that the `employees` and `payments` tables should be watched for change events.
-- `--replicator-flags` specifies the paths to the server certificate (`--tlsCertificate`) and key (`--tlsPrivateKey`) that correspond to the client certs defined by `sink_query_parameters` in the changefeed override JSON file.
-- `--changefeeds-path` specifies the path to `changefeed-secure.json`, which contains the following setting override:
-
- {% include_cached copy-clipboard.html %}
- ~~~ json
- {
- "sink_query_parameters": "client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}"
- }
+### Test and validate
+
+To verify that your connections and configuration work properly, run MOLT Fetch in a staging environment before migrating any data in production. Use a test or development environment that closely resembles production.
+
+### Configure the source database and connection
+
+- To prevent connections from terminating prematurely during the [data export phase](#data-export-phase), set the following to high values on the source database:
+
+ - **Maximum allowed number of connections.** MOLT Fetch can export data across multiple connections. The number of connections it will create is the number of shards ([`--export-concurrency`](#global-flags)) multiplied by the number of tables ([`--table-concurrency`](#global-flags)) being exported concurrently.
+
+ {{site.data.alerts.callout_info}}
+ With the default numerical range sharding, only tables with [primary key]({% link {{ site.current_cloud_version }}/primary-key.md %}) types of [`INT`]({% link {{ site.current_cloud_version }}/int.md %}), [`FLOAT`]({% link {{ site.current_cloud_version }}/float.md %}), or [`UUID`]({% link {{ site.current_cloud_version }}/uuid.md %}) can be sharded. PostgreSQL users can enable [`--use-stats-based-sharding`](#global-flags) to use statistics-based sharding for tables with primary keys of any data type. For details, refer to [Table sharding](#table-sharding).
+ {{site.data.alerts.end}}
+
+ - **Maximum lifetime of a connection.**
+
+- If a PostgreSQL database is set as a [source](#source-and-target-databases), ensure that [`idle_in_transaction_session_timeout`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT) on PostgreSQL is either disabled or set to a value longer than the duration of the [data export phase](#data-export-phase). Otherwise, the connection will be prematurely terminated. To estimate the time needed to export the PostgreSQL tables, you can perform a dry run and sum the value of [`molt_fetch_table_export_duration_ms`](#monitoring) for all exported tables.
+
+### Optimize performance
+
+- {% include molt/molt-drop-constraints-indexes.md %}
+
+- For PostgreSQL sources using [`--use-stats-based-sharding`](#global-flags), run [`ANALYZE`]({% link {{ site.current_cloud_version }}/create-statistics.md %}) on source tables before migration to ensure optimal shard distribution. This is especially important for large tables where even distribution can significantly improve export performance.
+
+- To prevent memory outages during `READ COMMITTED` [data export](#data-export-phase) of tables with large rows, estimate the amount of memory used to export a table:
+
+ ~~~
+ --row-batch-size * --export-concurrency * average size of the table rows
~~~
- `client_cert`, `client_key`, and `ca_cert` are [webhook sink parameters]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-parameters) that must be base64- and URL-encoded (for example, use the command `base64 -i ./client.crt | jq -R -r '@uri'`).
+ If you are exporting more than one table at a time (i.e., [`--table-concurrency`](#global-flags) is set higher than `1`), add the estimated memory usage for the tables with the largest row sizes. Ensure that you have sufficient memory to run `molt fetch`, and adjust `--row-batch-size` accordingly. For details on how concurrency and sharding interact, refer to [Table sharding](#table-sharding).
- {{site.data.alerts.callout_success}}
- For details on the default changefeed settings and how to override them, see [Changefeed override settings](#changefeed-override-settings).
- {{site.data.alerts.end}}
+- If a table in the source database is much larger than the other tables, [filter and export the largest table](#schema-and-table-selection) in its own `molt fetch` task. Repeat this for each of the largest tables. Then export the remaining tables in another task.
-The preceding `molt fetch` command issues the equivalent [`CREATE CHANGEFEED`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}) command, using the default and explicitly overriden changefeed settings:
+- Ensure that the machine running MOLT Fetch is large enough to handle the amount of data being migrated. Fetch performance can sometimes be limited by available resources, but should always be making progress. To identify possible resource constraints, observe the `molt_fetch_rows_exported` [metric](#monitoring) for decreases in the number of rows being processed. You can use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view metrics. For details on optimizing export performance through sharding, refer to [Table sharding](#table-sharding).
-{% include_cached copy-clipboard.html %}
-~~~ sql
-CREATE CHANGEFEED FOR TABLE employees, payments
- INTO 'webhook-https://localhost:30004/defaultdb/public?client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}'
- WITH updated, resolved = '1s', min_checkpoint_frequency = '1s', initial_scan = 'no', cursor = '2024-09-11T16:33:35Z', webhook_sink_config = '{\"Flush\":{\"Bytes\":1048576,\"Frequency\":\"1s\"}}'
-~~~
+### Import and continuation handling
-The initial output looks like the following:
+- When using [`IMPORT INTO`](#data-load-mode) during the [data import phase](#data-import-phase) to load tables into CockroachDB, if the fetch task terminates before the import job completes, the hanging import job on the target database will keep the table offline. To make this table accessible again, [manually resume or cancel the job]({% link {{site.current_cloud_version}}/import-into.md %}#view-and-control-import-jobs). Then resume `molt fetch` using [continuation](#fetch-continuation), or restart the task from the beginning.
-~~~
-INFO [Sep 11 11:03:54] Replicator starting -buildmode=exe -compiler=gc CGO_CFLAGS= CGO_CPPFLAGS= CGO_CXXFLAGS= CGO_ENABLED=1 CGO_LDFLAGS= GOARCH=arm64 GOOS=darwin vcs=git vcs.modified=true vcs.revision=c948b78081a37aacf37a82eac213aa91a2828f92 vcs.time="2024-08-19T13:39:37Z"
-INFO [Sep 11 11:03:54] Server listening address="[::]:30004"
-DEBUG [Sep 11 11:04:00] httpRequest="&{0x14000156ea0 0 401 32 101.042µs false false}"
-DEBUG [Sep 11 11:04:00] httpRequest="&{0x14000018b40 0 401 32 104.417µs false false}"
-DEBUG [Sep 11 11:04:01] httpRequest="&{0x140000190e0 0 401 32 27.958µs false false}"
-~~~
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-fetch.md %}
## See also
- [Migration Overview]({% link molt/migration-overview.md %})
- [Migration Strategy]({% link molt/migration-strategy.md %})
+- [MOLT Replicator]({% link molt/molt-replicator.md %})
- [MOLT Verify]({% link molt/molt-verify.md %})
+- [Load and replicate]({% link molt/migrate-load-replicate.md %})
+- [Resume Replication]({% link molt/migrate-resume-replication.md %})
+- [Migration Failback]({% link molt/migrate-failback.md %})
\ No newline at end of file
diff --git a/src/current/molt/molt-replicator.md b/src/current/molt/molt-replicator.md
new file mode 100644
index 00000000000..1a395a61a18
--- /dev/null
+++ b/src/current/molt/molt-replicator.md
@@ -0,0 +1,697 @@
+---
+title: MOLT Replicator
+summary: Learn how to use the MOLT Replicator tool to continuously replicate changes from source databases to CockroachDB.
+toc: true
+docs_area: migrate
+---
+
+MOLT Replicator continuously replicates changes from a source database to CockroachDB as part of a [database migration]({% link molt/migration-overview.md %}). It supports migrations from a source database to CockroachDB with minimal downtime, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.
+
+MOLT Replicator consumes change data from PostgreSQL [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) streams, MySQL [GTID-based replication](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html), Oracle [LogMiner](https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle-logminer-utility.html), and [CockroachDB changefeeds]({% link {{ site.current_cloud_version }}/change-data-capture-overview.md %}) (for failback). For details, refer to [How it works](#how-it-works).
+
+## Terminology
+
+- *Checkpoint*: The position in the source database's transaction log from which replication begins or resumes: LSN (PostgreSQL), GTID (MySQL), or SCN (Oracle).
+- *Staging database*: A CockroachDB database used by Replicator to store replication metadata, checkpoints, and buffered mutations. Specified with `--stagingSchema` and automatically created with `--stagingCreateSchema`. For details, refer to [Staging database](#staging-database).
+- *Forward replication*: Replicate changes from a source database (PostgreSQL, MySQL, or Oracle) to CockroachDB during a migration. For usage details, refer to [Forward replication with initial load](#forward-replication-with-initial-load).
+- *Failback*: Replicate changes from CockroachDB back to the source database. Used for migration rollback or to maintain data consistency on the source during migration. For usage details, refer to [Failback to source database](#failback-to-source-database).
+
+## Prerequisites
+
+### Supported databases
+
+MOLT Replicator supports the following source and target databases:
+
+- PostgreSQL 11-16
+- MySQL 5.7, 8.0 and later
+- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
+- CockroachDB (all currently [supported versions]({% link releases/release-support-policy.md %}#supported-versions))
+
+### Database configuration
+
+The source database must be configured for replication:
+
+| Database | Configuration Requirements | Details |
+|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
+| PostgreSQL source |
Enable logical replication by setting `wal_level = logical`.
| [Configure PostgreSQL for replication]({% link molt/migrate-load-replicate.md %}#configure-source-database-for-replication) |
+| MySQL source |
Enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) and configure binary logging. Set `binlog-row-metadata` or `binlog-row-image` to `full`.
Configure sufficient binlog retention for migration duration.
| [Configure MySQL for replication]({% link molt/migrate-load-replicate.md %}?filters=mysql#configure-source-database-for-replication) |
+| Oracle source |
Install [Oracle Instant Client]({% link molt/migrate-load-replicate.md %}?filters=oracle#oracle-instant-client).
[Enable `ARCHIVELOG` mode]({% link molt/migrate-load-replicate.md %}?filters=oracle#enable-archivelog-and-force-logging), supplemental logging for primary keys, and `FORCE LOGGING`.
[Create sentinel table]({% link molt/migrate-load-replicate.md %}#create-source-sentinel-table) (`_replicator_sentinel`) in source schema.
Grant and verify [LogMiner privileges]({% link molt/migrate-load-replicate.md %}#grant-logminer-privileges).
| [Configure Oracle for replication]({% link molt/migrate-load-replicate.md %}?filters=oracle#configure-source-database-for-replication) |
+| CockroachDB source (failback) |
| [Configure CockroachDB for replication]({% link molt/migrate-failback.md %}#prepare-the-cockroachdb-cluster) |
+
+### User permissions
+
+The SQL user running MOLT Replicator requires specific privileges on both the source and target databases:
+
+| Database | Required Privileges | Details |
+|------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| PostgreSQL source |
`SUPERUSER` role (recommended), or the following granular permissions:
`CREATE` and `SELECT` on database and tables to replicate.
Table ownership for adding tables to publications.
`LOGIN` and `REPLICATION` privileges to create replication slots and access replication data.
| [Create PostgreSQL migration user]({% link molt/migrate-load-replicate.md %}#create-migration-user-on-source-database) |
+| MySQL source |
`SELECT` on tables to replicate.
`REPLICATION SLAVE` and `REPLICATION CLIENT` privileges for binlog access.
For `--fetchMetadata`, either `SELECT` on the source database or `PROCESS` globally.
| [Create MySQL migration user]({% link molt/migrate-load-replicate.md %}?filters=mysql#create-migration-user-on-source-database) |
+| Oracle source |
`SELECT`, `INSERT`, `UPDATE` on `_replicator_sentinel` table.
`SELECT` on `V$` views (`V$LOG`, `V$LOGFILE`, `V$LOGMNR_CONTENTS`, `V$ARCHIVED_LOG`, `V$LOG_HISTORY`).
`SELECT` on `SYS.V$LOGMNR_*` views (`SYS.V$LOGMNR_DICTIONARY`, `SYS.V$LOGMNR_LOGS`, `SYS.V$LOGMNR_PARAMETERS`, `SYS.V$LOGMNR_SESSION`).
`LOGMINING` privilege.
`EXECUTE` on `DBMS_LOGMNR`.
For Oracle Multitenant, the user must be a common user (prefixed with `C##`) with privileges granted on both CDB and PDB.
| [Create Oracle migration user]({% link molt/migrate-load-replicate.md %}?filters=oracle#create-migration-user-on-source-database) [Create sentinel table]({% link molt/migrate-load-replicate.md %}#create-source-sentinel-table) [Grant LogMiner privileges]({% link molt/migrate-load-replicate.md %}#grant-logminer-privileges) |
+| CockroachDB target (forward replication) |
`ALL` on target database.
`CREATE` on schema.
`SELECT`, `INSERT`, `UPDATE`, `DELETE` on target tables.
`CREATEDB` privilege for creating staging schema.
| [Create CockroachDB user]({% link molt/migrate-load-replicate.md %}#create-the-sql-user) |
+| PostgreSQL, MySQL, or Oracle target (failback) |
`SELECT`, `INSERT`, `UPDATE` on tables to fail back to.
For Oracle, `FLASHBACK` is also required.
| [Grant PostgreSQL user permissions]({% link molt/migrate-failback.md %}#grant-target-database-user-permissions) [Grant MySQL user permissions]({% link molt/migrate-failback.md %}?filter=mysql#grant-target-database-user-permissions) [Grant Oracle user permissions]({% link molt/migrate-failback.md %}?filter=oracle#grant-target-database-user-permissions) |
+
+## Installation
+
+{% include molt/molt-install.md %}
+
+### Docker usage
+
+{% include molt/molt-docker.md %}
+
+## How it works
+
+MOLT Replicator supports forward replication from PostgreSQL, MySQL, and Oracle, and failback from CockroachDB:
+
+- PostgreSQL source ([`pglogical`](#commands)): MOLT Replicator uses [PostgreSQL logical replication](https://www.postgresql.org/docs/current/logical-replication.html), which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.
+
+- MySQL source ([`mylogical`](#commands)): MOLT Replicator relies on [MySQL GTID-based replication](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html) to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.
+
+- Oracle source ([`oraclelogminer`](#commands)): MOLT Replicator uses [Oracle LogMiner](https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle-logminer-utility.html) to capture change data from Oracle redo logs. Both Oracle Multitenant (CDB/PDB) and single-tenant Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.
+
+- Failback from CockroachDB ([`start`](#commands)): MOLT Replicator acts as an HTTP webhook sink for a single CockroachDB changefeed. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied as [`UPSERT`]({% link {{ site.current_cloud_version }}/upsert.md %}) or [`DELETE`]({% link {{ site.current_cloud_version }}/delete.md %}) statements while respecting [foreign-key]({% link {{ site.current_cloud_version }}/foreign-key.md %}) and table dependencies.
+
+### Consistency modes
+
+MOLT Replicator supports three consistency modes for balancing throughput and transactional guarantees:
+
+1. *Consistent* (failback mode only, default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by `--parallelism`.
+
+1. *BestEffort* (failback mode only): Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Enable with `--bestEffortOnly` or allow auto-entry via `--bestEffortWindow` set to a positive duration (such as `1s`).
+
+ {{site.data.alerts.callout_info}}
+ For independent tables (with no foreign key constraints), BestEffort mode applies changes immediately as they arrive, without waiting for the resolved timestamp. This provides higher throughput for tables that have no relationships with other tables.
+ {{site.data.alerts.end}}
+
+1. *Immediate* (default for PostgreSQL, MySQL, and Oracle sources): Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. For CockroachDB sources, provides highest throughput but requires no foreign keys on the target schema.
+
+## Commands
+
+MOLT Replicator provides the following commands:
+
+| Command | Description |
+|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `pglogical` | Replicate from PostgreSQL source to CockroachDB target using logical replication. |
+| `mylogical` | Replicate from MySQL source to CockroachDB target using GTID-based replication. |
+| `oraclelogminer` | Replicate from Oracle source to CockroachDB target using Oracle LogMiner. |
+| `start` | Replicate from CockroachDB source to PostgreSQL, MySQL, or Oracle target ([failback mode](#failback-to-source-database)). Requires a CockroachDB changefeed with rangefeeds enabled. |
+| `make-jwt` | Generate JWT tokens for authorizing changefeed connections in failback scenarios. Supports signing tokens with RSA or EC keys, or generating claims for external JWT providers. For details, refer to [JWT authentication](#jwt-authentication). |
+| `version` | Display version information and Go module dependencies with checksums. For details, refer to [Supply chain security](#supply-chain-security). |
+
+For command-specific flags and examples, refer to [Usage](#usage) and [Common workflows](#common-workflows).
+
+## Flags
+
+{% include molt/replicator-flags.md %}
+
+## Usage
+
+### Replicator commands
+
+MOLT Replicator provides four commands for different replication scenarios. For detailed workflows, refer to [Common workflows](#common-workflows).
+
+Use `pglogical` to replicate from PostgreSQL to CockroachDB:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator pglogical
+~~~
+
+Use `mylogical` to replicate from MySQL to CockroachDB:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator mylogical
+~~~
+
+Use `oraclelogminer` to replicate from Oracle to CockroachDB:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator oraclelogminer
+~~~
+
+Use `start` to replicate from CockroachDB to PostgreSQL, MySQL, or Oracle (failback):
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator start
+~~~
+
+### Source connection strings
+
+{{site.data.alerts.callout_success}}
+Follow the security recommendations in [Connection security and credentials](#connection-security-and-credentials).
+{{site.data.alerts.end}}
+
+`--sourceConn` specifies the connection string of the source database for forward replication.
+
+{{site.data.alerts.callout_info}}
+The source connection string **must** point to the primary instance of the source database. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.
+{{site.data.alerts.end}}
+
+PostgreSQL connection string:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
+~~~
+
+MySQL connection string:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
+~~~
+
+Oracle connection string:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'
+~~~
+
+For Oracle Multitenant databases, `--sourcePDBConn` specifies the pluggable database (PDB) connection. `--sourceConn` specifies the container database (CDB):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceConn 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
+--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'
+~~~
+
+For failback, `--stagingConn` specifies the CockroachDB connection string:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--stagingConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
+~~~
+
+### Target connection strings
+
+`--targetConn` specifies the connection string of the target CockroachDB database for forward replication:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
+~~~
+
+{{site.data.alerts.callout_info}}
+For failback, `--targetConn` specifies the original source database (PostgreSQL, MySQL, or Oracle). For details, refer to [Failback to source database](#failback-to-source-database).
+{{site.data.alerts.end}}
+
+### Replication checkpoints
+
+MOLT Replicator requires a checkpoint value to start replication from the correct position in the source database's transaction log.
+
+For PostgreSQL, use `--slotName` to specify the [replication slot created during the data load]({% link molt/migrate-load-replicate.md %}#start-fetch). The slot automatically tracks the LSN (Log Sequence Number):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--slotName molt_slot
+~~~
+
+For MySQL, use `--defaultGTIDSet` with the GTID set from the [MOLT Fetch output]({% link molt/migrate-load-replicate.md %}?filters=mysql#start-fetch):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'
+~~~
+
+For Oracle, use `--scn` and `--backfillFromSCN` with the SCN values from the [MOLT Fetch output]({% link molt/migrate-load-replicate.md %}?filters=oracle#start-fetch):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--scn 26685786
+--backfillFromSCN 26685444
+~~~
+
+### Staging database
+
+The staging database stores replication metadata, checkpoints, and buffered mutations. Specify the staging database with `--stagingSchema` and create it automatically with `--stagingCreateSchema`:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--stagingSchema _replicator
+--stagingCreateSchema
+~~~
+
+The staging database is used to:
+
+- Store checkpoints that enable resuming from the correct point after interruptions.
+- Buffer mutations before applying them to the target in transaction order.
+- Maintain consistency for time-ordered transactional batches while respecting table dependencies.
+- Provide restart capabilities after failures.
+
+## Security
+
+Cockroach Labs **strongly** recommends the following:
+
+### Connection security and credentials
+
+{% include molt/molt-secure-connection-strings.md %}
+
+### CockroachDB changefeed security
+
+For failback scenarios, secure the connection from CockroachDB to MOLT Replicator using TLS certificates. Generate TLS certificates using self-signed certificates, certificate authorities like Let's Encrypt, or your organization's certificate management system.
+
+#### TLS from CockroachDB to Replicator
+
+Configure MOLT Replicator with server certificates using the `--tlsCertificate` and `--tlsPrivateKey` flags to specify the certificate and private key file paths. For example:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator start \
+--tlsCertificate ./certs/server.crt \
+--tlsPrivateKey ./certs/server.key \
+...
+~~~
+
+These server certificates must correspond to the client certificates specified in the changefeed webhook URL to ensure proper TLS handshake.
+
+Encode client certificates for changefeed webhook URLs:
+
+- **Webhook URLs**: Use both URL encoding and base64 encoding: `base64 -i ./client.crt | jq -R -r '@uri'`
+- **Non-webhook contexts**: Use base64 encoding only: `base64 -w 0 ca.cert`
+
+#### JWT authentication
+
+You can use JSON Web Tokens (JWT) to authorize incoming changefeed connections and restrict writes to a subset of SQL databases or user-defined schemas in the target cluster.
+
+Replicator supports JWT claims that allow writes to specific databases, schemas, or all of them. JWT tokens must be signed using RSA or EC keys. HMAC and `None` signatures are automatically rejected.
+
+To configure JWT authentication:
+
+1. Add PEM-formatted public signing keys to the `_replicator.jwt_public_keys` table in the staging database.
+
+1. To revoke a specific token, add its `jti` value to the `_replicator.jwt_revoked_ids` table in the staging database.
+
+The Replicator process re-reads these tables every minute to pick up changes.
+
+To pass the JWT token from the changefeed to the Replicator webhook sink, use the [`webhook_auth_header` option]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#options):
+
+{% include_cached copy-clipboard.html %}
+~~~ sql
+CREATE CHANGEFEED ... WITH webhook_auth_header='Bearer ';
+~~~
+
+##### Token quickstart
+
+The following example uses `OpenSSL` to generate keys, but any PEM-encoded RSA or EC keys will work.
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+# Generate an EC private key using OpenSSL.
+openssl ecparam -out ec.key -genkey -name prime256v1
+
+# Write the public key components to a separate file.
+openssl ec -in ec.key -pubout -out ec.pub
+
+# Upload the public key for all instances of Replicator to find it.
+cockroach sql -e "INSERT INTO _replicator.jwt_public_keys (public_key) VALUES ('$(cat ec.pub)')"
+
+# Reload configuration, or wait one minute.
+killall -HUP replicator
+
+# Generate a token which can write to the ycsb.public schema.
+# The key can be decoded using the debugger at https://jwt.io.
+# Add the contents of out.jwt to the CREATE CHANGEFEED command:
+# WITH webhook_auth_header='Bearer {out.jwt}'
+replicator make-jwt -k ec.key -a ycsb.public -o out.jwt
+~~~
+
+##### External JWT providers
+
+The `make-jwt` command also supports a `--claim` [flag](#make-jwt-flags), which prints a JWT claim that can be signed by your existing JWT provider. The PEM-formatted public key or keys for that provider must be inserted into the `_replicator.jwt_public_keys` table. The `iss` (issuers) and `jti` (token id) fields will likely be specific to your auth provider, but the custom claim must be retained in its entirety.
+
+You can repeat the `-a` [flag](#make-jwt-flags) to create a claim for multiple schemas:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator make-jwt -a 'database.schema' --claim
+~~~
+
+~~~json
+{
+ "iss": "replicator",
+ "jti": "d5ffa211-8d54-424b-819a-bc19af9202a5",
+ "https://github.com/cockroachdb/replicator": {
+ "schemas": [
+ [
+ "database",
+ "schema"
+ ]
+ ]
+ }
+}
+~~~
+
+{{site.data.alerts.callout_info}}
+For details on the `make-jwt` command flags, refer to [`make-jwt` flags](#make-jwt-flags).
+{{site.data.alerts.end}}
+
+### Production considerations
+
+- Avoid `--disableAuthentication` and `--tlsSelfSigned` flags in production environments. These flags should only be used for testing or development purposes.
+
+### Supply chain security
+
+Use the `version` command to verify the integrity of your MOLT Replicator build and identify potential upstream vulnerabilities.
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator version
+~~~
+
+The output includes:
+
+- Module name
+- go.mod checksum
+- Version
+
+Use this information to determine if your build may be subject to vulnerabilities from upstream packages. Cockroach Labs uses Dependabot to automatically upgrade Go modules, and the team regularly merges Dependabot updates to address security issues.
+
+## Common workflows
+
+### Forward replication with initial load
+
+
+
+
+
+
+
+
+To start replication after an [initial data load with MOLT Fetch]({% link molt/migrate-load-replicate.md %}#start-fetch), use the `pglogical` command:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator pglogical
+~~~
+
+
+
+To start replication after an [initial data load with MOLT Fetch]({% link molt/migrate-load-replicate.md %}?filters=mysql#start-fetch), use the `mylogical` command:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator mylogical
+~~~
+
+
+
+{To start replication after an [initial data load with MOLT Fetch]({% link molt/migrate-load-replicate.md %}?filters=oracle#start-fetch), use the `oraclelogminer` command:
+
+% include_cached copy-clipboard.html %}
+~~~ shell
+replicator oraclelogminer
+~~~
+
+
+Specify the source and target database connections. For connection string formats, refer to [Source connection strings](#source-connection-strings) and [Target connection strings](#target-connection-strings):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceConn $SOURCE
+--targetConn $TARGET
+~~~
+
+
+For Oracle Multitenant databases, also specify the PDB connection:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourcePDBConn $SOURCE_PDB
+~~~
+
+Specify the source Oracle schema to replicate from:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--sourceSchema migration_schema
+~~~
+
+
+To replicate from the correct position, specify the appropriate checkpoint value.
+
+
+Use `--slotName` to specify the slot created during the data load, which automatically tracks the LSN (Log Sequence Number) checkpoint:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--slotName molt_slot
+~~~
+
+
+
+Use `--defaultGTIDSet` from the `cdc_cursor` field in the MOLT Fetch output:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'
+~~~
+
+
+
+Use the `--scn` and `--backfillFromSCN` values from the MOLT Fetch output:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--scn 26685786
+--backfillFromSCN 26685444
+~~~
+
+
+Use `--stagingSchema` to specify the staging database. Use `--stagingCreateSchema` to create it automatically on first run:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--stagingSchema _replicator
+--stagingCreateSchema
+~~~
+
+At minimum, the `replicator` command should include the following flags:
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator pglogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--slotName molt_slot \
+--stagingSchema _replicator \
+--stagingCreateSchema
+~~~
+
+For detailed steps, refer to [Load and replicate]({% link molt/migrate-load-replicate.md %}#start-replicator).
+
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator mylogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29' \
+--stagingSchema _replicator \
+--stagingCreateSchema
+~~~
+
+For detailed steps, refer to [Load and replicate]({% link molt/migrate-load-replicate.md %}?filters=mysql#start-replicator).
+
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator oraclelogminer \
+--sourceConn $SOURCE \
+--sourcePDBConn $SOURCE_PDB \
+--sourceSchema migration_schema \
+--targetConn $TARGET \
+--scn 26685786 \
+--backfillFromSCN 26685444 \
+--stagingSchema _replicator \
+--stagingCreateSchema
+~~~
+
+For detailed steps, refer to [Load and replicate]({% link molt/migrate-load-replicate.md %}?filters=oracle#start-replicator).
+
+
+### Resume after interruption
+
+
+
+
+
+
+
+When resuming replication after an interruption, MOLT Replicator automatically uses the stored checkpoint to resume from the correct position.
+
+Rerun the same `replicator` command used during [forward replication](#forward-replication-with-initial-load), specifying the same `--stagingSchema` value as before. Omit `--stagingCreateSchema` and any checkpoint flags. For example:
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator pglogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--slotName molt_slot \
+--stagingSchema _replicator
+~~~
+
+For detailed steps, refer to [Resume replication]({% link molt/migrate-resume-replication.md %}).
+
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator mylogical \
+--sourceConn $SOURCE \
+--targetConn $TARGET \
+--stagingSchema _replicator
+~~~
+
+For detailed steps, refer to [Resume replication]({% link molt/migrate-resume-replication.md %}?filters=mysql).
+
+
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator oraclelogminer \
+--sourceConn $SOURCE \
+--sourcePDBConn $SOURCE_PDB \
+--sourceSchema migration_schema \
+--targetConn $TARGET \
+--stagingSchema _replicator
+~~~
+
+For detailed steps, refer to [Resume replication]({% link molt/migrate-resume-replication.md %}?filters=oracle).
+
+
+### Failback to source database
+
+When replicating from CockroachDB back to the source database, MOLT Replicator acts as a webhook sink for a CockroachDB changefeed.
+
+Use the `start` command for failback:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator start
+~~~
+
+Specify the target database connection (the database you originally migrated from). For connection string formats, refer to [Target connection strings](#target-connection-strings):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--targetConn $TARGET
+~~~
+
+Specify the CockroachDB connection string. For details, refer to [Connect using a URL]({% link {{ site.current_cloud_version }}/connection-parameters.md %}#connect-using-a-url).
+
+{% include_cached copy-clipboard.html %}
+~~~
+--stagingConn $STAGING
+~~~
+
+Specify the staging database name. This should be the same staging database created during [Forward replication with initial load](#forward-replication-with-initial-load):
+
+{% include_cached copy-clipboard.html %}
+~~~
+--stagingSchema _replicator
+~~~
+
+Specify a webhook endpoint address for the changefeed to send changes to. For example:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--bindAddr :30004
+~~~
+
+Specify TLS certificate and private key file paths for secure webhook connections:
+
+{% include_cached copy-clipboard.html %}
+~~~
+--tlsCertificate ./certs/server.crt
+--tlsPrivateKey ./certs/server.key
+~~~
+
+At minimum, the `replicator` command should include the following flags:
+
+{% include_cached copy-clipboard.html %}
+~~~ shell
+replicator start \
+--targetConn $TARGET \
+--stagingConn $STAGING \
+--stagingSchema _replicator \
+--bindAddr :30004 \
+--tlsCertificate ./certs/server.crt \
+--tlsPrivateKey ./certs/server.key
+~~~
+
+For detailed steps, refer to [Migration failback]({% link molt/migrate-failback.md %}).
+
+## Monitoring
+
+### Metrics
+
+MOLT Replicator can export [Prometheus](https://prometheus.io/) metrics by setting the `--metricsAddr` flag to a port (for example, `--metricsAddr :30005`). Metrics are not enabled by default. When enabled, metrics are available at the path `/_/varz`. For example: `http://localhost:30005/_/varz`.
+
+For a list of recommended metrics to monitor during replication, refer to:
+
+- [Forward replication metrics]({% link molt/migrate-load-replicate.md %}#replicator-metrics) (PostgreSQL, MySQL, and Oracle sources)
+- [Failback replication metrics]({% link molt/migrate-failback.md %}#replicator-metrics) (CockroachDB source)
+
+You can use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize the metrics. For Oracle-specific metrics, import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).
+
+To check MOLT Replicator health when metrics are enabled, run `curl http://localhost:30005/_/healthz` (replacing the port with your `--metricsAddr` value). This returns a status code of `200` if Replicator is running.
+
+### Logging
+
+By default, MOLT Replicator writes two streams of logs: operational logs to `stdout` (including `warning`, `info`, `trace`, and some errors) and final errors to `stderr`.
+
+Redirect both streams to ensure all logs are captured for troubleshooting:
+
+{% include_cached copy-clipboard.html %}
+~~~shell
+# Merge both streams to console
+./replicator ... 2>&1
+
+# Redirect both streams to a file
+./replicator ... > output.log 2>&1
+
+# Merge streams to console while saving to file
+./replicator > >(tee replicator.log) 2>&1
+
+# Use logDestination flag to write all logs to a file
+./replicator --logDestination replicator.log ...
+~~~
+
+Enable debug logging with `-v`. For more granularity and system insights, enable trace logging with `-vv`. Pay close attention to warning- and error-level logs, as these indicate when Replicator is misbehaving.
+
+## Best practices
+
+### Test and validate
+
+To verify that your connections and configuration work properly, run MOLT Replicator in a staging environment before replicating any data in production. Use a test or development environment that closely resembles production.
+
+### Optimize performance
+
+{% include molt/optimize-replicator-performance.md %}
+
+## Troubleshooting
+
+{% include molt/molt-troubleshooting-replication.md %}
+
+{% include molt/molt-troubleshooting-failback.md %}
+
+## Examples
+
+For detailed examples of using MOLT Replicator usage, refer to the migration workflow tutorials:
+
+- [Load and Replicate]({% link molt/migrate-load-replicate.md %}): Load data with MOLT Fetch and set up ongoing replication with MOLT Replicator.
+- [Resume Replication]({% link molt/migrate-resume-replication.md %}): Resume replication after an interruption.
+- [Migration failback]({% link molt/migrate-failback.md %}): Replicate changes from CockroachDB back to the initial source database.
+
+## See also
+
+- [Migration Overview]({% link molt/migration-overview.md %})
+- [Migration Strategy]({% link molt/migration-strategy.md %})
+- [MOLT Fetch]({% link molt/molt-fetch.md %})
\ No newline at end of file
diff --git a/src/current/releases/molt.md b/src/current/releases/molt.md
index a7766071166..734b47fa9c8 100644
--- a/src/current/releases/molt.md
+++ b/src/current/releases/molt.md
@@ -1,14 +1,14 @@
---
title: MOLT Releases
-summary: Changelog for MOLT Fetch and Verify
+summary: Changelog for MOLT Fetch, Verify, and Replicator
toc: true
docs_area: releases
---
This page has details about each release of the following [MOLT (Migrate Off Legacy Technology) tools]({% link molt/migration-overview.md %}):
-- [Fetch]({% link molt/molt-fetch.md %})
-- [Verify]({% link molt/molt-verify.md %})
+- `molt`: [MOLT Fetch]({% link molt/molt-fetch.md %}) and [MOLT Verify]({% link molt/molt-verify.md %})
+- `replicator`: [MOLT Replicator]({% link molt/molt-replicator.md %})
Cockroach Labs recommends using the latest available version of each tool. See [Installation](#installation).
@@ -16,6 +16,20 @@ Cockroach Labs recommends using the latest available version of each tool. See [
{% include molt/molt-install.md %}
+## October 24, 2025
+
+`molt` 1.3.2 is [available](#installation):
+
+- MOLT Fetch replication modes are deprecated in favor of a separate replication workflow using `replicator`. For details, refer to [MOLT Replicator]({% link molt/molt-replicator.md %}).
+- Added retry logic to the export phase for CockroachDB and PostgreSQL sources to handle transient errors while maintaining a consistent snapshot point. Not currently supported for Oracle or MySQL sources.
+- Added `--export-retry-max-attempts` and `--export-retry-max-duration` flags to control retry behavior for source query exports.
+
+`replicator` 1.1.3 is [available](#installation):
+
+- Added `commit_to_stage_lag_seconds` Prometheus histogram metric to track the distribution of source CockroachDB to staged data times.
+- Added `core_parallelism_utilization_percent` gauge to track parallelism utilization and identify when the system is nearing parallelism limits, and should be sized up.
+- Added `core_flush_count` metric to track the number of flushed batches in the applier flow and the reason for each flush.
+
## September 25, 2025
MOLT Fetch/Verify 1.3.2 is [available](#installation).
@@ -25,7 +39,7 @@ MOLT Fetch/Verify 1.3.2 is [available](#installation).
- Fixed a bug in `escape-password` where passwords that start with a hyphen were not handled correctly. Users must now pass the `--password` flag when running `escape-password`. For example, `molt escape-password --password 'a$52&'`.
- Added support for assume role authentication during [data export]({% link molt/molt-fetch.md %}#data-export-phase) with MOLT Fetch.
- Added support to `replicator` for retrying unique constraint violations on the target database, which can be temporary in some cases.
-- Added exponential backoff to `replicator` for retryable errors when applying mutations to the target database. This reduces load on the target database and prevents exhausting retries prematurely. The new [replication flags]({% link molt/molt-fetch.md %}#replication-flags) `--retryInitialBackoff`, `--retryMaxBackoff`, and `--retryMultiplier` control backoff behavior. The new `--maxRetries` flag configures the maximum number of retries. To retain the previous "immediate retry" behavior, set `--retryMaxBackoff 1ns --retryInitialBackoff 1ns --retryMultiplier 1`.
+- Added exponential backoff to `replicator` for retryable errors when applying mutations to the target database. This reduces load on the target database and prevents exhausting retries prematurely. The new [replication flags]({% link molt/molt-replicator.md %}#flags) `--retryInitialBackoff`, `--retryMaxBackoff`, and `--retryMultiplier` control backoff behavior. The new `--maxRetries` flag configures the maximum number of retries. To retain the previous "immediate retry" behavior, set `--retryMaxBackoff 1ns --retryInitialBackoff 1ns --retryMultiplier 1`.
- Added support to `replicator` for the `source_lag_seconds`, `target_lag_seconds`, `apply_mutation_age_seconds`, and `source_commit_to_apply_lag_seconds` metrics for replication from PostgreSQL and MySQL, and introduced histogram metrics `source_lag_seconds_histogram` and `target_lag_seconds_histogram` for replication from CockroachDB.
`source_lag_seconds` measures the delay before data is ready to be processed by `replicator`, while `target_lag_seconds` measures the "end-to-end" delay until `replicator` has written data to the target. A steady increase in `source_lag_seconds` may indicate `replicator` cannot keep up with the source workload, while a steady increase in `target_lag_seconds` may indicate `replicator` cannot keep up with the source workload or that writes on the target database are bottlenecked.
@@ -35,8 +49,8 @@ MOLT Fetch/Verify 1.3.2 is [available](#installation).
MOLT Fetch/Verify 1.3.1 is [available](#installation).
- MOLT Fetch now supports [sharding]({% link molt/molt-fetch.md %}#table-sharding) of primary keys of any data type on PostgreSQL 11+ sources. This can be enabled with the [`--use-stats-based-sharding`]({% link molt/molt-fetch.md %}#global-flags) flag.
-- Added the [`--ignore-replication-check`]({% link molt/molt-fetch.md %}#global-flags) flag to allow data loads with planned downtime and no [replication setup]({% link molt/molt-fetch.md %}#replication-setup). The `--pglogical-ignore-wal-check` flag has been removed.
-- Added the `--enableParallelApplies` [replication flag]({% link molt/molt-fetch.md %}#replication-flags) to enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of increased target pool and memory usage.
+- Added the [`--ignore-replication-check`]({% link molt/molt-fetch.md %}#global-flags) flag to allow data loads with planned downtime and no replication setup. The `--pglogical-ignore-wal-check` flag has been removed.
+- Added the `--enableParallelApplies` [replication flag]({% link molt/molt-replicator.md %}#flags) to enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of increased target pool and memory usage.
- Improved cleanup logic for scheduled tasks to ensure progress reporting and prevent indefinite hangs.
- Added parallelism gating to ensure the parallelism setting is smaller than the `targetMaxPoolSize`. This helps prevent a potential indefinite hang.
- Added new metrics that track start and end times for progress reports (`core_progress_reports_started_count` and `core_progress_reports_ended_count`) and error reports (`core_error_reports_started_count` and `core_error_reports_ended_count`). These provide visibility into the core sequencer progress and help identify hangs in the applier and progress tracking pipeline.
@@ -101,10 +115,10 @@ MOLT Fetch/Verify 1.2.5 is [available](#installation).
MOLT Fetch/Verify 1.2.4 is [available](#installation).
- MOLT Fetch now supports PostgreSQL 11.
-- MOLT Fetch [failback]({% link molt/molt-fetch.md %}#failback) to CockroachDB is now disallowed.
+- MOLT Fetch failback to CockroachDB is now disallowed.
- MOLT Verify can now compare tables that are named differently on the source and target schemas.
- The `molt` logging date format is now period-delimited for Windows compatibility.
-- During replication, an index is now created on all tables by default, improving replication performance. Because index creation can cause the replication process to initialize more slowly, this behavior can be disabled using the `--stageDisableCreateTableReaderIndex` [replication flag]({% link molt/molt-fetch.md %}#replication-flags).
+- During replication, an index is now created on all tables by default, improving replication performance. Because index creation can cause the replication process to initialize more slowly, this behavior can be disabled using the `--stageDisableCreateTableReaderIndex` [replication flag]({% link molt/molt-replicator.md %}#flags).
- Added a failback metric that tracks the time to write a source commit to the staging schema for a given mutation.
- Added a failback metric that tracks the time to write a source commit to the target database for a given mutation.
@@ -122,25 +136,25 @@ MOLT Fetch/Verify 1.2.2 is [available](#installation).
- Added an [`--import-region`]({% link molt/molt-fetch.md %}#global-flags) flag that is used to set the `AWS_REGION` query parameter explicitly in the [`s3` URL]({% link molt/molt-fetch.md %}#bucket-path).
- Fixed the [`truncate-if-exists`]({% link molt/molt-fetch.md %}#target-table-handling) schema mode for cases where there are uppercase table or schema names.
- Fixed an issue with unsigned `BIGINT` values overflowing in replication.
-- Added a `--schemaRefresh` [replication flag]({% link molt/molt-fetch.md %}#replication-flags) that is used to configure the schema watcher refresh delay in the replication phase. Previously, the refresh delay was set to a constant value of 1 minute. Set the flag as follows: `--replicator-flags "--schemaRefresh {value}"`.
+- Added a `--schemaRefresh` [replication flag]({% link molt/molt-replicator.md %}#flags) that is used to configure the schema watcher refresh delay in the replication phase. Previously, the refresh delay was set to a constant value of 1 minute. Set the flag as follows: `--replicator-flags "--schemaRefresh {value}"`.
## December 13, 2024
MOLT Fetch/Verify 1.2.1 is [available](#installation).
- MOLT Fetch users now can use [`--assume-role`]({% link molt/molt-fetch.md %}#global-flags) to specify a service account for assume role authentication to cloud storage. `--assume-role` must be used with `--use-implicit-auth`, or the flag will be ignored.
-- MySQL 5.7 and later are now supported with MOLT Fetch replication modes. For details on setup, refer to the [MOLT Fetch documentation]({% link molt/molt-fetch.md %}#replication-setup).
+- MySQL 5.7 and later are now supported with MOLT Fetch replication modes.
- Fetch replication mode now defaults to a less verbose `INFO` logging level. To specify `DEBUG` logging, pass in the `--replicator-flags '-v'` setting, or `--replicator-flags '-vv'` for trace logging.
- MySQL columns of type `BIGINT UNSIGNED` or `SERIAL` are now auto-mapped to [`DECIMAL`]({% link {{ site.current_cloud_version }}/decimal.md %}) type in CockroachDB. MySQL regular `BIGINT` types are mapped to [`INT`]({% link {{ site.current_cloud_version }}/int.md %}) type in CockroachDB.
-- The `pglogical` replication workflow was modified in order to enforce safer and simpler defaults for the [`data-load`]({% link molt/molt-fetch.md %}#data-load), [`data-load-and-replication`]({% link molt/molt-fetch.md %}#data-load-and-replication), and [`replication-only`]({% link molt/molt-fetch.md %}#replication-only) workflows for PostgreSQL sources. Fetch now ensures that the publication is created before the slot, and that `replication-only` defaults to using publications and slots created either in previous Fetch runs or manually.
+- The `pglogical` replication workflow was modified in order to enforce safer and simpler defaults for the [`data-load`]({% link molt/molt-fetch.md %}#fetch-mode), `data-load-and-replication`, and `replication-only` workflows for PostgreSQL sources. Fetch now ensures that the publication is created before the slot, and that `replication-only` defaults to using publications and slots created either in previous Fetch runs or manually.
- Fixed scan iterator query ordering for `BINARY` and `TEXT` (of same collation) PKs so that they lead to the correct queries and ordering.
-- For a MySQL source in [`replication-only`]({% link molt/molt-fetch.md %}#replication-only) mode, the [`--stagingSchema` replicator flag]({% link molt/molt-fetch.md %}#replication-flags) can now be used to resume streaming replication after being interrupted. Otherwise, the [`--defaultGTIDSet` replicator flag]({% link molt/molt-fetch.md %}#mysql-replication-flags) is used to start initial replication after a previous Fetch run in [`data-load`]({% link molt/molt-fetch.md %}#data-load) mode, or as an override to the current replication stream.
+- For a MySQL source in `replication-only` mode, the [`--stagingSchema` replicator flag]({% link molt/molt-replicator.md %}#flags) can now be used to resume streaming replication after being interrupted. Otherwise, the [`--defaultGTIDSet` replicator flag]({% link molt/molt-replicator.md %}#mylogical-replication-flags) is used to start initial replication after a previous Fetch run in [`data-load`]({% link molt/molt-fetch.md %}#fetch-mode) mode, or as an override to the current replication stream.
## October 29, 2024
MOLT Fetch/Verify 1.2.0 is [available](#installation).
-- Added [`failback` mode]({% link molt/molt-fetch.md %}#failback) to MOLT Fetch, which allows the user to replicate changes on CockroachDB back to the initial source database. Failback is supported for MySQL and PostgreSQL databases.
+- Added `failback` mode to MOLT Fetch, which allows the user to replicate changes on CockroachDB back to the initial source database. Failback is supported for MySQL and PostgreSQL databases.
- The [`--pprof-list-addr` flag]({% link molt/molt-fetch.md %}#global-flags), which specifies the address of the `pprof` endpoint, is now configurable. The default value is `'127.0.0.1:3031'`.
- [Fetch modes]({% link molt/molt-fetch.md %}#fetch-mode) involving replication now state that MySQL 8.0 and later are supported for replication between MySQL and CockroachDB.
- [Partitioned tables]({% link molt/molt-fetch.md %}#transformations) can now be moved to CockroachDB using [`IMPORT INTO`]({% link {{ site.current_cloud_version }}/import-into.md %}).
@@ -166,9 +180,9 @@ MOLT Fetch/Verify 1.1.6 is [available](#installation).
MOLT Fetch/Verify 1.1.5 is [available](#installation).
-- **Deprecated** the `--ongoing-replication` flag in favor of [`--mode data-load-and-replication`]({% link molt/molt-fetch.md %}#data-load-and-replication), using the new `--mode` flag. Users should replace all instances of `--ongoing-replication` with `--mode data-load-and-replication`.
-- Fetch can now be run in an export-only mode by specifying [`--mode export-only`]({% link molt/molt-fetch.md %}#export-only-and-import-only). This will export all the data in `csv` or `csv.gz` format to the specified cloud or local store.
-- Fetch can now be run in an import-only mode by specifying [`--mode import-only`]({% link molt/molt-fetch.md %}#export-only-and-import-only). This will load all data in the specified cloud or local store into the target CockroachDB database, effectively skipping the export data phase.
+- **Deprecated** the `--ongoing-replication` flag in favor of `--mode data-load-and-replication`, using the new `--mode` flag. Users should replace all instances of `--ongoing-replication` with `--mode data-load-and-replication`.
+- Fetch can now be run in an export-only mode by specifying [`--mode export-only`]({% link molt/molt-fetch.md %}#fetch-mode). This will export all the data in `csv` or `csv.gz` format to the specified cloud or local store.
+- Fetch can now be run in an import-only mode by specifying [`--mode import-only`]({% link molt/molt-fetch.md %}#fetch-mode). This will load all data in the specified cloud or local store into the target CockroachDB database, effectively skipping the export data phase.
- Strings for the `--mode` flag are now word-separated by hyphens instead of underscores. For example, `replication-only` instead of `replication_only`.
## August 8, 2024