Skip to content
Merged
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@
The DataStax CDC for Apache Cassandra requires:

* DataStax Change Agent for Apache Cassandra, which is an event producer deployed as a JVM agent on each Cassandra data node.
* Datastax Source Connector for Apache Pulsar, which is source connector deployed in your streaming platform.
* DataStax Source Connector for Apache Pulsar, which is source connector deployed in your streaming platform.

![Cassandra-source-connector](docs/modules/ROOT/assets/images/cassandra-source-connector.png)
![Cassandra-source-connector](docs/docs-src/core/modules/ROOT/assets/images/cassandra-source-connector.png)

Supported streaming platform:
* Apache Pulsar 2.8.1+
* Datastax Luna Streaming 2.8.0.1.1.40+
* DataStax Luna Streaming 2.8.0.1.1.40+

Supported Cassandra version:
* Cassandra 3.11+
* Cassandra 4.0+
* Datastax Enterprise Server 6.8.16+
* [DataStax Enterprise (DSE)](https://www.datastax.com/products/datastax-enterprise) 6.8.16+

Note: Only Cassandra 4.0 and DSE 6.8.16+ support the near realtime CDC allowing to replicate data as soon as they are synced on disk.

Expand Down Expand Up @@ -49,7 +49,7 @@ You can collect Cassandra/DSE and Pulsar metrics into Prometheus, and build a Gr
* The mutation sent throughput from a Cassandra node
* The pulsar events and data topic rate in

![CDC Dashboard](docs/modules/ROOT/assets/images/cdc-dashboard.png)
![CDC Dashboard](docs/docs-src/core/modules/ROOT/assets/images/cdc-dashboard.png)

## Limitations

Expand All @@ -58,7 +58,7 @@ You can collect Cassandra/DSE and Pulsar metrics into Prometheus, and build a Gr
* Does not manage TTLs
* Does not support range deletes
* Does not sync data available before starting the CDC agent.
* CQL column names must not match a Pulsar primitive type name (ex: INT32)
* CQL column names must not match a [Pulsar primitive type](https://pulsar.apache.org/docs/next/schema-understand/#primitive-type) name (ex: INT32)

## Supported data types

Expand Down
46 changes: 15 additions & 31 deletions docs/docs-src/core/modules/ROOT/pages/cdcExample.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@

Capture schema changes in your C* tables and pass them to Apache Pulsar(R) with DataStax Change Data Capture (CDC). This doc will guide you through installing, configuring, and using CDC with C* or DSE in a VM-based deployment.

This installation requires:
This installation requires the following. Latest version artifacts are available here at [![GitHub release](https://img.shields.io/github/v/release/datastax/cdc-apache-cassandra.svg)](https://github.com/datastax/cdc-apache-cassandra/releases/latest):

* C* or DSE environment
** https://downloads.datastax.com/#enterprise[DSE 6.8.21]
** https://downloads.datastax.com/#enterprise[DSE 6.8.16+]
** https://cassandra.apache.org/_/download.html[OSS C*]
* CDC Agent
** https://github.com/datastax/cdc-apache-cassandra/releases/download/v1.0.5/agent-dse4-pulsar-1.0.5-all.jar[DSE]
** https://github.com/datastax/cdc-apache-cassandra/releases/download/v1.0.5/agent-c4-pulsar-1.0.5-all.jar[OSS C*]
** DSE - use `agent-dse4-<version>-all.jar`
** OSS C* - use `agent-c4-<version>-all.jar`
* Pulsar
** https://github.com/datastax/cdc-apache-cassandra/releases/download/v1.0.5/agent-dse4-pulsar-1.0.5-all.jar[DataStax Luna Streaming 2.8.3_1.0.7 Core]
** DataStax Luna Streaming - use `agent-dse4-<version>-all.jar`
* Pulsar C* source connector (CSC)
** https://github.com/datastax/cdc-apache-cassandra/releases/download/v1.0.5/pulsar-cassandra-source-1.0.5.nar[pulsar-cassandra-source-1.0.5.nar]
** Pulsar Cassandra Source NAR - use `pulsar-cassandra-source-<version>.nar`

== Installing and configuring

Expand All @@ -27,7 +27,7 @@ bin/pulsar standalone
+
[NOTE]
====
We recommend using the latest CDC agent version (at least version 1.04+) to support C* collection data types.
We recommend using the latest CDC agent version (at least version `1.0.4`+) to support C* collection data types.
====
. Install C*/DSE with your preferred https://docs.datastax.com/en/install/6.8/install/installWhichOne.html[installation method^].

Expand All @@ -45,16 +45,16 @@ export CDC_PULSAR_AUTH_PARAMS="file://</path/to/token/file>"
export CDC_TLS_TRUST_CERTS_FILE_PATH=”</path/to/trusted/cert/file>”

# DSE CDC
JVM_OPTS="$JVM_OPTS -javaagent:/home/automaton/cdc104/agent-dse4-pulsar-1.0.5-all.jar"
JVM_OPTS="$JVM_OPTS -javaagent:/home/automaton/cdc104/agent-dse4-<version>-all.jar"
----
+
For CDC agent versions *after 1.03*, the CDC agent Pulsar connection parameters are provided as system environment variables (see *DSE CDC* in the example above).
For CDC agent versions *after 1.0.3*, the CDC agent Pulsar connection parameters are provided as system environment variables (see *DSE CDC* in the example above).
+
For CDC agent versions *before 1.03*, the CDC agent Pulsar connection parameters are also provided as extra JVM options, as below:
For CDC agent versions *before 1.0.3*, the CDC agent Pulsar connection parameters are also provided as extra JVM options, as below:
+
[source,bash]
----
export JVM_EXTRA_OPTS="-javaagent:/path/to/agent-c4-luna-<version>-all.jar=pulsarServiceUrl=pulsar://pulsar:6650"
export JVM_EXTRA_OPTS="-javaagent:/path/to/agent-c4-<version>-all.jar=pulsarServiceUrl=pulsar://pulsar:6650"
----

. Set the `cassandra.yaml` configuration:
Expand Down Expand Up @@ -95,7 +95,7 @@ Key-value-avro::
----
$ pulsar-admin source create \
--name <csc_connector_name> \
--archive /pathto/to/pulsar-cassandra-source-1.0.5.nar \
--archive /path/to/pulsar-cassandra-source-<version>.nar \
--tenant public \
--namespace default \
--destination-topic-name <keyspace>.<table> \
Expand All @@ -119,7 +119,7 @@ Key-value-json::
----
$ pulsar-admin source create \
--name <csc_connector_name> \
--archive /pathto/to/pulsar-cassandra-source-1.0.5.nar \
--archive /path/to/pulsar-cassandra-source-<version>.nar \
--tenant public \
--namespace default \
--destination-topic-name persistent://public/default/data-<keyspace>.<table> \
Expand All @@ -136,14 +136,14 @@ $ pulsar-admin source create \
----
--
+
Json::
JSON::
+
--
[source,bash]
----
$ pulsar-admin source create \
--name <csc_connector_name> \
--archive /pathto/to/pulsar-cassandra-source-1.0.5.nar \
--archive /path/to/pulsar-cassandra-source-<version>.nar \
--tenant public \
--namespace default \
--destination-topic-name persistent://public/default/data-<keyspace>.<table> \
Expand Down Expand Up @@ -185,19 +185,3 @@ pulsar-client consume -s mysub -st auto_consume -n 0 persistent://public/default

For more on monitoring your {cdc_cass} deployment, see xref:monitor.adoc[Monitor {cdc_cass}]. +
For using CDC with Astra DB, see https://docs.datastax.com/en/astra-streaming/docs/astream-cdc.html[CDC for Astra DB].