Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 6 additions & 12 deletions docs/modules/ROOT/pages/backfill-cli.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
When CDC is enabled on a table, the data topic doesn't contain any data from before CDC was enabled.
The backfill CLI solves this problem by exporting the table's primary key to a Comma Separated Values (CSV) file, storing the CSV file on disk, and sending the primary key from the CSV file to the event topic.
The Cassandra Source Connector reads the primary key from the event topic and populates the data topic with historical data.
The backfill CLI is powered by the https://docs.datastax.com/en/dsbulk/docs/reference/dsbulkCmd.html[DataStax Bulk Loader], a battle-tested data loader tool. This means the CLI takes full advantage of optimizations done in DSBulk when exporting data from table to disk.
The backfill CLI is powered by the xref:dsbulk:overview:dsbulk-about.adoc[DataStax Bulk Loader], a battle-tested data loader tool. This means the CLI takes full advantage of optimizations done in DSBulk when exporting data from table to disk.

Developers can also use the backfill CLI to trigger change events for downstream applications without having to insert new data.

Expand Down Expand Up @@ -64,7 +64,7 @@ The Pulsar-admin extension is packaged with the IBM Elite Support for Apache Pul

. Move the generated NAR archive to the /cliextensions folder of your Pulsar installation (e.g. /pulsar/cliextensions).
. Modify the client.conf file of your Pulsar installation to include: `customCommandFactories=cassandra-cdc`.
. Run the following command (this assumes the https://docs.datastax.com/en/installing/docs/installTARdse.html[default installation] of DSE Cassandra):
. Run the following command (this assumes the https://docs.datastax.com/en/dse/6.8/installing/tarball-dse.html[default tarball installation of DSE]):
+
[source,shell]
----
Expand All @@ -80,11 +80,11 @@ This test quickly confirms your CDC backfill is working correctly.

*Prerequisites:*

* A running https://docs.datastax.com/en/installing/docs/installTARdse.html[DSE Cassandra cluster]
* A running DSE cluster
* A running Pulsar cluster (https://pulsar.apache.org/docs/getting-started-standalone/[standalone] is fine)
* Backfill CLI built with Gradle (see <<install>>)

. Start DSE Cassandra from the https://docs.datastax.com/en/installing/docs/installTARdse.html[installation directory].
. Start DSE:
+
[source,bash]
----
Expand Down Expand Up @@ -299,7 +299,7 @@ value.
|An extra DSBulk option to use when exporting. Any valid DSBulk option
can be specified here, and it will be passed as-is to the DSBulk
process. DSBulk options, including driver options, must be passed as
'--long.option.name=<value>'. Short options are not supported. For more DSBulk options, see https://docs.datastax.com/en/dsbulk/docs/reference/commonOptions.html[here].
'--long.option.name=<value>'. Short options are not supported.

|--export-host=HOST[:PORT]
|The host name or IP and, optionally, the port of a node from the
Expand Down Expand Up @@ -380,10 +380,4 @@ These parameters should be passed as command line arguments in the standalone Ja
|The path to the trusted TLS certificate file.
|--pulsar-ssl-use-key-store-tls
|If TLS is enabled, specifies whether to use KeyStore type as TLS configuration parameter.
|===

== What's next?

* xref:index.adoc[CDC Home]
* https://docs.datastax.com/en/dsbulk/docs/reference/dsbulkCmd.html[DataStax Bulk Loader]
* For more on using CDC with Apache Pulsar, including schema management and consumption patterns, see our https://docs.datastax.com/en/streaming/streaming-learning/use-cases-architectures/change-data-capture/index.html[Streaming learning page].
|===
13 changes: 6 additions & 7 deletions docs/modules/ROOT/pages/cdcExample.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@ Capture schema changes in your C* tables and pass them to Apache Pulsar(R) with

This installation requires the following. Latest version artifacts are available https://github.com/datastax/cdc-apache-cassandra/releases/latest[here]. Use image:https://img.shields.io/github/v/release/datastax/cdc-apache-cassandra?color=green&display_name=tag[link="https://github.com/datastax/cdc-apache-cassandra/releases/latest"] for the latest version.

* C* or DSE environment
** https://downloads.datastax.com/#enterprise[DSE 6.8.16+]
** https://cassandra.apache.org/_/download.html[OSS C*]
* DSE 6.8.16 or later
* OSS Apache Cassandra(R)
* CDC Agent
** DSE - use `agent-dse4-<version>-all.jar`
** OSS C* - use `agent-c4-<version>-all.jar`
Expand All @@ -29,7 +28,8 @@ bin/pulsar standalone
====
We recommend using the latest CDC agent version (at least version `1.0.4`+) to support C* collection data types.
====
. Install C*/DSE with your preferred https://docs.datastax.com/en/install/6.8/install/installWhichOne.html[installation method].

. Install C*/DSE.

. After installing C*/DSE, but before starting the C*/DSE service, set the `Cassandra-env.sh` configuration:
+
Expand Down Expand Up @@ -181,7 +181,6 @@ Any captured CDC events from the C* table will be reflected in the command line
pulsar-client consume -s mysub -st auto_consume -n 0 persistent://public/default/data-<keyspace>.<table>
----

== What's next?
== See also

For more on monitoring your {cdc_cass} deployment, see xref:monitor.adoc[Monitor {cdc_cass}]. +
For using CDC with Astra DB, see https://docs.datastax.com/en/astra-streaming/docs/astream-cdc.html[CDC for Astra DB].
* xref:monitor.adoc[]
7 changes: 1 addition & 6 deletions docs/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -159,9 +159,4 @@ To ensure the data sent to all datacenters are delivered to the data topic, make
For example, given a Cassandra cluster with three datacenters (DC1, DC2, and DC3), you would enable CDC and install the change agent in only DC1.
To ensure all updates in DC2 and DC3 are propagated to the data topic, configure the table's keyspace to replicate data from DC2 and DC3 to DC1.
For example, `replication = {'class': 'NetworkTopologyStrategy', 'dc1': 3, 'dc2': 3, 'dc3': 3})`.
The data replicated to DC1 will be processed by the change agent and eventually end up in the data topic.

== What's next?

* For more on using CDC with Apache Pulsar, including schema management and consumption patterns, see our https://docs.datastax.com/en/streaming/streaming-learning/use-cases-architectures/change-data-capture/index.html[Streaming learning page].
* If you've got more questions about {cdc_cass_first}, see xref::faqs.adoc[].
The data replicated to DC1 will be processed by the change agent and eventually end up in the data topic.
22 changes: 4 additions & 18 deletions docs/modules/ROOT/pages/install.adoc
Original file line number Diff line number Diff line change
@@ -1,15 +1,6 @@
= Installing {cdc_cass} for VM deployment

== Download the DataStax Change Data Capture (CDC) Agent for Apache Cassandra(R)

[IMPORTANT]
====
By downloading this DataStax product, you agree to the terms of the open-source https://www.apache.org/licenses/LICENSE-2.0[Apache-2.0 license agreement].
====

Perform the following steps:

. Download the change agent tar file from the https://downloads.datastax.com/#cassandra-change-agent[DataStax downloads page]. +
. Download the `cassandra-source-agents` tar file from the https://github.com/datastax/cdc-apache-cassandra/releases[{cdc_cass} GitHub repository].
The following files are available in the tar file:
+
[cols="1,1"]
Expand Down Expand Up @@ -89,12 +80,7 @@ include::partial$agentParams.adoc[]

== Download {cdc_pulsar}

IMPORTANT
====
By downloading this DataStax product, you agree to the terms of the open-source https://www.apache.org/licenses/LICENSE-2.0[Apache-2.0 license agreement].
====

Download the `cassandra-source-connectors-<version>.tar` file from the https://downloads.datastax.com/#cassandra-source-connector[DataStax downloads page].
Download the `cassandra-source-connectors` tar file from the https://github.com/datastax/cdc-apache-cassandra/releases[{cdc_cass} GitHub repository].

For Apache Pulsar and IBM Elite Support for Apache Pulsar (formerly DataStax Luna Streaming) 2.8, the `pulsar-cassandra-source-<version>.nar` file is available.

Expand Down Expand Up @@ -198,7 +184,7 @@ The following table identifies functionally equivalent {cdc_pulsar} and DataStax
NOTE: If you define both in your configuration, the {cdc_pulsar} setting take precedence over the `datastax-java-driver.property-name`.
If you do not provide either in your configuration, {cdc_pulsar} defaults are in effect.

For information about the Java properties, refer to the link:https://docs.datastax.com/en/developer/java-driver-dse/2.3/manual/core/configuration/[DataStax Java driver documentation].
For information about the Java properties, refer to the https://docs.datastax.com/en/developer/java-driver/4.3/manual/core/configuration/reference/index.html[DataStax Java driver documentation].

|===
| {csc_pulsar_first} | Using datastax-java-driver prefix
Expand Down Expand Up @@ -238,7 +224,7 @@ datastax-java-driver.basic.contact-points = 127.0.0.1:9042, 127.0.0.2:9042

=== Java driver reference

For more information, refer to the link:https://docs.datastax.com/en/developer/java-driver/4.3/manual/core/configuration/reference/[Java driver reference configuration] topic.
For more information, refer to the https://docs.datastax.com/en/developer/java-driver/4.3/manual/core/configuration/reference/index.html[Java driver reference configuration] topic.

== Scaling up your configuration

Expand Down
4 changes: 2 additions & 2 deletions docs/modules/ROOT/pages/monitor.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ pulsar_source_user_metric__sum{tenant="public",namespace="public/default",name="

== Monitoring and Alerting resources

* The change agent exposes metrics with https://docs.datastax.com/en/landing_page/doc/landing_page/metricsandalerts.html[JMX], a technology within Java that provides tools for managing and monitoring applications.
* https://docs.datastax.com/en/opscenter/6.8/[DSE Ops Center] can collect these exposed metrics for visualization and alerts, and pass them on to https://docs.datastax.com/en/monitoring/doc/monitoring/opsUseMetricsCollector.html[DSE Metrics Collector] for additional integration with Prometheus and Grafana.
* The change agent exposes metrics with https://docs.datastax.com/en/planning/dse/metrics-alerts.html[JMX], a technology within Java that provides tools for managing and monitoring applications.
* https://docs.datastax.com/en/opscenter/6.8/overview/opscenter-about.html[DSE Ops Center] can collect these exposed metrics for visualization and alerts, and pass them on to https://docs.datastax.com/en/monitoring/ops-use-metrics-collector.html[DSE Metrics Collector] for additional integration with Prometheus and Grafana.
* The https://github.com/datastax/metric-collector-for-apache-cassandra[Metrics Collector for Apache Cassandra] with Prometheus and Grafana dashboards provides the same functionality as DSE Metrics Collector, built on the well-supported collectd agent.
* Other monitoring tools like https://github.com/prometheus/jmx_exporter[JMX Exporter] by Prometheus are available, but may require additional tuning.