Skip to content
Permalink
master
Switch branches/tags
Go to file
1 contributor

Users who have contributed to this file

No More Silos - CDC demo

The slides that accompany this demo can be found here: https://talks.rmoff.net/

Running the test rig

  1. Bring up the stack

    git clone https://github.com/confluentinc/demo-scene.git
    cd no-more-silos
    docker-compose up -d

    This brings up the stack ready for use.

  2. Wait for Kafka Connect to be started

    bash -c '
    echo "Waiting for Kafka Connect to start listening on localhost ⏳"
    while [ $(curl -s -o /dev/null -w %{http_code} http://localhost:8083/connectors) -ne 200 ] ; do
      echo -e $(date) " Kafka Connect listener HTTP state: " $(curl -s -o /dev/null -w %{http_code} http://localhost:8083/connectors) " (waiting for 200)"
      sleep 5
    done
    echo -e $(date) " Kafka Connect is ready! Listener HTTP state: " $(curl -s -o /dev/null -w %{http_code} http://localhost:8083/connectors)
    '
  3. Make sure the connector plugins are available

    curl -s localhost:8083/connector-plugins|jq '.[].class'|egrep 'debezium.*mysql|JdbcSink'
    "io.confluent.connect.jdbc.JdbcSinkConnector"
    "io.debezium.connector.mysql.MySqlConnector"
  4. Launch MySQL CLI

    docker exec -it mysql bash -c 'mysql -u $MYSQL_USER -p$MYSQL_PASSWORD demo'

Part 01 - Query-Based CDC

MySQL to Kafka using JDBC Source connector

  1. In MySQL, examine the data

    DESCRIBE customers;
    SELECT ID, FIRST_NAME, LAST_NAME, EMAIL, UPDATE_TS FROM customers;
    +----+------------+-----------+----------------------------+---------------------+
    | ID | FIRST_NAME | LAST_NAME | EMAIL                      | UPDATE_TS           |
    +----+------------+-----------+----------------------------+---------------------+
    |  1 | Bibby      | Argabrite | bargabrite0@google.com.hk  | 2019-04-01 16:51:18 |
    |  2 | Auberon    | Sulland   | asulland1@slideshare.net   | 2019-04-01 16:51:18 |
    |  3 | Marv       | Dalrymple | mdalrymple2@macromedia.com | 2019-04-01 16:51:18 |
    |  4 | Nolana     | Yeeles    | nyeeles3@drupal.org        | 2019-04-01 16:51:18 |
    |  5 | Modestia   | Coltart   | mcoltart4@scribd.com       | 2019-04-01 16:51:18 |
    +----+------------+-----------+----------------------------+---------------------+
    5 rows in set (0.00 sec)
  2. Create the connector

    curl -X PUT http://localhost:8083/connectors/source-jdbc-mysql-00/config \
             -H "Content-Type: application/json" -d '{
              "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
              "connection.url": "jdbc:mysql://mysql:3306/demo",
              "connection.user": "connect_user",
              "connection.password": "asgard",
              "topic.prefix": "mysql-00-",
              "poll.interval.ms": 1000,
              "tasks.max":1,
              "mode":"timestamp",
              "table.whitelist" : "demo.customers",
              "timestamp.column.name": "UPDATE_TS",
              "validate.non.null": false
              }'
  3. Check it’s running

    curl -s "http://localhost:8083/connectors?expand=info&expand=status" |
           jq '. | to_entries[] | [ .value.info.type, .key, .value.status.connector.state,.value.status.tasks[].state,.value.info.config."connector.class"]|join(":|:")' |
           column -s : -t| sed 's/\"//g'| sort
    source  |  source-datagen-item_details_01  |  RUNNING  |  RUNNING  |  io.confluent.kafka.connect.datagen.DatagenConnector
    source  |  source-jdbc-mysql-00            |  RUNNING  |  RUNNING  |  io.confluent.connect.jdbc.JdbcSourceConnector
  4. Examine the data

    docker exec kafkacat kafkacat                      \
            -b broker:29092                            \
            -r http://schema-registry:8081             \
            -s avro                                    \
            -t mysql-00-customers                      \
            -C -o beginning -u -q | jq  '.'
  5. Split the screen to show Kafka topic output along with MySQL.

  6. Make changes in MySQL and observe that the Kafka topic (as shown by kafkacat) updates automatically

    • Insert a new row in MySQL:

      INSERT INTO customers (ID, FIRST_NAME, LAST_NAME, EMAIL, GENDER, COMMENTS) VALUES (42, 'Rick', 'Astley', '', 'Male', '');
    • Update a row in MySQL:

      UPDATE customers SET EMAIL = 'Never.gonna.give.you@up.com' WHERE ID = 42;

Part 02 - Log-Based CDC

MySQL to Kafka using JDBC Source connector

  1. In MySQL, examine the data

    SELECT ID, FIRST_NAME, LAST_NAME, EMAIL, UPDATE_TS FROM customers;
    +----+------------+-----------+----------------------------+---------------------+
    | ID | FIRST_NAME | LAST_NAME | EMAIL                      | UPDATE_TS           |
    +----+------------+-----------+----------------------------+---------------------+
    |  1 | Bibby      | Argabrite | bargabrite0@google.com.hk  | 2019-04-01 16:51:18 |
    |  2 | Auberon    | Sulland   | asulland1@slideshare.net   | 2019-04-01 16:51:18 |
    |  3 | Marv       | Dalrymple | mdalrymple2@macromedia.com | 2019-04-01 16:51:18 |
    |  4 | Nolana     | Yeeles    | nyeeles3@drupal.org        | 2019-04-01 16:51:18 |
    |  5 | Modestia   | Coltart   | mcoltart4@scribd.com       | 2019-04-01 16:51:18 |
    | 42 | Rick       | Astley    | Never.gonna.give.you@up.com| 2019-04-01 17:59:43 |
    +----+------------+-----------+----------------------------+---------------------+
    5 rows in set (0.00 sec)
  2. Create the connector

    curl -i -X PUT -H "Accept:application/json" \
        -H  "Content-Type:application/json" http://localhost:8083/connectors/source-debezium-mysql-00/config \
        -d '{
              "connector.class": "io.debezium.connector.mysql.MySqlConnector",
              "database.hostname": "mysql",
              "database.port": "3306",
              "database.user": "debezium",
              "database.password": "dbz",
              "database.server.id": "42",
              "database.allowPublicKeyRetrieval":"true",
              "database.server.name": "asgard",
              "table.whitelist": "demo.customers",
              "database.history.kafka.bootstrap.servers": "broker:29092",
              "database.history.kafka.topic": "asgard.dbhistory.demo" ,
              "include.schema.changes": "true"
        }'
  3. Check it’s running

    curl -s "http://localhost:8083/connectors?expand=info&expand=status" |
           jq '. | to_entries[] | [ .value.info.type, .key, .value.status.connector.state,.value.status.tasks[].state,.value.info.config."connector.class"]|join(":|:")' |
           column -s : -t| sed 's/\"//g'| sort
    source  |  source-datagen-item_details_01  |  RUNNING  |  RUNNING  |  io.confluent.kafka.connect.datagen.DatagenConnector
    source  |  source-debezium-mysql-00        |  RUNNING  |  RUNNING  |  io.debezium.connector.mysql.MySqlConnector
    source  |  source-jdbc-mysql-00            |  RUNNING  |  RUNNING  |  io.confluent.connect.jdbc.JdbcSourceConnector
  4. Examine the data with kafkacat

    docker exec kafkacat kafkacat                      \
            -b broker:29092                            \
            -r http://schema-registry:8081             \
            -s avro                                    \
            -t asgard.demo.customers                   \
            -C -o beginning -u -q | jq '.'
    {
      "before": null,
      "after": {
        "Value": {
          "id": 42,
          "first_name": {
            "string": "Rick"
          },
          "last_name": {
            "string": "Astley"
          },
          "email": {
            "string": "Never.gonna.give.you@up.com"
          },
          "gender": {
            "string": "Male"
          },
          "comments": {
            "string": ""
          },
          "UPDATE_TS": {
            "string": "2019-10-23T16:29:53Z"
          }
        }
      },
      "source": {
        "version": "0.10.0.Final",
        "connector": "mysql",
        "name": "asgard",
        "ts_ms": 0,
        "snapshot": {
          "string": "last"
        },
        "db": "demo",
        "table": {
          "string": "customers"
        },
        "server_id": 0,
        "gtid": null,
        "file": "binlog.000002",
        "pos": 873,
        "row": 0,
        "thread": null,
        "query": null
      },
      "op": "c",
      "ts_ms": {
        "long": 1571848220368
      }
    }
  5. Split the screen to show Kafka topic output along with MySQL.

  6. Rerun kafkacat to show compact output

    docker exec kafkacat kafkacat                      \
            -b broker:29092                            \
            -r http://schema-registry:8081             \
            -s avro                                    \
            -t asgard.demo.customers                   \
            -C -o beginning -u -q | jq '.op, .before, .after'
  7. Make changes in MySQL and observe that the Kafka topic (as shown by kafkacat) updates automatically

    • Update a new row in MySQL:

      UPDATE customers SET EMAIL = 'r.astley@example.com' WHERE ID = 42;
      UPDATE customers SET FIRST_NAME = 'BOB' WHERE ID = 42;
    • Delete a row in MySQL:

      DELETE FROM customers WHERE ID=2;

Optional - Stream/Table duality in ksqlDB

SET 'auto.offset.reset' = 'earliest';
CREATE STREAM CUSTOMERS_CDC_STREAM WITH (KAFKA_TOPIC='asgard.demo.customers', VALUE_FORMAT='AVRO');

CREATE TABLE CUSTOMERS AS
  SELECT AFTER->ID AS ID,
         LATEST_BY_OFFSET(AFTER->FIRST_NAME) AS FIRST_NAME,
         LATEST_BY_OFFSET(AFTER->LAST_NAME) AS LAST_NAME,
         LATEST_BY_OFFSET(AFTER->EMAIL) AS EMAIL,
         LATEST_BY_OFFSET(AFTER->GENDER) AS GENDER,
         LATEST_BY_OFFSET(AFTER->COMMENTS) AS COMMENTS
    FROM CUSTOMERS_CDC_STREAM
    GROUP BY AFTER->ID;
  • In MySQL, query the state:

    mysql> SELECT ID, FIRST_NAME, LAST_NAME, EMAIL FROM customers WHERE ID=42;
    +----+------------+-----------+-----------------------------+
    | ID | FIRST_NAME | LAST_NAME | EMAIL                       |
    +----+------------+-----------+-----------------------------+
    | 42 | BOB       | Astley    | Never.gonna.give.you@up.com |
    +----+------------+-----------+-----------------------------+
    1 rows in set (0.00 sec)
  • In ksqlDB query the table:

    SELECT * FROM CUSTOMERS WHERE ID=42;
    +-------------+-------------+-------------+-------------+-------------+-------------+
    |ID           |FIRST_NAME   |LAST_NAME    |EMAIL        |GENDER       |COMMENTS     |
    +-------------+-------------+-------------+-------------+-------------+-------------+
    |42           |BOB          |Astley       |r.astley@exam|Male         |             |
    |             |             |             |ple.com      |             |             |
  • In ksqlDB query the stream:

    SET 'auto.offset.reset' = 'earliest';
    
    SELECT OP, AFTER->ID, AFTER->FIRST_NAME, AFTER->LAST_NAME, AFTER->EMAIL
      FROM CUSTOMERS_CDC_STREAM
      WHERE AFTER->ID=42
      EMIT CHANGES;
    
    +----------------+----------------+----------------+----------------+----------------+
    |OP              |ID              |FIRST_NAME      |LAST_NAME       |EMAIL           |
    +----------------+----------------+----------------+----------------+----------------+
    |c               |42              |Rick            |Astley          |Never.gonna.give|
    |                |                |                |                |.you@up.com     |
    |u               |42              |Rick            |Astley          |r.astley@example|
    |                |                |                |                |.com            |
    |u               |42              |BOB             |Astley          |r.astley@example|
    |                |                |                |                |.com            |
    |u               |42              |Rick            |Astley          |r.astley@example|
    |                |                |                |                |.com            |

Option - Stream/table joins

  • Join to a stream of events

    CREATE STREAM RATINGS WITH (KAFKA_TOPIC='ratings',VALUE_FORMAT='AVRO');
    SELECT MESSAGE, STARS, USER_ID FROM RATINGS EMIT CHANGES;
    SELECT R.RATING_ID, R.MESSAGE, R.STARS,
          C.ID, C.FIRST_NAME + ' ' + C.LAST_NAME AS FULL_NAME, C.EMAIL AS EMAIL
          FROM RATINGS R
            INNER JOIN CUSTOMERS C
            ON R.USER_ID = C.ID
          EMIT CHANGES;
    CREATE STREAM RATINGS_ENRICHED AS
    SELECT R.RATING_ID, R.MESSAGE, R.STARS,
          C.ID, C.FIRST_NAME + ' ' + C.LAST_NAME AS FULL_NAME, C.EMAIL AS EMAIL
          FROM RATINGS R
            INNER JOIN CUSTOMERS C
            ON R.USER_ID = C.ID
          EMIT CHANGES;
    PRINT 'RATINGS_ENRICHED';