# Подготовка Apache Kafka кластера для анализа метрик

## Создать топик

In [None]:
kafka-topics --bootstrap-server "$KAFKA_HOST":"$KAFKA_PORT" \
    --topic my-metrics-topic \
    --create \
    --partitions 2 \
    --replication-factor 3

## Получить список топиков

In [None]:
kafka-topics --bootstrap-server "$KAFKA_HOST":"$KAFKA_PORT" \
    --list

## Записать данные в топик

In [None]:
kafka-console-producer --bootstrap-server "$KAFKA_HOST":"$KAFKA_PORT" \
    --topic my-metrics-topic \
    <<<$(yes 'Hello, World!' 2>/dev/null | head -n 100)

## Прочитать данные из топика

In [None]:
kafka-console-consumer --bootstrap-server "$KAFKA_HOST":"$KAFKA_PORT" \
    --topic my-metrics-topic \
    --group my-metrics-topic-group-id \
    --consumer-property client.id=my-metrics-client-id \
    --from-beginning \
    --timeout-ms 10000

После добавления новых данных в топик, у консьюмер группы `my-metrics-topic-group-id` накопится лаг (отставание от продюсера):

In [None]:
kafka-console-producer --bootstrap-server "$KAFKA_HOST":"$KAFKA_PORT" \
    --topic my-metrics-topic \
    <<<$(yes 'Привет, мир!' 2>/dev/null | head -n 100)

## Schema Registry

In [None]:
cat <<EOF > /tmp/orders-value.avsc
{
    "schema": "{ \
        \"type\": \"record\", \
        \"name\": \"Orders\", \
        \"namespace\": \"com.github.neshkeev.kafka.customer.avro\", \
        \"fields\":[ \
            { \
                \"name\":\"id\", \
                \"type\":\"string\" \
            }, \
            { \
                \"name\":\"name\", \
                \"type\":\"string\" \
            } \
        ] \
    }" \
}
EOF

In [None]:
curl -s http://schema-registry:8081/subjects/orders-value/versions \
    -X POST \
    -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    -d '@/tmp/orders-value.avsc'

In [None]:
cat <<EOF > /tmp/orders-value-v2.avsc
{
    "schema": "{ \
        \"type\": \"record\", \
        \"name\": \"Orders\", \
        \"namespace\": \"com.github.neshkeev.kafka.customer.avro\", \
        \"fields\":[ \
            { \
                \"name\":\"id\", \
                \"type\":\"string\" \
            }, \
            { \
                \"name\":\"name\", \
                \"type\":\"string\" \
            }, \
            { \
                \"name\":\"description\", \
                \"type\":\"string\", \
                \"default\":\"\" \
            } \
        ] \
    }" \
}
EOF

In [None]:
curl -s http://schema-registry:8081/subjects/orders-value/versions \
    -X POST \
    -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    -d '@/tmp/orders-value-v2.avsc'

## Kafka Connect

In [None]:
cat <<EOF > /tmp/my-heartbeat-connector.json
{
   "config" : {
      "connector.class" : "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
      "name" : "my-heartbeat-connector",
      "source.cluster.alias" : "source",
      "target.cluster.bootstrap.servers" : "kafka1:9092",
      "target.cluster.sasl.mechanism" : "PLAIN",
      "target.cluster.security.protocol" : "PLAINTEXT"
   },
   "name" : "my-heartbeat-connector"
}
EOF

In [None]:
curl -s -X POST http://connect:8083/connectors \
    -H 'Content-Type: application/json' \
    -d '@/tmp/my-heartbeat-connector.json' | json_pp

## KSQL

In [None]:
function single_line() {
    sed -n '1h;1!H;${g;s,\n,,gp}'
}

In [None]:
function prepare_ksql_query() {
    local script=${1}
    [ -f "$script" ] || {
        echo "${script} файл не найден." >&2
        return 1
    }

    cat <<EOF
{
    "ksql": "$(cat ${script} | single_line)",
    "streamsProperties": {}
}
EOF
}

In [None]:
cat <<EOF > /tmp/create-customers-stream.cli
CREATE STREAM customers (
    id INT,
    name VARCHAR
) WITH (
    kafka_topic='customers',
    value_format='avro',
    partitions=2
);
EOF

In [None]:
curl -s "http://ksqldb-server:8088/ksql" \
    -X POST \
    -H 'Accept: application/vnd.ksql.v1+json' \
    -d "$(prepare_ksql_query /tmp/create-customers-stream.cli)" | json_pp

In [None]:
cat <<EOF > /tmp/insert-customers.cli
INSERT INTO customers VALUES($RANDOM, 'John Doe');
INSERT INTO customers VALUES($RANDOM, 'Maria Stewart');
EOF

In [None]:
curl -s "http://ksqldb-server:8088/ksql" \
    -X POST \
    -H 'Accept: application/vnd.ksql.v1+json' \
    -d "$(prepare_ksql_query /tmp/insert-customers.cli)" | json_pp

In [None]:
cat <<EOF > /tmp/select-customers.cli
SELECT *
  FROM customers EMIT CHANGES;
EOF

Запрос будет запущен в фоне:

In [None]:
curl -s "http://ksqldb-server:8088/query" \
    -X POST \
    -H 'Accept: application/vnd.ksql.v1+json' \
    -d "$(prepare_ksql_query /tmp/select-customers.cli)" &

Выплонение любой команды позволит напечатать вывод команды в фоне:

In [None]:
echo

В выводе необходимо найти `"queryId":"transient_CUSTOMERS_3399213554579640620"`, скопировать значение `queryId` и использовать его для остановки запроса:

In [None]:
QUERY_ID=''

In [None]:
[ -n "$QUERY_ID" ] || {
    echo 'Пожалуйста, запишите значение query_id в переменную QUERY_ID' >&2
    false
}

In [None]:
curl -s "http://ksqldb-server:8088/close-query" \
    -X POST \
    -H 'Accept: application/vnd.ksql.v1+json' \
    -d "{\"queryId\": \"${QUERY_ID}\"}" | json_pp