Skip to content

Add Kafka rack awareness/Fetch from follower to Kafka source configuration #5926

@dpavlov-smartling

Description

@dpavlov-smartling

This feature will make Kafka cross availability zone traffic in any cloud hosting provider free of charge. This is a huge bill than you process a lot of logs and your Kafka infrastructure is spread between multiple availability zones.
LibrdKafka library does support this as per https://docs.confluent.io/platform/current/clients/librdkafka/html/md_INTRODUCTION.html#fetch-from-follower , but we can't use this configuration, because QW assigns Kafka pipelines on its own and we can't just hardcoded Kafka client rack parameter in Kafka Source configuration like this:

version: 0.8
source_id: my-kafka-source
source_type: kafka
num_pipelines: 2
params:
  topic: my-topic
  client_params:
    bootstrap.servers: localhost:9092
    client.rack:us-east-1a

Kafka source configuration should support fetching client.rack from the environment variable on the QW indexer server.
For example you can use the following configuration of the source:

version: 0.8
source_id: my-kafka-source
source_type: kafka
num_pipelines: 2
params:
  topic: my-topic
  client_params:
    bootstrap.servers: localhost:9092
    client.rack:${QW_KAFKA_CLIENT_RACK}

Now when pipeline is scheduled on QW Indexer with env variable QW_KAFKA_CLIENT_RACK set to "us-east-1a" Kafka source config will take it from that variable and set as a client.rack parameter during connection to the Kafka server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions