Skip to content

JohnPreston/kafka-overwatch

Repository files navigation

kafka-overwatch

What started as a simple CLI/Service to evaluate Kafka cluster topics which have no activity, ended up being a somewhat comprehensive way to monitor Kafka cluster activities.

Takes a configuration file as input, where you can list one or multiple cluster(s) you wish to monitor.

After a set period of time, it can produce a report (to local disk or AWS S3) with the list of topics that haven't seen any activity. It also exposes metrics via a prometheus endpoint.

Usage

kafka-overwatch -c config.local.yaml

Features

  • Supports evaluating multiple Kafka clusters at once

  • Generates a report on topic usage based on topic watermarks offsets (store local or to S3)

  • Generates commands script to re-create all the topics in case of DR (store local or to S3)

  • Exposes metrics via prometheus

    • Topics count
    • Partitions count
    • Number of new messages (measured with topic offsets)
  • AWS Secret integration for client config values

  • Schema Registry integration

    • Scan schema registries, map 1 to many kafka clusters
    • Backup of the schemas, and CLI to restore schemas to existing/new registry.

Upcoming

  • Multi-nodes awareness (split the load with multiple nodes)
  • cfn-kafka-admin output format
  • topic messages meta-data analysis (i.e are messages compressed?)
  • scripts to perform cleanup
  • Recommendations generated from/based on models
  • Conduktor Gateway vClusters auto-discovery

Configuration

Whilst a much more comprehensive documentation is yet to be written, please look at kafka_overwatch/specs/config.json which is used with jsonschema to perform validation of the input.

Return codes

0 - all successful. 1 - error during execution 2 - error importing configuration.

Misc

Thanks

Thanks to the Apache Kafka OpenSource community for their continuous efforts in making the eco-system great. Thanks to the NASA for having a public cluster to run tests with

Note

Inspired by kafka-idle-topics, yet completely re-written to be a continuous monitoring of the topics, similar to cruise-control.

Status

Images build status

BUILD

Docs build status

DOCS_BUILD