Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Puppet HTML Makefile
Branch: master

Puppetize systemd override for Kafka LimitNOFILE

/etc/default/kafka's KAFKA_NOFILES_ULIMIT is not useable
by the systemd service file installed by the latest WMF
Kafka package, so we need to puppetized a custom systemd
override file to set this.

Bug: T106581
Change-Id: I10f41d0d3a19b15374ded800f0d6cd12573c9fa9
latest commit b75128da27
@ottomata ottomata authored

README.md

Table of Contents generated with DocToc

Kafka Puppet Module

A Puppet module for installing and managing Apache Kafka brokers.

This module is currently being maintained by The Wikimedia Foundation in Gerrit at operations/puppet/kafka and mirrored here on GitHub. It was originally developed for 0.7.2 at https://github.com/wikimedia/puppet-kafka-0.7.2.

Requirements

Usage

Kafka (Clients)

# Install the kafka package.
class { 'kafka': }

This will install the Kafka package which includes /usr/sbin/kafka, useful for running client (console-consumer, console-producer, etc.) commands.

Kafka Broker Server

# Include Kafka Broker Server.
class { 'kafka::server':
    log_dirs         => ['/var/spool/kafka/a', '/var/spool/kafka/b'],
    brokers          => {
        'kafka-node01.example.com' => { 'id' => 1, 'port' => 12345 },
        'kafka-node02.example.com' => { 'id' => 2 },
    },
    zookeeper_hosts  => ['zk-node01:2181', 'zk-node02:2181', 'zk-node03:2181'],
    zookeeper_chroot => '/kafka/cluster_name',
}

log_dirs defaults to a single ['/var/spool/kafka], but you may specify multiple Kafka log data directories here. This is useful for spreading your topic partitions across multiple disks.

The brokers parameter is a Hash keyed by $::fqdn. Each value is another Hash that contains config settings for that kafka host. id is required and must be unique for each Kafka Broker Server host. port is optional, and defaults to 9092.

Each Kafka Broker Server's broker_id and port properties in server.properties will be set based by looking up the node's $::fqdn in the hosts Hash passed into the kafka base class.

zookeeper_hosts is an array of Zookeeper host:port pairs. zookeeper_chroot is optional, and allows you to specify a Znode under which Kafka will store its metadata in Zookeeper. This is useful if you want to use a single Zookeeper cluster to manage multiple Kafka clusters. See below for information on how to create this Znode in Zookeeper.

Custom Zookeeper Chroot

If Kafka will share a Zookeeper cluster with other users, you might want to create a Znode in zookeeper in which to store your Kafka cluster's data. You can set the zookeeper_chroot parameter on the kafka class to do this.

First, you'll need to create the znode manually yourself. You can use zkCli.sh that ships with Zookeeper, or you can use the kafka built in


$ kafka zookeeper-shell :2182 Connecting to kraken-zookeeper Welcome to ZooKeeper! JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null [zk: kraken-zookeeper(CONNECTED) 0] create /my_kafka kafka Created /my_kafka


You can use whatever chroot znode path you like.  The second argument
(```data```) is arbitrary.  I used 'kafka' here.

Then:
```puppet
class { 'kafka::server':
    brokers => {
        'kafka-node01.example.com' => { 'id' => 1, 'port' => 12345 },
        'kafka-node02.example.com' => { 'id' => 2 },
    },
    zookeeper_hosts => ['zk-node01:2181', 'zk-node02:2181', 'zk-node03:2181'],
    # set zookeeper_chroot on the kafka class.
    zookeeper_chroot => '/kafka/clusterA',
}

Kafka Mirror

Kafka MirrorMaker is usually used for inter data center Kafka cluster replication and aggregation. You can consume from any number of source Kafka clusters, and produce to a single destination Kafka cluster.

# Configure kafka-mirror to produce to Kafka Brokers which are
# part of our kafka aggregator cluster.
class { 'kafka::mirror':
    destination_brokers => {
        'kafka-aggregator01.example.com' => { 'id' => 11 },
        'kafka-aggregator02.example.com' => { 'id' => 12 },
    },
    topic_whitelist => 'webrequest.*',
}

# Configure kafka-mirror to consume from both clusterA and clusterB
kafka::mirror::consumer { 'clusterA':
    zookeeper_hosts  => ['zk-node01:2181', 'zk-node02:2181', 'zk-node03:2181'],
    zookeeper_chroot => ['/kafka/clusterA'],
}
kafka::mirror::consumer { 'clusterB':
    zookeeper_hosts  => ['zk-node01:2181', 'zk-node02:2181', 'zk-node03:2181'],
    zookeeper_chroot => ['/kafka/clusterB'],
}

jmxtrans monitoring

This module contains a class called kafka::server::jmxtrans. It contains a useful jmxtrans JSON config object that can be used to tell jmxtrans to send to any output writer (Ganglia, Graphite, etc.). To you use this, you will need the puppet-jmxtrans module.

# Include this class on each of your Kafka Broker Servers.
class { '::kafka::server::jmxtrans':
    ganglia => 'ganglia.example.com:8649',
}

This will install jmxtrans and start render JSON config files for sending JVM and Kafka Broker stats to Ganglia. See kafka-jmxtrans.json.md for a fully rendered jmxtrans Kafka JSON config file.

Something went wrong with that request. Please try again.