Skip to content

Configuration

Brian L. Troutwine edited this page Dec 14, 2017 · 25 revisions

Cernan has many different options to configure its runtime behaviour. While a few command line flags are available the preferred method to configure cernan is via a toml file. This page discusses the options available for configuration in that format.

Sources

The cernan server has options to control which ports and protocols are running ingestion interfaces. These are referred to as 'sources'. By default cernan will listen on UDP:8125 and TCP:2003 for statsd and graphite traffic. In the following, all sources that can be enabled with defaults are:

[sources]
  [sources.statsd.primary]
  [sources.graphite.primary]
  [sources.native.primary]

The full documentation for sources is here.

Filters

The cernan server provides mechanisms to transform in-flight telemetry and logs. This mechanism is the 'filter'.

The full documentation is here.

Sinks

The cernan server has many ways to ship data to external systems. These are called 'sinks'. By default no sinks are enabled. In this mode cernan server doesn't do that much. Sinks are configured individually, by name.

[sinks]
  [sinks.console]
  [sinks.wavefront]
  [sinks.null]
  [sinks.firehose.stream_one]

For sinks which support it cernan can report metadata about the metric. These are called "tags" by some aggregators and cernan uses this terminology. In AWS you might choose to include your instance ID with each metric point, as well as the service name. You may set tags like so:

[tags]
source = cernan

Please see Sinks for the list of supported sinks and for sink-specific configuration details.

Forwards

A forward is a routing distination from one source or filter to potentially many sinks or filters. Each source must set at least one forward. Forwards are configured through the forwards parameter on each source. Consider the following:

[sources]
  [sources.statsd.primary]
  enabled = true
  port = 8125
  forwards = ["sinks.console"]

  [sources.statsd.secondary]
  enabled = true
  port = 8126
  forwards = ["sinks.null"]

  [sources.graphite.primary]
  enabled = true
  port = 2004
  forwards = ["sinks.null", "sinks.console"]

[sinks]
  [sinks.console]
  bin_width = 1

  [sinks.null]
  bin_width = 1

This sets up a cernan to have two statsd sources, running on ports 8125 and 8126, named 'primary' and 'secondary'. Additionally, a sole graphite source is enabled. The primary statsd source will forward all of its metrics to the console sink while the secondary statsd source will forward to the null sink. The graphite source will forward its metrics to all available sinks.

Flush Interval

By default cernan's sinks will flush every sixty seconds. You may adjust this behaviour by modifying the flush-interval directive:

flush-interval=<INT> How frequently to flush metrics to the sinks in seconds. [default: 60].

The flush-interval does not affect aggregations. A full discussion of cernan's aggregation model is discussed in this wiki's data model page. Sinks will accept independent flush interval configuration but this must be specified with flush_interval. Note the underscore.

Global Tags

For sinks which support it cernan can report metadata about the metric. These are called "tags" by some aggregators and cernan uses this terminology. You may configure tags per-sink – see sink documentation – or you may specify global tags to be applied to all sinks. You may set global tags like so:

[tags]
source = "cernan"
hostname = { environment = true, value = "HOSTNAME" }

The first will set the tag source to have the value cernan, the second will set hostname to have the environment variable value of HOSTNAME. Each key / value pair will converted to the appropriate tag, depending on the sink.

Hopper Index Size

Cernan separates its source / filter / sink threads by using a disk-backed mpsc variant called Hopper. In overload conditions hopper will buffer data to disk, keeping cernan's memory use low. Hopper works by writing index files to disk. This option controls the maximum size of these files.

max-hopper-queue-bytes = <INT> Soft-maximum size in bytes of hopper index files [default: 104857600]

The default size of the index files is 100MB. On disk constrained systems with complex event routing you may wish to set this value lower.

Data Directory

By default cernan will put its on-disk queues into TMPDIR. While this is acceptable for testing and development this is not desirable for production deployments. You may adjust where cernan stores its on-disk queues with the data-directory option:

data-directory = "/var/lib/cernan/"

In the above, we are requiring that cernan store its files in /var/lib/cernan. The structure of this data is not defined. Cernan will not create the path data-directory points to if it does not exist.

Scripts Directory

By default cernan will read programmable filters from /tmp/cernan-scripts. While this is acceptable for testing and development this is not desirable for production deployments. You may adjust where cernan searches for on-disk scripts with the data-directory option:

scripts-directory = "/etc/cernan/scripts"

Cernan will not create the path scripts-directory points to if it does not exist.

Eliding Points

In some cases it's not nessary for cernan to ship the aggregates of each point for every second it receives them to achieve a statistically accurate impression of your system. To that end, cernan allows the user to control the width of aggregation bins on a per-source basis. For instance, the following will aggregate points into 1 second bins on the console sink and 10 second bins for the wavefront sink:

[sinks.console]
bin_width = 1

[sinks.wavefront]
bin_width = 10