Skip to content

Quickstart

Brian L. Troutwine edited this page Oct 5, 2017 · 9 revisions

Building Cernan

The ambition is for cernan to be easily installed and run on development machines. The only slight rub is that you will need to install rust. Should be as simple as:

> curl -sSf https://static.rust-lang.org/rustup.sh | sh

Ensure that ~/.cargo/bin is in your PATH. Then from the root of this project:

> cargo install

Recent versions of OSX may have some goofy OpenSSL issues, which can be resolved by issuing

> brew install openssl
> export OPENSSL_INCLUDE_DIR=`brew --prefix openssl`/include
> export OPENSSL_LIB_DIR=`brew --prefix openssl`/lib

and following the onscreen instructions. Run

> cargo run -- -vv --config examples/configs/quickstart.toml

and you're good to go. Cernan will now be listening for statsd packets on UDP port 8125.

Example Configuration

Now that you have a cernan instance built, let's get cernan accepting statsd traffic and printing the aggregation out to console. You can find the configuration file we'll be discussing in the cernan project, the file examples/configs/quickstart.toml. It looks like so:

flush-interval = 10

[sources]
  [sources.statsd.primary]
  port = 8125
  forwards = ["sinks.console"]

[sinks]
  [sinks.console]

This tells cernan to start one statsd listener named "primary" on port 8125. That's this part here:

[sources]
  [sources.statsd.primary]
  port = 8125
  forwards = ["sinks.console"]

You can have more than one source enabled at a time and multiple sources of the same protocol, but we've only got one for now. Anything the "primary" statsd sink receives will be forwarded to the console sink. That's what forwards = ["sinks.console"] means.

This part enables the console sink:

[sinks]
  [sinks.console]

There are options available for console that you can read about in Configuration. We'll gloss over them for now.

From the root of the cernan project run cernan with this configuration file:

> cargo run -- -vv --config examples/configs/quickstart.toml

Cernan will now be running and listening on UDP:8125. Go ahead and feed it some traffic.

> while true; do echo "foo.bar:$(gshuf -i1-1000 -n1)|g" | nc -c -u localhost 8125; done

If you don't know the statsd protocol, that's okay. What we're doing here is sending a gauge metric with the name "foo.bar" and random values between 1 and 1000 to cernan.

OSX: NOTE that gshuf may not be available, but that can be installed by brew install coreutils.

After five seconds you ought to see output from cernan that looks kind of like this:

Flushing metrics: 2017-01-11T23:31:32.904575+00:00
  sums:
    cernan.statsd.packet(1484177488): 42
    cernan.statsd.packet(1484177489): 43
    cernan.statsd.packet(1484177490): 42
    cernan.statsd.packet(1484177491): 43
    cernan.statsd.packet(1484177492): 19
  sets:
    foo.bar(1484177488): 83
    foo.bar(1484177489): 660
    foo.bar(1484177490): 4
    foo.bar(1484177491): 725
    foo.bar(1484177492): 508
  summaries:

Our while loop was able to send 40 packets--give or take--per second from our loop. That's what cernan.statsd.packet is telling us. This wiki covers cernan's data-model elsewhere so the nuances between 'sums', 'sets' and 'summaries' won't be totally apparent but hopefully you get the gist of things. If you start emitting random histograms into cernan

> while true; do echo "bing.baz:$(gshuf -i1-1000 -n1)|h" | nc -c -u localhost 8125; done

then you'll see 'summaries' start to fill out.

In-flight Manipulation

The cernan filter system allows you to manipulate and create data inflight. Let's get that going now. Similarly to above, you'll find the configuration file for this section in examples/configs/quickstart-filters.toml. Its content is:

scripts-directory = "examples/scripts/"
flush-interval = 10

[sources]
  [sources.statsd.primary]
  port = 8125
  forwards = ["filters.name_replace"]

[filters]
  [filters.name_replace]
  script = "frau_im_mond.lua"
  forwards = ["sinks.console"]

[sinks]
  [sinks.console]

This ought to be mostly familiar. This line

scripts-directory = "examples/scripts/"

is new and tells cernan the root directory in which to search for filter scripts. Also, you'll see that sources.statsd.primary now forwards to filters.name_replace, which is configured with this section:

[filters]
  [filters.name_replace]
  script = "frau_im_mond.lua"
  forwards = ["sinks.console"]

You'll find the script at examples/scripts/frau_im_mond.lua and its contents are:

function process_metric(pyld)
   payload.set_metric_name(pyld, 1, "frau_im_mond")
end

function process_log(pyld)
end

function tick(pyld)
end

The programmable filter API is described elsewhere in this wiki but suffice it to say for now that the function process_metric will be called for each bit of telemetry sent through cernan and the function in the above script will replace the name of each metric with frau_im_mond. A more complicated scripts--in use at Postmates--is examples/scripts/collectd_scrub.lua.

That's pretty much it for the basics!

What to read next?

If you want to learn more about all the options cernan has available to control its behaviour head on over to Configuration. This page acts as a root for information about cernan's sources, sinks and filters.

If you'd like you'd like to know more about cernan's data model read about it here.

This wiki also has a glossary to help with terminology used in the cernan project that has special meaning.