Skip to content

Commit

Permalink
Add markup and links for the supersampling doc
Browse files Browse the repository at this point in the history
  • Loading branch information
ssm committed Jul 19, 2012
1 parent e9b8aac commit a992d9f
Show file tree
Hide file tree
Showing 2 changed files with 85 additions and 32 deletions.
108 changes: 76 additions & 32 deletions doc/plugin/supersampling.rst
@@ -1,76 +1,120 @@
.. _plugin-supersampling: .. _plugin-supersampling:


Every monitoring software has a polling rate. It is usually 5 min, because it's the sweet spot that enables frequent updates yet still having a low overhead. ===============
Supersampling
===============


Munin is not different in that respect : it's data fetching routines __have__ to be launched every 5 min, otherwise you'll face data loss. And this 5 min period is deeply grained in the code. So changing it is possible, but very tedious and error prone. Every monitoring software has a polling rate. It is usually 5 min,
because it's the sweet spot that enables frequent updates yet still
having a low overhead.


But sometimes we need a very fine sampling rate. Every 10 seconds enables us to track fast changing metrics that would be averaged out otherwise. Changing the whole polling process to cope with a 10s period is very hard on hardware, since now __every__ update has to finish in these 10 seconds. Munin is not different in that respect: it's data fetching routines
have to be launched every 5 min, otherwise you'll face data loss.
And this 5 min period is deeply grained in the code. So changing it is
possible, but very tedious and error prone.


This triggered an extension in the plugin protocol, commonly known as ''supersampling''. But sometimes we need a very fine sampling rate. Every 10 seconds
enables us to track fast changing metrics that would be averaged out
otherwise. Changing the whole polling process to cope with a 10s
period is very hard on hardware, since now every update has to finish
in these 10 seconds.


Supersampling This triggered an extension in the plugin protocol, commonly known as
============= "supersampling".


Overview Overview
-------- ========



The basic idea is that fine precision should only be for selected plugins only. It also cannot be triggered from the master, since the overhead would be way too big. The basic idea is that fine precision should only be for selected
plugins only. It also cannot be triggered from the master, since the
overhead would be way too big.


So, we just let the plugin sample itself the values at a rate it feels adequate. Then each polling round, the master fetches all the samples since last poll. So, we just let the plugin sample itself the values at a rate it feels
adequate. Then each polling round, the master fetches all the samples
since last poll.


This enables various constructions, mostly around ''streaming'' plugins to achieve highly detailed sampling with a very small overhead. This enables various constructions, mostly around "streaming" plugins
to achieve highly detailed sampling with a very small overhead.


Notes Notes
+++++ -----


This protocol is currently completely transparent to @@munin-node@@, and therefore it means that it can be used even on older (1.x) nodes. Only a 2.0 master is required. This protocol is currently completely transparent to :ref:`munin-node
<node-index>`, and therefore it means that it can be used even on
older (1.x) nodes. Only a 2.0 :ref:`master <master-index>` is
required.


Protocol details Protocol details
---------------- ================


The protocol itself is derived from the spoolfetch extension. The protocol itself is derived from the :ref:`spoolfetch` extension.


Config Config
++++++ ------


A new directive is used, @@update_rate@@. It enables the master to create the rrd with an adequate step. A new plugin directive is used, :ref:`update_rate`. It enables the
master to create the rrd with an adequate step.


Omitting it would lead to rrd averaging the supersampled values onto the default 5 min rate. This means **data loss**. Omitting it would lead to rrd averaging the supersampled values onto
the default 5 min rate. This means **data loss**.


Notes .. note:: Heartbeat
#####


The heartbeat has always a 2 step size, so failure to send all the samples will result with unknown values, as expected. The heartbeat has always a 2 step size, so failure to send all the
samples will result with unknown values, as expected.


The RRD file size is always the same in the default config, as all the RRA are configured proportionally to the @@update_rate@@. This means that, since you'll keep as much data as with the default, you keep it for a shorter time. .. note:: Data size

The RRD file size is always the same in the default config, as all
the RRA are configured proportionally to the :ref:`update_rate`.
This means that, since you'll keep as much data as with the default,
you keep it for a shorter time.


Fetch Fetch
+++++ -----

When spoolfetching, the epoch is also sent in front of the value.
Supersampling is then just a matter of sending multiple epoch/value
lines, with monotonically increasing epoch.


When spoolfetching, the epoch is also sent in front of the value. Supersampling is then just a matter of sending multiple epoch/value lines, with monotonically increasing epoch. Note that since the epoch is an integer value for rrdtool, the smallest granularity is 1 second. For the time being, the protocol itself does also mandates integers. We can easily imagine that with another database as backend, an extension could be hacked together. .. note::

Note that since the epoch is an integer value for rrdtool_, the
smallest granularity is 1 second. For the time being, the protocol
itself does also mandates integers. We can easily imagine that with
another database as backend, an extension could be hacked together.

.. _rrdtool: http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html


Compatibility with 1.4 Compatibility with 1.4
---------------------- ======================


On older 1.4 masters, only the last sampled value gets into the rrd. On older 1.4 masters, only the last sampled value gets into the RRD.


Sample implementation Sample implementation
--------------------- =====================


The canonical sample implementation is multicpu1sec, a contrib plugin on github. It is also a so-called streaming plugin. The canonical sample implementation is multicpu1sec_, a contrib plugin
on github. It is also a so-called streaming plugin.

.. _multicpu1sec: https://github.com/munin-monitoring/contrib/tree/master/plugins/system/multicpu1sec


Streaming plugins Streaming plugins
----------------- =================

These plugins fork a background process when called that streams a
system tool into a spool file. In multicpu1sec_, it is the mpstat_ tool
with a period of 1 second.


These plugins fork a background process when called that streams a system tool into a spool file. In multipcu1sec, it is the @@mpstat@@ tool with a period of 1 second. .. _mpstat: https://en.wikipedia.org/wiki/Mpstat


Undersampling Undersampling
============= =============


Some plugins are on the opposite side of the spectrum, as they only need a lower precision. Some plugins are on the opposite side of the spectrum, as they only
need a lower precision.


It makes sense when : It makes sense when :


* data should be kept for a *very* long time * data should be kept for a *very* long time
* data is *very* expensive to generate and it varies only slowly. * data is *very* expensive to generate and it varies only slowly.
9 changes: 9 additions & 0 deletions doc/reference/plugin.rst
Expand Up @@ -48,6 +48,15 @@ following fields are used.
| | | | which is the practice of taking | | | | | | | which is the practice of taking | | |
| | | | datapoints from other graphs. | | | | | | | datapoints from other graphs. | | |
+--------------------+------------------+----------+------------------------------------------+------------------+---------+ +--------------------+------------------+----------+------------------------------------------+------------------+---------+
| update_rate | integer | optional | Sets the update_rate used by the munin | | |
| | (seconds) | | master when it creates the RRD file. | | |
| | | | | | |
| | | | The update rate is the interval at which | | |
| | | | the RRD file expects to have data. | | |
| | | | | | |
| | | | This field requires a munin master | | |
| | | | version of at least 2.0.0 | | |
+--------------------+------------------+----------+------------------------------------------+------------------+---------+
| datapoint.label | lower case | required | The label used in the graph for this | | | | datapoint.label | lower case | required | The label used in the graph for this | | |
| | string, no | | field | | | | | string, no | | field | | |
| | whitespace | | | | | | | whitespace | | | | |
Expand Down

0 comments on commit a992d9f

Please sign in to comment.