Permalink
Browse files

doc: multimaster

contrib: munin-mergedb.pl (tool for multimaster)

Short documentation on how to run multiple update masters on shared nfs,
and still having a single view of all your nodes.
  • Loading branch information...
1 parent 474b077 commit 1264077420dca66f46207d17336af837e0feaca3 @ze42 ze42 committed Aug 26, 2012
Showing with 339 additions and 0 deletions.
  1. +173 −0 contrib/munin-mergedb.pl
  2. +166 −0 doc/example/tips/multimaster.rst
@@ -0,0 +1,173 @@
+#!/usr/bin/perl
+# -*- cperl -*-
+#
+# Merge munin db (datafile{,.storable} / limits) for multi-update masters
+# environment
+#
+# (c) GPL - Adrien "ze" Urban
+
+use warnings;
+use strict;
+
+use Storable;
+use Munin::Master::Utils;
+
+# Exemple of config (munin-merge.conf):
+# # what to merge ?
+# merge_datafile yes
+# merge_limits yes
+# # destination is the directory having this file
+#
+# # source directories to merge from (option should be used multiple times)
+# merge_source_dbdir /nfs/munin/db/updatehost1
+# merge_source_dbdir /nfs/munin/db/updatehost2
+# merge_source_dbdir /nfs/munin/db/updatehost3
+# merge_source_dbdir /nfs/munin/db/updatehost4
+
+my $configfile_name = 'munin-merge.conf';
+my $config_type = {
+ 'merge_source_dbdir' => 'ARRAY',
+ 'merge_datafile' => 'BOOL',
+ 'merge_limits' => 'BOOL',
+};
+my $config = {
+ 'merge_dbdir' => undef,
+ 'merge_source_dbdir' => [],
+ 'merge_datafile' => 0,
+ 'merge_limits' => 0,
+};
+
+sub usage()
+{
+ print STDERR <<EOF;
+Usage:
+ $0 merge_dbdir
+
+merge_dbdir should include a config file named $configfile_name.
+This is also a security to avoid accidentaly breaking everything.
+EOF
+ exit 1;
+}
+
+sub load_config()
+{
+ my $dbdir = $config->{'merge_dbdir'};
+ my $file = $dbdir . "/" . $configfile_name;
+ open FILE, "<", $file or die "open: $!\n";
+ while (<FILE>) {
+ chomp;
+ next if (/^[[:space:]]*#/); # comment
+ next if (/^[[:space:]]*$/); # empty line
+ unless (/^[[:space:]]*([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]*$/) {
+ die "$.: Unrecogized line format\n";
+ }
+ my ($key, $value) = ($1, $2);
+ if (not defined $config_type->{$key}) {
+ die "$.: $key: unrecognized option\n";
+ }
+ if ('ARRAY' eq $config_type->{$key}) {
+ push @{$config->{$key}}, $value;
+ } elsif ('BOOL' eq $config_type->{$key}) {
+ if ($value =~ /(yes|y|1|true)/i) {
+ $config->{$key} = 1;
+ } elsif ($value =~ /(no?|0|false)/i) {
+ $config->{$key} = 0;
+ } else {
+ die "$.: unrecognized boolean: $value\n";
+ }
+ } else {
+ die "INTERNAL ERROR: $config_type->{$key}: " .
+ "type not implemented\n";
+ }
+ }
+ close FILE;
+}
+sub check_sources()
+{
+ if (0 == scalar(@{$config->{'merge_source_dbdir'}})) {
+ die "No source dbdir. " .
+ "Should I produce a result from thin air?\n";
+ }
+ # no datafile, means it's not really a munin dbdir
+ for my $srcdir (@{$config->{'merge_source_dbdir'}}) {
+ unless (-f "$srcdir/datafile") {
+ die "$srcdir: datafile not found";
+ }
+ }
+}
+sub merge_plaintext($)
+{
+ my $name = shift;
+ my $data = [];
+ my $version = undef;
+ for my $srcdir (@{$config->{'merge_source_dbdir'}}) {
+ my $srcfile = "$srcdir/$name";
+ unless (-f $srcfile) {
+ die "$srcdir: $name not found";
+ }
+ open FILE, "<", $srcfile or
+ die "open: $srcfile: $!\n";
+ my $ver = <FILE>;
+ if (defined $version) {
+ die "$srcfile: versions differs: $version vs $ver\n"
+ if ($ver ne $version);
+ } else {
+ $version = $ver;
+ }
+ push @$data, <FILE>;
+ close FILE;
+ #print $name, " ", scalar(@$data), " ", $srcdir, "\n";
+ }
+ my $dstfile = $config->{'merge_dbdir'} . "/" . $name;
+ my $dsttmp = $dstfile . ".tmp.$$";
+ open FILE, ">", $dsttmp or
+ die "open: $dsttmp: $!\n";
+ print FILE $version, @$data;
+ close FILE;
+ rename $dsttmp, $dstfile or
+ die "mv $dsttmp $dstfile: $!\n";
+}
+sub merge_datafile_storable()
+{
+ my $name = 'datafile.storable';
+
+ my $data = undef;
+ for my $srcdir (@{$config->{'merge_source_dbdir'}}) {
+ my $srcfile = "$srcdir/$name";
+ unless (-f $srcfile) {
+ die "$srcdir: $name not found";
+ }
+ my $info = retrieve($srcfile);
+ if (defined $data) {
+ $data = munin_overwrite($data, $info);
+ } else {
+ $data = $info;
+ }
+ }
+ my $dstfile = $config->{'merge_dbdir'} . "/" . $name;
+ my $dsttmp = $dstfile . ".tmp.$$";
+ Storable::nstore($data, $dsttmp);
+ rename $dsttmp, $dstfile or
+ die "mv $dsttmp $dstfile: $!\n";
+}
+sub merge_datafile()
+{
+ merge_plaintext('datafile');
+ merge_datafile_storable();
+}
+sub merge_limits()
+{
+ merge_plaintext('limits');
+}
+
+usage unless (1 == scalar(@ARGV));
+$config->{'merge_dbdir'} = shift @ARGV;
+
+load_config;
+check_sources;
+merge_datafile if ($config->{'merge_datafile'});
+merge_limits if ($config->{'merge_limits'});
+
+exit 0;
+
+# vim: syntax=perl ts=8
@@ -0,0 +1,166 @@
+.. _example-tips-masteraggregation:
+
+==================================
+ multiple master data aggregation
+==================================
+
+This example describes a way to have multiple master collecting
+different information, and show all the data in a single presentation.
+
+When you reach some size (probably several hundreds of nodes, several
+tousands plugins), 5 minutes is not enough for your single master to
+connect and gather data from all hosts, and you end up having holes in
+your graph.
+
+Requirements
+============
+
+This example requires a shared nfs space for the munin data between the
+nodes.
+
+Before going that road, you should make sure to check other options
+first, like changing the number of update threads, and having rrdcached.
+
+An other option you might consider, is using munin-async. It requires
+modifications on all nodes, so it might not be an option, but I felt
+compeled to mention it. If you can't easily have shared nfs, or if you
+might have connectivity issues between master and some node, async would
+probably be a better approach.
+
+Because there is some rrd path merge required, it is highly recommended
+to have **all** nodes in groups.
+
+Overview
+========
+
+Munin-Master runs differents scripts via the cron script (munin-cron).
+
+``munin-update``
+ is the only part actualy connecting to the nodes. It gathers
+ information and updates the rrd (you'll probably need rrdcached,
+ especialy via nfs).
+
+``munin-limits``
+ checks what was collected, compared to the limits and places
+ warning and criticals.
+
+``munin-html``
+ takes the informations gathered by update and limits, and
+ generate the actual html files (if don't have cgi-html).
+ It currently still generate some data needed by the cgi.
+
+``munin-graph``
+ generate the graphs. If you are thinking about getting many
+ masters, you probably have alot of graph, and don't want to
+ generate them every 5 minutes, but you would rather use
+ cgi-graph.
+
+The trick about having multiple master running to update is :
+
+- run ``munin-update`` on different masters (called update-masters there
+ after), having ``dbdir`` on nfs
+- run ``munin-limits`` on either each of the update-masters, or the
+ html-master (see next line)
+- run ``munin-html`` on a single master (html-master), after merging
+ some data generated by the update processes
+- have graph (cgi) and html (from file or cgi) served by either
+ html-master, or specific presentation hosts.
+
+Of course, all hosts must have access to the shared nfs directory.
+
+Exemples will consider the shared folder /nfs/munin.
+
+Running munin-update
+====================
+
+Cange the ``munin-cron`` to only run ``munin-update`` (and
+``munin-limits``, if you have alerts you want to be managed directly on
+those masters). The cron should NOT launch munin-html or munin-graph.
+
+Change your ``munin.conf`` to use a dbdir within the shared nfs, (ie:
+``/nfs/munin/db/<hostname>``).
+
+To make it easier to see the configuration, you can also update the
+configuration with an ``includedir`` on nfs, and declare all your nodes
+there (ie: ``/nfs/munin/etc/<hostname>.d/``).
+
+If you configured at least one node, you should have
+``/nfs/munin/db/<hostname>`` that starts getting populated with
+subdirectories (groups), and a few files, including ``datafile``, and
+``datafile.storable`` (and ``limits`` if you also have munin-limits
+running here).
+
+Merging data
+============
+
+All our update-masters generate update their dbdir including:
+
+- ``datafile`` and ``datafile.storable`` which contain information about
+ the collected plugins, and graphs to generate.
+- directory tree with the rrd files
+
+In order to have munin-html to run correctly, we need to merge those
+dbdir into one.
+
+Merging files
+-------------
+
+``datafile`` is just plain text with lines of ``key value``, so
+concatenating all the files is enough.
+
+``datafile.storable`` is a binary representation of the data as loaded
+by munin. It requires some munin internal structures knowledge to merge
+them.
+
+If you have ``munin-limits`` also running on update-masters, it generate
+a ``limits`` files, those are also plain text.
+
+In order to make that part easier, a ``munin-mergedb.pl`` is provided in
+contrib.
+
+Merging rrd tree
+----------------
+
+The main trick is about rrd. As we are using a shared nfs, we can use
+symlinks to get them to point to one an other, and not have to duplicate
+them. (Would be hell to keep in sync, that's why we really need shared
+nfs storage.)
+
+As we deal with groups, we could just link top level groups to a common
+rrd tree.
+
+Exemple, if you have two updaters (update1 and update2), and 4 groups
+(customer1, customer2, customer3, customer4), you could make something
+like that::
+
+/nfs/munin/db/shared-rrd/customer1/
+/nfs/munin/db/shared-rrd/customer2/
+/nfs/munin/db/shared-rrd/customer3/
+/nfs/munin/db/shared-rrd/customer4/
+/nfs/munin/db/update1/customer1 -> ../shared-rrd/customer1
+/nfs/munin/db/update1/customer2 -> ../shared-rrd/customer2
+/nfs/munin/db/update1/customer3 -> ../shared-rrd/customer3
+/nfs/munin/db/update1/customer4 -> ../shared-rrd/customer4
+/nfs/munin/db/update2/customer1 -> ../shared-rrd/customer1
+/nfs/munin/db/update2/customer2 -> ../shared-rrd/customer2
+/nfs/munin/db/update2/customer3 -> ../shared-rrd/customer3
+/nfs/munin/db/update2/customer4 -> ../shared-rrd/customer4
+/nfs/munin/db/html/customer1 -> ../shared-rrd/customer1
+/nfs/munin/db/html/customer2 -> ../shared-rrd/customer2
+/nfs/munin/db/html/customer3 -> ../shared-rrd/customer3
+/nfs/munin/db/html/customer4 -> ../shared-rrd/customer4
+
+At some point, an option to get the rrd tree separated from the dbdir,
+and should avoid the need of such links.
+
+Running munin-html
+==================
+
+Once you have your update-masters running, and a merge ready to go, you
+should place a cron on a html-master to :
+
+- merge data as requested
+- launch ``munin-limits``, if not launched on update-masters and merged
+- launch ``munin-html`` (required, even if you use cgi)
+- launch ``munin-graph`` unless you use cgi-graph
+

0 comments on commit 1264077

Please sign in to comment.