Skip to content

Ganglia GMond C Modules

Ng Zhi An edited this page Jul 27, 2014 · 1 revision

Creating Gmond Modules in C

Premise

Creating a Gmond module mostly involves creating a struct of type "mmodule". This struct is made up of three function pointers and another struct containing info about your metric. Because Ganglia relies on the Apache Portable Runtime Library, many of the data structures involved are APR types, and digging more than a couple layers deep will land you squarely in the APR headers, which you may find a bit vexing (I sure did). The module writers were sensitive to this, and have provided a few functions to insulate you from needing to know too much about the underlying APR data structures.

Static vs Dynamic modules

The problem is, to use these functions, you need to statically define the metric metadata up front in a "metric_info" struct. If this struct is declared inside the mmodule, GMOND will do the APR interaction for you. If it isn't declared and defined up front, you'll have to do the APR stuff yourself. So the primary difference between writing modules that I call "Dynamic" and those I call "Static" is how much you'll need to interact with APR. Most modules that you'll probably want to collect will fit nicely into the static category. You'll see what I mean as we continue.

The mmodule Struct

Lets take a look at the mmodule struct from mod_mem.c, which is the module gmond uses to collect memory-related metrics:

mmodule mem_module =
{
    STD_MMODULE_STUFF,
    mem_metric_init,
    mem_metric_cleanup,
    mem_metric_info,
    mem_metric_handler,
};

This struct is usually defined at the very bottom of the source file, but it's where the action begins. It breaks down like so:

  • STD_MMODULE_STUFF: Always the same for every module. If you need to change this then you don't need this wiki page to tell you what you're doing
  • metric_init: A function you must define ( static int metric_init ( apr_pool_t * )). The init function is the first function called by GMOND once it reads in your mmodule definition.
  • metric_cleanup: A function you must define (static void metric_cleanup ( void )). This function is called before your module is unloaded from GMOND. It's usually defined but not used, but if you need to do something important before exit, then here's your hook.
  • metric_info: A struct you must define. I'll break this down in the next section. The metric_info struct contains metadata about every metric you want to measure. The init function will call the handler function once for each element in this struct.
  • metric_handler: A function you must define (static g_val_t metric_handler ( int )). This function is called by metric_init() once for each element in the metric_info struct.

Strategic Review

So your job is to define an init, cleanup and handler function, and an info struct. GMOND will call your init function. Your init function will call your handler function once for each element in the info struct. Your handler function will return a value of some description. GMOND will pair the value you returned with the corresponding element in the info struct, and report all of that up to GMETAD.

The metric_info Struct

Look at a few lines of the metric_info struct from mod_mem:

static Ganglia_25metric mem_metric_info[] =
{
    {0, "mem_total",  1200, GANGLIA_VALUE_FLOAT, "KB", "zero", "%.0f", UDP_HEADER_SIZE+8, "Total amount of memory displayed in KBs"},
    {0, "mem_free",    180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of available memory"},
    {0, "mem_shared",  180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of shared memory"},
    <snip>

This should give you a rough idea of what type of metadata is tracked for each metric. Ganglia_25metric is defined in lib/gm_protocol.h in the source tarball. The fields from left to right are:

  • int key: I'm not sure what this is for, but setting it to zero seems safe.
  • char *name, the name of the metric, for the RRD
  • int tmax, the maximium time in seconds between metric collection calls
  • Ganglia_value_types type, used by APR to create dynamic storage for the metric, this can be one of: string, uint, float, double or a ganglia Global, like the ones above.
  • char * units, unit of your metric for the RRD
  • char *slope, one of: zero, positive, negative,or both
  • char *fmt, A printf style format string for your metric which MUST correspond to the type field
  • int msg_size, UDP_HEADER_SIZE+8 is a sane default
  • char *desc, A text description of the metric

metric_init

Lets take a look at the init function from mod_mem:

static int mem_metric_init ( apr_pool_t *p )
{
    int i;

    libmetrics_init();

    for (i = 0; mem_module.metrics_info[i].name != NULL; i++) {
         MMETRIC_INIT_METADATA(&(mem_module.metrics_info[i]),p);
        MMETRIC_ADD_METADATA(&(mem_module.metrics_info[i]),MGROUP,"memory");
    }

    return 0;
}

Simple enough, a for loop is iterating across each element of the metrics_info struct, and calling MMETRIC_INIT_METADATA, and MMETRIC_ADD_METADATA on each element in the struct. These are the APR related functions I referred to earlier. Since we've already defined our metric_info struct, all we need to do is call MMETRIC_INIT_METADATA to create some APR storage for it, and then MMETRIC_ADD_METADATA, which will call our handler function, and pair the result with the metadata we've provided in metric_info. Once it's all neatly packaged up, gmond will present it to gmetad whenever it's asked to do so.

metric_handler

Lets look at the handler function for mod_mem

static g_val_t mem_metric_handler ( int metric_index )
{
    g_val_t val;

    /* The metric_index corresponds to the order in which
       the metrics appear in the metric_info array
    */
    switch (metric_index) {
    case 0:
        return mem_total_func();
    case 1:
        return mem_free_func();
    case 2:
        return mem_shared_func();
    case 3:
<snip>
...
</snip>
    }

    /* default case */
    val.f = 0;
    return val;
}

So we see here that the handler function gets passed the index number of the element in mem_info that we're currently iterating across. Different modules use this in different ways. mod_mem is using the index number directly in a switch-case to call other (probably OS-dependant) functions to gather the actual metrics. If you look at the source for most of these functions on Linux, you'll find that most of them are just reading numbers directly out of /proc.

Full Source For mod_mem

Since I've been using it as an example so far, here's the full source for mod_mem (circa ganglia-3.1.7) for reference:

#include <gm_metric.h>
#include <libmetrics.h>

mmodule mem_module;


static int mem_metric_init ( apr_pool_t *p )
{
    int i;

    libmetrics_init();

    for (i = 0; mem_module.metrics_info[i].name != NULL; i++) {
        /* Initialize the metadata storage for each of the metrics and then
         *  store one or more key/value pairs.  The define MGROUPS defines
         *  the key for the grouping attribute. */
        MMETRIC_INIT_METADATA(&(mem_module.metrics_info[i]),p);
        MMETRIC_ADD_METADATA(&(mem_module.metrics_info[i]),MGROUP,"memory");
    }

    return 0;
}

static void mem_metric_cleanup ( void )
{
}

static g_val_t mem_metric_handler ( int metric_index )
{
    g_val_t val;

    /* The metric_index corresponds to the order in which
       the metrics appear in the metric_info array
    */
    switch (metric_index) {
    case 0:
        return mem_total_func();
    case 1:
        return mem_free_func();
    case 2:
        return mem_shared_func();
    case 3:
        return mem_buffers_func();
    case 4:
        return mem_cached_func();
    case 5:
        return swap_free_func();
    case 6:
        return swap_total_func();
#if HPUX
    case 7:
        return mem_arm_func();
    case 8:
        return mem_rm_func();
    case 9:
        return mem_avm_func();
    case 10:
        return mem_vm_func();
#endif
    }

    /* default case */
    val.f = 0;
    return val;
}

static Ganglia_25metric mem_metric_info[] = 
{
    {0, "mem_total",  1200, GANGLIA_VALUE_FLOAT, "KB", "zero", "%.0f", UDP_HEADER_SIZE+8, "Total amount of memory displayed in KBs"},
    {0, "mem_free",    180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of available memory"},
    {0, "mem_shared",  180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of shared memory"},
    {0, "mem_buffers", 180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of buffered memory"},
    {0, "mem_cached",  180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of cached memory"},
    {0, "swap_free",   180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "Amount of available swap memory"},
    {0, "swap_total", 1200, GANGLIA_VALUE_FLOAT, "KB", "zero", "%.0f", UDP_HEADER_SIZE+8, "Total amount of swap space displayed in KBs"},
#if HPUX
    {0, "mem_arm",     180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "mem_arm"},
    {0, "mem_rm",      180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "mem_rm"},
    {0, "mem_avm",     180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "mem_avm"},
    {0, "mem_vm",      180, GANGLIA_VALUE_FLOAT, "KB", "both", "%.0f", UDP_HEADER_SIZE+8, "mem_vm"},
#endif
    {0, NULL}

};

mmodule mem_module =
{
    STD_MMODULE_STUFF,
    mem_metric_init,
    mem_metric_cleanup,
    mem_metric_info,
    mem_metric_handler,
};

Dynamic Modules

Sometimes you can't predict what the metric_info struct is going to look like. Consider, for example, a process counter module that takes a space-separated list of process names (like "bash httpd gmond"), and then for each process name, counts the number of active instances of that process. In order to do that, you're going to need to take input from the user (via configuration in the gmond.conf file) and dynamically create the metric_info struct at run-time based on user input. I wrote that module. Here's how it works:

Declare metric_info NULL in the mmodule

Here's the mmodule struct for my mod_count_procs module:

mmodule cp_module =
{
    STD_MMODULE_STUFF,
    cp_metric_init,
    cp_metric_cleanup,
    NULL, /* Dynanically defined in cp_metric_init() */
    cp_metric_handler,
};

Globally Declare metric_info Manually

When you declare the name of metric_info in the mmodule struct, gmond creates an APR Dynamic array for you. If you don't declare it there, then you have to do it yourself, like so:

static apr_array_header_t *metric_info = NULL;

That needs to be a global declaration. Did I say that needs to be a global declaration? That needs to be global.

Changes to init

The init function becomes more important now. Basically we need to read in the user-input, dynamically create the array, and then iterate across it with METRIC_INFO and METRIC_ADD all inside of init. First, we need a few variables:

Ganglia_25metric *gmi;
metric_info = apr_array_make(p, 1, sizeof(Ganglia_25metric));

We need a Ganglia_25metric pointer inside the scope of the init function, so we can modify the contents of individual array elements inside metric_info, and metric_info itself needs to be initialized as a dynamic APR array.

Now we're ready to read in the user-input from gmond.conf:

    if (list_params) {
        params = (mmparam*) list_params->elts;
        for(i=0; i< list_params->nelts; i++) {
            if (!strcasecmp(params[i].name, "ProcessNames")) {
                processNames = params[i].value;
            }
        }
    }

For reference, here's what the actual configuration in gmond.conf looks like:

  module {
    name = "cp_module"
    path = "modcprocs.so"
    Param ProcessNames {
      Value = "httpd bash"
    }

So we pass a space separated list of values in gmond.conf, and then iterate across list_params in our module code to get the list out. Then I use strtok() to parse out the individual process names, but you could use whatever suits you. Once we have the list, we can populate gmi with the info. When we want to add a row to metric_info, we set gmi to the next available slot in the metric_info array by calling apr_array_push like this:

gmi = apr_array_push(metric_info);

and then we just operate directly on gmi like so:

    for(i=0; processName != NULL; i++) {
       gmi = apr_array_push(metric_info);

       gmi->name = apr_pstrdup (p, processName);
       gmi->tmax = 512;
       gmi->type = GANGLIA_VALUE_UNSIGNED_INT;
       gmi->units = apr_pstrdup(p, "count");
       gmi->slope = apr_pstrdup(p, "both");
       gmi->fmt = apr_pstrdup(p, "%u");
       gmi->msg_size = UDP_HEADER_SIZE+8;
       gmi->desc = apr_pstrdup(p, "process count");

Now that the element has been dynamically created, it can have INIT_METADATA, and ADD_METADATA called against it, just like its static brethren:

       MMETRIC_INIT_METADATA(gmi,p);
       MMETRIC_ADD_METADATA(gmi,MGROUP,"cprocs");

That'll call up our handler function and take care of data collection et al, just like with a static module. Now, to wrap up init, APR expects an empty terminator on metric_info, so outside of the loop we call apr_array_push one last time, and then we manually set our metric_info array to be the array used by the mmodule struct down at the bottom of the file (the one we'd initially declared as void) with this line:

cp_module.metrics_info = (Ganglia_25metric *)metric_info->elts;

Handler

The handler function is no different in a dynamic module. By the time handler gets called, metric_info has been created and everything is happy.

Full Source for mod_count_procs

For reference, here's the full source to a process counter module, which dynamically creates it's metric_info.

 /*******************************************************************************
* This software is public domain. No rights reserved. Do whatever you want 
* (and leave me alone)
*
* Author: Dave Josephsen (dave at skeptech.org)
******************************************************************************/

#include <gm_metric.h>
#include <ganglia_priv.h>
#include <apr_strings.h>

#include <stdlib.h>
#include <stdio.h>
#include <strings.h>
#include <string.h>
#include <time.h>
#include <readproc.h> // the readproc library from the linux procps project. most systems don't have this installed by default

extern mmodule cp_module;
static apr_array_header_t *metric_info = NULL;
char *processNames;

g_val_t count_procs(char *proc_name) {
//This function counts the number of instances of the given name (like 'httpd' for example)

        int numprocs=0;
   PROCTAB *proct;
   proc_t *proc_info;
   g_val_t returnThis; 

   proct = openproc(PROC_FILLARG | PROC_FILLSTAT | PROC_FILLSTATUS);

   while ((proc_info = readproc(proct,NULL))) {
      if(!strncmp(proc_info->cmd,proc_name,sizeof(proc_info->cmd))){ 
         numprocs++;
      }
   }
   closeproc(proct);
        returnThis.int32 = numprocs;
   return returnThis;
}

static int cp_metric_init ( apr_pool_t *p )
{
    const char* str_params = cp_module.module_params;
    apr_array_header_t *list_params = cp_module.module_params_list;
    mmparam *params;
         Ganglia_25metric *gmi;
    char processDesc[1024];
    int i;

         //init the metric_info array
    metric_info = apr_array_make(p, 1, sizeof(Ganglia_25metric));

    /* Read the parameters from the gmond.conf file. */
    if (list_params) {
        params = (mmparam*) list_params->elts;
        for(i=0; i< list_params->nelts; i++) {
            if (!strcasecmp(params[i].name, "ProcessNames")) {
                processNames = params[i].value;
                                 debug_msg("\tProcessNames:: %s",processNames);
            }
        }
    }

    // metadata storage 
         char processNamesCp[sizeof(processNames)+1];
         
         strcpy(processNamesCp, processNames ); //avoid allowing strtok to clobber the original ProcessNames string
         char * processName = strtok(processNamesCp, " ");

         for(i=0; processName != NULL; i++) {

                 gmi = apr_array_push(metric_info);

       /* gmi->key will be automatically assigned by gmond */
       gmi->name = apr_pstrdup (p, processName);
       gmi->tmax = 512;
       gmi->type = GANGLIA_VALUE_UNSIGNED_INT;
       gmi->units = apr_pstrdup(p, "count");
       gmi->slope = apr_pstrdup(p, "both");
       gmi->fmt = apr_pstrdup(p, "%u");
       gmi->msg_size = UDP_HEADER_SIZE+8;
       gmi->desc = apr_pstrdup(p, "process count");

       MMETRIC_INIT_METADATA(gmi,p);
       MMETRIC_ADD_METADATA(gmi,MGROUP,"cprocs");

                 processName = strtok(NULL, " ");
         }

    /* Add a terminator to the array and replace the empty static metric definition 
        array with the dynamic array that we just created 
    */
    gmi = apr_array_push(metric_info);
    debug_msg("\tarray push done");
    memset (gmi, 0, sizeof(*gmi));

    cp_module.metrics_info = (Ganglia_25metric *)metric_info->elts;


    return 0;
}

static void cp_metric_cleanup ( void )
{
}

static g_val_t cp_metric_handler ( int metric_index )
{
         Ganglia_25metric *gmi = &(cp_module.metrics_info[metric_index]);
    g_val_t val = count_procs(gmi->name);

    return val;
}

mmodule cp_module =
{
    STD_MMODULE_STUFF,
    cp_metric_init,
    cp_metric_cleanup,
    NULL, /* Dynanically defined in cp_metric_init() */
    cp_metric_handler,
};