support standard deviation in reduce functions #808

ktsaou · 2016-08-21T00:39:13Z

https://en.wikipedia.org/wiki/Standard_deviation

This is useful in health monitoring - alarms to get more accurate values compared to average, when spikes need to be eliminated.

The text was updated successfully, but these errors were encountered:

paulfantom · 2016-08-25T16:25:28Z

Apart from standard deviation, I would be overjoyed if we could somehow include Holt-Winters anomaly detection. This would be the first monitoring stack I now about, which by default includes prediction based on history trends (I know it is doable in Bosun or Riemann, but not as default function).

ktsaou · 2016-08-26T12:09:41Z

@paulfantom this algorithm seems pretty simple, but I am not sure how to use it. Have you used it in the past?

paulfantom · 2016-08-26T15:06:31Z

Unfortunately no, but I am learning how to.

ktsaou · 2016-11-05T23:52:55Z

I did some research on Holt-Winters function. I also found a few issues (pierre/holt-winters#1) to the repo you mentioned, which I fixed with a PR (pierre/holt-winters#2).

So, holt-winters is like double exponential smoothing, but with seasonality (i.e. triple exponential smoothing). Since we don't know anything about seasonality on our data, holt-winters won't help further.

I collected / developed several statistical functions (including double exponential smoothing = holt-winters without seasonality).

Here they are:

long double average(long double *series, size_t entries) {
    size_t i, count = 0;
    long double sum = 0;

    for(i = 0; i < entries ; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;
        count++;

        sum += value;
    }

    long double avg = sum / (long double)count;

    fprintf(stderr, "average % 12.7Lf\n", avg);
    return avg;
}

long double moving_average(long double *series, size_t entries, size_t period) {
    size_t i, count = 0;
    long double sum = 0, avg = 0;
    long double p[period];

    for(i = 0; i < entries; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;

        if(count < period) {
            sum += value;
            avg = (count == period - 1) ? sum / (long double)period : 0;
        }
        else {
            sum = sum - p[count % period] + value;
            avg = sum / (long double)period;
        }

        p[count % period] = value;

        count++;
        fprintf(stderr, " > i = % 3zd, current = % 12.7Lf, movavg = % 12.7Lf\n", i, value, avg);
    }

    fprintf(stderr, "moving average (period = %zu) % 12.7Lf\n", period, avg);
    return avg;
}

long double standard_deviation(long double *series, size_t entries) {
    size_t i, count = 0;
    long double sum = 0;

    for(i = 0; i < entries ; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;
        count++;

        sum += value;
    }
    long double average = sum / (long double)count;

    for(i = 0, count = 0, sum = 0; i < entries ; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;
        count++;

        sum += pow(value - average, 2);
    }
    long double variance = sum / (long double)count;

    long double stddev = sqrt(variance);
    fprintf(stderr, "standard deviation % 12.7Lf\n", stddev);
    return stddev;
}

long double single_exponential_smoothing(long double *series, size_t entries, long double alpha) {
    size_t i, count = 0;
    long double level = 0, sum = 0;

    if(isnan(alpha))
        alpha = 1.0 / (entries / 2);

    for(i = 0; i < entries ; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;
        count++;

        sum += value;

        long double last_level = level;
        level = alpha * value + (1.0 - alpha) * last_level;
        fprintf(stderr, " > i = % 3zd, current = % 12.7Lf, 1expavg = % 12.7Lf, avg = % 12.7Lf\n", i, value, level, sum/count);
    }

    fprintf(stderr, "single exponential average (alpha = % 12.7Lf) % 12.7Lf\n", alpha, level);
    return level;
}

// http://grisha.org/blog/2016/02/16/triple-exponential-smoothing-forecasting-part-ii/
long double double_exponential_smoothing(long double *series, size_t entries, long double alpha, long double beta) {
    size_t i, count = 0;
    long double level = series[0], trend, sum, forecast = series[0];

    if(isnan(alpha))
        alpha = 0.01;

    if(isnan(beta))
        beta = 0.9;

    if(entries > 1)
        trend = series[1] - series[0];
    else
        trend = 0;

    sum = series[0];

    for(i = 1; i < entries ; i++) {
        long double value = series[i];
        if(isnan(value) || isinf(value)) continue;
        count++;

        sum += value;

        long double last_level = level;

        level = alpha * value + (1.0 - alpha) * (level + trend);
        trend = beta * (level - last_level) + (1.0 - beta) * trend;
        forecast = level + trend;

        fprintf(stderr, " > i = % 3zd, current = % 12.7Lf, forecast = (% 12.7Lf + % 12.7Lf) = % 12.7Lf, avg = % 12.7Lf\n", i, value, level, trend, forecast, sum/(count+1));
    }

    fprintf(stderr, "double exponential average (alpha = % 12.7Lf, beta = % 12.7Lf) % 12.7Lf, forecast % 12.7Lf\n", alpha, beta, level, forecast);
    return forecast;
}

In single exponential smoothing, double exponential smoothing and holt-winters, alpha is the importance of the recent values: 0.0 = not important at all (i.e. no exponential smoothing), to 1.0 = only the most recent value is important. This is used to favor recent values over older ones. This is a decimal number.

In double exponential smoothing and holt-winters, beta is the importance of the trend: 0.0 = not important at all (i.e. no double exponential smoothing), to 1.0 = the trend is most important. This is a decimal number.

ktsaou · 2016-11-06T00:00:53Z

I also found FANN, a deep learning library.

I am still learning this stuff, so I opened an issue there to find out if we can use neural networks in netdata for alarms: libfann/fann#83

ktsaou · 2016-11-06T00:32:07Z

hm... I also found this: https://github.com/rubygarage/holtwinters/blob/master/index.js
~~It seems that it can detect the seasonality for holtwinters (although it is a brute force).~~

It can only detect proper values for alpha and beta - it does not detect seasonality.

ghost · 2017-03-22T09:45:37Z

How about combining anomaly detection with taking a snapshot (#309) to a long term store? This significantly reduces the storage requirements in long term without sacrificing the value of the detail around when 'something happened'. It also spreads the work of anomaly detection across the monitored pool, so no huge MI platform in the middle...

ktsaou · 2017-03-23T00:02:38Z

@PhlashGBG I am not sure I get it. Could you please explain it a bit more?

ghost · 2017-03-23T11:17:33Z

Heh ok :) As a possible 'enterprise user' of a monitoring tool like netdata, I would likely want to centralise storage and apply machine intelligence (MI) to detect / trace issues across my estate of thousands of real/virtual machines, much as monitoring services like NewRelic and cloud providers such as Azure and AWS can already do. This sort of works (see below) when sampling at 5min+ intervals, but will fail horribly at 1sec sample intervals due to the volume of data involved.

Why would I want 1sec sampling fed to MI? Because many incidents are ephemeral, and 5min+ sampling completely misses what actually happened, more detail gives me much more ability to diagnose and fix.

My suggestion is thus to limit the amount of data being sent to storage / MI, by filtering out the boring stuff using local anomaly detection (simple MI) on each server, which also scales much better than central MI. When anomalies are detected, netdata can send a snapshot of monitoring data around the anomaly (both before, during and after) to storage, much like a hardware logic analyser or storage oscilloscope can capture the events leading up to and around a trigger point. This gives the MI (or the humans) the detail required to diagnose and fix without overloading the centralised storage / MI with yards of boring junk.

ktsaou · 2017-03-23T20:31:13Z

ok, I see. Nice idea. Although I am not sure how MI will work with non-regular data. MI is supposed to find issues on data that follow the same principles. If suddenly they get so much detail, I am not sure what the outcome will be. I am not a data scientist though, so I don't know.

Keep in mind, you can use multiple netdata (running even on the same host), to broadcast the same metrics to different backends with different detail. For example:

netdata 1: receives all metrics from all hosts, maintains a small db, archives metrics to time-series database A with 5 second resolution (A has a retention of 1 month) and sends all metrics to netdata 2.
netdata 2: receives all metrics from netdata 1, maintains a small db, archives all metrics to time-series database B with 1 minute resolution (B has a retention of 3 months) and sends all metrics to netdata 3.
netdata 3: receives all metrics from netdata 2, maintains a large database (a few days) with memory mode map (swap like) and archives all metrics to time-series database C with 5 minute resolution (C has a retention of 1 year).

So, in this setup:

i. you have a large high resolution database maintained by netdata 3
ii. you have 3 time-series databases, with different resolution and retention policy each.

paulfantom · 2018-11-18T21:25:15Z

I think this was already implemented. Closing.

ktsaou added the enhancement label Aug 21, 2016

paulfantom added module/core and removed enhancement labels Sep 22, 2018

paulfantom closed this as completed Nov 18, 2018

paulfantom added feature request New features and removed module/core labels Nov 18, 2018

vkalintiris pushed a commit to vkalintiris/netdata that referenced this issue Dec 13, 2023

feat(postgres): collect pg_statio_user_tables (netdata#808)

e04b713

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support standard deviation in reduce functions #808

support standard deviation in reduce functions #808

ktsaou commented Aug 21, 2016

paulfantom commented Aug 25, 2016 •

edited

ktsaou commented Aug 26, 2016

paulfantom commented Aug 26, 2016

ktsaou commented Nov 5, 2016 •

edited

ktsaou commented Nov 6, 2016 •

edited

ktsaou commented Nov 6, 2016 •

edited

ghost commented Mar 22, 2017 •

edited by ghost

ktsaou commented Mar 23, 2017

ghost commented Mar 23, 2017

ktsaou commented Mar 23, 2017

paulfantom commented Nov 18, 2018

support standard deviation in reduce functions #808

support standard deviation in reduce functions #808

Comments

ktsaou commented Aug 21, 2016

paulfantom commented Aug 25, 2016 • edited

ktsaou commented Aug 26, 2016

paulfantom commented Aug 26, 2016

ktsaou commented Nov 5, 2016 • edited

ktsaou commented Nov 6, 2016 • edited

ktsaou commented Nov 6, 2016 • edited

ghost commented Mar 22, 2017 • edited by ghost

ktsaou commented Mar 23, 2017

ghost commented Mar 23, 2017

ktsaou commented Mar 23, 2017

paulfantom commented Nov 18, 2018

paulfantom commented Aug 25, 2016 •

edited

ktsaou commented Nov 5, 2016 •

edited

ktsaou commented Nov 6, 2016 •

edited

ktsaou commented Nov 6, 2016 •

edited

ghost commented Mar 22, 2017 •

edited by ghost