Struggling writing custom plugins #206

tferreira · 2016-04-10T12:32:07Z

Hi!

I'm really excited on monitoring all my microservices on netdata, but I am encountering some issues when writing my own plugins.

Indeed, to have a chart updating every second, I need to print data to stdout at least every 10ms (and thus having a high CPU usage). The more I increase this value, the more blank gaps I will have on the lines.

Also, if I don't add the CHART and DIMENSION lines to this output everytime, along with the BEGIN/SET/END ones, it also increase these gaps.

By the way I am using the pseudo code found here, without any other modifications nor adding data collection that would make the script sleep:
https://github.com/firehol/netdata/wiki/External-Plugins#writing-plugins-properly

Any idea of what I may be doing wrong ?

ktsaou · 2016-04-10T14:55:24Z

nice you are trying it!
For sure we need better documentation.

So, here are a few rules:

CHART and DIMENSION are needed only once. You can re-post them only if you need to add dimensions.
There is no point to collect values more frequently than update every. Of course you can do this occasionally and you can skip also a few iterations under certain conditions. What I do (and can be found at the pseudo code) is that decide upfront the time the next collection should happen and I sleep exactly the amount of time required to collect the values. I never change the beat. I can skip a few bits, but the collection is always aligned to a beat.
The BEGIN statement accepts a second parameter too (optional), which is the number of microseconds (1 sec = 1.000.000) since the last time you collected data. Netdata interpolates collected values to second boundary, and under heavy system load there are latencies. This value improves accuracy. If you can't measure it, just don't add this parameter at all. In this case, netdata will use the time it got the values from the plugin.

If you can share your plugin, I can have a look.
In which language do your write it?

tferreira · 2016-04-10T17:00:15Z

Here is the plugin I use for my tests:

#!/usr/bin/python3.5

import time

update_every = 1 * 1000
get_millis = lambda: int(round(time.time() * 1000))

count = 0
last_run = 0
next_run = get_millis()

print ('CHART test.test test "Test" nb test test line 1000 1')
print ('DIMENSION test test absolute 1 1')

while True:
    now = get_millis()
    if next_run <= now:
        count += 1

        while next_run < now:
            next_run += 1000

        dt_since = (now - last_run) * update_every
        last_run = now

        if count == 1:
            print ('BEGIN test.test')
        else:
            print ('BEGIN test.test %s' % (dt_since))
        print ('SET test = 5')
        print ('END')
    time.sleep(update_every/10000) #100ms sleep

May it be python not precise enough ?

ktsaou · 2016-04-10T18:26:01Z

ok, it was a bit strange (I am not a python expert really - actually this was my first python program).
This is what I finally used:

#!/usr/bin/python3 -u
# use -u at shebang to disable buffering
# http://stackoverflow.com/questions/107705/disable-output-buffering

import sys, errno, time, random, argparse

sys.stdin.close()

parser = argparse.ArgumentParser(description='my super duper netdata module')
parser.add_argument('update_every', type=int, nargs='?', help='update frequency in seconds')
args = parser.parse_args()

# internal defaults for the command line arguments
update_every = 1

# evaluate the command line arguments
if args.update_every != None:
    update_every = args.update_every

# various preparations
update_every *= 1000
get_millis = lambda: int(round(time.time() * 1000))

# generate the charts
try:
    sys.stdout.write('CHART example.tferreira tferreira "Netdata Issue 206" "my unit" "my family" "my category" line 100000 %s\n' % int(update_every / 1000))
    sys.stdout.write('DIMENSION value1 "random number 1" absolute 1 1\n')
    sys.stdout.flush()
except IOError as e:
    sys.stderr.write('Failed to send data to netdata\n')
    sys.exit(0)

# the main loop
count = 0
last_run = next_run = now = get_millis()
while True:
    if next_run <= now:
        count += 1

        # DO DATA COLLECTION HERE
        value1 = random.randint(0, 1000)

        # debugging to know it is working
        # stderr is going to /var/log/netdata/error.log
        # don't enable on production
        #sys.stderr.write('collecting data, iteration No %s\n' % count)
        #sys.stderr.flush()

        # get the current time again
        # data collection may be too slow
        now = get_millis()

        # find the time for the next run
        while next_run <= now:
            next_run += update_every

        # calculate dt = the time we took
        # since the last run
        dt = now - last_run
        last_run = now

        # on the first iteration, don't set dt
        # allowing netdata to align itself
        if count == 1:
            dt = 0

        # send the values to netdata
        try:
            sys.stdout.write('BEGIN example.tferreira %s\n' % (dt * 1000))
            sys.stdout.write('SET value1 = %s\n' % value1)
            sys.stdout.write('END\n')
            sys.stdout.flush()
        except IOError as e:
            sys.stderr.write('Failed to send data to netdata\n')
            sys.exit(0)

    # sleep 1/10 of update_every
    time.sleep(update_every / 1000 / 10)
    now = get_millis()

Pay attention to -u option to python. By default python buffers its output, except when you see it on console, so netdata saw nothing and suddenly it took 1000 points.

While trying to find out what is happening, I replaced all print with sys.stdout.write and I added sys.stdout.flush(), but I guess these are not needed (the -u option to python solved the problem for good).

Here is a screenshot of it running:

tferreira · 2016-04-10T19:06:03Z

Wow, this is awesome! Everything is working perfectly smooth with the -u option.

I'm using Python for a few years now, and I never heard about this kind of buffering when working outside of a console.

Thanks a lot for finding this.

spalfs · 2016-04-13T02:56:15Z

Hello,
How would you go about using this script?

Putting it into "/usr/libexec/netdata/plugins.d/pythonscript.chart.sh" ?
I've done just that but then I get the error:
"example.tferreria: chart not found on url '/api/v1/char?chart=example.tferreria'".

How can I go about enabling it?
Sorry I've been looking through the documentation and cannot find it very clearly.

Thanks

_EDIT_
Ah, I have found it.
Putting it into "/usr/libexec/netdata/plugins.d/pythonscript.plugin" is the fix I was looking for.
P.S.
I would like to say this software is great to use and thanks for making it!

Tom

shubh93 · 2017-03-29T11:11:49Z

sys.stdout.write('CHART example.tferreira tferreira "Netdata Issue 206" "my unit" "my family" "my category" line 100000 %s\n' % int(update_every / 1000))
sys.stdout.write('DIMENSION value1 "random number 1" absolute 1 1\n')
how can i add my web application path here?

ktsaou · 2017-03-29T19:50:47Z

how can i add my web application path here?

You want to add an application path, where?
I might be able to help if I understand what are you trying to achieve.

JPRbrs · 2018-04-04T15:42:29Z

Hello,

Thanks for this explanation it helped me understand better how plugins work.

However, I'm trying to use the plugin above placing it on /usr/libexec/netdata/plugins.d/pythonscript.plugin as @barretttom suggest and adding example.tferreira to my index.html custom file but I get the following error:
example.tferreira: chart not found on url "/api/v1/chart?chart=example.tferreira"

I've tried placing the example.chart.py from the example here and that works.

Any help?

JPRbrs · 2018-04-04T15:49:17Z

I had it working!!

I've removed the shebang, renamed the file to ferreira.charts.py and moved it to /usr/libexec/netdata/python.d

netdata-community-bot · 2021-02-03T18:51:41Z

This issue has been mentioned on the Netdata Community. There might be relevant details there:

https://community.netdata.cloud/t/writing-a-custom-python-plugin/869/1

* cleanups1 * cleanups2 * cleanups3 * minor

ktsaou added the question label Apr 10, 2016

tferreira closed this as completed Apr 10, 2016

sanaumer mentioned this issue May 23, 2016

Data Collection in Python Custom Plug-ins #447

Closed

paulfantom mentioned this issue Jun 9, 2016

MySQL plugin question #520

Closed

ktsaou mentioned this issue Jun 16, 2016

Please consider changing hashbang in mysql.chart.py to "#!/usr/bin/env python3" #562

Closed

aj-fernando mentioned this issue Jan 2, 2017

Custom plugin consume too much CPU #1495

Closed

7hacker mentioned this issue Mar 8, 2018

How do I debug a python plugin that is not a "Service"? #3522

Closed

Andrepuel mentioned this issue Mar 30, 2020

Synchronization issue/gaps when writing a plugin using Rust #8532

Closed

vkalintiris pushed a commit to vkalintiris/netdata that referenced this issue Dec 13, 2023

apache cleanups (netdata#206)

4734f5a

* cleanups1 * cleanups2 * cleanups3 * minor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Struggling writing custom plugins #206

Struggling writing custom plugins #206

tferreira commented Apr 10, 2016

ktsaou commented Apr 10, 2016

tferreira commented Apr 10, 2016

ktsaou commented Apr 10, 2016

tferreira commented Apr 10, 2016

spalfs commented Apr 13, 2016

shubh93 commented Mar 29, 2017 •

edited

ktsaou commented Mar 29, 2017

JPRbrs commented Apr 4, 2018

JPRbrs commented Apr 4, 2018 •

edited

netdata-community-bot commented Feb 3, 2021

Struggling writing custom plugins #206

Struggling writing custom plugins #206

Comments

tferreira commented Apr 10, 2016

ktsaou commented Apr 10, 2016

tferreira commented Apr 10, 2016

ktsaou commented Apr 10, 2016

tferreira commented Apr 10, 2016

spalfs commented Apr 13, 2016

shubh93 commented Mar 29, 2017 • edited

ktsaou commented Mar 29, 2017

JPRbrs commented Apr 4, 2018

JPRbrs commented Apr 4, 2018 • edited

netdata-community-bot commented Feb 3, 2021

shubh93 commented Mar 29, 2017 •

edited

JPRbrs commented Apr 4, 2018 •

edited