Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struggling writing custom plugins #206

Closed
tferreira opened this issue Apr 10, 2016 · 10 comments
Closed

Struggling writing custom plugins #206

tferreira opened this issue Apr 10, 2016 · 10 comments
Labels

Comments

@tferreira
Copy link
Contributor

Hi!

I'm really excited on monitoring all my microservices on netdata, but I am encountering some issues when writing my own plugins.

Indeed, to have a chart updating every second, I need to print data to stdout at least every 10ms (and thus having a high CPU usage). The more I increase this value, the more blank gaps I will have on the lines.

test

Also, if I don't add the CHART and DIMENSION lines to this output everytime, along with the BEGIN/SET/END ones, it also increase these gaps.

By the way I am using the pseudo code found here, without any other modifications nor adding data collection that would make the script sleep:
https://github.com/firehol/netdata/wiki/External-Plugins#writing-plugins-properly

Any idea of what I may be doing wrong ?

@ktsaou
Copy link
Member

ktsaou commented Apr 10, 2016

nice you are trying it!
For sure we need better documentation.

So, here are a few rules:

  1. CHART and DIMENSION are needed only once. You can re-post them only if you need to add dimensions.
  2. There is no point to collect values more frequently than update every. Of course you can do this occasionally and you can skip also a few iterations under certain conditions. What I do (and can be found at the pseudo code) is that decide upfront the time the next collection should happen and I sleep exactly the amount of time required to collect the values. I never change the beat. I can skip a few bits, but the collection is always aligned to a beat.
  3. The BEGIN statement accepts a second parameter too (optional), which is the number of microseconds (1 sec = 1.000.000) since the last time you collected data. Netdata interpolates collected values to second boundary, and under heavy system load there are latencies. This value improves accuracy. If you can't measure it, just don't add this parameter at all. In this case, netdata will use the time it got the values from the plugin.

If you can share your plugin, I can have a look.
In which language do your write it?

@tferreira
Copy link
Contributor Author

Here is the plugin I use for my tests:

#!/usr/bin/python3.5

import time

update_every = 1 * 1000
get_millis = lambda: int(round(time.time() * 1000))

count = 0
last_run = 0
next_run = get_millis()

print ('CHART test.test test "Test" nb test test line 1000 1')
print ('DIMENSION test test absolute 1 1')

while True:
    now = get_millis()
    if next_run <= now:
        count += 1

        while next_run < now:
            next_run += 1000

        dt_since = (now - last_run) * update_every
        last_run = now

        if count == 1:
            print ('BEGIN test.test')
        else:
            print ('BEGIN test.test %s' % (dt_since))
        print ('SET test = 5')
        print ('END')
    time.sleep(update_every/10000) #100ms sleep

May it be python not precise enough ?

@ktsaou
Copy link
Member

ktsaou commented Apr 10, 2016

ok, it was a bit strange (I am not a python expert really - actually this was my first python program).
This is what I finally used:

#!/usr/bin/python3 -u
# use -u at shebang to disable buffering
# http://stackoverflow.com/questions/107705/disable-output-buffering

import sys, errno, time, random, argparse

sys.stdin.close()

parser = argparse.ArgumentParser(description='my super duper netdata module')
parser.add_argument('update_every', type=int, nargs='?', help='update frequency in seconds')
args = parser.parse_args()

# internal defaults for the command line arguments
update_every = 1

# evaluate the command line arguments
if args.update_every != None:
    update_every = args.update_every

# various preparations
update_every *= 1000
get_millis = lambda: int(round(time.time() * 1000))

# generate the charts
try:
    sys.stdout.write('CHART example.tferreira tferreira "Netdata Issue 206" "my unit" "my family" "my category" line 100000 %s\n' % int(update_every / 1000))
    sys.stdout.write('DIMENSION value1 "random number 1" absolute 1 1\n')
    sys.stdout.flush()
except IOError as e:
    sys.stderr.write('Failed to send data to netdata\n')
    sys.exit(0)

# the main loop
count = 0
last_run = next_run = now = get_millis()
while True:
    if next_run <= now:
        count += 1

        # DO DATA COLLECTION HERE
        value1 = random.randint(0, 1000)

        # debugging to know it is working
        # stderr is going to /var/log/netdata/error.log
        # don't enable on production
        #sys.stderr.write('collecting data, iteration No %s\n' % count)
        #sys.stderr.flush()

        # get the current time again
        # data collection may be too slow
        now = get_millis()

        # find the time for the next run
        while next_run <= now:
            next_run += update_every

        # calculate dt = the time we took
        # since the last run
        dt = now - last_run
        last_run = now

        # on the first iteration, don't set dt
        # allowing netdata to align itself
        if count == 1:
            dt = 0

        # send the values to netdata
        try:
            sys.stdout.write('BEGIN example.tferreira %s\n' % (dt * 1000))
            sys.stdout.write('SET value1 = %s\n' % value1)
            sys.stdout.write('END\n')
            sys.stdout.flush()
        except IOError as e:
            sys.stderr.write('Failed to send data to netdata\n')
            sys.exit(0)

    # sleep 1/10 of update_every
    time.sleep(update_every / 1000 / 10)
    now = get_millis()

Pay attention to -u option to python. By default python buffers its output, except when you see it on console, so netdata saw nothing and suddenly it took 1000 points.

While trying to find out what is happening, I replaced all print with sys.stdout.write and I added sys.stdout.flush(), but I guess these are not needed (the -u option to python solved the problem for good).

Here is a screenshot of it running:

image

@tferreira
Copy link
Contributor Author

Wow, this is awesome! Everything is working perfectly smooth with the -u option.

I'm using Python for a few years now, and I never heard about this kind of buffering when working outside of a console.

Thanks a lot for finding this.

@spalfs
Copy link

spalfs commented Apr 13, 2016

Hello,
How would you go about using this script?

Putting it into "/usr/libexec/netdata/plugins.d/pythonscript.chart.sh" ?
I've done just that but then I get the error:
"example.tferreria: chart not found on url '/api/v1/char?chart=example.tferreria'".

How can I go about enabling it?
Sorry I've been looking through the documentation and cannot find it very clearly.

Thanks

_EDIT_
Ah, I have found it.
Putting it into "/usr/libexec/netdata/plugins.d/pythonscript.plugin" is the fix I was looking for.
P.S.
I would like to say this software is great to use and thanks for making it!

Tom

@shubh93
Copy link

shubh93 commented Mar 29, 2017

sys.stdout.write('CHART example.tferreira tferreira "Netdata Issue 206" "my unit" "my family" "my category" line 100000 %s\n' % int(update_every / 1000))
sys.stdout.write('DIMENSION value1 "random number 1" absolute 1 1\n')
how can i add my web application path here?

@ktsaou
Copy link
Member

ktsaou commented Mar 29, 2017

how can i add my web application path here?

You want to add an application path, where?
I might be able to help if I understand what are you trying to achieve.

@JPRbrs
Copy link

JPRbrs commented Apr 4, 2018

Hello,

Thanks for this explanation it helped me understand better how plugins work.

However, I'm trying to use the plugin above placing it on /usr/libexec/netdata/plugins.d/pythonscript.plugin as @barretttom suggest and adding example.tferreira to my index.html custom file but I get the following error:
example.tferreira: chart not found on url "/api/v1/chart?chart=example.tferreira"

I've tried placing the example.chart.py from the example here and that works.

Any help?

@JPRbrs
Copy link

JPRbrs commented Apr 4, 2018

I had it working!!

I've removed the shebang, renamed the file to ferreira.charts.py and moved it to /usr/libexec/netdata/python.d

@netdata-community-bot
Copy link

This issue has been mentioned on the Netdata Community. There might be relevant details there:

https://community.netdata.cloud/t/writing-a-custom-python-plugin/869/1

vkalintiris pushed a commit to vkalintiris/netdata that referenced this issue Dec 13, 2023
* cleanups1

* cleanups2

* cleanups3

* minor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants