Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python.d plugins not storing data #8451

Closed
zdzichu opened this issue Mar 21, 2020 · 12 comments
Closed

python.d plugins not storing data #8451

zdzichu opened this issue Mar 21, 2020 · 12 comments
Labels
area/collectors Everything related to data collection bug collectors/python.d needs triage Issues which need to be manually labelled

Comments

@zdzichu
Copy link

zdzichu commented Mar 21, 2020

Bug report summary

Data collected by python.d plugins are not displayed in web interface.

OS / Environment
Linux mother.pipebreaker.pl 5.5.9-200.fc31.x86_64 #1 SMP Thu Mar 12 13:55:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
/etc/fedora-release:Fedora release 31 (Thirty One)
/etc/os-release:NAME=Fedora
/etc/os-release:VERSION="31 (Workstation Edition)"
/etc/os-release:ID=fedora
/etc/os-release:VERSION_ID=31
/etc/os-release:VERSION_CODENAME=""
/etc/os-release:PLATFORM_ID="platform:f31"
/etc/os-release:PRETTY_NAME="Fedora 31 (Workstation Edition)"
/etc/os-release:ANSI_COLOR="0;34"
/etc/os-release:LOGO=fedora-logo-icon
/etc/os-release:CPE_NAME="cpe:/o:fedoraproject:fedora:31"
/etc/os-release:HOME_URL="https://fedoraproject.org/"
/etc/os-release:DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f31/system-administrators-guide/"
/etc/os-release:SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
/etc/os-release:BUG_REPORT_URL="https://bugzilla.redhat.com/"
/etc/os-release:REDHAT_BUGZILLA_PRODUCT="Fedora"
/etc/os-release:REDHAT_BUGZILLA_PRODUCT_VERSION=31
/etc/os-release:REDHAT_SUPPORT_PRODUCT="Fedora"
/etc/os-release:REDHAT_SUPPORT_PRODUCT_VERSION=31
/etc/os-release:PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
/etc/os-release:VARIANT="Workstation Edition"
/etc/os-release:VARIANT_ID=workstation
Netdata version

Using distribution packages: netdata-1.20.0-1.fc31.x86_64

Component Name

python.d

Steps To Reproduce

I can see that plugins are working when run standalone, for example:

# sudo -u netdata /usr/libexec/netdata/plugins.d/python.d.plugin sensors debug
2020-03-21 16:18:21: python.d INFO: plugin[main] : using python v3
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : looking for 'python.d.conf' in ['/etc/netdata', '/usr/lib64/netdata/conf.d']
2020-03-21 16:18:21: python.d WARNING: plugin[main] : 'python.d.conf' was not found, using defaults                                                                       
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : looking for 'pythond-jobs-statuses.json' in /var/lib/netdata
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : loading '/var/lib/netdata/pythond-jobs-statuses.json'
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : '/var/lib/netdata/pythond-jobs-statuses.json' is loaded                                                               
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : [sensors] looking for 'sensors.conf' in ['/etc/netdata/python.d', '/usr/lib64/netdata/conf.d/python.d']               
2020-03-21 16:18:21: python.d WARNING: plugin[main] : [sensors] 'sensors.conf' was not found                                                                              
2020-03-21 16:18:21: python.d INFO: plugin[main] : [sensors] built 1 job(s) configs  
2020-03-21 16:18:21: python.d DEBUG: plugin[main] : sensors[sensors] was previously active, applying recovering settings                                                  
2020-03-21 16:18:21: python.d INFO: plugin[main] : sensors[sensors] : check success  
CHART netdata.runtime_sensors '' 'Execution time for sensors' 'ms' 'python.d' netdata.pythond_runtime line 145000 1                                                       
DIMENSION run_time 'run time' absolute 1 1                                           
                                                                                                                                                                          
2020-03-21 16:18:21: python.d DEBUG: sensors[sensors] : started, update frequency: 1                                                                                      
CHART sensors.drivetemp-scsi-2-0_temperature '' 'drivetemp-scsi-2-0 temperature' 'Celsius' 'temperature' 'sensors.temperature' line 60000 1 '' 'python.d.plugin' 'sensors'
DIMENSION 'drivetemp-scsi-2-0_temp1' 'temp1' absolute 1 1000 ' '                     
                                                                                     
BEGIN sensors.drivetemp-scsi-2-0_temperature 0                                       
SET 'drivetemp-scsi-2-0_temp1' = 35000    
END                                                                                                                                                                       
                                                                                                                                                                          
CHART sensors.drivetemp-scsi-0-0_temperature '' 'drivetemp-scsi-0-0 temperature' 'Celsius' 'temperature' 'sensors.temperature' line 60001 1 '' 'python.d.plugin' 'sensors'
DIMENSION 'drivetemp-scsi-0-0_temp1' 'temp1' absolute 1 1000 ' '                     
                                                                                     
BEGIN sensors.drivetemp-scsi-0-0_temperature 0                                       
SET 'drivetemp-scsi-0-0_temp1' = 31000                                               
END                                                                                  
                                                                                                                                                                          
CHART sensors.it8728-isa-0a30_temperature '' 'it8728-isa-0a30 temperature' 'Celsius' 'temperature' 'sensors.temperature' line 60002 1 '' 'python.d.plugin' 'sensors'      
DIMENSION 'it8728-isa-0a30_temp1' 'temp1' absolute 1 1000 ' '                        
DIMENSION 'it8728-isa-0a30_temp2' 'temp2' absolute 1 1000 ' '                        
DIMENSION 'it8728-isa-0a30_temp3' 'temp3' absolute 1 1000 ' '                        
                                                                                                                                                                          
BEGIN sensors.it8728-isa-0a30_temperature 0                       
SET 'it8728-isa-0a30_temp1' = 39000                                                                                                                                       
SET 'it8728-isa-0a30_temp3' = -54000                                                                                                                                      
END                                           
[…]

etc. But the web interface doesn't even show Sensors, hddtemp, ceph nor other python.d plugins.
I've built simple custom dashboard: https://pipebreaker.pl/z/netdata-enviro.html , it stopped working with messages like hddtemp_local.disks_temp: chart not found on url "/api/v1/chart?chart=hddtemp_local.disks_temp"

error_log doesn't have anything:

# grep sensors *
error.log:2020-03-21 16:05:22: charts.d: INFO: sensors: is disabled. Add a line with sensors=force in '/etc/netdata/charts.d.conf' to enable it (or remove the line that disables it).
error.log:2020-03-21 16:05:24: python.d DEBUG: plugin[main] : [sensors] looking for 'sensors.conf' in ['/etc/netdata/python.d', '/etc/netdata/conf.d/python.d']
error.log:2020-03-21 16:05:24: python.d DEBUG: plugin[main] : [sensors] loading '/etc/netdata/conf.d/python.d/sensors.conf'
error.log:2020-03-21 16:05:24: python.d DEBUG: plugin[main] : [sensors] '/etc/netdata/conf.d/python.d/sensors.conf' is loaded
error.log:2020-03-21 16:05:24: python.d INFO: plugin[main] : [sensors] built 1 job(s) configs
error.log:2020-03-21 16:05:24: python.d DEBUG: plugin[main] : sensors[sensors] was previously active, applying recovering settings

Also reported downstream: https://bugzilla.redhat.com/show_bug.cgi?id=1815197

@zdzichu zdzichu added bug needs triage Issues which need to be manually labelled labels Mar 21, 2020
@ilyam8
Copy link
Member

ilyam8 commented Mar 21, 2020

@zdzichu 👋

Could you test (w/o module name)

sudo -u netdata /usr/libexec/netdata/plugins.d/python.d.plugin debug

@zdzichu
Copy link
Author

zdzichu commented Mar 21, 2020

Without module name collectors also work. I'm attaching output as file, as it is quite big.

netdata-python-debug.txt

@ilyam8
Copy link
Member

ilyam8 commented Mar 21, 2020

it should work then

error_log doesn't have anything

do

  • cp /dev/null error.log
  • systemctl restart netdata.service
  • wait a bit; grep python error.log

it should contain all info about started jobs.

Refresh your browser and make sure url has no after/before, Sensors section should be on the dashboard.

@zdzichu
Copy link
Author

zdzichu commented Mar 21, 2020

Sadly, it doesn't work. I waited about ten minutes, no Sensors section.
Any hints on debugging socket communication between python.d and main netdata process?
netdata_python_error.log

@ilyam8
Copy link
Member

ilyam8 commented Mar 21, 2020

2020-03-21 22:38:53: python.d ERROR: apache[localhost] : Url: http://localhost/server-status?auto. Error: HTTPSConnectionPool(host='localhost', port=443): Max retries exceeded with url: /server-status?auto (Caused by SSLError(SSLCertVerificationError("hostname 'localhost' doesn't match either of 'enotty.pipebreaker.pl', 'fotki.pipebreaker.pl', 'mother.pipebreaker.pl', 'pipebreaker.pl'")))
2020-03-21 22:38:53: python.d ERROR: apache[localhost] : Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 376, in _make_request
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
  File "/usr/lib/python3.7/site-packages/urllib3/connection.py", line 420, in connect
  File "/usr/lib/python3.7/site-packages/urllib3/connection.py", line 430, in _match_hostname
  File "/usr/lib64/python3.7/ssl.py", line 334, in match_hostname
  File "/usr/libexec/netdata/python.d/python_modules/bases/FrameworkServices/UrlService.py", line 123, in _get_raw_data
  File "/usr/libexec/netdata/python.d/python_modules/bases/FrameworkServices/UrlService.py", line 155, in _get_raw_data_with_status
  File "/usr/lib/python3.7/site-packages/urllib3/request.py", line 76, in request
  File "/usr/lib/python3.7/site-packages/urllib3/request.py", line 97, in request_encode_url
  File "/usr/lib/python3.7/site-packages/urllib3/poolmanager.py", line 369, in urlopen
  File "/usr/lib/python3.7/site-packages/urllib3/poolmanager.py", line 330, in urlopen
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
  File "/usr/lib/python3.7/site-packages/urllib3/util/retry.py", line 436, in increment
2020-03-21 22:38:53: python.d INFO: plugin[main] : apache[localipv4] : check failed

Does your server redirect http to https?

I see the last line is

2020-03-21 22:38:53: python.d INFO: plugin[main] : boinc[boinc] : check failed

Btw is it the whole output?

@zdzichu
Copy link
Author

zdzichu commented Mar 22, 2020

Yes, it redirects, but I'm not concerned about apache plugin right now. I'm using different server.
This is the full output.

@ilyam8
Copy link
Member

ilyam8 commented Mar 22, 2020

Lets try to disable boinc module in the python.d.conf

also do same but w/o grep and share error.log

@zdzichu
Copy link
Author

zdzichu commented Mar 22, 2020

Here's error.log with disabled boinc=no

netdata_error.log

@ilyam8
Copy link
Member

ilyam8 commented Mar 22, 2020

I dont understand what is the problem, there is no info in the error.log

It loads all jobs, starts init,check, checks few modules and that is it (according log)

...
020-03-22 15:42:45: python.d ERROR: am2320[am2320] : Could not find the adafruit-circuitpython-am2320 package.
2020-03-22 15:42:45: python.d INFO: plugin[main] : am2320[am2320] : check failed
2020-03-22 15:42:45: python.d ERROR: beanstalk[beanstalk] : 'beanstalkc' module is needed to use beanstalk.chart.py
2020-03-22 15:42:45: python.d INFO: plugin[main] : beanstalk[beanstalk] : check failed
2020-03-22 15:42:45: python.d ERROR: bind_rndc[bind_rndc] : Can't locate "rndc" binary or binary is not executable by netdata
2020-03-22 15:42:45: python.d INFO: plugin[main] : bind_rndc[bind_rndc] : check failed

Nothing after.

At the same time

2020-03-22 15:42:45: python.d INFO: plugin[main] : [sensors] built 1 job(s) configs
2020-03-22 15:42:45: python.d DEBUG: plugin[main] : sensors[sensors] was previously active, applying recovering settings

which means that sensor module job was active 🤷‍♂

@zdzichu
Copy link
Author

zdzichu commented Mar 23, 2020

Found it!
Actually the problem was in ceph module. My CEPH cluster is not fully healthy right now (ceph-mgrs are failing). ceph monitoring module is basically running ceph osd pool stats --format json command.
Without MGRs, above command just hangs waiting for answer. The hanging command halted everything in python.d, no job was progressing.
After disabling ceph module, everything works as before.
Maybe there should be a timeout for each job within python.d?

@zdzichu zdzichu closed this as completed Mar 23, 2020
@ilyam8
Copy link
Member

ilyam8 commented Mar 23, 2020

I think this should be handled by module/library. I see there is optional timeout parameter for mon_command, we need to use it

https://docs.ceph.com/docs/master/rados/api/python/#cli-commands

@ilyam8
Copy link
Member

ilyam8 commented Mar 23, 2020

Well, i see

timeout – This parameter is ignored.

@PaulMez 👋 any ideas how to handle the hang problem?

@ilyam8 ilyam8 added collectors/python.d area/collectors Everything related to data collection and removed area/external/python labels Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collectors Everything related to data collection bug collectors/python.d needs triage Issues which need to be manually labelled
Projects
None yet
Development

No branches or pull requests

2 participants