Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi Output Fields are not parsed correctly #2979

Closed
eschoeller opened this issue Sep 26, 2019 · 4 comments
Closed

Multi Output Fields are not parsed correctly #2979

eschoeller opened this issue Sep 26, 2019 · 4 comments
Labels
bug Undesired behaviour resolved A fixed issue
Milestone

Comments

@eschoeller
Copy link

eschoeller commented Sep 26, 2019

Here's another one. As you know I have a Main data collector and 3 'remote' data collectors. I have all of our Linux servers on one of the data collectors. When I ran into boost problems a few days ago I immediately disabled all of them to bring down the load on the system. Now I have attempted to re-enable them. In doing so, the poller run time on the main data collector increases dramatically as more of the linux servers are enabled on the remote data collector.

After turning on medium level verbosity I'm seeing the main data collector flooded with these messages:

2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'finwait2:0' [map finwait2->finwait2]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'synsent:0' [map synsent->synsent]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'synrecv:0' [map synrecv->synrecv]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'closewait:0' [map closewait->closewait]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'file_sz:2048' [map file_sz->file_sz]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'totsck:621' [map totsck->totsck]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'tcpsck:10' [map tcpsck->tcpsck]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'udpsck:6' [map udpsck->udpsck]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'rawsck:0' [map rawsck->rawsck]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'ip_frag:0' [map ip_frag->ip_frag]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'plist_sz:245' [map plist_sz->plist_sz]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'procs:1.00' [map procs->procs]
2019/09/25 19:55:52 - POLLER: Poller[1] Parsed MULTI output field 'cswchs:137.00' [map cswchs->cswchs]

I recognize this output as being related to the data sources associated with our Linux devices, and not with anything the main data collector should be currently polling.
So I am left wondering why the main data collector is taking on this multi output field parsing from the remote data collector.

I attempted to move all the Linux servers to a different remote data collector to see if it would resolve the problem. But I'm still seeing all this activity on the main data collector.

@cigamit cigamit changed the title [1.2.6] Multi Output Field Parsing on Main Data Collector Multi Output Field Parsing on Main Data Collector Sep 26, 2019
cigamit added a commit that referenced this issue Sep 26, 2019
Multi Output Field Parsing on Main Data Collector
@cigamit cigamit added bug Undesired behaviour resolved A fixed issue labels Sep 26, 2019
@cigamit cigamit added this to the v1.2.7 milestone Sep 26, 2019
@eschoeller
Copy link
Author

eschoeller commented Sep 26, 2019

Quick fix! It took me longer to apply the updates than for you to write the code I think.
So, yes the messages have gone away, but I'm still left wondering why the main data collector runtime is still increasing when I enable devices on a remote data collector
Because now, I just have no logs for about 15 seconds of poller runtime :)

2019/09/25 20:09:29 - SPINE: Poller[1] PID[57339] Device[693] HT[1] Total Time: 23 Seconds
2019/09/25 20:09:29 - SPINE: Poller[1] PID[57339] POLLER: Active Threads is 0, Pending is 0
2019/09/25 20:09:29 - SPINE: Poller[1] PID[57339] SPINE: The Final Value of Threads is 0
2019/09/25 20:09:30 - SPINE: Poller[1] PID[57339] Time: 26.5788 s, Threads: 32, Devices: 53
2019/09/25 20:09:44 - SYSTEM STATS: Time:41.1043 Method:spine Processes:1 Threads:32 Hosts:53 HostsPerProcess:53 DataSources:2508 RRDsProcessed:0

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

It has to wait on all the remotes to finish.

@eschoeller
Copy link
Author

eschoeller commented Sep 26, 2019

Oh, that has not been my experience at all. Here take a look:
cacti_graph_68034
In this case the two remote pollers, thorn-b and thorn-c were consistently running longer than the main poller was.
I just kicked up the logging verbosity and found what it was doing in those additional 15 seconds, more of the MULTI output field stuff, but I think these are for items that run directly on the main poller

2019/09/25 21:21:31 - SPINE: Poller[1] PID[31588] Device[693] HT[1] Total Time: 25 Seconds
2019/09/25 21:21:31 - SPINE: Poller[1] PID[31588] POLLER: Active Threads is 0, Pending is 0
2019/09/25 21:21:31 - SPINE: Poller[1] PID[31588] SPINE: The Final Value of Threads is 0
2019/09/25 21:21:32 - SPINE: Poller[1] PID[31588] Time: 29.1516 s, Threads: 32, Devices: 56
.....
2019/09/25 21:20:35 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:1834' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:36 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:4400' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:36 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:4111' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:36 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:5103' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:36 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:7101' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:36 - POLLER: Poller[1] Parsed MULTI output field 'systemTotalPower:3546' [map systemTotalPower->systemTotalPower]
2019/09/25 21:20:37 - SYSTEM STATS: Time:34.7067 Method:spine Processes:1 Threads:32 Hosts:56 HostsPerProcess:56 DataSources:2982 RRDsProcessed:0

(and yes, in this case it was not 15 seconds of MULTI output activity, more like 5)

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Hmm, I'm going to have to give that some thought.

@netniV netniV changed the title Multi Output Field Parsing on Main Data Collector Multi Output Fields are not parsed correctly Sep 28, 2019
@cigamit cigamit closed this as completed Sep 30, 2019
@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

2 participants