Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.2.19] - Under heavy load Primary poller process can go into a loop #4450

Closed
seanmancini opened this issue Nov 1, 2021 · 3 comments
Closed
Labels
bug Undesired behaviour duplicate Duplicate of another issue resolved A fixed issue
Milestone

Comments

@seanmancini
Copy link

While adding 500+ devices poller.php continues to run long after spine has finished
the poller.php proccesses continue to mount up and poller.php never exists

Strace output shows the poller just flipping between two statments

recvfrom(6, "\1\0\0\1\0106\0\0\2\3def\5cacti\2po\rpoller_ou"..., 32768, MSG_DONTWAIT, NULL, NULL) = 1837
sendto(6, "*\0\0\0\3SELECT * FROM data_debug WH"..., 46, MSG_DONTWAIT, NULL, 0) = 46
poll([{fd=6, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=6, revents=POLLIN}])
recvfrom(6, "\1\0\0\1\0073\0\0\2\3def\5cacti\ndata_debug\nd"..., 32768, MSG_DONTWAIT, NULL, NULL) = 452
sendto(6, "I\0\0\0\3REPLACE INTO settings SET n"..., 77, MSG_DONTWAIT, NULL, 0) = 77
poll([{fd=6, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=6, revents=POLLIN}])
recvfrom(6, "\7\0\0\1\0\1\0\2\0\0\0", 32768, MSG_DONTWAIT, NULL, NULL) = 11
sendto(6, "/\0\0\0\3SELECT COUNT(local_data_id)"..., 51, MSG_DONTWAIT, NULL, 0) = 51
poll([{fd=6, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=6, revents=POLLIN}])
recvfrom(6, "\1\0\0\1\1*\0\0\2\3def\0\0\0\24COUNT(local_dat"..., 32768, MSG_DONTWAIT, NULL, NULL) = 76
sendto(6, "/\0\0\0\3SELECT COUNT(local_data_id)"..., 51, MSG_DONTWAIT, NULL, 0) = 51
poll([{fd=6, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=6, revents=POLLIN}])
recvfrom(6, "\1\0\0\1\1*\0\0\2\3def\0\0\0\24COUNT(local_dat"..., 32768, MSG_DONTWAIT, NULL, NULL) = 76
sendto(6, "U\1\0\0\3SELECT po.output, po.time, "..., 345, MSG_DONTWAIT, NULL, 0) = 345
poll([{fd=6, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000^Cstrace: Process 841667 detached
 <detached ...>

When I comment out this block of code everything works fine

/* determine the number of active profiles to improve poller performance
 * under some circumstances.  Save this data for spine and cmd.php.
 */
/* $active_profiles = db_fetch_cell('SELECT COUNT(DISTINCT data_source_profile_id)
        FROM data_template_data
        WHERE local_data_id > 0');
set_config_option('active_profiles', $active_profiles);

*/

However I dont think thats where the problem is I belive its somewhere around here

/* assume a scheduled task of either 60 or 300 seconds */
if (!empty($poller_interval)) {
        $poller_runs = intval($cron_interval / $poller_interval);

        if ($active_profiles != 1) {
                $sql_where   = "WHERE rrd_next_step - $poller_interval <= 0 AND poller_id = $poller_id";
        } else {
                $sql_where   = "WHERE poller_id = $poller_id";

Manually running SELECT COUNT(DISTINCT data_source_profile_id)
FROM data_template_data
WHERE local_data_id > 0 returns the proper value of 1 on the primary poller

This only affects the primary poller remote pollers do not see this issue as far as I can tell

@seanmancini seanmancini added bug Undesired behaviour unverified Some days we don't have a clue labels Nov 1, 2021
@netniV
Copy link
Member

netniV commented Nov 1, 2021

As suggested, keep adding logging to various elements with the process ID and a different description so it can be tracked each time

@TheWitness
Copy link
Member

Stop cactid and run the poller by hand using the --force and --debug options.

@TheWitness TheWitness added duplicate Duplicate of another issue resolved A fixed issue and removed unverified Some days we don't have a clue labels Dec 10, 2021
@TheWitness TheWitness added this to the v1.2.20 milestone Dec 10, 2021
@TheWitness
Copy link
Member

Closing as this is duplicate and resolved.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour duplicate Duplicate of another issue resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

3 participants