Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running automation, scan can fail when selecting remote pollers #3042

Closed
bmfmancini opened this issue Oct 18, 2019 · 122 comments
Closed

When running automation, scan can fail when selecting remote pollers #3042

bmfmancini opened this issue Oct 18, 2019 · 122 comments
Labels
bug Undesired behaviour resolved A fixed issue
Milestone

Comments

@bmfmancini
Copy link
Member

Hey Guys

When running a network discovery and selecting a remote poller the scan hangs after a few devices
if you run the same scan on the main poller everything is ok

This seems to be new in 1.2.7

@bmfmancini
Copy link
Member Author

Ran a new bunch of devices today issue is still ongoing no errors that I can see
the automation just sits in a running state but nothing happens

if I create the same network but put if on the main poller it runs just fine

@bmfmancini
Copy link
Member Author

Hey Guys any thoughts on this one ?

@netniV
Copy link
Member

netniV commented Nov 8, 2019

I've not had time to create a multiple poller setup unfortunately. Our multi poller expert has been away but is hopefully returning soon so will be in a better position to answer.

@bmfmancini
Copy link
Member Author

bmfmancini commented Nov 8, 2019 via email

@cigamit
Copy link
Member

cigamit commented Nov 9, 2019

I'm here, but I need time to setup.

@cigamit
Copy link
Member

cigamit commented Nov 9, 2019

I'm going to mark this one as resolved. The other too. Found the issues.

@cigamit cigamit added bug Undesired behaviour resolved A fixed issue labels Nov 9, 2019
cigamit added a commit that referenced this issue Nov 9, 2019
* AUTOM8 discovery starts even if you click cancel
* AUTOM8 network scan continues to run infinetly even when cancelled
* AUTOM8 Scan hangs when selecting remote poller
@cigamit cigamit closed this as completed Nov 9, 2019
@bmfmancini
Copy link
Member Author

Ok I have tested and the cancel button issue is resolved !
AUTOM8 runs on the remote pollers great as well from what I can tell !
Confirmed that if you cancel a running scan it stops

Great work @cigamit thanks a bunch man !
Would be great if I could send you a beer some how lol

@bmfmancini
Copy link
Member Author

hey Guys

Sorry I tested this out on a larger subnet /24 and about 20 devices in each time it hangs :(

@netniV
Copy link
Member

netniV commented Nov 20, 2019

Are you able to trace where it is hanging at all? Exactly the same? Does the modification that @cigamit have any affect? If you look at the automation_processes table?

@bmfmancini
Copy link
Member Author

It seems to stop within the first 15 devices

@bmfmancini
Copy link
Member Author

can this be re-opened ?

@TheWitness
Copy link
Member

TheWitness commented Nov 26, 2019

Can you check that when it runs, you continue to see the poller_automation.php scripts running? If they remain running, can you strace using "strace -s 2500 -tt -p <pid>" to see what they are doing? If they end, and no more scans start, can you check the pids in the automation_processes and see if they remain for pids that have left the system? Let us know what you find. We can open after wee get good feedback.

@bmfmancini
Copy link
Member Author

Sure!

Here we go

ps -ef | grep auto

apache 10867 1 0 12:35 ? 00:00:00 /bin/php -q /var/www/html/cacti/poller_automation.php --poller=8 --network=72 --force
root 19416 14828 0 12:38 pts/0 00:00:00 grep --color=auto auto

12:41:24.636405 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
12:41:24.636471 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
12:41:24.636528 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
12:41:24.636580 nanosleep({5, 0}, 0x7ffc25d107c0) = 0
12:41:29.636781 sendto(4, "T\0\0\0\3SELECT command FROM automation_processes WHERE network_id = '72' AND task="tmaster"", 88, MSG_DONTWAIT, NULL, 0) = 88
12:41:29.636869 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:29.637110 recvfrom(4, "\1\0\0\1\1Q\0\0\2\3def\5cacti\24automation_processes\24aut", 44, MSG_DONTWAIT, NULL, NULL) = 44
12:41:29.637187 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:29.637252 recvfrom(4, "omation_processes\7command\7command\f!\0<\0\0\0\375\0\0\0\0\0\5\0\0\3\376\0\0"\0\1\0\0\4\373\5\0\0\5\376\0\0"\0", 193, MSG_DONTWAIT, NULL, NULL) = 69
12:41:29.637343 sendto(4, "k\0\0\0\3SELECT count() FROM automation_processes WHERE network_id = '72' AND task!="tmaster" AND status="running"", 111, MSG_DONTWAIT, NULL, 0) = 111
12:41:29.637400 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:29.637572 recvfrom(4, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10count(
)\0\f?\0\25\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0"\0\2\0\0\4\0012\5\0\0\5\376\0\0"\0", 124, MSG_DONTWAIT, NULL, NULL) = 63
12:41:29.637659 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
12:41:29.637720 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
12:41:29.637773 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
12:41:29.637822 nanosleep({5, 0}, 0x7ffc25d107c0) = 0
12:41:34.638137 sendto(4, "T\0\0\0\3SELECT command FROM automation_processes WHERE network_id = '72' AND task="tmaster"", 88, MSG_DONTWAIT, NULL, 0) = 88
12:41:34.638247 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:34.638543 recvfrom(4, "\1\0\0\1\1Q\0\0\2\3def\5cacti\24automation_processes\24automation_processes", 61, MSG_DONTWAIT, NULL, NULL) = 61
12:41:34.638627 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:34.638701 recvfrom(4, "\7command\7command\f!\0<\0\0\0\375\0\0\0\0\0\5\0\0\3\376\0\0"\0\1\0\0\4\373\5\0\0\5\376\0\0"\0", 193, MSG_DONTWAIT, NULL, NULL) = 52
12:41:34.638795 sendto(4, "k\0\0\0\3SELECT count() FROM automation_processes WHERE network_id = '72' AND task!="tmaster" AND status="running"", 111, MSG_DONTWAIT, NULL, 0) = 111
12:41:34.638868 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:34.639047 recvfrom(4, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10count(
)\0\f?\0\25\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0"\0\2\0\0\4\0012\5\0\0\5\376\0\0"\0", 141, MSG_DONTWAIT, NULL, NULL) = 63
12:41:34.639146 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
12:41:34.639213 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
12:41:34.639272 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
12:41:34.639331 nanosleep({5, 0}, 0x7ffc25d107c0) = 0
12:41:39.639593 sendto(4, "T\0\0\0\3SELECT command FROM automation_processes WHERE network_id = '72' AND task="tmaster"", 88, MSG_DONTWAIT, NULL, 0) = 88
12:41:39.639704 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:39.640088 recvfrom(4, "\1\0\0\1\1Q\0\0\2\3def\5cacti\24automation_processes\24automation_processes\7command\7command\f", 78, MSG_DONTWAIT, NULL, NULL) = 78
12:41:39.640218 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:39.640321 recvfrom(4, "!\0<\0\0\0\375\0\0\0\0\0\5\0\0\3\376\0\0"\0\1\0\0\4\373\5\0\0\5\376\0\0"\0", 193, MSG_DONTWAIT, NULL, NULL) = 35
12:41:39.640520 sendto(4, "k\0\0\0\3SELECT count() FROM automation_processes WHERE network_id = '72' AND task!="tmaster" AND status="running"", 111, MSG_DONTWAIT, NULL, 0) = 111
12:41:39.640640 poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=4, revents=POLLIN}])
12:41:39.640849 recvfrom(4, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10count(
)\0\f?\0\25\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0"\0\2\0\0\4\0012\5\0\0\5\376\0\0"\0", 158, MSG_DONTWAIT, NULL, NULL) = 63

@bmfmancini
Copy link
Member Author

When I cancel the discover here is the trace output yet the status shows running

12:44:44.702842 brk(NULL) = 0x555fdf3f7000
12:44:44.702914 brk(NULL) = 0x555fdf3f7000
12:44:44.702974 brk(0x555fdf3c3000) = 0x555fdf3c3000
12:44:44.703067 brk(NULL) = 0x555fdf3c3000
12:44:44.703689 munmap(0x7f6a4b600000, 2097152) = 0
12:44:44.703866 munmap(0x7f6a5d000000, 2097152) = 0
12:44:44.704228 munmap(0x7f6a6185c000, 299008) = 0
12:44:44.704357 munmap(0x7f6a618ca000, 323584) = 0
12:44:44.704656 exit_group(0) = ?
12:44:44.705306 +++ exited with 0 +++

@TheWitness
Copy link
Member

Run the following after cancel:

SELECT * FROM automation_processes;

Post the output.

@bmfmancini
Copy link
Member Author

MariaDB [cacti]> SELECT * FROM automation_processes;
+-------+-----------+------------+-----------+---------+---------+----------+--- ---------+---------------------+
| pid | poller_id | network_id | task | status | command | up_hosts | sn mp_hosts | heartbeat |
+-------+-----------+------------+-----------+---------+---------+----------+--- ---------+---------------------+
| 32003 | 7 | 48 | collector | running | NULL | 1 | 1 | 2019-10-18 11:25:45 |
| 16849 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:44 |
| 16841 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:41 |
| 16847 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:42 |
| 16845 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:42 |
| 16843 | 8 | 45 | collector | running | NULL | 4 | 1 | 2019-10-18 10:38:54 |
| 16853 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:41 |
| 16855 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:42 |
| 16851 | 8 | 45 | collector | running | NULL | 5 | 1 | 2019-10-18 10:39:05 |
| 16861 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:42 |
| 16865 | 8 | 45 | collector | running | NULL | 3 | 1 | 2019-10-18 10:38:42 |
| 32001 | 7 | 48 | collector | running | NULL | 1 | 1 | 2019-10-18 11:25:45 |
| 16950 | 7 | 47 | collector | running | NULL | 13 | 1 | 2019-10-18 11:11:53 |
| 16948 | 7 | 47 | collector | running | NULL | 6 | 1 | 2019-10-18 11:09:32 |
| 16944 | 7 | 47 | collector | running | NULL | 8 | 1 | 2019-10-18 11:10:12 |
| 16946 | 7 | 47 | collector | running | NULL | 6 | 1 | 2019-10-18 11:09:31 |
| 16942 | 7 | 47 | collector | running | NULL | 24 | 1 | 2019-10-18 11:15:37 |
| 32009 | 7 | 48 | collector | running | NULL | 1 | 1 | 2019-10-18 11:25:45 |
| 32005 | 7 | 48 | collector | running | NULL | 1 | 1 | 2019-10-18 11:25:44 |
| 32007 | 7 | 48 | collector | running | NULL | 1 | 1 | 2019-10-18 11:25:45 |
| 30869 | 8 | 51 | collector | running | NULL | 1 | 1 | 2019-10-18 12:08:58 |
| 30867 | 8 | 51 | collector | running | NULL | 1 | 1 | 2019-10-18 12:08:57 |
| 30865 | 8 | 51 | collector | running | NULL | 1 | 1 | 2019-10-18 12:08:59 |
| 30863 | 8 | 51 | collector | running | NULL | 1 | 1 | 2019-10-18 12:08:57 |
| 30861 | 8 | 51 | collector | running | NULL | 2 | 1 | 2019-10-18 12:09:10 |
| 31535 | 7 | 56 | collector | done | NULL | 7 | 7 | 2019-11-05 10:36:44 |
| 31531 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:43 |
| 9423 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:26 |
| 9429 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:24 |
| 9432 | 7 | 78 | collector | running | NULL | 2 | 2 | 2019-11-19 13:41:26 |
| 9421 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:24 |
| 9413 | 7 | 78 | collector | running | NULL | 3 | 2 | 2019-11-19 13:41:44 |
| 31529 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 31525 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 9419 | 7 | 78 | collector | running | NULL | 2 | 2 | 2019-11-19 13:41:24 |
| 9417 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:24 |
| 9411 | 7 | 78 | collector | running | NULL | 2 | 2 | 2019-11-19 13:41:24 |
| 9415 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:25 |
| 9409 | 7 | 78 | collector | running | NULL | 1 | 1 | 2019-11-19 13:41:23 |
| 20335 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 20340 | 8 | 76 | collector | running | NULL | 2 | 2 | 2019-11-19 13:25:51 |
| 20328 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 31527 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 24530 | 8 | 44 | collector | done | NULL | 45 | 44 | 2019-10-29 13:54:44 |
| 24522 | 8 | 44 | collector | done | NULL | 44 | 43 | 2019-10-29 13:54:44 |
| 20331 | 8 | 76 | collector | running | NULL | 2 | 2 | 2019-11-19 13:25:51 |
| 20333 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 20337 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 20322 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 20324 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:51 |
| 20320 | 8 | 76 | collector | running | NULL | 2 | 2 | 2019-11-19 13:25:51 |
| 24524 | 8 | 44 | collector | done | NULL | 52 | 50 | 2019-10-29 13:54:42 |
| 20326 | 8 | 76 | collector | running | NULL | 1 | 1 | 2019-11-19 13:25:52 |
| 2891 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:18 |
| 2883 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:18 |
| 31521 | 7 | 56 | collector | running | NULL | 8 | 8 | 2019-11-05 10:36:42 |
| 24526 | 8 | 44 | collector | done | NULL | 60 | 59 | 2019-10-29 13:54:44 |
| 31515 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 31519 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 24528 | 8 | 44 | collector | running | NULL | 5 | 5 | 2019-10-29 13:52:07 |
| 31523 | 7 | 56 | collector | done | NULL | 8 | 8 | 2019-11-05 10:36:44 |
| 31517 | 7 | 56 | collector | done | NULL | 9 | 9 | 2019-11-05 10:36:43 |
| 2881 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:18 |
| 2879 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:18 |
| 2889 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:18 |
| 2877 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:17 |
| 2895 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:17 |
| 26418 | 8 | 61 | collector | running | NULL | 1 | 1 | 2019-11-05 09:40:04 |
| 2885 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:17 |
| 2893 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:17 |
| 2887 | 7 | 74 | collector | running | NULL | 1 | 1 | 2019-11-19 13:10:17 |
| 10897 | 8 | 72 | collector | running | NULL | 2 | 2 | 2019-11-26 12:35:26 |
| 10899 | 8 | 72 | collector | done | NULL | 11 | 10 | 2019-11-26 12:35:36 |
| 10887 | 8 | 72 | collector | done | NULL | 46 | 46 | 2019-11-26 12:35:36 |
| 10879 | 8 | 72 | collector | running | NULL | 2 | 2 | 2019-11-26 12:35:26 |
| 10893 | 8 | 72 | collector | done | NULL | 24 | 23 | 2019-11-26 12:35:36 |
| 10885 | 8 | 72 | collector | done | NULL | 20 | 20 | 2019-11-26 12:35:36 |
| 10889 | 8 | 72 | collector | done | NULL | 10 | 9 | 2019-11-26 12:35:36 |
| 10891 | 8 | 72 | collector | done | NULL | 14 | 14 | 2019-11-26 12:35:36 |
| 10883 | 8 | 72 | collector | done | NULL | 50 | 49 | 2019-11-26 12:35:44 |
| 10881 | 8 | 72 | collector | done | NULL | 22 | 21 | 2019-11-26 12:35:36 |
+-------+-----------+------------+-----------+---------+---------+----------+--- ---------+---------------------+
81 rows in set (0.00 sec)

@bmfmancini
Copy link
Member Author

Not sure why some many show as running right now only 1 is running the others are idle and disabled

@TheWitness
Copy link
Member

Yea, was thinking the same thing. Do you have error_log set to something like /tmp/php_errors.log right now in your php.ini file on the data collectors? Can you do that and see if they are exiting prematurely due to some error? There definitely seems to be a trend, but not sure why it's happening. Maybe it has something to do with the shutdown logic.

@bmfmancini
Copy link
Member Author

Let me check I dont think I have that setup though

@bmfmancini
Copy link
Member Author

yea as I thought I dont have anything there

@bmfmancini
Copy link
Member Author

should I try to truncate the table ?

@bmfmancini
Copy link
Member Author

Another funny this is that the scan shows running well passed the run limit
Something changed for this scan routine between 1.2.4 and 1.2.7 on 1.2.4 this was working awsome now not so much

@TheWitness
Copy link
Member

Yea, truncate it. Run and cancel again, and let me know what comes back. Table should be empty in the end.

@bmfmancini
Copy link
Member Author

MariaDB [cacti]> SELECT * FROM automation_processes;
+-------+-----------+------------+-----------+---------+---------+----------+------------+---------------------+
| pid | poller_id | network_id | task | status | command | up_hosts | snmp_hosts | heartbeat |
+-------+-----------+------------+-----------+---------+---------+----------+------------+---------------------+
| 11378 | 8 | 72 | tmaster | running | NULL | 0 | 0 | 2019-11-26 17:56:50 |
| 11389 | 8 | 72 | collector | done | NULL | 28 | 28 | 2019-11-26 17:57:06 |
| 11395 | 8 | 72 | collector | done | NULL | 15 | 14 | 2019-11-26 17:57:07 |
| 11381 | 8 | 72 | collector | done | NULL | 33 | 33 | 2019-11-26 17:57:06 |
| 11391 | 8 | 72 | collector | done | NULL | 24 | 23 | 2019-11-26 17:57:07 |
| 11385 | 8 | 72 | collector | done | NULL | 11 | 9 | 2019-11-26 17:57:14 |
| 11383 | 8 | 72 | collector | done | NULL | 10 | 9 | 2019-11-26 17:57:06 |
| 11387 | 8 | 72 | collector | running | NULL | 4 | 4 | 2019-11-26 17:56:57 |
| 11400 | 8 | 72 | collector | running | NULL | 3 | 3 | 2019-11-26 17:56:57 |
| 11393 | 8 | 72 | collector | done | NULL | 50 | 50 | 2019-11-26 17:57:06 |
| 11397 | 8 | 72 | collector | done | NULL | 23 | 23 | 2019-11-26 17:57:07 |
+-------+-----------+------------+-----------+---------+---------+----------+------------+---------------------+
11 rows in set (0.00 sec)

Same behaviour multiple proccess complete but a few keep running

cigamit added a commit that referenced this issue Nov 26, 2019
This should resolve the cancel functionality.
@cigamit
Copy link
Member

cigamit commented Nov 26, 2019

Thanks guys. I've just updated poller_automation.php. Retry with the latest binary.

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

No, run those two commands, in that order, the second of which is the upgrade script.

@bmfmancini
Copy link
Member Author

The first command is a no go

php -q convert_tables.php -i -u --dynamic
ERROR: Invalid Parameter --dynamic

@bmfmancini
Copy link
Member Author

There is actually no Dyanmic option

Required (one or more):
-i | --innodb - Convert any MyISAM tables to InnoDB
-u | --utf8 - Convert any non-UTF8 tables to utf8mb4_unicode_ci

Optional:
-t | --table=S - The name of a single table to change
-n | --skip-innodb="table1 table2 ..." - Skip converting tables to InnoDB
-s | --size=N - The largest table size in records to convert. Default is 1,000,000 rows.
-r | --rebuild - Will compress/optimize existing InnoDB tables if found
-f | --force - Proceed with conversion regardless of table size

-d | --debug - Display verbose output during execution

@bmfmancini
Copy link
Member Author

bmfmancini commented Dec 6, 2019

I think its the version of the convert tables file I have I see it in the 1.2.x branch with the dynamic option

I will pull that down and try again
Also do I run this on the Main poller or the remote pollers ?

@bmfmancini
Copy link
Member Author

Never mind on where to run it I saw your original Note
I pulled down the new convert tables file and ran the convert
I am testing now

@bmfmancini
Copy link
Member Author

No go :( still stuck at the last device

@bmfmancini
Copy link
Member Author

Still seeing the PHP errors

[05-Dec-2019 21:06:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpbQLoBY on line 266
[05-Dec-2019 21:06:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccp65tqUT on line 110
[05-Dec-2019 21:07:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpC3Eax2 on line 98
[05-Dec-2019 21:07:02 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccphudQZO on line 266
[05-Dec-2019 21:07:02 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpwLstxB on line 110
[05-Dec-2019 21:08:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpqIoXOT on line 98
[05-Dec-2019 21:08:02 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpPEWd5w on line 266
[05-Dec-2019 21:08:02 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpwJBDqa on line 110
[05-Dec-2019 21:09:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccp6S0fjJ on line 98
[05-Dec-2019 21:09:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpbObEed on line 266
[05-Dec-2019 21:09:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpeJjjfH on line 110
[05-Dec-2019 21:10:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccptNTcXz on line 98
[05-Dec-2019 21:10:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccp974DHU on line 266
[05-Dec-2019 21:10:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpfrA0wf on line 110
[05-Dec-2019 21:11:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccp8NZ8cl on line 98
[05-Dec-2019 21:11:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccp1WszEw on line 266
[05-Dec-2019 21:11:01 America/Toronto] PHP Parse error: syntax error, unexpected 'new' (T_NEW) in /tmp/ccpYHagbI on line 110

@bmfmancini
Copy link
Member Author

wait a sec....

I have Cerius reporting installed from @thurban which also is showing this in the logs
ERROR: PHP Source File '/var/www/html/cacti/plugins/CereusReporting/ReportEngines/fpdf/fpdi_pdf_parser.php'': Errors parsing /tmp/ccpddIFIB

There is not a chance there is something funky happening with the plugin is there ?
I am just for testing going to disable the plugin and see if somehow its related its a long shot but lets see

@bmfmancini
Copy link
Member Author

Well, I'll be damned ! it works now!
Same scan works like a charm after I uninstalled the reporting plugin
The php errors are still present though

@bmfmancini
Copy link
Member Author

Ok I re-enabled the plugin and its still working so either co-incidence or the table re-install from the plugin re-install fixed it ???

I have no c;lue how the two can be tied together

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

I know it's tough to ask, but when working with us on a maintenance branch you will have to do a full update to the 1.2.x branch. That way, the convert_tables.php --dynamic would not have been a problem. It also simplifies things for us in that we don't have to tell you to update this and that, we can just say sync to branch xxx.

@bmfmancini
Copy link
Member Author

bmfmancini commented Dec 6, 2019 via email

@bmfmancini
Copy link
Member Author

bmfmancini commented Dec 6, 2019 via email

@thurban
Copy link
Contributor

thurban commented Dec 6, 2019

Hi. I guess the whole fpdp extensions isn't required anymore. Going to look into this and remove it.
Nevertheless, the Cereus pluging shouldn't be on a remote poller in the first place (that's the nosync setting in INFO).

cigamit, is there an option for plugins to explicetly state that they should not run/be enalbed on remote pollers ?

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

Sean, we appreciate your persistence as well. And knowing it's production, yea, that does make things a bit tenuous. We are hoping that this release will be the last one of the 1.2 series. We've been working overtime to wring out the bugs. We have actually been delaying the 1.3 development now for literally 6 months.

Thomas, in the INFO file capabilities line:

  • remote_collect: 0 => no poller_top, no poller_bottom, 1 => do both
  • online_view: 1 => show the plugin tab remotely, 0 => don't show the plugin tab
  • online_mgmt: 1 => show the plugin settings remotely, 0 => no console access remotely
  • offline_view: 1 => show the plugin tab remotely, 0 => don't show the plugin tab
  • offline_mgmt: 1 => allow offline console access, 0 => don't allow offline console access

Then the following are definitions for online/offline:

  • online => the remote data collector is able to communicate to the core Cacti
  • offline => the remote data collector is not able to communicate to the core Cacti

Note: I have to check if 'Recovery' mode is online or offline.

I guess we could log a minor feature request for plugins that are zero across the board to have nosync = true instead of directories. What do you think?

@thurban
Copy link
Contributor

thurban commented Dec 6, 2019

I will create that feature request. I got some plugins in mind which do not require to be on remote poller ( i.e. REST API ) or are purely relying on local data (weathermap -> RRD files only ?).

I'm not too unhappy about a delay of 1.3 as the fast releases of new versions made lots of my customers stick with 0.8. An LTS version would be a really(really!) nice thing to have ...

@bmfmancini
Copy link
Member Author

No worries Jimmy I am always glad to help out and make things better!
Thanks for your work as always

I checked the INFO file for CeriusReporting

This is what shows up in the latest version @thurban

Just for reference

[info]
name = CereusReporting
version = 3.90.07
longname = CereusReporting Plugin
author = Urban-Software.de
email = support@urban-software.de
homepage = https://www.urban-software.com/products/cereusreporting-professional-pdf-reports-for-cacti/
compat = 1.0.0
nosync = true <<<<
capabilities = online_view:1, online_mgmt:1, offline_view:0, offline_mgmt:0, remote_collect:0

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

Sean, Thomas knows that nosync = true is not presently good syntax. It's supposed to be a list of directories within the plugin to completely ignore. So, my advice to you is that short term, if cerious creates a lot of temporary files you do the following:

  1. For the directories within the plugin directory that include those files, add that sub-directory to the nosync list. (aka change from 'true' to the list of sub-directories)
  2. Goto Console > Utilities > System Utilities and Rebuild the Resource Cache
  3. Wait for a while to see that the problem goes away
  4. Purge the temporary directories on the remote data collectors

@bmfmancini
Copy link
Member Author

Got it !

Ok but would that have caused this issue with AUTOM8 to hang ?
I mean as soon as I uninstalled and re-installed the plugin everything seems to be working awsome now

Its just so weird

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

Thomas, the 1.2.x branch is supposed to be LTS, but we continue to find bugs. If the team were larger, we could keep development fresh for those hobbiests to tinker and help the project. However, we, the core developers have our day jobs too.

The good thing is that we think we have it nailed now with the 1.2.8 release. We had a delay last week with the release due to personal issues on the team, and that was a good thing for the project (not the team members) as there was a major QA sweep of the code and 41 commit's since last Wednesday.

With that said, it should translate into the release of version 1.2.8 allowing us to get back to the business of addressing the over 160 feature requests that are outstanding.

Sean,

The reason it works now, is that the bugs were all fixed, and you updated all the changed files.

So, guys, we can continue the discussion, but I think it's safe to close this one out.

@bmfmancini
Copy link
Member Author

Cool,

I didn't do a git checkout though on the 1.2.x repo only for the convert tables php file
the only thing past doing the convert tables script and upgrade database script was uninstalled and reinstalled Thomas's plugin

But I agree Jimmy it seems to be working fine now so lets close this off

Thanks for everyone's help on this !!!!!
Glad we found a fix

How can I send you guys a beer @cigamit @netniV ??

@netniV
Copy link
Member

netniV commented Dec 6, 2019

I have Paypal and GitHub Sponsorship

@bmfmancini
Copy link
Member Author

same email for your paypal ?

@bmfmancini
Copy link
Member Author

Beer incoming ! via github sponsor

@cigamit
Copy link
Member

cigamit commented Dec 6, 2019

You can contribute to The Cacti Group through our donations link on the main website as well. I love beer, just can't drink too much do to the natural reactions between candida and sugars naturally found in the human diet.

@cigamit cigamit closed this as completed Dec 6, 2019
@netniV netniV changed the title AUTOM8 Scan hangs when selecting remote poller When running automation, scan can fail when selecting remote pollers Dec 7, 2019
@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

5 participants