-
-
Notifications
You must be signed in to change notification settings - Fork 415
When remote poller is in offline mode, GUI can become inaccesible and poller can timeout #4896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You are running out of connections. |
Let me check I didnt see any too many connections logs |
connections are fine
|
Ok interesting find if you fail the MariaDB process on the main poller this issue is not seen the remote poller is accessible and polling completes without issues when the primary server is not reachable when the network connection is down that's when the remote starts acting up when the network is restored to the primary recovery does kick in and valid records are passed to the main's boost table |
You should reduce your timeout and retry number. The default's are too high. |
Yea I thought the same I'll try that on Monday |
Tested the script server timeout |
Brought down retry for spine from 5 to 3 same result |
I used the same process as you with 1 remote and can not repeat your problem. This might be a plugin issue. What plugins are you running? |
I think it's only thold and syslog
I'll check in the morning
…On Tue., Aug. 23, 2022, 9:03 p.m. TheWitness, ***@***.***> wrote:
I used the same process as you with 1 remote and can not repeat your
problem. This might be a plugin issue. What plugins are you running?
—
Reply to this email directly, view it on GitHub
<#4896 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADGEXTB27CP2VBGQMCZKSSTV2VYHZANCNFSM566DOYMA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@TheWitness and I spoke about this to create a setting for main server timeout on the remote pollers example
The reason for this is the MySQL/MariaDB client take longer to timeout when the server does not respond |
Also found the following when the primary server is offline you will see this in the error_log of httpd
Function at line 548
|
I think we need to have a setting in the GUI to keep the poller offline from the GUI perspective until it comes back up. Something that we need to save in the session. I think this is the only good way to solve this issue. |
It's likely I'll be offline for a few days. We'll see. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Bump to keep the ticket open
…On Mon., Oct. 24, 2022, 8:05 p.m. github-actions[bot], < ***@***.***> wrote:
This issue has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions.
—
Reply to this email directly, view it on GitHub
<#4896 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADGEXTEEL6WUXNIAYFP45WLWE4P3JANCNFSM566DOYMA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Porting these three fixes from the 1.2.x branch * upgrading to 1.2.22 most of the plugins break in a multipoller setup * When remote poller is in offline mode GUI inaccesible and poller times out * When in Recovery Mode plugins that are designed to work remotely stop working
We have been performing testing on 1.2.22
During failure testing we found that if you fail the master poller the remote poller GUI is unusable and
spine times out
spine spends a lot of time spawning script server processes
this behaviour is only seen in offline mode
I checked via packet capture for excessive retries to the primary poller but did not find any
In the logs I am seeing the remote trying to push data to main constantly
The text was updated successfully, but these errors were encountered: