Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor specific lines in game logs that show the server has crashed #1805

Closed
diamondburned opened this issue Feb 15, 2018 · 3 comments
Closed
Labels
command: monitor outcome: wontfix This will not be worked on or out of scope type: feature request New feature or request

Comments

@diamondburned
Copy link
Contributor

I'll make this short. So there's monitor_jsquery.sh, however this at best can only monitor the server every minute. I recently found this script here (https://github.com/riemers/wrench/blob/master/wrench) which can detect Watchdog kills and check for them every 10 seconds or so, and it's pretty fast too (watchdog function, line 3794). Having another method of checking for dead srcdss would be pretty useful since I usually have watchdog kill my servers, and monitor isn't that effective (had to add -nowatchdog).

@dgibbs64
Copy link
Member

Since I don't use watchdog I don't fully understand what would be required or what you want us to do.

Would it be something like LinuxGSM checks watchdog for dead servers and reboots the server? If so LinuxGSM already does this without the need for watchdog. It also queries the server as processes can stay even if the server has locked up. LinuxGSM also relies on cron to run scheduled tasks. If you could provide more detail on how you think this can be achieved any why you think it would be useful that would help

@dgibbs64 dgibbs64 added the waiting response Bot: Will be closed after 60 days if no response label Jul 21, 2018
@diamondburned
Copy link
Contributor Author

diamondburned commented Aug 1, 2018

I have a sample of a watchdog crash:

one grumby birb: !resizemyarms
L 04/08/2018 - 20:30:08: "one grumby birb<5><[U:1:318343588]><Red>" say "!resizemyarms"
WatchDog! Server took too long to process (probably infinite loop).
Host_Error: WatchdogHandler called - server exiting.


L 04/08/2018 - 20:31:09: Engine error: Host_Error: WatchdogHandler called - server exiting.


Setting breakpad minidump AppID = 232251
Wrote minidump to: /home/x1mil/serverfiles/tf/addons/sourcemod/data/dumps/d08314d3-c197-4f46-4f489e9e-5f4edd78.dmp
Segmentation fault (core dumped)
Add "-debug" to the ./srcds_run command line to generate a debug.log to help with solving this problem
Sun Apr  8 20:31:14 EDT 2018: Server restart

If my memory isn't fuzzy (I have not touched the logs in a reaallyyy long time), the monitor script searches this by querying the Source port, but the server is still there, just can't be connected to. Here's an old log of it: https://pastebin.com/294fmvgF

As you can see from the log, the server process is still alive. So my easiest solution would be to scan for the lines

Host_Error: WatchdogHandler called - server exiting.

Then exit the process if the lines are found (sigkill is fine, the process already stopped)

@no-response no-response bot removed the waiting response Bot: Will be closed after 60 days if no response label Aug 1, 2018
@dgibbs64 dgibbs64 changed the title [Request] Adding watchdog monitor to the server? Monitor specific lines in game logs that show the server has crashed Jun 24, 2019
@dgibbs64
Copy link
Member

dgibbs64 commented Nov 25, 2019

At this point, I don't think this idea will not be developed. Currently, there are 4 monitoring methods in monitor, session, gamedig, gsquery and TCP. Most game servers support gamedig and gsquery which has to respond within 60 seconds or will be rebooted. If TCP monitor is being used it's possible that the process crashes and the port stays open however very few servers use this. Overall monitor coverage is very good and unless there are examples of monitor failing to pickup crashed servers I believe monitor is currently sufficient.

@dgibbs64 dgibbs64 added the outcome: wontfix This will not be worked on or out of scope label Nov 30, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
command: monitor outcome: wontfix This will not be worked on or out of scope type: feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants