-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux (Zombie Process) - Various Miners - RBM : "failed to close within 10 seconds" / "Refused to die" #1296
Comments
Hmm... fyi, when doing ps aux | grep t-rex above, there is a "defunct" after the t-rex process, but github is not showing it when I pasted it above. |
Just going to work with the below for now, and closing this out. The problem, was that when inserting the ^C into the screen of the miner, at times the miner doesn't stop, and RBM doesn't continue. Instead, doing the below in OCDaemon.psm1:
Will probably screw something up. Will see. |
Thank you! I will improve the function. Let me know, if quit and wipe manage to get rid of defunct processes. |
Ok, it seems, that the subsequent |
- linux: remove kill -9 to avoid hang when trying to kill a zombie process (issue #1296)
Done. If you like to have RainbowMiner reboot the machine, after such a zombie/defunct process has been created, just set |
Ok, re-benchmarking with your change, keeping the quit/wipe and once done, I'll let it sit for a few days to see if it keeps hanging. I know it's not a clean exit, but... Thanks! |
Still "zombies". I'm good, I'll just manually switch and just mine one coin at a time. I was reading that zombies only occur when the parent process ends sooner than the child, and doesn't keep track of it. Read that in almost all cases, it's due to the handling via the code. Anywho... |
Sure, zombies are to be expected. But does Rainbowminer still stop? The last fix was not to get rid of zombies, but to avoid RainbowMiner waiting forever, after it failed to kill a miner process. |
Ok, got it. Almost positive it continued, but the miner itself and it's screen hang. Don't think any other screens spawn after that. I'm working from home today, so let me rerun through it again and clear the logs to be sure I'm telling you accurately. Sorry, i'll let you know. |
RainbowMiner does not continue. 2 Screenshots show @ least 8 minutes, in between, but I went much longer. Note W10 times. My problem might be something else. GPU randomly stops, disable in RBM/Reboot, another GPU fails, disable in RBM/Reboot, 15 or so benchmarks go through fine, then the rest all fail one after the other saying OpenCL not found. Unrelated to the original issue. I'll strip it down to 1 gpu and work my way up. |
It really looks like an intensity problem with these GPUs. Try to set lower intensities or reduce the overclocking. |
When auto switching from one miner to another, I've seen RBM reporting that it could not close the miner, and RBM doesn't continue. It looks like RBM is sending a ^C to terminate, but not doing anything else to attempt to close it.
Top shows the defunct process still using 100% cpu.
Logs:
[2021-01-08 21:25:09] INFO: Send ^C to Miner Trex-GPU#00-GPU#01-GPU#02-GPU#03-GPU#04-GPU#05-GPU#06-GPU#07-GPU#08's screen sakkisminer01_gpu00_gpu01_gpu02_gpu03_gpu04_gpu05_gpu06_gpu07_gpu08
[2021-01-08 21:25:21] WARNING: Miner Trex-GPU#00-GPU#01-GPU#02-GPU#03-GPU#04-GPU#05-GPU#06-GPU#07-GPU#08 failed to close within 10 seconds
[2021-01-08 21:25:32] INFO: OCDaemon for start-stop-daemon --stop --name t-rex --pidfile /home/saki2fifty/RainbowMiner/Data/pid/sakkisminer01_gpu00_gpu01_gpu02_gpu03_gpu04_gpu05_gpu06_gpu07_gpu08_pid.txt --retry 5 reports: Program t-rex, 1 process(es), refused to die.
saki2fifty@sakkisminer01:~/RainbowMiner$ sudo ps aux | grep t-rex
[sudo] password for saki2fifty:
root 2154 99.1 0.0 0 0 pts/0 Zl+ Jan08 635:02 [t-rex]
saki2fi+ 29291 0.0 0.0 14436 1116 pts/2 S+ 08:00 0:00 grep --color=auto t-rex
The text was updated successfully, but these errors were encountered: