-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - Provide warning indicator when the miner is dying #630
Comments
I'm not sure there's a way to reliably detect failure of the U3. Maybe we can request clock speed info when no nonces have been found in a while...? |
Hi Luke, Yes. Perhaps that could work. The question would be, how long should it wait for no nonces from the miner before declaring that the miner is dead? As I mentioned on my previous issue report, I am not a programmer so I have actually no clue on how to implement this feature. However, below is what I have observed so far which I hope would give you ideas on implementing the feature if you wish. Yesterday, I started to use only 1 BFGMiner process to manage 2 of my Antminer U3'. I used voltage=x750, clock=x0782 and timing=0.0175 and I got the total hash rate figures of around 101/101/100 GH/s. This morning at around 06:00 UTC (according to Eligius hash rate graph below), the 1st Antminer U3 (AMU 0) crashed. BFGMiner does not know about it. When I checked it at around 11:00 UTC, the hashing LED of the 1st miner was not illuminating any more. As we can see on below screenshot, that the all-time average hash rates of all AMU 0 processors increased to above 12.63 GH/s and their all-time average effective hash rates decreased to below 9 GH/s. I initially thought that I could set a trigger to automatically power cycle the miners based on the difference of all-time average hash rate and all-time average effective hash rate like below: a = all-time average hash rate The assumption for the above algorithm to work is that a > b. But I found that the b is not reliable. At some set of voltage, clock and timing parameters, a > b. But at different set of that parameters, a < b. And I have no clue on what parameters affecting b. I have clear idea on how to control a. I think I will raise another issue ticket for this one. I think it will be great if we could just query the status of the miners via RPC API instead of trying to figure out myself like above. Cheers, Anto |
I'm working on a solution for this issue with bfgminer detecting dead ASIC devices. The only way to do it reliably would be with nonce responses from an ASIC, and would most likely have some default value of 120 seconds (if you have an ASIC set with some abnormally high diff value for whatever reason then I could implement a command line set value for ASIC is SICK after x seconds) |
This feature request is at the moment only relevant to Antminer U3. However, this feature could be possibly also used for the miners that do not have the capability to update their status.
As of BFGMiner 5.2.0, BFGMiner does not actually know the status of Antminer U3. It keeps running and expects Antminer U3 to perform hashing even when Antminer U3 already crashed. From what I observed when Antminer U3 is dying, the all-time effective average hash rate keeps reducing. All other hash rate figures are usually increasing a little bit, but they cannot reliably be used as indicator.
However, from what I have observed today, the all time effective average hash rate figure is also not reliable. But I believe the algorithm that is being used to calculate the all time effective average hash rate figure can be used to generate a warning indicator that Antminer U3 is dying, so that the users do not have to physically look at the hashing LED on Antminer U3 to see whether it is still hashing or not.
The text was updated successfully, but these errors were encountered: