miner crash/bus hangs when mining for dev #1351

gurkburk76 · 2021-01-23T10:28:43Z

So i'm currently evaluating a minebox12 and i've been installing rbm "fresh" quite a few times and i've noticed that sometimes some miners seem to hang the pcie bus and witch then ususally requires a restart of the rig.
Once that evalluation part of miners is done it's not much of an issue since rbm will simply use whatever miners that ACTUALLY work, but when it starts dev mining it seems like it dosen't care about what "good" miners there are and just uses whatever miners there are and that can cause a bus hang.
So i was thinking, would it not be a sane thing to use the information of known good miners that rbm has evaluated so you the dev actually can get payed for this great software ? :)
I live the pushover integration btw, it's saved me a lot of time and revenue :)

RainbowMiner · 2021-01-23T11:23:32Z

Thank you. Sometimes the dev mining uses a miner, that has not been benchmarked yet. Most probably, that's the reason for the problems, because it might dig up a miner, that might overload a rig.
So best solution will be to simply block all non-benchmarked miners during that time. Then, only those already successfully benchmarked miners will be used. I'll add some code for that.

- don't benchmark during donation run (issue #1351)

gurkburk76 · 2021-01-25T07:12:50Z

This might have nothing to do with the way it uses miners, but i saw that it crashed when mining for you last night, this is what i got from log.sh and a proper debug file.

log.sh:
i 02:03:18 ethminer Job: e050ace6… daggerhashimoto.eu.nicehash.com [172.65.200.133:3353]
cu 02:03:18 cuda-0 Generating DAG + Light (reusing buffers): 4.11 GB
SIGSEGV encountered ...
stack trace:
backtrace() returned 19 addresses
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x8fc88) [0x560bcfaf4c88]
/lib/x86_64-linux-gnu/libc.so.6(+0x3f040) [0x7f8a94f63040]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1d07b0) [0x7f8a932c97b0]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x2d03ef) [0x7f8a933c93ef]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x3c71bd) [0x7f8a934c01bd]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x175a0b) [0x7f8a9326ea0b]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x3f0885) [0x7f8a934e9885]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x16f337) [0x7f8a93268337]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(cuMemcpyHtoD_v2+0x56) [0x7f8a932e4ed6]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x39f649) [0x560bcfe04649]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3794b5) [0x560bcfdde4b5]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3b94a9) [0x560bcfe1e4a9]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3605c0) [0x560bcfdc55c0]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0xd7a1d) [0x560bcfb3ca1d]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x363d14) [0x560bcfdc8d14]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x121e06) [0x560bcfb86e06]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x4344af) [0x560bcfe994af]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f8a958d26db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f8a9504571f]

attaching debug file.
debug_2021-01-25.zip

gurkburk76 · 2021-01-25T07:14:51Z

also from dmesg:

[125563.771412] NVRM: GPU at PCI:0000:02:00: GPU-1a9c6802-87b7-ace4-7094-b42142a79fc1
[125563.771415] NVRM: GPU Board Serial Number:
[125563.771417] NVRM: Xid (PCI:0000:02:00): 79, pid=0, GPU has fallen off the bus.
[125563.771421] NVRM: GPU 0000:02:00.0: GPU has fallen off the bus.
[125563.771421] NVRM: GPU 0000:02:00.0: GPU is on Board .
[125563.771435] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.

gurkburk76 · 2021-01-27T23:21:25Z

I just uppgraded from 4.6.7.7 to 4.6.7.8 and i noticed that it wants to re-benchmark basicly all miners, is this supposed to happen? Seems like a bit of overkill and also puts the rig at risk of crashing if you happen to leave auto-updates on :)

RainbowMiner self-assigned this Jan 23, 2021

RainbowMiner added a commit that referenced this issue Jan 24, 2021

Update core

aeff322

- don't benchmark during donation run (issue #1351)

RainbowMiner closed this as completed Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miner crash/bus hangs when mining for dev #1351

miner crash/bus hangs when mining for dev #1351

gurkburk76 commented Jan 23, 2021

RainbowMiner commented Jan 23, 2021

gurkburk76 commented Jan 25, 2021

gurkburk76 commented Jan 25, 2021

gurkburk76 commented Jan 27, 2021 •

edited

miner crash/bus hangs when mining for dev #1351

miner crash/bus hangs when mining for dev #1351

Comments

gurkburk76 commented Jan 23, 2021

RainbowMiner commented Jan 23, 2021

gurkburk76 commented Jan 25, 2021

gurkburk76 commented Jan 25, 2021

gurkburk76 commented Jan 27, 2021 • edited

gurkburk76 commented Jan 27, 2021 •

edited