Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miner crash/bus hangs when mining for dev #1351

Closed
gurkburk76 opened this issue Jan 23, 2021 · 4 comments
Closed

miner crash/bus hangs when mining for dev #1351

gurkburk76 opened this issue Jan 23, 2021 · 4 comments
Assignees

Comments

@gurkburk76
Copy link

So i'm currently evaluating a minebox12 and i've been installing rbm "fresh" quite a few times and i've noticed that sometimes some miners seem to hang the pcie bus and witch then ususally requires a restart of the rig.
Once that evalluation part of miners is done it's not much of an issue since rbm will simply use whatever miners that ACTUALLY work, but when it starts dev mining it seems like it dosen't care about what "good" miners there are and just uses whatever miners there are and that can cause a bus hang.
So i was thinking, would it not be a sane thing to use the information of known good miners that rbm has evaluated so you the dev actually can get payed for this great software ? :)
I live the pushover integration btw, it's saved me a lot of time and revenue :)

@RainbowMiner
Copy link
Owner

Thank you. Sometimes the dev mining uses a miner, that has not been benchmarked yet. Most probably, that's the reason for the problems, because it might dig up a miner, that might overload a rig.
So best solution will be to simply block all non-benchmarked miners during that time. Then, only those already successfully benchmarked miners will be used. I'll add some code for that.

@RainbowMiner RainbowMiner self-assigned this Jan 23, 2021
RainbowMiner added a commit that referenced this issue Jan 24, 2021
- don't benchmark during donation run (issue #1351)
@gurkburk76
Copy link
Author

This might have nothing to do with the way it uses miners, but i saw that it crashed when mining for you last night, this is what i got from log.sh and a proper debug file.

log.sh:
i 02:03:18 ethminer Job: e050ace6… daggerhashimoto.eu.nicehash.com [172.65.200.133:3353]
cu 02:03:18 cuda-0 Generating DAG + Light (reusing buffers): 4.11 GB
SIGSEGV encountered ...
stack trace:
backtrace() returned 19 addresses
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x8fc88) [0x560bcfaf4c88]
/lib/x86_64-linux-gnu/libc.so.6(+0x3f040) [0x7f8a94f63040]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1d07b0) [0x7f8a932c97b0]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x2d03ef) [0x7f8a933c93ef]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x3c71bd) [0x7f8a934c01bd]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x175a0b) [0x7f8a9326ea0b]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x3f0885) [0x7f8a934e9885]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x16f337) [0x7f8a93268337]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(cuMemcpyHtoD_v2+0x56) [0x7f8a932e4ed6]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x39f649) [0x560bcfe04649]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3794b5) [0x560bcfdde4b5]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3b94a9) [0x560bcfe1e4a9]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x3605c0) [0x560bcfdc55c0]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0xd7a1d) [0x560bcfb3ca1d]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x363d14) [0x560bcfdc8d14]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x121e06) [0x560bcfb86e06]
/root/RainbowMiner/Bin/Ethash-Ethminer/ethminer(+0x4344af) [0x560bcfe994af]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f8a958d26db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f8a9504571f]

attaching debug file.
debug_2021-01-25.zip

@gurkburk76
Copy link
Author

also from dmesg:

[125563.771412] NVRM: GPU at PCI:0000:02:00: GPU-1a9c6802-87b7-ace4-7094-b42142a79fc1
[125563.771415] NVRM: GPU Board Serial Number:
[125563.771417] NVRM: Xid (PCI:0000:02:00): 79, pid=0, GPU has fallen off the bus.
[125563.771421] NVRM: GPU 0000:02:00.0: GPU has fallen off the bus.
[125563.771421] NVRM: GPU 0000:02:00.0: GPU is on Board .
[125563.771435] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.

@gurkburk76
Copy link
Author

gurkburk76 commented Jan 27, 2021

I just uppgraded from 4.6.7.7 to 4.6.7.8 and i noticed that it wants to re-benchmark basicly all miners, is this supposed to happen? Seems like a bit of overkill and also puts the rig at risk of crashing if you happen to leave auto-updates on :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants