-
Notifications
You must be signed in to change notification settings - Fork 262
Conversation
Does anyone have an ETA on when this would be merged in? I would love to run ethminer on a bunch of "old" cards that I have that have the required amount of memory but cannot allocate it all at once. See my comments in issue #2761 in cpp-ethereum. |
Hey @ry60003333, |
This seems to have changes in a lot of places including the openCL kernel code. Apart from having all tests green, before merging please try to test on as many different GPUs as possible. We have no automated tests for the openCL mining code so any merge is a risk. |
Yes, @LefterisJP. The range of "touch points" scare me too. CC @chriseth. @ry60003333 If this code is working for you, then perhaps you are best just keeping it running as a private fork for you own benefit for the time being. Many miners are already using the Genoil fork, rather than the "official" ethminer, so we're not even in a position where we have something in a particularly healthy state. |
The error is in ethereum/mix. There actually aren't many changes in the OpenCL kernel code. The main ones are my unfolding the ternary operator used to decide which chunk to sample, and changing the offsets to use a define passed from ethash_cl_miner. In ethash_cl_miner it adds another define with the chunk size, then allocates and uploads those chunks. The behavior on cards that can allocate a single chunk has not changed (beyond adding the extra define) |
@bobsummerwill Thanks for the reply! It looks like @Equinox- is correct about ethereum/mix causing the builds to fail. I'm not sure what the process would be to get that fix so the tests would pass, but it would be a start. I'll attempt to build from source with this code and test the resulting miner on both the R9 270s that require chunking and some R9 280X cards that can fit the DAG in one chunk to ensure that the OpenCL kernel code still works. Sadly it seems like the Genoil fork doesn't support chunking the DAG either, so it would be nice to have this in the "official" version. |
I refresh the webthree-umbrella, so if you use "develop" it should have the latest and greatest everything now. Yes - please do test away! I would also recommend that you start a dialog with @Genoil about this DAG chunking functionality too. I am hoping that we can upstream the Genoil changes and "heal the breach", but that may or may not be possible. At the time of writing the Genoil branch is the best miner to use, and we may or may not ever get back to an official miner which is worth people's while. I hope we will, but don't bet everything on it! |
I recently removed the chunking parts because I was refactoring the kernel in an attempt to squeeze a bit of extra performance out of it. It didn't work anyway and it didn't look very pretty either. Nice fix though. Most cards that are used for mining (including Pitcairn (78x0/270/370) work fine without chunking, by setting the right ENV vars. So I'm not really considering bringing it back. A while ago I tried a different chunking method that only had host-side chunking and no chunks kernel-side. Unfortunately it didn't work out well on AMD hardware. |
@bobsummerwill I'll give it a try from that branch! It would be nice to merge the changes back and "heal" the branch, but I agree that the need for it will likely determine if that happens. @Genoil Thanks for explaining the situation; do you happen to know what ENV variables will work on Pitcairn MSI R9 270 cards? I'm running Ubuntu 14.04 with the fglrx-updates drivers, and unfortunately haven't had any success with ENV variables. I really do appreciate your input though! |
@ry60003333 I don't own any Pitcairns, but apparently the R7 370 is now seen as one of the most efficient cards for Ethereum. With ENv vars I meant enviroment variables. Recently the list has grown to 5 of these to satisfy most modern AMD cards (export ==setx): export GPU_FORCE_64BIT_PTR 0 |
@Genoil Thank you for the reply! It looks like the last environment variable was the one that I was missing. I had tried all the others, but GPU_SINGLE_ALLOC_PERCENT looks like it was the one that did the trick on Ubuntu. I really appreciate the assistance! |
@ry60003333 you're welcome. It's actually quite a recent requirement for some AMD cards, since the DAG has grown to about 1.4GB. |
Hi @ry60003333 and @Genoil |
Could you rebase this, please? There are a lot of unrelated commits in this PR. |
@otaku160 but can you mine without the setting? |
@Genoil i downloaded your latest miner (1.0.6) and tried this, but still failing to mine. i tried setting all the ENV vars and still cannot allocate the DAG in a single chunk ... i am using ethminer, is there a different miner i should use ? [0] Pitcairn my card is a 2GB R9 270 and it shows 1.4 GB as the max memory when doing list-devices./ |
yesterday i could set use export GPU_SINGLE_ALLOC_PERCENT=100 and it worked with ethminer and stratum proxy.. Today my miner on linux mint.. was doing nothing when i woke up.. fuss@fussy ~/Downloads $ export GPU_MAX_ALLOC_PERCENT=100 |
Hi guys. Im yesterday connect to my rig of Asus R9 280x 3Gb one new ASUS R9 380 4Gb. After installation drivers im getting issue like this when im start eth-proxy.py file : |
had the same problem today too on windows. |
yesterday windows 10, windows defender reported a virus it was eth proxy.exe.. Win Defender deleted the eth proxy.exe so I downloaded it again and windows defender taged and deleted the file at download. found a copy in a zip file and checked it, it was not infected so using it now |
I only wanted to see if windows also did not work. u use windows without the bullshit defender and wall |
@cgladue |
thanks for the suggestion, i tried it and it didnt help (i use resisters in a dummy plug anyways) it just appears that sapphire R9 270 cards have CL_DEVICE_MAX_MEM_ALLOC_SIZE: 1408867653 which is just (as of last night) over the memory needed to load the full DAG. i dont think this is going to be fixed unless there is a way to not load the complete dag in one huge file, perhaps chunk the DAG in 2 smaller files or something ? or is there a way to increase the value of CL_DEVICE_MAX_MEM_ALLOC_SIZE ? |
i see at the top you said you:
but seems like still having the issue where my card has 2GB of RAM, but can only alloc 1.4GB max at once, wasnt your fix supposed to allow me to keep mining ? perhaps i dont have the right binaries, where can i download the fixed binaries ? |
@cgladue and they both MEM_ALLOC_SIZE: 1408867653 like yours. |
What version of ethminer are you using, i am using 0.9.41-genoil-1.0.6 |
(MSI R9 380 2Gb) Driv 15.7.1, Win7 x64 My (Easy) not work string: gives an error message (-61) and (-38) My NEW string: NOT gives an error message (-61) and (-38), BUT it produces air at speeds 13MH WTF????!!!! HELP!!! |
My statistics on " http://dwarfpool.com " equal to 0!!!!! This race - the result of work of the laptop. |
I'm honestly lost here, since there are now multiple AMD cards this appears to not work on. |
setx GPU_FORCE_64BIT_PTR 0 form me its solved re 270x 2gb |
Equinox Would this be helpful? |
I've built some binaries that force chunking here, both the debugging and the release binary. If you're having troubles with this code feel free to try these. |
I just tried the new binary and it's still not committing new work. Ran the debug binary, and the CPU hash doesn't end with GPU hash. GPU lid=44, nonce = ecaec71b49aa6679, hash = 4565752f What does this mean? I'm running an AMD 7570 with 2GB RAM |
Not sure. Both those binaries work on my 7770s, so I'm unsure why they don't work on your card. What version of the AMD drivers do you have, what arguments are you using to launch, and could I get some more info on your exact card? |
Interesting. I just used the |
Driver version is 15.200.1045.0, and launch args are "--cl-local-work 64 --cl-global-work 4096". This is my card, except mine is the 2GB version. https://www.techpowerup.com/gpudb/b692/pegatron-hd-7570.html |
What does |
The risk that some GPU are burning working for nothing is too much high. I think we should concentrate creating a unit test assuring that GPU algorithm is working good, before starting the GPU mining; both for full DAG and chunk DAG. |
ethminer --list-devices returns: Listing OpenCL devices. Here's the first bit of output: Found suitable OpenCL device [Turks] with 2147483648 bytes of GPU memory Failed to allocate 1 big chunk. Max allocateable memory is 536870912. Trying to allocate 4 chunks. |
Mind trying it with the following environmental variables (setx name val or export name=val)
|
Don't mind at all. I ran the setx commands and fired up ethminer, but it seems to be doing the same thing. `Found suitable OpenCL device [Turks] with 2147483648 bytes of GPU memory Failed to allocate 1 big chunk. Max allocateable memory is 536870912. Trying to allocate 4 chunks. |
I'm assuming that means the ethminer_debug outputs are invalid again (CPU hash doesn't end with GPU hash) |
Correct; debug's CPU hash doesn't end with GPU hash. |
If you are having trouble with chunked mining you can try running the chunked DAG debugger. This won't actually mine anything; it uploads the DAG, runs through it to ensure integrity, then outputs CPU/GPU hash pairs. If it fails before the CPU/GPU hash pairs get printed (DAG verification fails) post the log. |
BTW i looked into my broken chunks implementation and fixed it. It does work, but it doesn't seem to be very useful since for the majority of cards it's more a matter of setting the right environment variables to fix the allocation issues. It's also slower on AMD cards and really doesn't do anything useful on the Nvidia platform. I also managed to get the VGPRS usage down to 56, but I got 108 scratch registers back in return, which totally kills the added value of an extra wavefront |
How low did you get it before scratch registers started appearing? I've
|
It was either 78 or 80 (the 80 one works real nice on CUDA-CL with maxrregs compiler option), or 56. You use this for Theta:
The dynamic indexing of a (the 1600-bit keccak state) forces the compiler to move it out of the registers. Same happens for t (the temporary 1600-bit keccak state). 25 * 2 * 2 = 100 is about 108 scratch regs. Why it then saves just 24 VGPRS is still a bit of a mystery, but it really doesn't matter. |
Ah finally getting a bit of grip on that dreaded GCN compiler. Down to 23 VGPRS with an occupancy of 100%. Dramatic hashrate though. Good example why occupancy isn't everything :) |
I'm going to close this. At this point the DAG size has increased even further and the few edge cases that the environmental variables don't solve don't seem to work with this either. |
I'm not sure why this patch was closed. Recently I've been unable to mine with stock ethminer using either a 7970 3 Gb or an R9 270 2 Gb card due to the DAG alloc issue. ( note: for some reason one 7970 works fine and another doesn't. ) I've tried all env variable hacks in this thread and elsewhere to no avail. Applying this patch fixes the problem for both cards, and all is well. |
For anyone interested, I created a fork of Genoil's ethminer that includes this patch. The chunking works great with R9 270 and HD 7970 and is automatic if allocating a full DAG fails. |
Fixed support of chunking the DAG on GPUs unable to allocate a continuous block of memory large enough for the entire DAG.
The problem was in the offsets calculated in the CL kernel.
Also removed DAG duplication on chunked buffer mapping.