Skip to content
This repository has been archived by the owner on Apr 24, 2022. It is now read-only.

AMD 8Gb GPUs will not mine past 4GB DAG #1966

Closed
ghost opened this issue Mar 1, 2020 · 21 comments
Closed

AMD 8Gb GPUs will not mine past 4GB DAG #1966

ghost opened this issue Mar 1, 2020 · 21 comments

Comments

@ghost
Copy link

ghost commented Mar 1, 2020

For some reason, on 8Gb AMD GPUs the OpenCL max single allocation memory size (CL_DEVICE_MAX_MEM_ALLOC_SIZE) is a bit less than 4Gb.

The following variables:

GPU_MAX_HEAP_SIZE 100
GPU_MAX_ALLOC_PERCENT 100
GPU_SINGLE_ALLOC_PERCENT 100

...only put single malloc memory at 4Gb (at about 50%) and not more.

For 4.05 Gb DAG, mining fails. Claymore and Phoenix fail too. Is there anything we can do about this?

C:\Users\user\Downloads>ethminer.exe --benchmark 11519999


ethminer 0.18.0
Build: windows/release/msvc

 i 19:59:00 <unknown> Selected pool localhost:0
 i 19:59:00 <unknown> Established connection to localhost:0
 i 19:59:00 <unknown> Spinning up miners...
cl 19:59:00 cl-0     Using PciId : 01:00.0 Ellesmere OpenCL 2.0 AMD-APP (2580.6) Memory : 8.00 GB
cu 19:59:00 cuda-1   Using Pci Id : 02:00.0 GeForce RTX 2060 (Compute 7.5) Memory : 6.00 GB
cl 19:59:00 cl-2     Using PciId : 03:00.0 Ellesmere OpenCL 2.0 AMD-APP (2580.6) Memory : 8.00 GB
 i 19:59:00 cl-2     Adjusting CL work multiplier for 32 CUs. Adjusted work multiplier: 58255
cu 19:59:00 cuda-3   Using Pci Id : 08:00.0 GeForce GTX 1080 (Compute 6.1) Memory : 8.00 GB
cl 19:59:00 cl-4     Using PciId : 0d:00.0 gfx900 OpenCL 2.0 AMD-APP (2580.6) Memory : 7.98 GB
 i 19:59:00 cl-4     Adjusting CL work multiplier for 56 CUs. Adjusted work multiplier: 101945
 i 19:59:00 sim      Epoch : 383 Difficulty : 4.29 Gh
 i 19:59:00 sim      Job: 7de876e8... block 11519999 localhost:0
cl 19:59:02 cl-0     Generating DAG + Light : 4.05 GB
cl 19:59:02 cl-2     Generating DAG + Light : 4.05 GB
cl 19:59:02 cl-4     Generating DAG + Light : 4.05 GB
cu 19:59:02 cuda-1   Generating DAG + Light : 4.05 GB
cu 19:59:02 cuda-3   Generating DAG + Light : 4.05 GB
cl 19:59:03 cl-0     OpenCL kernel
cl 19:59:03 cl-2     OpenCL kernel
cl 19:59:04 cl-0     Loading binary kernel C:\Users\o2gen\Downloads/kernels/ethash_ellesmere_lws128_exit.bin
 m 19:59:05 <unknown> 0:00 A0 0.00 h - cl0 0.00, cu1 0.00, cl2 0.00, cu3 0.00, cl4 0.00
cl 19:59:05 cl-2     Loading binary kernel C:\Users\o2gen\Downloads/kernels/ethash_ellesmere_lws128_exit.bin
cl 19:59:05 cl-0     Build info success:
cl 19:59:05 cl-0     Creating light cache buffer, size: 63.87 MB
cl 19:59:05 cl-0     Creating DAG buffer, size: 3.99 GB, free: 3.95 GB
 X 19:59:05 cl-0     Creating DAG buffer failed: clCreateBuffer: CL_INVALID_BUFFER_SIZE (-61)
cl 19:59:05 cl-2     Build info success:
cl 19:59:05 cl-2     Creating light cache buffer, size: 63.87 MB
cl 19:59:05 cl-2     Creating DAG buffer, size: 3.99 GB, free: 3.95 GB
 X 19:59:05 cl-2     Creating DAG buffer failed: clCreateBuffer: CL_INVALID_BUFFER_SIZE (-61)

And this is why:

PS C:\Users\user> clinfo | Select-String -Pattern "Global memory|Max memory allocation|Name|^$"
  Platform Name:                                 AMD Accelerated Parallel Processing

  Board name:                                    Radeon RX 580 Series
  Max memory allocation:                         4244635648
  Global memory size:                            8589934592
  Name:                                          Ellesmere

  Board name:                                    ASUS Radeon RX 470 Series
  Max memory allocation:                         4244635648
  Global memory size:                            8589934592
  Name:                                          Ellesmere

  Board name:                                    Radeon RX Vega
  Max memory allocation:                         4244635648
  Global memory size:                            8573157376
  Name:                                          gfx900

This limitation applies both to Windows and Linux.

@MariusVanDerWijden
Copy link
Collaborator

Yes OpenCL does not allow for bigger single memory allocations than (1/4 - 1/2) of global memory. What we need to do is allocate two/four chunks of smaller memory and put them together in the kernel.

@ghost
Copy link
Author

ghost commented Mar 2, 2020

@MariusVanDerWijden This is a misinterpretation of the OpenCL standard on the vendor's part, actually: https://gist.github.com/roycewilliams/5ac28350023613c614034c7fb6ba715d

The well-known GPU_MAX_ALLOC_PERCENT 100 environment variable already raises memory available on 4Gb GPUs from 25% to almost 100%. On 8Gb GPUs, it raises single alloc memory no further than 4Gb (50%).

The limit is clearly artificial, for example through the AMD ROCm platform almost 8Gb (100%) allocations work.

@maxmalysh
Copy link

maxmalysh commented Mar 12, 2020

Related PR from 2016:

ethereum/libethereum#203 ("Fixed DAG chunking")

@MariusVanDerWijden
Copy link
Collaborator

Can you test #1969 please?

@ghost
Copy link
Author

ghost commented Mar 14, 2020

@MariusVanDerWijden This gives an OpenCL compilation error, then segfaults.

./ethminer -G -M 300


ethminer 0.19.0-alpha.0-6+commit.645deb73
Build: linux/release/gnu

 i 19:42:17 ethminer Selected pool localhost:0
 i 19:42:17 ethminer Established connection to localhost:0
 i 19:42:17 ethminer Spinning up miners...
cl 19:42:17 cl-0     Using PciId : 01:00.0 Ellesmere OpenCL 1.2 AMD-APP (3004.6) Memory : 7.98 GB
cl 19:42:17 cl-1     Using PciId : 02:00.0 GeForce RTX 2060 (Compute 7.5) Memory : 5.80 GB
cl 19:42:17 cl-2     Using PciId : 03:00.0 Ellesmere OpenCL 1.2 AMD-APP (3004.6) Memory : 7.98 GB
 i 19:42:17 cl-2     Adjusting CL work multiplier for 32 CUs. Adjusted work multiplier: 58,255
cl 19:42:17 cl-3     Using PciId : 08:00.0 GeForce GTX 1080 (Compute 6.1) Memory : 7.93 GB
 i 19:42:17 sim      Epoch : 0 Difficulty : 4.29 Gh
 i 19:42:17 sim      Job: d98f8c4a… block 300 localhost:0
cl 19:42:18 cl-3     Generating DAG + Light : 1.02 GB
cl 19:42:18 cl-1     Generating DAG + Light : 1.02 GB
cl 19:42:18 cl-0     Generating DAG + Light : 1.02 GB
cl 19:42:18 cl-2     Generating DAG + Light : 1.02 GB
cl 19:42:18 cl-3     OpenCL init failed: clCreateContext: CL_OUT_OF_HOST_MEMORY (-6)
cl 19:42:18 cl-0     OpenCL kernel
cl 19:42:18 cl-1     OpenCL kernel
 X 19:42:18 cl-1     OpenCL kernel build log:
<kernel>:282:25: error: global variables in function scope must have static storage class
    __global uint const dag_size = _dag_size/2;
                        ^

 X 19:42:18 cl-1     OpenCL kernel build error (-11):
clBuildProgram
SIGSEGV encountered ...
stack trace:
backtrace() returned 7 addresses
./ethminer(+0x8bb88) [0x55feede0ab88]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7f737a944f20]
./ethminer(+0x352448) [0x55feee0d1448]
./ethminer(+0x11d956) [0x55feede9c956]
./ethminer(+0x3d812f) [0x55feee15712f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f737b2b46db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f737aa2788f]
pure virtual method called
terminate called without an active exception
Aborted (core dumped)

@ghost
Copy link
Author

ghost commented Mar 14, 2020

If I replace the buggy line with const uint dag_size = _dag_size/2;, it still emits other errors:

./ethminer -G -M 300


ethminer 0.19.0-alpha.0-6+commit.645deb73.dirty
Build: linux/release/gnu

 i 20:10:21 ethminer Selected pool localhost:0
 i 20:10:21 ethminer Established connection to localhost:0
 i 20:10:21 ethminer Spinning up miners...
cl 20:10:21 cl-0     Using PciId : 01:00.0 Ellesmere OpenCL 1.2 AMD-APP (3004.6) Memory : 7.98 GB
cl 20:10:21 cl-1     Using PciId : 02:00.0 GeForce RTX 2060 (Compute 7.5) Memory : 5.80 GB
cl 20:10:21 cl-2     Using PciId : 03:00.0 Ellesmere OpenCL 1.2 AMD-APP (3004.6) Memory : 7.98 GB
 i 20:10:21 cl-2     Adjusting CL work multiplier for 32 CUs. Adjusted work multiplier: 58,255
cl 20:10:21 cl-3     Using PciId : 08:00.0 GeForce GTX 1080 (Compute 6.1) Memory : 7.93 GB
 i 20:10:21 sim      Epoch : 0 Difficulty : 4.29 Gh
 i 20:10:21 sim      Job: 83786c00… block 300 localhost:0
cl 20:10:22 cl-3     Generating DAG + Light : 1.02 GB
cl 20:10:22 cl-2     Generating DAG + Light : 1.02 GB
cl 20:10:22 cl-0     Generating DAG + Light : 1.02 GB
cl 20:10:22 cl-1     Generating DAG + Light : 1.02 GB
cl 20:10:22 cl-0     OpenCL kernel
cl 20:10:22 cl-2     OpenCL kernel
cl 20:10:22 cl-1     OpenCL kernel
cl 20:10:22 cl-3     OpenCL kernel
cl 20:10:23 cl-0     Loading binary kernel /home/o2genum/Downloads/ethminer/build/ethminer/kernels/ethash_ellesmere_lws128_exit.bin
 X 20:10:23 cl-0     Failed to load binary kernel: /home/o2genum/Downloads/ethminer/build/ethminer/kernels/ethash_ellesmere_lws128_exit.bin
 X 20:10:23 cl-0     Falling back to OpenCL kernel...
cl 20:10:23 cl-0     Creating light cache buffer, size: 16.00 MB
cl 20:10:23 cl-0     Creating DAG buffer, size: 1024.00 MB, free: 6.97 GB
cl 20:10:23 cl-0     Loading kernels
cl 20:10:23 cl-0     Writing light cache buffer
cl 20:10:23 cl-1     Creating light cache buffer, size: 16.00 MB
cl 20:10:23 cl-1     Creating DAG buffer, size: 1024.00 MB, free: 4.78 GB
cl 20:10:23 cl-1     Loading kernels
cl 20:10:23 cl-1     Writing light cache buffer
cl 20:10:23 cl-3     Creating light cache buffer, size: 16.00 MB
cl 20:10:23 cl-3     Creating DAG buffer, size: 1024.00 MB, free: 6.91 GB
cl 20:10:23 cl-3     Loading kernels
cl 20:10:23 cl-3     Writing light cache buffer
cl 20:10:23 cl-0     Creating buffer for header.
cl 20:10:23 cl-0     Creating mining buffer
cl 20:10:23 cl-0     OpenCL init failed: clEnqueueNDRangeKernel: CL_INVALID_KERNEL_ARGS (-52)
cl 20:10:23 cl-1     Creating buffer for header.
cl 20:10:23 cl-1     Creating mining buffer
cl 20:10:23 cl-1     OpenCL init failed: clEnqueueNDRangeKernel: CL_INVALID_KERNEL_ARGS (-52)
cl 20:10:23 cl-3     Creating buffer for header.
cl 20:10:23 cl-3     Creating mining buffer
cl 20:10:23 cl-3     OpenCL init failed: clEnqueueNDRangeKernel: CL_INVALID_KERNEL_ARGS (-52)
cl 20:10:23 cl-2     Loading binary kernel /home/o2genum/Downloads/ethminer/build/ethminer/kernels/ethash_ellesmere_lws128_exit.bin
 X 20:10:23 cl-2     Failed to load binary kernel: /home/o2genum/Downloads/ethminer/build/ethminer/kernels/ethash_ellesmere_lws128_exit.bin
 X 20:10:23 cl-2     Falling back to OpenCL kernel...
cl 20:10:23 cl-2     Creating light cache buffer, size: 16.00 MB
cl 20:10:23 cl-2     Creating DAG buffer, size: 1024.00 MB, free: 6.97 GB
cl 20:10:23 cl-2     Loading kernels
cl 20:10:23 cl-2     Writing light cache buffer
cl 20:10:23 cl-2     Creating buffer for header.
cl 20:10:23 cl-2     Creating mining buffer
cl 20:10:23 cl-2     OpenCL init failed: clEnqueueNDRangeKernel: CL_INVALID_KERNEL_ARGS (-52)

@MariusVanDerWijden
Copy link
Collaborator

Are you using the binary kernel? Binary kernels don't work with the fix

@ghost
Copy link
Author

ghost commented Mar 14, 2020

Just built the miner from source and I'm using whatever is the default. How do I make sure it uses the right kernel?

@MariusVanDerWijden
Copy link
Collaborator

MariusVanDerWijden commented Mar 15, 2020

Ah damn I tested it with cuda set as default... I'll fix it tomorrow

edit. I'll fixed the problem now, however it produces no valid shares :D
I cannot find my error, maybe you can take a look @kr-deps

@ghost ghost closed this as completed Mar 15, 2020
@mirh
Copy link

mirh commented May 20, 2020

>4GB allocations should be controlled by GPU_ENABLE_LARGE_ALLOCATION
https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime/blob/master/runtime/utils/flags.hpp

Not sure if you couldn't also need an updated driver.

@raveos-foundation
Copy link

Yes OpenCL does not allow for bigger single memory allocations than (1/4 - 1/2) of global memory. What we need to do is allocate two/four chunks of smaller memory and put them together in the kernel.

In latest release RaveOS - fix it. OpenCL can allocate more 4GB VRAM

@BitwiseMaster
Copy link

Yes OpenCL does not allow for bigger single memory allocations than (1/4 - 1/2) of global memory. What we need to do is allocate two/four chunks of smaller memory and put them together in the kernel.

In latest release RaveOS - fix it. OpenCL can allocate more 4GB VRAM

Do tell what driver you're using. Because every version of amdgpu-pro I try, clinfo says max memory allocation = 3.95GB

@mauser98k
Copy link

mauser98k commented Jul 25, 2020

i successfully run a 4gb benchmark using Claymore V15.0 on a 7x Gigabyte RX580 rig... i used the official AMD RX580 Adrenaline drivers that were released on 7/14/2020 for Windows 10... using clinfo it shows that i now have 8gb memory available...

now i have to wait until Claymore is updated to mine past DAG #384...

@nexgen-gadgets
Copy link

i successfully run a 4gb benchmark using Claymore V15.0 on a 7x Gigabyte RX580 rig... i used the official AMD RX580 Adrenaline drivers that were released on 7/14/2020 for Windows 10... using clinfo it shows that i now have 8gb memory available...

now i have to wait until Claymore is updated to mine past DAG #384...

I confirm the Radeon driver released on 7/14/2020 works perfectly for mining on more than 4gb DAG, tried on Phoenix successfully , Claymore crashes as soon as you apply the -strap parameter.

@GetoXs
Copy link

GetoXs commented Nov 25, 2020

Updating driver is the solution, but I lost 5% of speed.
But I found a solution to add -eres 0 param into Claymore with older AMD Blockchain drivers.

@Monolite68
Copy link

Updating driver is the solution, but I lost 5% of speed.
But I found a solution to add -eres 0 param into Claymore with older AMD Blockchain drivers.

me too, added "-eres 0" into claymore string and all was good for some days.
But this morning claymore says again "GPU0 - not enough GPU memory to place DAG, you cannot mine this coin with this GPU. GPU0 - OpenCL error -61 - cannot allocate big buffer for DAG. Check readme.txt for possible solutions."
I am using blockchain driver and I would try to install Radeon update driver

@GetoXs
Copy link

GetoXs commented Dec 2, 2020

Same here ths morning.
Param eres=0 means that miner does not reserve additional memory for swapping into higher epoch. Based on the current eth epoch #379 (3.961 GB) and memory limit 4GB with blockchain old drivers, I guess this is the limit.

Another way is to upgrade drivers, but this means a loss of 5% speed.

@halsafar
Copy link

halsafar commented Dec 2, 2020

Same here soon as epoch 379 hit. So the only way forward is driver update?

@bmcd77
Copy link

bmcd77 commented Dec 4, 2020

I realize this is closed but I thought I'd ask : I've got this exact issue.
Win 10. multiple 8GB Rx580. Ethminer Latest (18.x) or 19.0 alpha. Amd Drivers updated to ver 20.11.2.
Running fine until 4 days ago. I've grabbed the latest ethminer. I noticed the DAG chunking work is in a separate branch which I don't think was merged in.

I'm seeing :
Creating DAG buffer, size 3.96 GB, free: 3.98 GB
Creating DAG buffer failed: clCreateBuffer: CL_INVALID_BUFFER_SIZE (-61)

Is there a version of ethminer I can grab or do I switch to another ethhash miner?
Or do I grab the DAG Chunking work and build that branch locally?

Appreciate any comments. Thx.

@AndreaLanfranchi
Copy link
Collaborator

You can continue using ethminer if you can build the binary on your own
See instructions here https://github.com/ethereum-mining/ethminer#build

@bmcd77
Copy link

bmcd77 commented Dec 4, 2020

Thanks for the quick reply. Yes I can do that with a bit of help from the build README. I'm building the latest master branch?

Been using ethminer for over 2 yrs. I'd like to stick with it.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.