No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs #6

Motoma · 2016-10-29T18:12:04Z

Ubuntu Linux 4.4.0 x86_64

CUDA Driver = CUDART
CUDA Driver Version = 8.0
CUDA Runtime Version = 7.5
Device0 = GeForce GTX 970
Device1 = GeForce GTX 970

Compiled silentarmy against both the libraries provided by apt-get as well as those included in the CUDA SDK.

Regardless of the command I run, I receive:

Building program
Hash tables will use 1744.8 MB
Running...
clEnqueueReadBuffer (-5)

The text was updated successfully, but these errors were encountered:

mbevand · 2016-10-30T00:36:48Z

Thanks for your report. As per cl.h, error -5 is:
#define CL_OUT_OF_RESOURCES -5

Could you report the full output of "clinfo"? It sounds like your cards doesn't support silentarmy allocating 1744.8 MB of GPU memory.

Motoma · 2016-10-30T01:56:31Z

You may be right:
Max memory allocation 1072873472 (1023MiB)

Here is the output of clinfo: clinfo.txt

mbevand · 2016-10-30T02:54:50Z

Could you try editing param.h and defining NR_ROWS_LOG to 16, then recompile and try again? This will reduce memory usage (at the cost of performance) and may work on your cards.

Motoma · 2016-10-30T12:06:15Z

Still no luck:

$ ./silentarmy --nonces 100 -v -v
Solving default all-zero 140-byte header
Found 1 OpenCL platform(s)
Building program
Hash tables will use 402.7 MB
Running...

Solving nonce 0000000000000000000000000000000000000000000000000000000000000000
Round 0
Dropped: 0 (coll) 0 (stor)
Round 1
Dropped: 0 (coll) 0 (stor)
Round 2
clEnqueueReadBuffer (-5)

Thanks for the quick response and for looking into this.

mbevand · 2016-10-30T22:02:38Z

Weird. I'll put it on my todo list to look into this issue. solardiz had reported the same problem when trying to run silentarmy on his Nvidia cards.

mbevand · 2016-10-31T02:28:06Z

Could you try checking out the latest revision? I fixed unaligned memory accesses and I think it may fix your issue on Nvidia cards...(5f68344)

GibsT · 2016-10-31T03:18:09Z

I'm using your latest version and linux cuda8 with a gtx 960 and i modified the rows to 16
On latest build it is still getting the error

Solving default all-zero 140-byte header
Found 1 OpenCL platform(s)
Building program
Hash tables will use 402.7 MB
Running...

Solving nonce 0000000000000000000000000000000000000000000000000000000000000000
clEnqueueReadBuffer (-5)

Some info about the card
[0] GeForce GTX 960
CL_DEVICE_TYPE: GPU
CL_DEVICE_GLOBAL_MEM_SIZE: 4236312576
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 1059078144
CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024

Motoma · 2016-10-31T13:54:54Z

I can back @GibsT's assessment that this problem still persists on master.

mbevand · 2016-10-31T14:44:01Z

Ok guys, thanks for testing. I'll try to get my hands on an Nvidia GPU to fix this bug.

GibsT · 2016-10-31T19:19:56Z

I can debug for you if you want. If there is anything you need me to do
over here let me know. Im a C++ programmer but I don't really have any
experience with crypto algo's or kernels

On Oct 31, 2016 8:49 AM, "mbevand" notifications@github.com wrote:

Ok guys, thanks for testing. I'll try to get my hands on an Nvidia GPU to
fix this bug.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AWBJYwLwicA4gIjTDAVOnsvJZDY_U5jAks5q5f6zgaJpZM4KkLdW
.

ivanmladek · 2016-11-01T00:17:30Z

Same here, I can help with GPU debugging:
[OPENCL]:Found suitable OpenCL device [GeForce GTX 1070] with 8507555840 bytes of GPU memory
[OPENCL]:Using platform: NVIDIA CUDA
[OPENCL]:Using device: GeForce GTX 1070(OpenCL 1.2 CUDA)

mbevand · 2016-11-02T21:13:14Z

I am told that on Nvidia CL_OUT_OF_RESOURCES could be a very generic error (eg. the kernel is accessing memory outside the bounds of a buffer). So try to run with "-v -v -v" to see at what step in Equihash is the clEnqueueReadBuffer() call that fails (please attach the output to this bug). Try to comment out big chunks of the OpenCL kernel to see if the error disappears. Etc. Maybe Nvidia has a debugger? I don't know. I have only worked with AMD GPUs in the past. In a few days I should find the time to try debugging this.

GibsT · 2016-11-04T02:44:47Z

[356702.608922] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 0): Misaligned Address [356702.608928] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 0, TPC 0): Physical Multiple Warp Errors [356702.608932] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x504648=0x5000f 0x504650=0x4 0x504644=0xd3eff2 0x50464c=0x7f [356702.608945] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 1): Misaligned Address [356702.608949] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 0, TPC 1): Physical Multiple Warp Errors [356702.608952] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x504e48=0x14000f 0x504e50=0x4 0x504e44=0xd3eff2 0x504e4c=0x7f [356702.608965] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 2): Misaligned Address [356702.608968] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 0, TPC 2): Physical Multiple Warp Errors [356702.608971] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x505648=0x7000f 0x505650=0x4 0x505644=0xd3eff2 0x50564c=0x7f [356702.608984] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 3): Misaligned Address [356702.608988] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 0, TPC 3): Physical Multiple Warp Errors [356702.608991] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x505e48=0xa000f 0x505e50=0x4 0x505e44=0xd3eff2 0x505e4c=0x7f [356702.609005] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 1, TPC 0): Misaligned Address [356702.609009] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 1, TPC 0): Physical Multiple Warp Errors [356702.609012] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x50c648=0xf 0x50c650=0x4 0x50c644=0xd3eff2 0x50c64c=0x7f [356702.609026] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 1, TPC 1): Misaligned Address [356702.609030] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 1, TPC 1): Physical Multiple Warp Errors [356702.609033] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x50ce48=0x2000f 0x50ce50=0x4 0x50ce44=0xd3eff2 0x50ce4c=0x7f [356702.609046] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 1, TPC 2): Misaligned Address [356702.609049] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 1, TPC 2): Physical Multiple Warp Errors [356702.609052] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x50d648=0x5000f 0x50d650=0x4 0x50d644=0xd3eff2 0x50d64c=0x7f [356702.609065] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Warp Exception on (GPC 1, TPC 3): Misaligned Address [356702.609068] NVRM: Xid (PCI:0000:01:00): 13, Graphics SM Global Exception on (GPC 1, TPC 3): Physical Multiple Warp Errors [356702.609071] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x50de48=0xf 0x50de50=0x4 0x50de44=0xd3eff2 0x50de4c=0x7f [356702.609082] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0041, Class 0000b1c0, Offset 00001b0c, Data 00000000
and
`Solving default all-zero 140-byte header
Found 1 OpenCL platform(s)
Using GPU device ID 0
Building program
Hash tables will use 1208.0 MB
Running...

Solving nonce 0000000000000000000000000000000000000000000000000000000000000000
Round 0
Dropped: 0 (coll) 0 (stor)
Round 1
Dropped: 0 (coll) 0 (stor)
Round 2
clEnqueueReadBuffer (-5)
`

mbevand · 2016-11-04T07:54:38Z

It's very interesting to me that it fails at Round 2 and not earlier. I should have time to start investigating this bug in the next 2-3 days.

GibsT · 2016-11-04T20:26:08Z

Not sure if this helps but I remember reading somewhere about how amd uses 64 threads per wave and nvidia likes to use 32 per warp. 512 bits bandwidth. 64kb shared memory cache. Up to 64 warps per multiprocessor

mbevand · 2016-11-06T06:10:31Z

So I have good news. The CL_OUT_OF_RESOURCES error is caused by unaligned memory accesses (which happen at round 2 and above.) The fix is relatively straightforward. This means Nvidia will be supported soon.

dacox · 2016-11-07T00:23:19Z

@Motoma how are you using NVIDIA cards? Are you installing any of the AMD stuff from the README?

mbevand · 2016-11-07T00:42:07Z

I will update the README with instructions for Nvidia.

Motoma · 2016-11-07T15:09:50Z

@dacox No, I'm not following the AMD instructions in the readme. I have used both the OpenCL libraries included in the apt repository as well as the ones included in the Nvidia CUDA SDK.

mbevand · 2016-11-08T07:39:40Z

Nvidia is now supported in SILENTARMY v4: a03e308

README.md has been updated with instructions on how to install the Nvidia packages on Ubuntu 16.04

mbevand changed the title ~~clEnqueueReadBuffer (-5) when running Nvidia (both CUDA 7.5 libs and system libs)~~ No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs Nov 5, 2016

mbevand mentioned this issue Nov 5, 2016

Nvidia support Cuda 8.0 #26

Closed

mbevand closed this as completed Nov 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs #6

No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs #6

Motoma commented Oct 29, 2016

mbevand commented Oct 30, 2016

Motoma commented Oct 30, 2016 •

edited

Loading

mbevand commented Oct 30, 2016

Motoma commented Oct 30, 2016 •

edited

Loading

mbevand commented Oct 30, 2016

mbevand commented Oct 31, 2016

GibsT commented Oct 31, 2016 •

edited

Loading

Motoma commented Oct 31, 2016

mbevand commented Oct 31, 2016

GibsT commented Oct 31, 2016

ivanmladek commented Nov 1, 2016

mbevand commented Nov 2, 2016

GibsT commented Nov 4, 2016

mbevand commented Nov 4, 2016

GibsT commented Nov 4, 2016 •

edited

Loading

mbevand commented Nov 6, 2016

dacox commented Nov 7, 2016

mbevand commented Nov 7, 2016

Motoma commented Nov 7, 2016

mbevand commented Nov 8, 2016

No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs #6

No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs #6

Comments

Motoma commented Oct 29, 2016

mbevand commented Oct 30, 2016

Motoma commented Oct 30, 2016 • edited Loading

mbevand commented Oct 30, 2016

Motoma commented Oct 30, 2016 • edited Loading

mbevand commented Oct 30, 2016

mbevand commented Oct 31, 2016

GibsT commented Oct 31, 2016 • edited Loading

Motoma commented Oct 31, 2016

mbevand commented Oct 31, 2016

GibsT commented Oct 31, 2016

ivanmladek commented Nov 1, 2016

mbevand commented Nov 2, 2016

GibsT commented Nov 4, 2016

mbevand commented Nov 4, 2016

GibsT commented Nov 4, 2016 • edited Loading

mbevand commented Nov 6, 2016

dacox commented Nov 7, 2016

mbevand commented Nov 7, 2016

Motoma commented Nov 7, 2016

mbevand commented Nov 8, 2016

Motoma commented Oct 30, 2016 •

edited

Loading

Motoma commented Oct 30, 2016 •

edited

Loading

GibsT commented Oct 31, 2016 •

edited

Loading

GibsT commented Nov 4, 2016 •

edited

Loading