Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel) #246

platinum4 · 2014-06-09T02:26:30Z

When building a bin file for scrypt kernel [zuikkis and/or alexkarnew] on a Hawaii chipset R9 290/X architecture, sgminer5 throws this error.

[00:30:50] Probing for an alive pool
[00:30:51] Switching to NiceHash_Scrypt_backup - first alive pool
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:52] Initialising kernel alexkarnew.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:30:54] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:30:54] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:30:54] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:30:54] GPU 0 failure, disabling!
[00:30:54] GPU 2 failure, disabling!
[00:30:54] GPU 1 failure, disabling!

This issue is not found present when running scrypt bins under kalroth's cgminer.

The settings for the scrypt bin are as follows, which are the preferred/max settings for a Hawaii R9 290.

worksize 256, TC 32765 - bin's initialize just fine on cgminer and provide 995Kh/s
under this new sgminer - nothing but Error -4; no change in memory or architecture, as I can close sgminer and go directly over to cgminer

This is the current working alex scrypt bin file that I have scrypt130511_alexeyHawaiiglg2tc32765w256l4

I can get sgminer to build
alexkarnewHawaiiglg2tc32765w128l4, it never respects the pool-worksize setting.

And regardless, if you force it into 256, it still -4 Error.

mrbrdo · 2014-06-09T03:10:18Z

There is no pool-worksize setting.

Can we try a few things:

make cgminer_kalroth build the bin file with appropriate settings, then copy it into sgminer and make the filename match (it might be different than in cgminer_kalroth). is there any difference?
please use ckolivas kernel when comparing to kalroth, because this is what kalroth uses
try lower thread-concurrency (but don't immediately go very low, and make it a multiple of shaders - e.g. for R9 290 it should be X * 2560)
show me output of when it's building (and then loading) the kernel (I only see loading from what you've shown)

Also, it seems people had this error before: https://www.weminecryptos.com/forum/topic/2299-getting-error-4-enqueueing-kernel-onto-command-queue/ - it seems to be related with low system RAM. I think you might just be getting this because alexkarnew might need different settings, so you need to try ckolivas first.

platinum4 · 2014-06-09T03:30:22Z

Ok this has been mitigated by replacing alexkarnew with ckolivas as algorithm; however, this now predictably crashes every 60s from this error.

[22:28:03] Started sgminer 4.2.1
[22:28:03] Loaded configuration file C:\sgminer_v5_0_06062014\sgminer-nicehash.conf
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:03] Probing for an alive pool
[22:28:04] Switching to Marucoin suprnova - first alive pool
[22:28:04] Network diff set to 19
[22:28:04] NiceHash_X13_multi alive, testing stability
[22:28:04] Switching to NiceHash_X13_multi
[22:28:04] Network diff set to 1.66K
[22:28:04] New block detected on network before pool notification
[22:28:06] Network diff set to 22
[22:28:06] Stratum from Marucoin suprnova detected new block
[22:28:06] Switching mrr x13 platinum4.2 to stratum+tcp://us-east01.miningrigrentals.com:50100
[22:28:07] Network diff set to 1.3K
[22:28:07] Stratum from NiceHash_X13_multi detected new block
[22:28:14] Switching to NiceHash_Scrypt
[22:28:15] NiceHash_Scrypt difficulty changed to 512
[22:28:19] Network diff set to 438K
[22:28:19] New block detected on network before pool notification
[22:28:19] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:19] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:19] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:28:19] Network diff set to 481K
[22:28:19] Stratum from NiceHash_Scrypt detected new block
[22:28:22] Network diff set to 510K
[22:28:22] Stratum from NiceHash_Scrypt detected new block
[22:28:26] Network diff set to 520
[22:28:26] Stratum from tmb x13 multiport detected new block
[22:28:36] Network diff set to 1.39K
[22:28:36] Stratum from NiceHash_X13_multi detected new block
[22:28:37] NiceHash_Scrypt stale share detected, submitting as user requested
[22:28:37] Accepted Coin 510491 Diff 1.1K/512 GPU 1 at NiceHash_Scrypt
[22:29:00] Accepted Coin 510491 Diff 3.38K/512 GPU 0 at NiceHash_Scrypt
[22:29:00] Accepted Coin 510491 Diff 3.23K/512 GPU 2 at NiceHash_Scrypt
[22:29:04] Network diff set to 562K
[22:29:04] Stratum from NiceHash_Scrypt detected new block
[22:29:11] Accepted Coin 562426 Diff 661/512 GPU 0 at NiceHash_Scrypt
[22:29:19] thread was not cancelled in 60 seconds after restart_mining_threads
[22:29:19]
Summary of runtime statistics:

[22:29:19] Started at [2014-06-08 22:28:04]
[22:29:19] Runtime: 0 hrs : 1 mins : 0 secs
[22:29:19] Average hashrate: 2.7 Megahash/s
[22:29:19] Solved blocks: 0
[22:29:19] Best share difficulty: 3.38K
[22:29:19] Share submissions: 4
[22:29:19] Accepted shares: 4
[22:29:19] Rejected shares: 0
[22:29:19] Accepted difficulty shares: 2048
[22:29:19] Rejected difficulty shares: 0
[22:29:19] Reject ratio: 0.0%
[22:29:19] Hardware errors: 0
[22:29:19] Utility (accepted shares / min): 4.00/min
[22:29:19] Work Utility (diff1 shares solved / min): 2494.52/min

[22:29:19] Stale submissions discarded due to new blocks: 0
[22:29:19] Unable to get work from server occasions: 0
[22:29:19] Work items generated locally: 136
[22:29:19] Submitting work remotely delay occasions: 0
[22:29:19] New blocks detected on network: 10

[22:29:19] Summary of per device statistics:

Then it crashes out.

This is with pool-gpu-threads : 1 - I continue to get -4 Errors if we do not specify pool-gpu-threads to 1 since my main gpu-threads is 2. gpu-threads 2 will always lead to an -4 Error.

platinum4 · 2014-06-09T03:35:16Z

I am trying with "no-restart" : true right now.

platinum4 · 2014-06-09T03:36:15Z

[22:33:41] Started sgminer 4.2.1
[22:33:41] Loaded configuration file C:\sgminer_v5_0_06062014\sgminer-nicehash.conf
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:33:42] Probing for an alive pool
[22:33:43] Switching to Marucoin suprnova - first alive pool
[22:33:43] Network diff set to 20
[22:33:44] NiceHash_X13_multi alive, testing stability
[22:33:44] Switching to NiceHash_X13_multi
[22:33:44] Network diff set to 1.48K
[22:33:44] New block detected on network before pool notification
[22:33:44] Marucoin suprnova stale share detected, submitting as user requested
[22:33:44] Accepted Coin 21 Diff 0.039/0.004 GPU 0 at Marucoin suprnova
[22:33:45] Switching mrr x13 platinum4.2 to stratum+tcp://us-east01.miningrigrentals.com:50100
[22:33:53] Stratum from NiceHash_X13_multi requested work restart
[22:33:54] Accepted Coin 1484 Diff 0.019/0.008 GPU 0 at NiceHash_X13_multi
[22:33:57] Accepted Coin 1484 Diff 0.098/0.008 GPU 0 at NiceHash_X13_multi
[22:33:57] Accepted Coin 1484 Diff 0.010/0.008 GPU 0 at NiceHash_X13_multi
[22:33:58] Accepted Coin 1484 Diff 0.046/0.008 GPU 1 at NiceHash_X13_multi
[22:34:07] Switching to NiceHash_Scrypt
[22:34:08] NiceHash_Scrypt difficulty changed to 512
[22:34:11] Network diff set to 162K
[22:34:11] Stratum from NiceHash_Scrypt detected new block
[22:34:12] Network diff set to 162K
[22:34:12] Stratum from NiceHash_Scrypt detected new block
[22:34:12] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:34:12] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:34:12] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[22:34:16] Network diff set to 162K
[22:34:16] Stratum from NiceHash_Scrypt detected new block
[22:34:19] Accepted Coin 162013 Diff 848/512 GPU 1 at NiceHash_Scrypt
[22:34:21] Network diff set to 2.54M
[22:34:21] Stratum from NiceHash_Scrypt detected new block
[22:34:40] Accepted Coin 2539405 Diff 1.05K/512 GPU 0 at NiceHash_Scrypt
[22:34:40] Accepted Coin 2539405 Diff 2.85K/512 GPU 2 at NiceHash_Scrypt
[22:34:45] NiceHash_Scrypt extranonce change requested
[22:34:45] Network diff set to 309K
[22:34:45] Stratum from NiceHash_Scrypt detected new block
[22:34:47] Accepted Coin 308808 Diff 2.93K/512 GPU 0 at NiceHash_Scrypt
[22:34:56] Accepted Coin 308808 Diff 2.4K/512 GPU 0 at NiceHash_Scrypt
[22:34:58] Accepted Coin 308808 Diff 752/512 GPU 2 at NiceHash_Scrypt
[22:34:59] Stratum from NiceHash_Scrypt requested work restart
[22:35:06] Accepted Coin 308808 Diff 1.01K/512 GPU 2 at NiceHash_Scrypt
[22:35:12] thread was not cancelled in 60 seconds after restart_mining_threads
[22:35:12]

mrbrdo · 2014-06-09T03:49:04Z

Just for start, can you try around line 6260 in sgminer.c, just above quit(1, "thread was not cancelled in 60 seconds after restart_mining_threads");, add a new line with pthread_testcancel(); and then run make again to recompile. I don't think it will help but just in case let's try it.

Edit: Oh right you are on Windows. Are you able to recompile?

platinum4 · 2014-06-09T03:52:51Z

tbh I wait on the Windows compiles provided by Elun on bitcointalk. I have
already asked him to make a new set of binaries based on your recent
commits. Can you add that line into sgminer.c as you don't think it will
be detrimental, or do you want me to add into my repo and have Elun build
it? No matter what I do, how I try, that stupid win-build guide does not
work for me, and I have followed it from the beginning now at least 5x,
enough to get frustrated.

On Sun, Jun 8, 2014 at 10:49 PM, Jan Berdajs notifications@github.com
wrote:

Just for start, can you try around line 6260 in sgminer.c, just above quit(1,
"thread was not cancelled in 60 seconds after restart_mining_threads");,
add a new line with pthread_testcancel(); and then run make again to
recompile. I don't think it will help but just in case let's try it.

—
Reply to this email directly or view it on GitHub
#246 (comment).

mrbrdo · 2014-06-09T03:56:38Z

Yes I also had problems with Windows build. Here is some discussion about it, it seems they figured it out: #229

I wouldn't like to add it to the branch, because I don't want to add code that has no effect.
As a side note, if you remove all "pool-gpu-threads" from your config, this should not happen.

mrbrdo · 2014-06-09T05:08:52Z

I think I figured out how to reproduce this. Seems it is indeed a bug. Only happens when "pool-gpu-threads" is configured.

Bllacky · 2014-06-09T05:10:32Z

@mrbrdo Have you managed to build SGminer with the latest commits? I tried last night and I had some major issues. But I don't know if it's my fault or it's something from the source.

platinum4 · 2014-06-09T05:34:55Z

"I think I figured out how to reproduce this. Seems it is indeed a bug. Only happens when "pool-gpu-threads" is configured."

Ditto - any thoughts on how to side-step this bug?

Edit: hardcoding gpu-threads 1 in the bottom portion of .conf file & removing all instances of pool-gpu-threads appears to have allowed me to build ckolivas24000nf11 and start hashing on scrypt-N for more than a minute; however, if you switch pools down to scrypt with an nf10 it immediately fails.

[00:42:46] Started sgminer 4.2.1
[00:42:46] Loaded configuration file C:\sgminer_v5_0_06062014\sgminer-nicehash.conf
[00:42:46] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:42:46] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:42:46] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:42:46] Probing for an alive pool
[00:42:47] Switching to Marucoin suprnova - first alive pool
[00:42:47] Network diff set to 16
[00:42:47] NiceHash_X13_multi alive, testing stability
[00:42:47] Switching to NiceHash_X13_multi
[00:42:47] Network diff set to 1.07K
[00:42:47] New block detected on network before pool notification
[00:42:49] Switching mrr x13 platinum4.2 to stratum+tcp://us-east01.miningrigrentals.com:50100
[00:42:49] Accepted Coin 1071 Diff 0.008/0.005 GPU 2 at NiceHash_X13_multi
[00:42:49] Stratum from NiceHash_X13_multi requested work restart
[00:42:52] Accepted Coin 1071 Diff 0.075/0.005 GPU 0 at NiceHash_X13_multi
[00:42:53] Network diff set to 1.09K
[00:42:53] Stratum from NiceHash_X13_multi detected new block
[00:42:53] Accepted Coin 1089 Diff 0.015/0.005 GPU 1 at NiceHash_X13_multi
[00:42:55] Accepted Coin 1089 Diff 0.022/0.005 GPU 2 at NiceHash_X13_multi
[00:42:55] Accepted Coin 1089 Diff 0.005/0.005 GPU 0 at NiceHash_X13_multi
[00:42:56] Accepted Coin 1089 Diff 0.012/0.005 GPU 1 at NiceHash_X13_multi
[00:42:57] Accepted Coin 1089 Diff 0.009/0.005 GPU 2 at NiceHash_X13_multi
[00:42:58] Accepted Coin 1089 Diff 0.006/0.005 GPU 1 at NiceHash_X13_multi
[00:43:00] Accepted Coin 1089 Diff 0.006/0.005 GPU 2 at NiceHash_X13_multi
[00:43:01] Accepted Coin 1089 Diff 0.009/0.005 GPU 2 at NiceHash_X13_multi
[00:43:05] Accepted Coin 1089 Diff 0.284/0.005 GPU 1 at NiceHash_X13_multi
[00:43:08] Switching to NiceHash_Scrypt-N
[00:43:08] NiceHash_Scrypt-N difficulty changed to 512
[00:43:12] Network diff set to 654M
[00:43:12] Stratum from NiceHash_Scrypt detected new block
[00:43:13] Network diff set to 15.6M
[00:43:13] New block detected on network before pool notification
[00:43:13] Building binary ckolivasHawaiiglg2tc24000nf11w128l4.bin
[00:43:20] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:43:20] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:43:20] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:43:34] Network diff set to 1.12K
[00:43:34] Stratum from tmb x13 multiport detected new block
[00:43:50] Accepted Coin 15580881 Diff 1.08K/512 GPU 0 at NiceHash_Scrypt-N
[00:43:50] Stratum from NiceHash_Scrypt-N requested work restart
[00:43:58] Network diff set to 15.6M
[00:43:58] Stratum from NiceHash_Scrypt-N detected new block
[00:44:06] Stratum from NiceHash_Scrypt-N requested work restart
[00:44:12] Accepted Coin 15574040 Diff 705/512 GPU 2 at NiceHash_Scrypt-N
[00:44:42] Switching to NiceHash_Scrypt
[00:44:43] NiceHash_Scrypt difficulty changed to 512
[00:44:47] Network diff set to 654M
[00:44:47] New block detected on network before pool notification
[00:44:47] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:44:47] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:44:48] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:44:48] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:44:48] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:44:48] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:44:48] GPU 1 failure, disabling!
[00:44:48] GPU 2 failure, disabling!
[00:44:48] GPU 0 failure, disabling!

platinum4 · 2014-06-09T05:50:37Z

Here is an attempt to trick the miner into using 2 different kernel.cl files as the algorithm

[00:47:33] Started sgminer 4.2.1
[00:47:33] Loaded configuration file C:\sgminer_v5_0_06062014\sgminer-nicehash.conf
[00:47:33] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:47:33] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:47:33] Initialising kernel marucoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:47:33] Probing for an alive pool
[00:47:34] Switching to Marucoin suprnova - first alive pool
[00:47:34] Network diff set to 16
[00:47:34] NiceHash_X13_multi alive, testing stability
[00:47:34] Switching to NiceHash_X13_multi
[00:47:35] Switching mrr x13 platinum4.2 to stratum+tcp://us-east01.miningrigrentals.com:50100
[00:47:35] Network diff set to 1.11K
[00:47:35] New block detected on network before pool notification
[00:47:35] Marucoin suprnova stale share detected, submitting as user requested
[00:47:35] Accepted Coin 16 Diff 0.008/0.004 GPU 2 at Marucoin suprnova
[00:47:41] Accepted Coin 1107 Diff 0.016/0.008 GPU 0 at NiceHash_X13_multi
[00:47:41] Stratum from NiceHash_X13_multi requested work restart
[00:47:44] Switching to NiceHash_Scrypt
[00:47:45] NiceHash_Scrypt difficulty changed to 512
[00:47:48] Network diff set to 1.16K
[00:47:48] Stratum from tmb x13 multiport east2 detected new block
[00:47:49] Network diff set to 51.9M
[00:47:49] New block detected on network before pool notification
[00:47:50] Building binary zuikkisHawaiiglg2tc32765nf10w128l4.bin
[00:47:54] NiceHash_Scrypt extranonce change requested
[00:47:54] Stratum from NiceHash_Scrypt requested work restart
[00:47:55] Stratum from NiceHash_Scrypt requested work restart
[00:47:57] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:47:57] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:47:57] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[00:48:00] Accepted Coin 51916139 Diff 2.83K/512 GPU 0 at NiceHash_Scrypt
[00:48:02] Network diff set to 654M
[00:48:02] Stratum from NiceHash_Scrypt detected new block
[00:48:14] Switching to NiceHash_Scrypt-N
[00:48:15] Network diff set to 15.6M
[00:48:15] New block detected on network before pool notification
[00:48:15] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:48:15] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:48:15] Initialising kernel ckolivas.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[00:48:15] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:48:15] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:48:15] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[00:48:15] GPU 1 failure, disabling!
[00:48:15] GPU 0 failure, disabling!
[00:48:15] GPU 2 failure, disabling!
[00:48:16] Network diff set to 654M
[00:48:16] Stratum from NiceHash_Scrypt detected new block

Again, trying to from from nf10 -> nf11; or nf11 -> nf10 throws failures. For now; I can add scrypt as a backup without an issue, since ckolivas works with nf10

platinum4 · 2014-06-09T06:04:09Z

As it is right now, this miner can effectively algo flip between scrypt-N, keccak, x11, x13 [no scrypt nf10]

Which, for right now, is all of the feasible mining algorithms for GPUs. It looks like ASICs have increased scrypt difficulty to a phenomenal level already.

mrbrdo · 2014-06-09T06:06:35Z

If you don't use "pool-gpu-threads" in your config (at all) then it should not happen. Does it?

Also, I experience this problem with Scrypt-N too.

platinum4 · 2014-06-09T06:09:17Z

"If you don't use "pool-gpu-threads" in your config (at all) then it should not happen. Does it?"

Confirmed - we are past the 60second shut off. However, I have removed all instances of pool-gpu-threads, and hardcoded gpu-threads to equal 1 in the .conf file. It allows the miner to run, but when you flip from scrypt to scrypt-N it throws -4 Error as provided above.

platinum4 · 2014-06-09T06:13:32Z

It's not the "pool-nfactor" setting, because that one works effectively. However, I have NOT tried to flip back to an X13 after that. Experimenting now.

platinum4 · 2014-06-09T06:16:52Z

Yeah it algo flips amongst nfactors ok. The problem lies in how it flips amongst nfactors within the same algo (ie scrypt). It fails consistently trying to switch only from scrypt10 to scrypt11 and/or either way back.

troky · 2014-06-09T06:47:50Z

@platinum4 How much RAM do you have installed? What is configured pagefile size? 3x R9 290, right?

mrbrdo · 2014-06-09T07:48:41Z

Well, it seems somehow it cannot allocate enough memory for kernel/buffers. This seems to work fine when pool-gpu-threads is not set (which means soft restart, mining threads are not completely stopped and restarted). But when it is set (hard restart, mining threads completely stopped and then restarted), it seems like there is some memory left reserved in the devices. But I cannot figure what it could be and why it is not happening with soft restart. It definitely happens because it is out of memory, for example if I set gpu-threads to 1 (which means less VRAM use), then it works fine.
For me it happens when switching from darkcoin, darkcoin-mod or maxcoin kernel to scrypt (I didn't try others).

(also http://www.popekim.com/2012/07/opencl-getting-outofresource-or.html this is the error we get - -4 CL_MEM_OBJECT_ALLOCATION_FAILURE)

platinum4 · 2014-06-09T08:16:37Z

@troky 8GB RAM, no problem mining any algos with it. Pagefile is set to automatic I am assuming. I'll try 12GB pagefile size and report back.

Pagefile shouldn't matter. The same error occurs on these rigs - 2x 290X, 3x 290X, 3x 390X, so pagefiles should ideally be 8gb,12gb,12gb

Expanded all page files to 12228MB still -4 Errors when switching amonst scrypt and scrypt-N

platinum4 · 2014-06-09T08:18:32Z

And still getting SICK -> DEAD errors on a few cards (mainly only R9 290X Tri-X OC 1040/1300)

mrbrdo · 2014-06-09T10:04:14Z

That's probably unrelated, would probably happen with sph-sgminer-x11mod too. It's probably just misconfiguration.

platinum4 · 2014-06-09T11:43:04Z

Well, not when Elun had extended that SICK timer... ;D

But yeah, we'll go with a 'continuous configuration error' after all this time if it's easier to stomach. ;)

mrbrdo · 2014-06-09T15:36:47Z

Extending SICK timer is not a solution, it's a workaround... And if we start with those, we will get nowhere again. But I might just throw that restarting SICK GPUs out some day because it doesn't seem to ever help anything.

One guy was complaining about SICK/DEAD on X13-mod, but then he found some different settings on some forum and no more SICK, and he even got better hashrate/WU out of it. I'm not saying the X13-mod kernel doesn't have problems, but we use the same that everyone does (from girino), so the problems should be the same no matter which miner you use. Also for example someone told me that he got better hashrate with intensity 18 instead of 20. So it's not necessary to always put everything to absolute max. Similarly keccak seems to work better with gpu-threads 1 instead of 2 on R9 280X (without changing any other settings).

But back to the point, I spent some time looking at what could be causing this weird bug with scrypt kernels, but I have no idea at all yet :/

platinum4 · 2014-06-16T07:54:41Z

This issue is still open until more reports in on a successful flip to nscrypt and back.

mrbrdo · 2014-06-30T19:15:46Z

@platinum4 so are you saying you yourself do not experience the problem anymore?

platinum4 · 2014-06-30T19:44:26Z

Still with nf11, and I think others do as well, reports if it from a few members in sgminer-dev IRC and on bitcoin talk still

Sent from my iPhone

On Jun 30, 2014, at 2:15 PM, Jan Berdajs notifications@github.com wrote:

@platinum4 so are you saying you yourself do not experience the problem anymore?

—
Reply to this email directly or view it on GitHub.

mrbrdo · 2014-06-30T22:51:38Z

Hm, well the new v5_0 does not do a hard reset unless gpu-threads is really different. Does this still happen for you when switching from scrypt to scrypt-n or vice versa (assuming that you use the same gpu-threads for both)? Or does it only happen when switching from some algo where you use a different gpu-threads (e.g. Keccak)?
I tested it a few minutes ago without hard restart (so gpu-threads was same), and it worked fine (as before, I only experienced this on hard-restart).

platinum4 · 2014-06-30T23:06:44Z

I'll have to test on rigs later tonight I will get back to you ok

Sent from my iPhone

On Jun 30, 2014, at 5:51 PM, Jan Berdajs notifications@github.com wrote:

Hm, well the new v5_0 does not do a hard reset unless gpu-threads is really different. Does this still happen for you when switching from scrypt to scrypt-n or vice versa (assuming that you use the same gpu-threads for both)? Or does it only happen when switching from some algo where you use a different gpu-threads (e.g. Keccak)?
I tested it a few minutes ago without hard restart (so gpu-threads was same), and it worked fine (as before, I only experienced this on hard-restart).

—
Reply to this email directly or view it on GitHub.

platinum4 · 2014-07-01T07:34:28Z

@mrbrdo I cannot replicate this issue anymore [Windows binaries, possibly fixed by the slew of commits in June 2014]; closing this issue for now unless others can repeat it.

platinum4 · 2014-07-01T14:25:13Z

Re-opened this issue as it is being experienced by others, such as @evolvia31 #308

evolvia31 · 2014-07-03T05:58:40Z

Hi all, i just try this morning the last commit to branch v5_0 and issue continue :(
Mysgminer version is : 4.2.2-240-gec8b
If you need more log or debug tell me which i do post to help you to solve this issue.
(My full config is post in my issue number #308 )
I try all scrypt N kernel support (zuikkis, alexkar, ckolivas ) and i have the same bug when i switch from x11, x13 or Keccak to scrypt-N
this is my output log:
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:50:34] Initialising kernel darkcoin-mod.cl with bitalign, unpatched BFI, nfactor 10, n 1024
[07:51:01] Accepted 12b76333 Diff 0.053/0.040 GPU 1 at NiceHash_X11
[07:51:14] Accepted 115262c9 Diff 0.058/0.040 GPU 0 at NiceHash_X11
[07:51:44] Accepted 0b3411a8 Diff 0.089/0.040 GPU 1 at NiceHash_X11
[07:52:04] Stratum connection to Waffle_X11 interrupted
[07:52:12] Accepted 10394a79 Diff 0.062/0.040 GPU 0 at NiceHash_X11
[07:52:15] Accepted 17755b6a Diff 0.043/0.040 GPU 0 at NiceHash_X11
[07:53:26] Accepted 081c27b0 Diff 0.123/0.040 GPU 0 at NiceHash_X11
[07:53:42] Accepted 037b5162 Diff 0.287/0.040 GPU 1 at NiceHash_X11
[07:54:02] Accepted 5415e379 Diff 3.044/0.040 GPU 0 at NiceHash_X11
[07:54:08] Accepted 0a602be4 Diff 0.096/0.040 GPU 1 at NiceHash_X11
[07:54:20] Switching to NiceHash_N
[07:54:20] NiceHash_N difficulty changed to 128
[07:54:25] Applying pool settings for NiceHash_N...
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Initialising kernel zuikkis.cl with bitalign, unpatched BFI, nfactor 11, n 2048
[07:54:25] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[07:54:25] Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[07:54:25] GPU 1 failure, disabling!
[07:54:25] GPU 2 failure, disabling!
[07:54:27] Accepted 8bdaabb6 Diff 468/128 GPU 0 at NiceHash_N
[07:54:29] Accepted 01109c02 Diff 240/128 GPU 0 at NiceHash_N
[07:54:30] Accepted 40e8602c Diff 1.01K/128 GPU 0 at NiceHash_N

mrbrdo · 2014-07-28T15:17:13Z

@evolvia31 I have time to look into it this week, can you confirm it is still a problem, or has it been fixed?

ystarnaud · 2014-07-28T23:24:49Z

@evolvia31 can you paste your full config not just the profiles section please?

ystarnaud · 2014-07-28T23:26:13Z

Also your specs might be helpful. What model GPUs are you using? How much system memory?

evolvia31 · 2014-07-29T09:06:18Z

Hi, yes I try this morning again and bug is already with crash.
My config is:
Ubuntu 13.10 (GNU/Linux 3.11.0-15-generic x86_64)
Driver ATI 14.6 beta
sgminer verison: sgminer 4.2.2-255-gb6aef
graphic card: R9 280X MSI Gaming
RAM: 2Go
GPU infos:
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tahiti XT [Radeon HD 7970/R9 280X](prog-if 00 [VGA controller])
Subsystem: Micro-Star International Co., Ltd. Device 2775
Flags: bus master, fast devsel, latency 0, IRQ 52
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at f7b00000 (64-bit, non-prefetchable) [size=256K]
I/O ports at b000 [size=256]
Expansion ROM at f7b40000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] #19
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] #13
Capabilities: [2d0] #1b
Kernel driver in use: fglrx_pci

My sgminer config file:
"profiles":[
{
"name":"x11",
"algorithm":"darkcoin-mod",
"nfactor" : "10"
},{
"name":"x13",
"algorithm":"marucoin-mod",
"nfactor" : "10"
},{
"name":"Scrypt",
"algorithm":"zuikkis",
"nfactor" : "10"
},{
"name":"ScryptN",
"algorithm":"zuikkis",
"nfactor" : "11"
},{
"name":"keccak",
"algorithm":"maxcoin",
"nfactor" : "10"
},{
"name":"x15",
"algorithm":"bitblock",
"nfactor" : "10",
"intensity" : "19",
"worksize" : "64",
"gpu-memclock" : "1250",
"gpu-engine" : "1100"
},{
"name":"nist5",
"algorithm":"talkcoin-mod",
"nfactor" : "10"
}
],
"intensity" : "13",
"vectors" : "1",
"worksize" : "256",
"kernel" : "zuikkis",
"lookup-gap" : "2",
"thread-concurrency" : "8192",
"shaders" : "2048",
"gpu-engine" : "1050",
"gpu-fan" : "80",
"gpu-memclock" : "1500",
"gpu-memdiff" : "0",
"gpu-powertune" : "5",
"gpu-vddc" : "0.000",
"temp-cutoff" : "88",
"temp-overheat" : "83",
"temp-target" : "75",
"api-mcast-port" : "4028",
"api-port" : "4028",
"api-listen" : true,
"api-allow" : "W:192.168.0.45",
"expiry" : "30",
"gpu-dyninterval" : "7",
"gpu-platform" : "0",
"gpu-threads" : "2",
"log" : "60",
"no-pool-disable" : true,
"queue" : "0",
"scan-time" : "5",
"scrypt" : true,
"temp-hysteresis" : "3",
"shares" : "0",
"kernel-path" : "/home/sgminer/v50/kernel"
}

ystarnaud · 2014-07-29T09:38:52Z

Where are your pools in the config? Do you only have 1 GPU? I could have sworn the enqueue error was with GPU 1 and 2 while 0 was ok.

Also using gpu threads 2 across the board seems a bit dangerous. I haven't dealt with non Xn algorithms in a while but I'm pretty sure some of the more intense algorithms needed only 1 thread.

I would recommend you test each algorithm individually with only 1 pool and no switching to make sure you have the correct settings before putting them all in 1 file.

One last thing... Do you run as root? I notice you have /home/sgminer/

mrbrdo · 2014-07-29T10:27:52Z

@ystarnaud it's an older issue.. I was able to reproduce it too. It seems to happen when switching from Scrypt to Scrypt-N.

Bllacky · 2014-07-29T10:32:35Z

Scrypt-N is working in a very strange way.
For me it never starts all cards/threads the same. For instance I will have one card at 300 KH/s, one at 330 KH/s, one at 350 KH/s, and one at 370Kh/s . All my cards work top speed at 372 KH/s . And to reach speeds close to 365-372 I have the restart sgminer several times, as well as the rig.

Scrypt-N or its kernel are very capricious.

evolvia31 · 2014-07-29T11:19:17Z

Hi, i have 3 ou 4 GPU card by server, i use more than 20 different pool so i don't publish them but, i have build my config with one pool and each algo config works fine since few months. The problem begin during may but i don't know with which exactly sgminer version.

All works fine during long weeks except if i try to switch from algo x11, x13 or x15 to scrypt-N.

I have no problem when i switch from Scrypt-N to any other algo.
I have no problem when i switch to Scrypt to scrypt-N
I have no problem when i switch from X11,X13 or X15 algo to scrypt

The only problem is from X11, X13 or X15 to Scrypt-N.
To reproduce the bug, i use only one pool X11 algo, when each card solve 2 shares each, when i try to switch i have the error message and i have 1 or 2 GPU from 3 which failed.

Sgminer restart don't work to re-enable GPU, i need to quit sgminer stay 30 sec and re-launch sgminer.

Yes all instance works as root.

platinum4 · 2014-07-30T06:24:31Z

I've given up on nscrypt until at least the winter time; I can build bins fine now but with 14.7RC drivers the TC was sliced in half, down to a maximum of 8192. Even then, hardware errors were experienced and the hashrate sucked. If I ever go back to nscrypt it will be with the 13.12 drivers on a dedicated rig, unless AMD comes out with better ones in the mean time.

The only error I notice now when building nf11 is Error -61 memory size, which is decrease TC or increase lookup gap error; so it's unrelated.

platinum4 · 2014-07-30T06:26:13Z

Also, as an aside note; I've found this particular kernel https://github.com/exeminer/exeminer/blob/master/scrypt140202.cl to not cause any HW errors, and it's different from bufius, ckolivas, alexkarnew & alexkarold

Bllacky · 2014-07-30T10:56:43Z

If you intend to merge this kernel into SGminer, please try to keep some consistency. We have different commits in master, 5.0 and developer.

platinum4 · 2014-12-15T09:31:24Z

This is largely dependent on scrypt n-factor 11, which sucks balls anyway; for the time being, I shall close this.

platinum4 mentioned this issue Jul 1, 2014

switching from X11 to scrypt-N #264

Closed

platinum4 closed this as completed Jul 1, 2014

platinum4 mentioned this issue Jul 1, 2014

Bug switching kernel "clEnqueueNDRangeKernel" #308

Closed

platinum4 reopened this Jul 1, 2014

mrbrdo added bug labels Jul 2, 2014

mrbrdo added this to the 5.0 milestone Jul 2, 2014

platinum4 closed this as completed Dec 15, 2014

Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel) #246

Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel) #246

Comments

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

Bllacky commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

troky commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 9, 2014

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 9, 2014

mrbrdo commented Jun 9, 2014

platinum4 commented Jun 16, 2014

mrbrdo commented Jun 30, 2014

platinum4 commented Jun 30, 2014

mrbrdo commented Jun 30, 2014

platinum4 commented Jun 30, 2014

platinum4 commented Jul 1, 2014

platinum4 commented Jul 1, 2014

evolvia31 commented Jul 3, 2014

mrbrdo commented Jul 28, 2014

ystarnaud commented Jul 28, 2014

ystarnaud commented Jul 28, 2014

evolvia31 commented Jul 29, 2014

ystarnaud commented Jul 29, 2014

mrbrdo commented Jul 29, 2014

Bllacky commented Jul 29, 2014

evolvia31 commented Jul 29, 2014

platinum4 commented Jul 30, 2014

platinum4 commented Jul 30, 2014

Bllacky commented Jul 30, 2014

platinum4 commented Dec 15, 2014