-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenCL error in JCE that is fixed in XMRSTAK - on HD 6990, 7850 with 14.4 drivers #26
Comments
Hello, The current code for v8 is a partially optimized version of the reference code, the same as in xmrstak, as you guessed. I normally write all my code from scratch, but I had to rush the v8 GPU version so I did an exception and just took the reference code provided by Monero team. You can read the Bitcointalk topic where i give details of the causes of such delay. Depending on what is the faster, i will either rewrite my implementation and so that broken line of code will just disappear, or i will apply the reference fix. |
Fixed the good way: I rewrote everything and the problem is away. I tested on Drivers 14.12 with a HD6950 (i bet that's the same as with your 14.4). |
Awesome! Those are great news! Were you able to get more than ~220h/s on that 6950? The max possible I've been getting on 6990 (which are 2x 6970s) is around 220h/s with 14.4 drivers on xmr stak. The intensity for those cards was 874 i believe and worksize of either 18, 19 or 22 produced best results. Work size of 8 just gave poor performance which was also the case in v7 algo before the upgrade. I will your new version later today, but just wondering for now, that might some sort of benchmark for those old HD69 series :) |
my card is on a rig with a weak psu that cannot afford let it mine at full power, i can only test the mining works, but if i configure it at max, i know the psu will burn. that rig with hd6000 has the lowest priority so i'm continuously taking spare parts from it and replace them by undersized ones. |
Please upgrade to 0.33b4, there's a nice boost for v8 on HD6900 |
Awesome, I'll check it out in the next few hours! :) |
Win 10 64, 14.4 drivers. Used default clocks for gpu: 830 core and 1250 memory. No memory timing mods or anything. So that is 2x 270 h/s = ~ 540 h/s total for 6990 (which is 2 gpu cores of 6970s in it) This is a very good improvement over current xmr stak which is giving ~ 220h/ per core or 440 total. However, These cards should be able to handle up to 888 intensity I think before cl buffer error, ( they work with 880 intensity and 20 worksize for example 'cause it's also a multiple of 16) Is there a possibility to implement some mode that would disable this automatic error that requires intensity to always be a multiple of 16? Even when we're setting completely different worksize not related to 16? All props for the speed improvement so far though, great job! @jceminer I think you should be able to replicate 270h/s on 6950s with 32 worksize and 864 intensity :) |
This is not an error, this is a technical requirement in JCE. 16, because that's 4x4, that's how some data are grouped inside the miner memory. |
Okay, thanks for letting me know! Do you think further optimization is possible for these gpus, to go beyond 270h per core? |
I can confirm that JCE works quite nice. @930Mhz 282h/s. Stability is far better than xmr-stak but for 8 GPUs (4xHD6990) 14.4 have problems. And Win8.1+ only. |
Windows 7with 14.4 on a 6970 gets 320 hash on windows ten same driver 220 hash Also if you update past 14.4 even if you ddu uninstall it's permantly messes up opencl... Not sure why. Need to reinstall windows and don't go past 14.4 it makes you get 220hashs then... Something gets messed up. |
Wow really? I have to test this at home, if that's the case, and I've been
able to get 270 hash on win 10 with 14.4, then on win 7 it should be far
more than 320 even!
нед, 18. нов 2018. у 02:07 Patrickdurbin <notifications@github.com> је
написао/ла:
… Windows 7with 14.4 on a 6970 gets 320 hash on windows ten same driver 220
hash
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Ah63wpmF23-Ub11kAkarWObj_tBeHG6-ks5uwLLVgaJpZM4X_UuC>
.
--
Sava.
|
What settings have you use on Win7 to get 320h/s? |
take a look at the standalone .exe that's an experimental version that should be faster on hd6000 |
This was an issue also happening in XMR stak (fireice-uk/xmr-stak#1922), but it is fixed here fireice-uk/xmr-stak#1945 and fireice-uk/xmr-stak#1951, please incorporate the fix also to your miner.
14.4 drivers are the best when it comes to HD 6990s, other drivers don't detect the memory of the cards properly and therefore intensity cannot be set higher than what is maybe 10% of max card performance.
Let me know if any other details are needed!
Here is the error pasted:
For Windows 64-bits
Analyzing Processors topology...
AMD Athlon(tm) II X2 250 Processor
Assembly codename: generic
SSE2 : Yes
SSE3 : Yes
SSE4 : No
AES : No
AVX : No
AVX2 : No
Found CPU 0, with:
L1 Cache: 64 KB
L2 Cache: 1024 KB
Found CPU 1, with:
L1 Cache: 64 KB
L2 Cache: 1024 KB
Detecting OpenCL-capable GPUs...
Found GPU 0, with:
Vendor: AMD
Processor: Cayman
Device: 0b:00
Compute-Units: 24
Cache Memory: 0 KB
Local Memory: 32 KB
Global Memory: 2048 MB
Addressing: 32-bits
Found GPU 1, with:
Vendor: AMD
Processor: Cayman
Device: 0b:00
Compute-Units: 24
Cache Memory: 0 KB
Local Memory: 32 KB
Global Memory: 2048 MB
Addressing: 32-bits
Found GPU 2, with:
Vendor: AMD
Processor: Cayman
Device: 0b:00
Compute-Units: 24
Cache Memory: 0 KB
Local Memory: 32 KB
Global Memory: 2048 MB
Addressing: 32-bits
Found GPU 3, with:
Vendor: AMD
Processor: Cayman
Device: 0b:00
Compute-Units: 24
Cache Memory: 0 KB
Local Memory: 32 KB
Global Memory: 2048 MB
Addressing: 32-bits
Found GPU 4, with:
Vendor: AMD
Processor: Pitcairn
Device: 01:00
Compute-Units: 16
Cache Memory: 16 KB
Local Memory: 32 KB
Global Memory: 2048 MB
Addressing: 64-bits
Preparing 1 Mining Threads...
+-- Thread 0 config ------------------------+
| Run on GPU: 0 |
| Multi-hash: 224 |
| Worksize: 8 |
| Factor Alpha 64 |
| Factor Beta 8 |
+-------------------------------------------+
Cryptonight Variation: Cryptonight V8 fork of Oct-2018
Low intensity.
Starting GPU Thread 0, on GPU 0
Created OpenCL Context for GPU 0 at 000001739fb71920
Created OpenCL Thread 0 Command-Queue for GPU 0 at 00000173a03f1930
Scratchpad Allocation success for OpenCL Thread 0
Allocating big 448MB scratchpad for OpenCL Thread 0...
Compiling kernels of OpenCL Thread 0...
LLVM ERROR: Cannot select: 0x1739f7c0c60: i32 = setcc 0x173a1bcaf10, 0x173a1bcad10, 0x173a0443ed0 [ORD=4762] [ID=3015]
0x173a1bcaf10: i64 = add 0x173a1bcad10, 0x173a1bcec30 [ORD=4756] [ID=3013]
0x173a1bcad10: i64 = add 0x173a1bcac10, 0x173a1bca810 [ORD=4755] [ID=3012]
0x173a1bcac10: i64 = mul 0x173a1bca910, 0x173a1bcab10 [ORD=4754] [ID=3010]
0x173a1bca910: i64 = zero_extend 0x173a1bca410 [ORD=4751] [ID=3008]
0x173a1bca410: i32 = add 0x173a1bcf130, 0x173a1bca310 [ORD=4746] [ID=3007]
0x173a1bcf130: i32 = AMDILISD::VEXTRACT 0x173a1bca210, 0x173a1bc35d0 [ORD=4744] [ID=3006]
0x173a1bca210: v2i32 = AMDILISD::BITCONV 0x173a1bca010 [ORD=4743] [ID=3005]
0x173a1bca010: i64 = add 0x173a1bc9f10, 0x173a1bc9d00 [ORD=4742] [ID=3004]
0x173a1bc9f10: i64 = add 0x173a19bf310, 0x173a1bc9a00 [ORD=4741] [ID=3003]
0x173a1bcad10: i64 = add 0x173a1bcac10, 0x173a1bca810 [ORD=4755] [ID=3012]
0x173a1bcac10: i64 = mul 0x173a1bca910, 0x173a1bcab10 [ORD=4754] [ID=3010]
0x173a1bca910: i64 = zero_extend 0x173a1bca410 [ORD=4751] [ID=3008]
0x173a1bca410: i32 = add 0x173a1bcf130, 0x173a1bca310 [ORD=4746] [ID=3007]
0x173a1bcf130: i32 = AMDILISD::VEXTRACT 0x173a1bca210, 0x173a1bc35d0 [ORD=4744] [ID=3006]
0x173a1bca210: v2i32 = AMDILISD::BITCONV 0x173a1bca010 [ORD=4743] [ID=3005]
0x173a1bca010: i64 = add 0x173a1bc9f10, 0x173a1bc9d00 [ORD=4742] [ID=3004]
0x173a1bc9f10: i64 = add 0x173a19bf310, 0x173a1bc9a00 [ORD=4741] [ID=3003]
0x173a19bf310: i64 = AMDILISD::LCREATE 0x173a19bf820, 0x173a19c6890 [ORD=4731] [ID=2986]
In function: __OpenCL_scratchrounds_kernel
The text was updated successfully, but these errors were encountered: