Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Os x port #30

Open
justvanbloom opened this issue Nov 5, 2016 · 29 comments
Open

Os x port #30

justvanbloom opened this issue Nov 5, 2016 · 29 comments

Comments

@justvanbloom
Copy link

Dear,
I'll do the os x port.
Changed compiler structs and opencl headers but get a bunch of errors. Any clue?

@mbevand
Copy link
Owner

mbevand commented Nov 5, 2016

I could help you if you showed me the errors you see.

@justvanbloom
Copy link
Author

well.. successful build but sa-solver fails now.

./sa-solver -v
Solving default all-zero 140-byte header
Found 1 OpenCL platform(s)
Using GPU device ID 0
Building program
OpenCL build failed (-11). Build log follows:
input.cl:80:6: warning: no previous prototype for function 'ht_store'
uint ht_store(uint round, **global char *ht, uint i,
^
input.cl:415:6: warning: no previous prototype for function 'xor_and_store'
uint xor_and_store(uint round, __global char *ht_dst, uint row,
^
input.cl:496:6: warning: no previous prototype for function 'equihash_round'
void equihash_round(uint round, __global char ht_src, __global char ht_dst,
^
input.cl:500:11: warning: unused variable 'tlid'
uint tlid = get_local_id(0);
^
input.cl:592:63: error: global variables must have a constant address space qualifier
__kernel __attribute
((reqd_work_group_size(64, 1, 1))) void kernel_round ## 1(**global char *ht_src, __global char ht_dst, __global uint debug) { equihash_round(1, ht_src, ht_dst, debug); }
^
input.cl:592:75: error: expected ';' after top level declarator
__kernel __attribute
((reqd_work_group_size(64, 1, 1))) void kernel_round ## 1(__global char *ht_src, __global char *ht_dst, __global uint *debug) { equihash_round(1, ht_src, ht_dst, debug); }
^
;

@justvanbloom
Copy link
Author

see my changes on https://github.com/justvanbloom/silentarmy/tree/mac

@mbevand
Copy link
Owner

mbevand commented Nov 7, 2016

Weird. Your system's OpenCL compiler requires definining prototypes? Trying adding them and see what happens...

@justvanbloom
Copy link
Author

justvanbloom commented Nov 8, 2016

ok i will try. but i think its another problem also. global variables must have a constant address space qualifier. but when i add __constant nothing changes. have you a hint?

@mbevand
Copy link
Owner

mbevand commented Nov 8, 2016

Sorry I don't know. Maybe your OpenCL compiler is confused by the syntax? Try removing the whole "__attribute((reqd_work_group_size(64, 1, 1)))"

@justvanbloom
Copy link
Author

ok, great steps forward now

make runs with only a few warnings now.

have inlcuded ocldump on my fork perhaps this helps

echo 'const char *ocl_code = R"_mrb_(' >_kernel.h cpp input.cl >>_kernel.h echo ')_mrb_";' >>_kernel.h gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o main.o main.c main.c: In function 'store_encoded_sol': main.c:573:34: warning: left shift of negative value [-Wshift-negative-value] uint32_t mask = ~(-1 << (8 - x_bits_used)); ^~ gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o blake.o blake.c blake.c:6:25: warning: 'blake2b_block_len' defined but not used [-Wunused-const-variable=] static const uint32_t blake2b_block_len = 128; ^~~~~~~~~~~~~~~~~ gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o sha256.o sha256.c gcc-6 -o sa-solver main.o blake.o sha256.o -rdynamic -L"/System/Library/Frameworks/OpenCL.framework/Versions/Current/Libraries" -framework OpenCL

so now when i ran

./sa-solver --nonces 1000

i get

Solving default all-zero 140-byte header
Building program
clCreateKernel (-46)

@mposch
Copy link

mposch commented Nov 9, 2016

Hi!

I got it working on OSX (hackintosh), at least the solver runs but i am not sure if i have correct results. And i am still seeing the "pipe closed by peer" issue. To get rid of the constant address space error, I removed the define for the KERNEL_ROUND(N) completely and just wrote the 7 method declarations by hand instead.
Justvanbloom, you effectively removed the equihash_round calls so that might be the issue.

./sa-solver --list
Devices on platform "Apple":
ID 0: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
ID 1: GeForce GTX 970

./sa-solver --nonces 10 --use 1

Solving default all-zero 140-byte header
Building program
Hash tables will use 1208.0 MB
Running...
Nonce 0000000000000000000000000000000000000000000000000000000000000000: 2 sols
265 (probably invalid) solutions were dropped!
Nonce 0100000000000000000000000000000000000000000000000000000000000000: 3 sols
Nonce 0200000000000000000000000000000000000000000000000000000000000000: 1 sol
Nonce 0300000000000000000000000000000000000000000000000000000000000000: 0 sols
Nonce 0400000000000000000000000000000000000000000000000000000000000000: 3 sols
Nonce 0500000000000000000000000000000000000000000000000000000000000000: 1 sol
Nonce 0600000000000000000000000000000000000000000000000000000000000000: 4 sols
Nonce 0700000000000000000000000000000000000000000000000000000000000000: 2 sols
Nonce 0800000000000000000000000000000000000000000000000000000000000000: 2 sols
Nonce 0900000000000000000000000000000000000000000000000000000000000000: 2 sols
Total 20 solutions in 1060.5 ms (18.9 Sol/s)

@justvanbloom
Copy link
Author

Wohoo! Jo let us work on this.
Can you make a pull request on my repo (mac branch?)

@justvanbloom
Copy link
Author

Or post the funcs you mean here...

@mposch
Copy link

mposch commented Nov 10, 2016

Well it seems that the CPU works, but the GPU (tested on a GTX970 and ATI Radeon 6750) does not produce correct solutions at the moment.

@justvanbloom
Copy link
Author

in the first moment i thougt ok, but then also saw Share above target failures.
but when i run it with --instances=1 and --debug -v then all seeems ok.

@mposch
Copy link

mposch commented Nov 10, 2016

When testing, should cpu and gpu devices produce the same result?
GPU:
hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln:
Soln: 0x0: 635 33ddc....
CPU:
hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 0 2>&1 | grep Soln:
Soln: 0x0: 35c 12d31f a2216....

So at the moment is seems cpu is okay, gpu is false.

@justvanbloom
Copy link
Author

justvanbloom commented Nov 10, 2016

jvb-MacBook-Pro:silentarmy oliverfolz$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln:
Soln: 0x0: 35c 12d31f a2216 cbc99 ... 16fe42 1ae61b

jvb-MacBook-Pro:silentarmy oliverfolz$ ./sa-solver --nonce 1 -v -v --use 0 2>&1 | grep Soln:
Soln: 0x0: 35c 12d31f a2216 cbc99 ... 16fe42 1ae61b

working?

@justvanbloom
Copy link
Author

From one tester:
So, now no interesting info:

Devices on platform "Apple":
ID 0: Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz - Max rate - 7.78 H/s (or 15.5 sol/s)
ID 1: GeForce GTX 970 - Max rate - 1.13 H/s (or 5.2 sol/s)

Total 4.6 sol/s [dev1 5.2] 9 shares
Total 4.4 sol/s [dev1 4.9] 9 shares
Total 4.4 sol/s [dev1 4.5] 9 shares
Total 4.4 sol/s [dev1 4.1] 9 shares
Total 4.3 sol/s [dev1 4.3] 10 shares

It is very strange, like a gtx 970 should provide 50 sol/s

@mposch
Copy link

mposch commented Nov 10, 2016

no, using the gpu i get a different Solution:

hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln:
Soln: 0x0: 635 33ddc ...... 1411ec

./sa-solver --nonce 20 --use 1 (GTX 970)
Total 44 solutions in 2133.6 ms (20.6 Sol/s)
./sa-solver --nonce 20 --use 0
Total 42 solutions in 36185.6 ms (1.2 Sol/s)

so if 2 instances were running on the cpu, 40 Sol/s could be possible. But still my version does not calculate correctly. i´ll have to look into it. It seems that your version provides correct results on the gpu?

@justvanbloom
Copy link
Author

justvanbloom commented Nov 10, 2016

Jep. Mine is fine.
Made also implementation for xn-sub.
Have uploaded my bins to beta1mac branch of my fork.

@justvanbloom
Copy link
Author

Just clone this repo and do tests plz.

https://github.com/justvanbloom/silentarmy/tree/beta1mac?files=1

@mbevand
Copy link
Owner

mbevand commented Nov 10, 2016

I see you guys are making progress, nice :) As long as you produce the same solutions as testing/sols-100 then you know your port is valid.

@justvanbloom
Copy link
Author

Thx! Yes. Verified and working. Mined pover 1000 sol/s on diffrent pools so far. All accepted shares. But i do not stop. I have the feeleing there is more room.

@mposch
Copy link

mposch commented Nov 10, 2016

No i am still having the same issue that my gpu is not delivering correct results.

@justvanbloom
Copy link
Author

Maybe i can port the opencl kernel to metal.

@mposch
Copy link

mposch commented Nov 10, 2016

I am posting my -v -v -v -v output, probably one of you can spot why the results are different (the numbers are a myth to me for now :). Should this output be exactly the same on gpu and cpu?

CPU:
Solving nonce 0000000000000000000000000000000000000000000000000000000000000000
Round 0
row 0xab6b7:
00 06000000 5c030000 | aec71a52 3c273545 244a056d 4c34495a 16b68910 c61d1514
01 ________ d3160400 | ae8710c1 b413ee00 7b026443 39948ed7 d17f099b d8368900
02 ________ b8ba0600 | a2fb67db 8daed24d de6a7633 00ea0a98 635e1932 f38a472a
03 ________ f4530700 | a44c80c4 011d7b69 6940fc5c cf394783 3c1c93c1 47274a8a
04 ________ 54a90900 | adb8f7eb 63c99657 ddd5b038 333a034e 533db3a3 e21aea8d
*05 ________ 1fd31200 | a1249924 afd702c6 71bf5485 ca719a5e 590c5274 ab4c6600
Dropped: 0 (coll) 0 (stor)
Round 1

GPU:
Round 0
row 0xa9ea1:
00 04000000 0d3b0100 | a2c2ea03 74b5505d dd3db55c 7ba995ac 0ecb1580 4ab04600
01 ________ 9fd91100 | a6b2b69e adad167d e3d18a86 098d4a2f e8e7ec31 ed7afe00
02 ________ 92181600 | a11fe04e b3cd5611 6c5dc0cd fccb8120 859ef672 9c370e49
*03 ________ 1fd31200 | a174df2b 42e96062 13a82593 bc6ca4ad bcf2b3ef 9b981300
row 0xb63d1:

@justvanbloom
Copy link
Author

justvanbloom commented Nov 12, 2016

Cpu in mac version (latest v5 silentarmy) is broken. Gpu works fine.
Eg
Amd radeon 970
Total 53.2 sol/s [dev1 54.1] 12 shares
Total 52.7 sol/s [dev1 52.8] 12 shares
Amd radeon r9 390
Up to 125 sol/s
:D
Thx for beeing part. I can make pull request with my latest source @mbevand

@mposch
Copy link

mposch commented Nov 12, 2016

What type of system do you use to mine- a real mac ? When i find time i will test the kernel under linux.

@justvanbloom
Copy link
Author

macbook pro, imac and hackintosh.

@mposch
Copy link

mposch commented Nov 13, 2016

Using linux everything works fine and i get 45 Sols/s on my gtx 970

It seems that cpp (clang) does not like the ## concatenation within a #define. If you make sure that cpp-6 (the brew version) of the preprocessor is used during build, you could revert to the original input.cl kernel.

see https://github.com/mposch/silentarmy/blob/master/Makefile

@CoderYgs
Copy link

I see @justvanbloom said maybe i can port the opencl kernel to metal. I want to know is it feasible? thx

@taomanwai
Copy link

@justvanbloom How u solve Mac "make" err issue ? And How to find path of LIBOPENCL? My whole Mac has no libOpenCL.so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants