Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mac support #17

Merged
merged 3 commits into from Jun 25, 2017
Merged

Conversation

psychocrypt
Copy link
Collaborator

@psychocrypt psychocrypt commented Apr 8, 2017

This pull requst integrate the changes of #10

Changes

  • include OpenCL/cl.j on Mac
  • add option to emulate amd_bfe and amd_bitalign

My changes on top of kth5's pull requst

  • remove the not existent pragma #pragma "Emulation *"
  • optimize amd_bfe (remove check for the edge case)

Tests

  • runtime test linux (native amd intrinsics)
  • runtime test linux (emulated amd bfe/bit_align)
  • @kth5 Mac test
  • please do not merge before merge master back to dev #39 this branch needs a rebase

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Apr 8, 2017

@fireice-uk I need to check my changes without on linux while using the emulated functions instead of the amd intrinsic.

@kth5 Could you please run this branch on our Mac. btw: the fix for the worksize is not included in this branch and is not needed for a validation test

@fireice-uk
Copy link
Owner

@psychocrypt Let me know when the code is ready to be merged (I don't think those tick widgets will ping me).

@kth5
Copy link

kth5 commented Apr 9, 2017

I'll be running an hour worth later today and report back. So far the miner performs the same as with my PR alone. Still waiting for shares to get an average idea.

@kth5
Copy link

kth5 commented Apr 10, 2017

@psychocrypt Had it running for around an hour now and due to a still slow hashrate of just 60H/s on a R9 M395X 4GB in my iMac, it only came to find a few shares but seems ok:

image

There are still GPU compute errors but to me it seems these may be expected? Someone with more indepth knowledge of the algorithms involved may have to have a look at the emulated amd_bfe and amd_bitalign to make sure I guess.

@psychocrypt
Copy link
Collaborator Author

I can reproduce a low rate of wrong shares under linux if I used the emulated functions. I will investigate time to find the issue as soon as the nvidia miner is released.

@kth5
Copy link

kth5 commented Apr 10, 2017

If it helps, I'll extend my offer for an account on a recent Mac with AMD graphics. Just let me know so we can schedule.

Apart from that, even if the emulated functions aren't accurate enough. It seems to me that under 100H/s is very low since I can get around the same with a much lower end R5 240 on Ubuntu.

@psychocrypt
Copy link
Collaborator Author

I can reproduce a low rate of wrong shares under linux if I used the emulated functions. I will investigate time to find the issue as soon as the nvidia miner is released.
After I found the bug I will rebase against #13 this should fix your low hashrate. You need to increase the worksize to some differ from 1.

@fireice-uk fireice-uk mentioned this pull request May 26, 2017
@richard-underwood
Copy link

I'm seeing up to 50% rejected submissions - could the lack of casts in the amd_bitalign mean it's doing a signed shift when it shouldn't be?

@psychocrypt psychocrypt mentioned this pull request Jun 2, 2017
@psychocrypt
Copy link
Collaborator Author

@richard-underwood You pointed to the correct place. It was no issue with signed the function need a 64bit cast and than back to 32bit. Also the parentheses was wrongly placed.

I fixed it in the last commit, it is running on my system with 100% valid hashes.
Could you please test the latest changes.

@psychocrypt
Copy link
Collaborator Author

@kth5 Could you please try the latest version of the PR.

@richard-underwood
Copy link

Interestingly, I still get GPU compute errors.

@psychocrypt
Copy link
Collaborator Author

@richard-underwood could you please ckeck by hand that you are using the latest chages.

@richard-underwood
Copy link

I've just fetched it again with:

git clone https://github.com/fireice-uk/xmr-stak-amd.git
git checkout dev
wget 'https://github.com/fireice-uk/xmr-stak-amd/pull/17.diff'
patch -p1 < 17.diff
cmake -DOpenSSL_ENABLE=OFF -DMICROHTTPD_ENABLE=OFF .
make

This is still throwing errors:

RESULT REPORT
Difficulty       : 1000
Good results     : 14 / 17 (82.4 %)
Avg result time  : 3.1 sec
Pool-side hashes : 14000

Error details:
| Count | Error text                       | Last seen           |
|     3 | [GPU COMPUTE ERROR]              | 2017-06-05 15:15:17 |

@snazzybunny
Copy link

My macbook pro with a amd 460 gpu seems to freeze when running richard-underwood's instructions. I have to do a hard reboot. Also, it only shows that I have a platform index of 0. Using platform index = 1 does not allow the miner to start - I think it only sees one opencl device instead of two.

@richard-underwood
Copy link

I'm running on an iMac with one GPU - I have no idea what happens on a laptop which switches between integrated and discrete GPU, but I can take a look later.

@richard-underwood
Copy link

Is there documentation of the protocol/algorithms used anywhere? I intentionally broke SKEIN_ROT() and the result was pretty much the same percentage of failed hashes.

@kth5
Copy link

kth5 commented Jun 13, 2017

@psychocrypt Did a run of the PR a while ago, here are the results:

image

This is on a MacBook Pro mid-2015 with a R9 M370X 2GB on Sierra.

@gregordoltar
Copy link

I am experiencing same problem as snazzybunny. But this happens on my iMac Sierra, AMD Radeon r9 m390 2gb.

Binary was compiled as:

git clone https://github.com/fireice-uk/xmr-stak-amd.git
git checkout dev
wget 'https://github.com/fireice-uk/xmr-stak-amd/pull/17.diff'
patch -p1 < 17.diff
cmake -DOpenSSL_ENABLE=OFF -DMICROHTTPD_ENABLE=OFF .
make

I had to modify my config.txt because some devices were not found:


-"gpu_thread_num" : 6,
+"gpu_thread_num" : 1,
 "gpu_threads_conf" : [ 
        { "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
-       { "index" : 1, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
-       { "index" : 2, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
-       { "index" : 3, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
-       { "index" : 4, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
-       { "index" : 5, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false },
 ],

I run binary as normal user (not a root). The system does not crash. I can still move the mouse but UI is not responsive.

@ArtSabintsev
Copy link

Mac user here. Installation instructions available, or do i simply run Make?

@ArtSabintsev
Copy link

Alright, got Cmake to work using:

cmake -DOPENSSL_ROOT_DIR=/usr/local/opt/openssl .

However, running make throws the following error on @psychocrypt's branch:

Arthur: ~/Documents/xmr-stak-amd 
(master)$ make
Scanning dependencies of target xmr-stak-amd
[  5%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/c_blake256.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 11%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/c_groestl.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 16%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/c_jh.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 22%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/c_keccak.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 27%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/c_skein.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 33%] Building C object CMakeFiles/xmr-stak-amd.dir/crypto/soft_aes.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 38%] Building CXX object CMakeFiles/xmr-stak-amd.dir/crypto/cryptonight_common.cpp.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
[ 44%] Building C object CMakeFiles/xmr-stak-amd.dir/amd_gpu/gpu.c.o
clang: warning: argument unused during compilation: '-s' [-Wunused-command-line-argument]
In file included from /Users/Arthur/Documents/xmr-stak-amd/amd_gpu/gpu.c:42:
/Users/Arthur/Documents/xmr-stak-amd/amd_gpu/gpu.h:3:10: fatal error: 'CL/cl.h' file not found
#include <CL/cl.h>
         ^
1 error generated.
make[2]: *** [CMakeFiles/xmr-stak-amd.dir/amd_gpu/gpu.c.o] Error 1
make[1]: *** [CMakeFiles/xmr-stak-amd.dir/all] Error 2
make: *** [all] Error 2

(I'm here because I saw a post on reddit saying that mac testers were needed. As a potential mac user, a README would be nice as well across all xmr-stak-* repos)

@snazzybunny
Copy link

snazzybunny commented Jun 16, 2017

I would try to add the dev patches as shown in the above instructions and then make.

@ArtSabintsev
Copy link

Thanks. Got it working, though I only have one GPU, so it took a while to get these results and unfreeze my computer:

Arthur: ~/Documents/xmr-stak-amd 
(dev)$ ./xmr-stak-amd 
[2017-06-15 21:08:32] : Compiling code and initializing GPUs. This will take a while...
[2017-06-15 21:08:32] : Device 0 work size 8 / 256.
-------------------------------------------------------------------
xmr-stak-amd 1.1.0-1.4.0-dev mining software, AMD Version.
AMD mining code was written by wolf9466.
Brought to you by fireice_uk under GPLv3.

Configurable dev donation level is set to 1.0 %

You can use following keys to display reports:
'h' - hashrate
'r' - results
'c' - connection
-------------------------------------------------------------------
[2017-06-15 21:08:36] : Starting GPU thread, no affinity.
[2017-06-15 21:08:36] : Connecting to pool monerohash.com:3333 ...
[2017-06-15 21:08:36] : Connected. Logging in...
[2017-06-15 21:08:36] : Difficulty changed. Now: 5000.
[2017-06-15 21:08:36] : New block detected.
[2017-06-15 21:09:51] : New block detected.
[2017-06-15 21:11:25] : Result accepted by the pool.
[2017-06-15 21:12:26] : New block detected.
[2017-06-15 21:13:49] : New block detected.
[2017-06-15 21:14:13] : New block detected.
[2017-06-15 21:16:27] : New block detected.
[2017-06-15 21:17:13] : New block detected.
[2017-06-15 21:17:57] : Result accepted by the pool.
[2017-06-15 21:17:58] : New block detected.
[2017-06-15 21:21:03] : New block detected.
[2017-06-15 21:21:46] : New block detected.
[2017-06-15 21:22:40] : Result accepted by the pool.
[2017-06-15 21:22:40] : New block detected.
[2017-06-15 21:24:03] : Result accepted by the pool.
[2017-06-15 21:25:28] : Result accepted by the pool.
[2017-06-15 21:25:56] : Result accepted by the pool.
[2017-06-15 21:26:24] : Result accepted by the pool.
[2017-06-15 21:27:38] : New block detected.
[2017-06-15 21:28:37] : New block detected.
[2017-06-15 21:29:12] : New block detected.
[2017-06-15 21:31:44] : New block detected.
[2017-06-15 21:36:25] : New block detected.
[2017-06-15 21:36:50] : New block detected.
[2017-06-15 21:39:00] : Result accepted by the pool.

Also adding visual proof.
screen shot 2017-06-15 at 9 41 05 pm

@snazzybunny
Copy link

@ArtSabintsev Does it crash for you?

@ArtSabintsev
Copy link

@snazzybunny No, but again, I have one GPU on this machine, and it was pegged the entire time, so I only let it run for 20-25 minutes - worked without fail.

@Panzerfather
Copy link

Panzerfather commented Jun 17, 2017

@psychocrypt Please correct me if I'm wrong, but after building the AMD specific functions in code should this not also be working with Intel OpenCL drivers? If so, it simply doesn't. At least when I tested it with Intel 42xx Haswell CPUs running Windows it will produce 100% GPU compute errors. Maybe there is a rare compute condition on AMD Macs and a permanent condition on Intel GPUs?

Program was build with Intel's latest OpenCL SDK.

Edit: Just a side note which may help you to isolate the problem, I also noticed rare GPU compute errors on Windows with AMD Radeon 270/280 and AMD Radeon 480 cards on Windows/Linux. Just like around 1-2%, but not in every run of the miner. E.g. running miner 7 days without a compute error and right now, after running the same compilation after reboot for ~2 days 971/988 (98.3%) with 17 GPU compute errors. It seems that the problems were triggered from another (rare) case than from the new functions implemented via the new code.

@psychocrypt
Copy link
Collaborator Author

@Panzerfather I have not tested it on Intel, I can do it. If you get compute errors please post the output that we can see the kind of error.

@kth5 The error in the post is a network error and is independent of this PR.

kth5 and others added 3 commits June 21, 2017 09:35
- include `OpenCL/cl.j` on Mac
- add option to emulate `amd_bfe` and `amd_bitalign`
- remove edge case check in emulated `amd_bfe` function
- remove unknown `#pragma "Emulation *"`
- add clean inline functions for `amd_bitalign`
- change `amd_bfe` implementation
- add links to the original documentations from kronos for
  `amd_bitalign` and `amd_bfe`
@psychocrypt
Copy link
Collaborator Author

@fireice-uk Could you please merge this PR. If there is something not 100% well for MACOSX it is no issue because it is not effecting linux. This means we have no side effects to the current supported environments.

@Panzerfather
Copy link

@psychocrypt Off course, it will look just like that:

RESULT REPORT
Difficulty       : 1000
Good results     : 0 / 17 (0 %)
Avg result time  : 4.2 sec
Pool-side hashes : 0

Error details:
| Count | Error text                       | Last seen           |
|    17 | [GPU COMPUTE ERROR]              | 2017-06-22 22:34:15 |

I've ran other tests with more than 100 miscalculated hashes and 0 good results [all GPU compute errors] on 2 different Intel GPUs.

Is there an input and correct output hash example to work with for debugging proposes or do I have to setup a testing environment myself with a working card to locate the miscalculation on Intel cards? The testing network isn't an option because you have to know what the correct internal and result values should be.

@fireice-uk
Copy link
Owner

@psychocrypt ok - merging without much review, this one is on you =)

@fireice-uk fireice-uk merged commit 3848669 into fireice-uk:dev Jun 25, 2017
@psychocrypt psychocrypt deleted the topic-appleOpenCLHeader branch June 25, 2017 16:56
@psychocrypt
Copy link
Collaborator Author

@Panzerfather Which OS do you used and which gpu?

@Panzerfather
Copy link

@psychocrypt Just take a look #17 (comment) 😉

The 100% GPU compute errors are Intel NUC Systems with latest drivers. All Systems running at standard power voltage and are not overclocked.

@psychocrypt psychocrypt mentioned this pull request Jun 26, 2017
@psychocrypt
Copy link
Collaborator Author

@Panzerfather Please switch to #56 for discussion about Intel GPU support

@mihaiile
Copy link

I was trying the latest dev branch as of today on iMac 2017 (has the Radeon Pro 580) but I get a LOT of GPU compute errors, I mean around maybe 40-60% is compute errors. :(

@fireice-uk
Copy link
Owner

Unfortunately most often those are due to the hardware - card is either too hot or too old.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants