New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hashcat on iMac Pro with Vega 56 #1497

Open
guru431 opened this Issue Jan 23, 2018 · 18 comments

Comments

Projects
None yet
5 participants
@guru431

guru431 commented Jan 23, 2018

I have a new iMac Pro with a video card AMD Radeon Vega 56.
The iMac has two operating systems: macOS High Sierra and Windows 10.
In both of them I tried to run hashcat benchmark.
And under macOS, the speed of hashcat is almost two times lower.
Why is this happening?

Windows:

hashcat64 -m 2500 -w 3 -b
hashcat (v4.0.1) starting in benchmark mode...

OpenCL Platform #1: Advanced Micro Devices, Inc.

  • Device #1: gfx901, 6732/8176 MB allocatable, 56MCU
  • Device #2: Intel(R) Xeon(R) W-2140B CPU @ 3.20GHz, skipped.

Benchmark relevant options:

  • --workload-profile=3

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#1.....: 352.0 kH/s (77.70ms)

macOS:

hashcat -m 2500 -w 3 -b
hashcat (v4.0.1) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Xeon(R) W-2140B CPU @ 3.20GHz, skipped.
  • Device #2: AMD Radeon Pro Vega 56 Compute Engine, 2044/8176 MB allocatable, 56MCU

Benchmark relevant options:

  • --workload-profile=3

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#2.....: 199.8 kH/s (68.96ms)

@neheb

This comment has been minimized.

Contributor

neheb commented Jan 23, 2018

Because Apple's drivers suck. Simple as that.

@guru431

This comment has been minimized.

guru431 commented Jan 23, 2018

To use Windows is the only way out?
I basically have to work in macOS. Can there is still some way to raise the speed? Is there something similar to the -w 3 key or can there be unofficial patched drivers for Vega?

@neheb

This comment has been minimized.

Contributor

neheb commented Jan 23, 2018

Let's see. You could virtualize the GPU. Maybe?

ssh into a system which is properly set up (Windows or Linux).

I woudln't hold my breath for Apple to make proper drivers. One research area would be to see if Apple's OpenCL runtime allows inline assembly so that some of the more advanced features of Vega can be used. Don't hold your breath though.

@emwinkler

This comment has been minimized.

emwinkler commented Jan 31, 2018

Have you tried version 3.5.0? I too am running MacOS High Sierra 10.13.3, but with an AMD RX 580 in eGPU on a Macbook Air. I see terrible benchmarks with v4.0.1 as seen below.

hashcat-3.5.0 results:
./hashcat -m 2500 -w 3 -b -d 3
hashcat () starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....: 214.1 kH/s (83.32ms)

hashcat-4.0.1 results:
./hashcat -m 2500 -w 3 -b -d 3
hashcat (v4.0.1) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=3
  • --workload-profile=3

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#3.....: 59432 H/s (72.00ms)

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 1, 2018

I can't reproduce with almost the same GPU (RX480).

v4.1.0 (it's the RC1 of v4.0.1 - no kernel changes):

root@sf:~/hashcat# ./hashcat -b -m 2500
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Advanced Micro Devices, Inc.
================================================
* Device #1: Ellesmere, 3254/4077 MB allocatable, 36MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#1.....:   173.8 kH/s (52.28ms)

Started: Thu Feb  1 09:19:45 2018
Stopped: Thu Feb  1 09:20:08 2018
root@sf:~/hashcat# ./hashcat -b -m 2500 -w 3
hashcat (v4.1.0) starting in benchmark mode...

OpenCL Platform #1: Advanced Micro Devices, Inc.
================================================
* Device #1: Ellesmere, 3254/4077 MB allocatable, 36MCU

Benchmark relevant options:
===========================
* --workload-profile=3

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#1.....:   174.0 kH/s (102.26ms)

Started: Thu Feb  1 09:20:15 2018
Stopped: Thu Feb  1 09:20:28 2018

v3.5.0:

root@sf:~/hashcat# ./hashcat -b -m 2500 -w 3
hashcat (v3.5.0) starting in benchmark mode...

OpenCL Platform #1: Advanced Micro Devices, Inc.
================================================
* Device #1: Ellesmere, 3829/4077 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#1.....:   173.1 kH/s (105.89ms)

Started: Thu Feb  1 09:21:21 2018
Stopped: Thu Feb  1 09:21:32 2018

Same speed. Without being able to reproduce I can't fix it.

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 1, 2018

I've add some code to disable code caching, a shot in the dark. Can you please pull master, recompile and retry?

@emwinkler

This comment has been minimized.

emwinkler commented Feb 1, 2018

I will test later and let you know. I do know that it only occurs with hashmode 2500. All others benchmark equally or better with 4.0.1.

@emwinkler

This comment has been minimized.

emwinkler commented Feb 2, 2018

Same result with latest version 4.1.0. I :
./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=3
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 58793 H/s (72.83ms)

Started: Fri Feb 2 14:52:50 2018
Stopped: Fri Feb 2 14:53:07 2018

I went back and tested old versions of hashcat I had and the issue starts in all 4.0 versions I have. 3.6 versions have no issue as shown below:

./hashcat -b -m 2500 -d 3
hashcat (v3.6.0-3-ge87fb31d) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....: 201.4 kH/s (83.16ms)

Started: Fri Feb 2 15:05:56 2018
Stopped: Fri Feb 2 15:06:03 2018

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 3, 2018

I've add some more code to change certain parameters, another shot in the dark, since I do not have access to that system and can not reproduce locally. Can you please pull master, recompile and retry again?

@emwinkler

This comment has been minimized.

emwinkler commented Feb 3, 2018

Same result.

./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=3
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 76867 H/s (54.31ms) @ Accel:1024 Loops:8 Thr:64 Vec:1

Started: Sat Feb 3 11:05:38 2018
Stopped: Sat Feb 3 11:06:03 2018

@philsmd

This comment has been minimized.

Member

philsmd commented Feb 3, 2018

Are we actually sure that this is not the same problem reported here: #1290 ?

@emwinkler can you please make the test mentioned here: #1290 (comment)

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 13, 2018

Please retry with latest github master version. It's important to run make clean this time.

@emwinkler

This comment has been minimized.

emwinkler commented Feb 13, 2018

Whatever you changed fixed the benchmark speed for my AMD Radeon RX 580:

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 213.6 kH/s (85.01ms) @ Accel:128 Loops:64 Thr:256 Vec:1

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 14, 2018

Please redo the test with latest github master. Had to do some change because this workaround made other hash-modes no longer working. But with a bit of luck the latest changes lead to the same result.

@emwinkler

This comment has been minimized.

emwinkler commented Feb 14, 2018

The change you made breaks the benchmark speed for my AMD Radeon RX 580. It is slow again:

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 62451 H/s (72.57ms) @ Accel:64 Loops:32 Thr:256 Vec:1

@jsteube

This comment has been minimized.

Member

jsteube commented Feb 14, 2018

Pull again and retry please

@emwinkler

This comment has been minimized.

emwinkler commented Feb 14, 2018

No change. Still the slow benchmark.

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 62638 H/s (72.42ms) @ Accel:64 Loops:32 Thr:256 Vec:1

@emwinkler

This comment has been minimized.

emwinkler commented Feb 26, 2018

Running on the same hardware after booting into Ubuntu 16.04, I get normal benchmarks with the RX 580. Looks to be a macOS driver issue as you suggest.

OpenCL Platform #1: Intel(R) Corporation

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.

OpenCL Platform #2: Advanced Micro Devices, Inc.

  • Device #2: Ellesmere, 4048/6472 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=2
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#2.....: 212.3 kH/s (85.51ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment