New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v. 3.6 - Apple. AMD Radeon R9 M395 @ 1/3 speed #1290

Open
diegodieguex opened this Issue Jul 8, 2017 · 41 comments

Comments

Projects
None yet
7 participants
@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Jul 11, 2017

Member

Please retry with new github master version and make sure to have the kernels/ kernel cached removed first

Member

jsteube commented Jul 11, 2017

Please retry with new github master version and make sure to have the kernels/ kernel cached removed first

@diegodieguex

This comment has been minimized.

Show comment
Hide comment

diegodieguex commented Jul 11, 2017

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Aug 26, 2017

Member

Please retry with new github master version again and make sure to have the kernels/ kernel cached removed first.

It's really hard to debug this since this is some error only showing up on your system (I can't reproduce locally).

Member

jsteube commented Aug 26, 2017

Please retry with new github master version again and make sure to have the kernels/ kernel cached removed first.

It's really hard to debug this since this is some error only showing up on your system (I can't reproduce locally).

@diegodieguex

This comment has been minimized.

Show comment
Hide comment

diegodieguex commented Aug 29, 2017

ok but no success

http://textuploader.com/d6ma3

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Aug 29, 2017

hashcat (v3.6.0-454-g12295dcd) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 29942 H/s (117.17ms)

Started: Tue Aug 29 15:44:37 2017
Stopped: Tue Aug 29 15:44:51 2017

diegodieguex commented Aug 29, 2017

hashcat (v3.6.0-454-g12295dcd) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 29942 H/s (117.17ms)

Started: Tue Aug 29 15:44:37 2017
Stopped: Tue Aug 29 15:44:51 2017

@neheb

This comment has been minimized.

Show comment
Hide comment
@neheb

neheb Aug 29, 2017

Contributor

honestly since you're the only one seeing this issue, I recommend bisecting.

Contributor

neheb commented Aug 29, 2017

honestly since you're the only one seeing this issue, I recommend bisecting.

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Sep 13, 2017

Speed on my R9 370X has also been halved, so @diegodieguex is not the only one. It's not my main rig so I don't really care for it but it would be nice to find a solution. IIRC, it was fine with hashcat 3.5.0.

soxrok2212 commented Sep 13, 2017

Speed on my R9 370X has also been halved, so @diegodieguex is not the only one. It's not my main rig so I don't really care for it but it would be nice to find a solution. IIRC, it was fine with hashcat 3.5.0.

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Sep 15, 2017

Member

If you're on github master, is it possible you did miss to use the new -O parameter and therefore have the speed drop?

This could also solve the OPs problem because the problem here is not hashcat, it's the OpenCL driver/runtime for this particular device. Different vendors have different runtimes and therefore it works on some devices while on others it does not. We also have no influence on how the watchdog that causes this problem reacts from outside.

Member

jsteube commented Sep 15, 2017

If you're on github master, is it possible you did miss to use the new -O parameter and therefore have the speed drop?

This could also solve the OPs problem because the problem here is not hashcat, it's the OpenCL driver/runtime for this particular device. Different vendors have different runtimes and therefore it works on some devices while on others it does not. We also have no influence on how the watchdog that causes this problem reacts from outside.

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Sep 16, 2017

the same problem with MacOS High Sierra 10.13 GM

(from source) no problem. v3.30 and v3.6.0 working
https://i.imgur.com/ijKWa4p.jpg

but fail with all repos
https://i.imgur.com/0dw8qm9.jpg

diegodieguex commented Sep 16, 2017

the same problem with MacOS High Sierra 10.13 GM

(from source) no problem. v3.30 and v3.6.0 working
https://i.imgur.com/ijKWa4p.jpg

but fail with all repos
https://i.imgur.com/0dw8qm9.jpg

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Sep 17, 2017

Member

It's an Apple-only problem that I can not reproduce since I do not have such a device. Someone else need to find the root of the problem.

Member

jsteube commented Sep 17, 2017

It's an Apple-only problem that I can not reproduce since I do not have such a device. Someone else need to find the root of the problem.

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Sep 17, 2017

sorry again but I don't think so. maybe is a branch problem. all sources working OK and realy slow all repos

https://i.imgur.com/Sx2L93S.jpg
https://i.imgur.com/lslMUst.jpg

diegodieguex commented Sep 17, 2017

sorry again but I don't think so. maybe is a branch problem. all sources working OK and realy slow all repos

https://i.imgur.com/Sx2L93S.jpg
https://i.imgur.com/lslMUst.jpg

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Oct 20, 2017

Member

the issue should be fixed hopefully with commit bf11287

please test and close the issue if fixed

Member

jsteube commented Oct 20, 2017

the issue should be fixed hopefully with commit bf11287

please test and close the issue if fixed

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Oct 20, 2017

same problem nothing changed. thanks

https://i.imgur.com/wBs6Yxg.jpg

diegodieguex commented Oct 20, 2017

same problem nothing changed. thanks

https://i.imgur.com/wBs6Yxg.jpg

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Oct 20, 2017

Member

Can you please post -I output?

Member

jsteube commented Oct 20, 2017

Can you please post -I output?

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Oct 20, 2017

hashcat (4.0.0-rc6) starting...

OpenCL Info:

Platform ID #1
Vendor : Apple
Name : Apple
Version : OpenCL 1.2 (Aug 23 2017 16:35:41)

Device ID #1
Type : CPU
Vendor ID : 4
Vendor : Intel
Name : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Version : OpenCL 1.2
Processor(s) : 8
Clock : 4000
Memory : 2048/8192 MB allocatable
OpenCL Version : OpenCL C 1.2
Driver Version : 1.1

Device ID #2
Type : GPU
Vendor ID : 1
Vendor : AMD
Name : AMD Radeon R9 M395 Compute Engine
Version : OpenCL 1.2
Processor(s) : 28
Clock : 834
Memory : 512/2048 MB allocatable
OpenCL Version : OpenCL C 1.2
Driver Version : 1.2 (Aug 24 2017 22:09:54)

diegodieguex commented Oct 20, 2017

hashcat (4.0.0-rc6) starting...

OpenCL Info:

Platform ID #1
Vendor : Apple
Name : Apple
Version : OpenCL 1.2 (Aug 23 2017 16:35:41)

Device ID #1
Type : CPU
Vendor ID : 4
Vendor : Intel
Name : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Version : OpenCL 1.2
Processor(s) : 8
Clock : 4000
Memory : 2048/8192 MB allocatable
OpenCL Version : OpenCL C 1.2
Driver Version : 1.1

Device ID #2
Type : GPU
Vendor ID : 1
Vendor : AMD
Name : AMD Radeon R9 M395 Compute Engine
Version : OpenCL 1.2
Processor(s) : 28
Clock : 834
Memory : 512/2048 MB allocatable
OpenCL Version : OpenCL C 1.2
Driver Version : 1.2 (Aug 24 2017 22:09:54)

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Oct 20, 2017

Member

Looks like the Vendor ID is fixed. It was 214... before, see here: https://imgur.com/0dw8qm9

Now it is "1" as it should be.

In theory the speed should be fixed now. Make sure your installation is clean, all the cached OpenCL kernels are removed, you're using the right version, etc...

If that doesn't help we're back to zero. You maybe want to come to IRC so we can do some interactive debugging.

Member

jsteube commented Oct 20, 2017

Looks like the Vendor ID is fixed. It was 214... before, see here: https://imgur.com/0dw8qm9

Now it is "1" as it should be.

In theory the speed should be fixed now. Make sure your installation is clean, all the cached OpenCL kernels are removed, you're using the right version, etc...

If that doesn't help we're back to zero. You maybe want to come to IRC so we can do some interactive debugging.

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Oct 20, 2017

hashcat (4.0.0-rc6) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 2048/8192 MB allocatable, 8MCU
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Benchmark relevant options:

  • --opencl-device-types=1,2
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#1.....: 7583 H/s (67.70ms)
Speed.Dev.#2.....: 30054 H/s (117.51ms)
Speed.Dev.#*.....: 37637 H/s

Started: Fri Oct 20 10:11:32 2017
Stopped: Fri Oct 20 10:11:48 2017

diegodieguex commented Oct 20, 2017

hashcat (4.0.0-rc6) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 2048/8192 MB allocatable, 8MCU
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Benchmark relevant options:

  • --opencl-device-types=1,2
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#1.....: 7583 H/s (67.70ms)
Speed.Dev.#2.....: 30054 H/s (117.51ms)
Speed.Dev.#*.....: 37637 H/s

Started: Fri Oct 20 10:11:32 2017
Stopped: Fri Oct 20 10:11:48 2017

@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Oct 20, 2017

Member

Could you please perform a further test for us (just to double-check some of our most recent disoveries):

make clean
git checkout 52c1e15f3f5b0f50c8ccce3b8b9c604941fe8a57
make
hashcat -m 2500 -b
# note down the speed1

and:

make clean
git checkout 56dc8ae3598b059b9a7fe11859ebbec087d6cfab
make
hashcat -m 2500 -b
# note down the speed2

and finally:

make clean
git checkout master
make

please let us know about the speed1 and speed2 results

Member

philsmd commented Oct 20, 2017

Could you please perform a further test for us (just to double-check some of our most recent disoveries):

make clean
git checkout 52c1e15f3f5b0f50c8ccce3b8b9c604941fe8a57
make
hashcat -m 2500 -b
# note down the speed1

and:

make clean
git checkout 56dc8ae3598b059b9a7fe11859ebbec087d6cfab
make
hashcat -m 2500 -b
# note down the speed2

and finally:

make clean
git checkout master
make

please let us know about the speed1 and speed2 results

@roycewilliams

This comment has been minimized.

Show comment
Hide comment
@roycewilliams

roycewilliams Oct 20, 2017

Contributor

As a control group, the speeds are identical on my 2012 Macbook Air:

$ ./hashcat -m 2500 -b
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz, skipped.
* Device #2: HD Graphics 4000, 384/1536 MB allocatable, 16MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....:     2954 H/s (85.57ms)
$ ./hashcat -m 2500 -b
hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz, skipped.
* Device #2: HD Graphics 4000, 384/1536 MB allocatable, 16MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....:     2959 H/s (85.57ms)
Contributor

roycewilliams commented Oct 20, 2017

As a control group, the speeds are identical on my 2012 Macbook Air:

$ ./hashcat -m 2500 -b
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz, skipped.
* Device #2: HD Graphics 4000, 384/1536 MB allocatable, 16MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....:     2954 H/s (85.57ms)
$ ./hashcat -m 2500 -b
hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz, skipped.
* Device #2: HD Graphics 4000, 384/1536 MB allocatable, 16MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....:     2959 H/s (85.57ms)
@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Oct 20, 2017

Part of the problems seems to have come from Apple's drivers. On Sierra, I had ~33kH/s on Hashcat 3.40. Now on high sierra, I also tested on Hashcat 3.40 and maxed at ~13kH/s. 3.40 was unfortunately the newest release that I had saved a whole benchmark for.

I wouldn't say it's completely a Hashcat issue just yet.

This was on the 2015 Retina model with an R9 M370X.

soxrok2212 commented Oct 20, 2017

Part of the problems seems to have come from Apple's drivers. On Sierra, I had ~33kH/s on Hashcat 3.40. Now on high sierra, I also tested on Hashcat 3.40 and maxed at ~13kH/s. 3.40 was unfortunately the newest release that I had saved a whole benchmark for.

I wouldn't say it's completely a Hashcat issue just yet.

This was on the 2015 Retina model with an R9 M370X.

@diegodieguex

This comment has been minimized.

Show comment
Hide comment
@diegodieguex

diegodieguex Oct 21, 2017

hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 94570 H/s (75.32ms)

Started: Sat Oct 21 10:39:21 2017
Stopped: Sat Oct 21 10:39:33 2017


hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 30461 H/s (115.25ms)

Started: Sat Oct 21 10:40:58 2017
Stopped: Sat Oct 21 10:41:12 2017


hashcat (4.0.0-rc6) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Benchmark relevant options:

  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#2.....: 30101 H/s (117.51ms)

Started: Sat Oct 21 13:53:15 2017
Stopped: Sat Oct 21 13:53:35 2017

diegodieguex commented Oct 21, 2017

hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 94570 H/s (75.32ms)

Started: Sat Oct 21 10:39:21 2017
Stopped: Sat Oct 21 10:39:33 2017


hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 30461 H/s (115.25ms)

Started: Sat Oct 21 10:40:58 2017
Stopped: Sat Oct 21 10:41:12 2017


hashcat (4.0.0-rc6) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, skipped.
  • Device #2: AMD Radeon R9 M395 Compute Engine, 512/2048 MB allocatable, 28MCU

Benchmark relevant options:

  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#2.....: 30101 H/s (117.51ms)

Started: Sat Oct 21 13:53:15 2017
Stopped: Sat Oct 21 13:53:35 2017

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Feb 3, 2018

Member

I've add some more code to change certain parameters, another shot in the dark, since I do not have access to that system and can not reproduce locally. Can you please pull master, recompile and retry again?

Member

jsteube commented Feb 3, 2018

I've add some more code to change certain parameters, another shot in the dark, since I do not have access to that system and can not reproduce locally. Can you please pull master, recompile and retry again?

@emwinkler

This comment has been minimized.

Show comment
Hide comment
@emwinkler

emwinkler Feb 3, 2018

Speed1:
./hashcat -m 2500 -b
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 5934 H/s (50.05ms)
Speed.Dev.#3.....: 213.9 kH/s (83.32ms)
Speed.Dev.#*.....: 219.9 kH/s

Started: Sat Feb 3 14:01:23 2018
Stopped: Sat Feb 3 14:01:51 2018

Speed2:
./hashcat -m 2500 -b
hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 5927 H/s (50.05ms)
Speed.Dev.#3.....: 62928 H/s (71.27ms)
Speed.Dev.#*.....: 68855 H/s

Started: Sat Feb 3 14:03:46 2018
Stopped: Sat Feb 3 14:04:32 2018

Master:
./hashcat -m 2500 -b
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#2.....: 5924 H/s (53.25ms) @ Accel:128 Loops:32 Thr:8 Vec:1
Speed.Dev.#3.....: 59442 H/s (73.20ms) @ Accel:128 Loops:64 Thr:64 Vec:1
Speed.Dev.#*.....: 65366 H/s

Started: Sat Feb 3 14:07:09 2018
Stopped: Sat Feb 3 14:08:08 2018

emwinkler commented Feb 3, 2018

Speed1:
./hashcat -m 2500 -b
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 5934 H/s (50.05ms)
Speed.Dev.#3.....: 213.9 kH/s (83.32ms)
Speed.Dev.#*.....: 219.9 kH/s

Started: Sat Feb 3 14:01:23 2018
Stopped: Sat Feb 3 14:01:51 2018

Speed2:
./hashcat -m 2500 -b
hashcat (v3.6.0-51-g56dc8ae3) starting in benchmark mode...

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Hashtype: WPA/WPA2

Speed.Dev.#2.....: 5927 H/s (50.05ms)
Speed.Dev.#3.....: 62928 H/s (71.27ms)
Speed.Dev.#*.....: 68855 H/s

Started: Sat Feb 3 14:03:46 2018
Stopped: Sat Feb 3 14:04:32 2018

Master:
./hashcat -m 2500 -b
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, 384/1536 MB allocatable, 40MCU
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#2.....: 5924 H/s (53.25ms) @ Accel:128 Loops:32 Thr:8 Vec:1
Speed.Dev.#3.....: 59442 H/s (73.20ms) @ Accel:128 Loops:64 Thr:64 Vec:1
Speed.Dev.#*.....: 65366 H/s

Started: Sat Feb 3 14:07:09 2018
Stopped: Sat Feb 3 14:08:08 2018

@neheb

This comment has been minimized.

Show comment
Hide comment
@neheb

neheb Feb 3, 2018

Contributor

Instead of randomly running tests, can you bisect the issue?

Contributor

neheb commented Feb 3, 2018

Instead of randomly running tests, can you bisect the issue?

@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Feb 4, 2018

Member

I fully agree with neheb. This is very annoying/confusing.
Fortunately, I remember that github user name "emwinkler" from this other github issue #1497 where I recently posted a comment.
The problem is, that you should always explicitly state what you are doing and why you are posting this benchmarks etc (my guess is that because I suspected that the other issue was a duplicate of this github issue, see #1497 (comment)).
... the context/explanation/description why you post a "few benchmark numbers" is very important... It's not always easy to keep track which user is affected by a (speed drop) problem and to guess why he is commenting/posting across different tickets.... so even I, who asked for this "test" in the other issue, was confused at the beginning and had to double check if the github user is the same across those 2 issues etc (and why s/he is posting this here).
Furthermore, github allows you to use quoting/code markdown formatting... this would make all your post much more readable and understandable (otherwise it's difficult to see where the benchmark results start and where your description text is etc... it's just annoying extra work to see what is going on without the correct markdown formatting).

It seems at least for @emwinkler the 2 issues have the same root (unfortunately the original poster of #1497 , @guru431, is not the same github user and therefore it is just a guess that the s/he will have the same outcome/problem). I currently suspect that #1497 is a duplicate of this issue and that the speed drop was introduced by this commit 165380c but only on macOS AMD hardware (which is kind of strange and which should be investigated in detail).

Maybe we can find someone, @emwinkler you ?, that helps us to interactively debug this problem for instance by running some tests together and chatting on the #hashcat IRC channel on freenode, because currently as far as I know not a single dev has a similar setup (macOS AMD hardware etc). That would probably help a lot

Member

philsmd commented Feb 4, 2018

I fully agree with neheb. This is very annoying/confusing.
Fortunately, I remember that github user name "emwinkler" from this other github issue #1497 where I recently posted a comment.
The problem is, that you should always explicitly state what you are doing and why you are posting this benchmarks etc (my guess is that because I suspected that the other issue was a duplicate of this github issue, see #1497 (comment)).
... the context/explanation/description why you post a "few benchmark numbers" is very important... It's not always easy to keep track which user is affected by a (speed drop) problem and to guess why he is commenting/posting across different tickets.... so even I, who asked for this "test" in the other issue, was confused at the beginning and had to double check if the github user is the same across those 2 issues etc (and why s/he is posting this here).
Furthermore, github allows you to use quoting/code markdown formatting... this would make all your post much more readable and understandable (otherwise it's difficult to see where the benchmark results start and where your description text is etc... it's just annoying extra work to see what is going on without the correct markdown formatting).

It seems at least for @emwinkler the 2 issues have the same root (unfortunately the original poster of #1497 , @guru431, is not the same github user and therefore it is just a guess that the s/he will have the same outcome/problem). I currently suspect that #1497 is a duplicate of this issue and that the speed drop was introduced by this commit 165380c but only on macOS AMD hardware (which is kind of strange and which should be investigated in detail).

Maybe we can find someone, @emwinkler you ?, that helps us to interactively debug this problem for instance by running some tests together and chatting on the #hashcat IRC channel on freenode, because currently as far as I know not a single dev has a similar setup (macOS AMD hardware etc). That would probably help a lot

@emwinkler

This comment has been minimized.

Show comment
Hide comment
@emwinkler

emwinkler Feb 4, 2018

I would be glad to assist with issue. I agree it looks like the same issue as #1497 as I was able to duplicate the benchmark speed drop discussed above. Let me know how you want to proceed.

emwinkler commented Feb 4, 2018

I would be glad to assist with issue. I agree it looks like the same issue as #1497 as I was able to duplicate the benchmark speed drop discussed above. Let me know how you want to proceed.

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 4, 2018

Some decent improvements on 10.13.3. AMD R9 M370X which was previously between 6 and 13KH/s IIRC.

Plain benchmarks:

$ hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    18322 H/s (68.22ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Started: Sun Feb  4 12:57:31 2018
Stopped: Sun Feb  4 12:57:42 2018

With -w 4:

$ hashcat -m 2500 -b -d 3 -w 4
hashcat (v4.1.0) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --workload-profile=4

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    19438 H/s (259.88ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Started: Sun Feb  4 12:57:55 2018
Stopped: Sun Feb  4 12:58:11 2018

Anything else you need, just ask. Also, plugging in the charger does nothing to improve performance, so they don't seem to be throttling on battery power.

soxrok2212 commented Feb 4, 2018

Some decent improvements on 10.13.3. AMD R9 M370X which was previously between 6 and 13KH/s IIRC.

Plain benchmarks:

$ hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    18322 H/s (68.22ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Started: Sun Feb  4 12:57:31 2018
Stopped: Sun Feb  4 12:57:42 2018

With -w 4:

$ hashcat -m 2500 -b -d 3 -w 4
hashcat (v4.1.0) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --workload-profile=4

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    19438 H/s (259.88ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Started: Sun Feb  4 12:57:55 2018
Stopped: Sun Feb  4 12:58:11 2018

Anything else you need, just ask. Also, plugging in the charger does nothing to improve performance, so they don't seem to be throttling on battery power.

@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Feb 4, 2018

Member

hmm, @soxrok2212 you should try the exact same version of hashcat that you used for your previous benchmark. Otherwise there could be too many factors that could influence the speed (besides new operating system, new driver ... also the changes for hashcat 4.1.0 etc).

@emwinkler and/or @soxrok2212 could help us a lot by connecting to the freenode IRC channel and querying "atom" or me ("philsmd"). I think we can dig deeper to find the culprit. Again, I think it has to do with 165380c and especially the changes in the _loop kernel function (e.g. sha1_transform_V () vs sha1_transform_vector ())

Member

philsmd commented Feb 4, 2018

hmm, @soxrok2212 you should try the exact same version of hashcat that you used for your previous benchmark. Otherwise there could be too many factors that could influence the speed (besides new operating system, new driver ... also the changes for hashcat 4.1.0 etc).

@emwinkler and/or @soxrok2212 could help us a lot by connecting to the freenode IRC channel and querying "atom" or me ("philsmd"). I think we can dig deeper to find the culprit. Again, I think it has to do with 165380c and especially the changes in the _loop kernel function (e.g. sha1_transform_V () vs sha1_transform_vector ())

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 4, 2018

@philsmd I was under the impression that @jsteube wanted us to test the new version...

I've add some more code to change certain parameters, another shot in the dark, since I do not have access to that system and can not reproduce locally. Can you please pull master, recompile and retry again?

soxrok2212 commented Feb 4, 2018

@philsmd I was under the impression that @jsteube wanted us to test the new version...

I've add some more code to change certain parameters, another shot in the dark, since I do not have access to that system and can not reproduce locally. Can you please pull master, recompile and retry again?

@emwinkler

This comment has been minimized.

Show comment
Hide comment
@emwinkler

emwinkler Feb 4, 2018

Test completed, but same poor benchmark as shown below.

Last login: Sun Feb 4 13:53:15 2018 from ./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=3
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 59540 H/s (72.54ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Started: Sun Feb 4 13:54:03 2018
Stopped: Sun Feb 4 13:54:19 2018
CCUS-ADM-EWINKLER:hashcat ewinkler$

emwinkler commented Feb 4, 2018

Test completed, but same poor benchmark as shown below.

Last login: Sun Feb 4 13:53:15 2018 from ./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple

  • Device #1: Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz, skipped.
  • Device #2: HD Graphics 5000, skipped.
  • Device #3: AMD Radeon RX 580 Compute Engine, 2048/8192 MB allocatable, 36MCU

Benchmark relevant options:

  • --opencl-devices=3
  • --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....: 59540 H/s (72.54ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Started: Sun Feb 4 13:54:03 2018
Stopped: Sun Feb 4 13:54:19 2018
CCUS-ADM-EWINKLER:hashcat ewinkler$

@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Feb 5, 2018

Member

With an enourmous help from @soxrok2212 we found out a couple of hours ago that also -m 12000 = PBKDF2-HMAC-SHA1 is affected by the same speed drop problem as -m 2500 = WPA/WPA2 (only with these AMD drivers/cards on macOS).

Maybe someone here can confirm this too and test if the speed was changed a lot around the commit 729c5f0 (would also help a lot if we found out if the commit that introduced the problem was exactly at 729c5f0 or some commits slightly before/after that commit).

Maybe we should close the duplicate issues and rename the title of this issue to make it PBKDF2-HMAC-SHA1 specific ?
Could it help to extend our search to even more algorithms by comparing a larger set of hash types (maybe 2 full benchmarks before and after those WPA/PBKDF2 commits)?

Member

philsmd commented Feb 5, 2018

With an enourmous help from @soxrok2212 we found out a couple of hours ago that also -m 12000 = PBKDF2-HMAC-SHA1 is affected by the same speed drop problem as -m 2500 = WPA/WPA2 (only with these AMD drivers/cards on macOS).

Maybe someone here can confirm this too and test if the speed was changed a lot around the commit 729c5f0 (would also help a lot if we found out if the commit that introduced the problem was exactly at 729c5f0 or some commits slightly before/after that commit).

Maybe we should close the duplicate issues and rename the title of this issue to make it PBKDF2-HMAC-SHA1 specific ?
Could it help to extend our search to even more algorithms by comparing a larger set of hash types (maybe 2 full benchmarks before and after those WPA/PBKDF2 commits)?

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 5, 2018

At commit 729c5f0:

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-131-g729c5f09) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    18438 H/s (67.41ms)

Started: Mon Feb  5 09:21:19 2018
Stopped: Mon Feb  5 09:21:31 2018

I hopped, skipped, and jumped around in commits around that. The speed returns to normal in commits before 165380c
Here is a benchmark at this commit:

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-49-g165380c4) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    18451 H/s (67.41ms)

Started: Mon Feb  5 09:24:41 2018
Stopped: Mon Feb  5 09:24:53 2018

Here is a benchmark from 52c1e15

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    34629 H/s (73.12ms)

Started: Mon Feb  5 09:23:10 2018
Stopped: Mon Feb  5 09:23:23 2018

If you need -m 12000 benchmarks too or anything else, let me know.

soxrok2212 commented Feb 5, 2018

At commit 729c5f0:

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-131-g729c5f09) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    18438 H/s (67.41ms)

Started: Mon Feb  5 09:21:19 2018
Stopped: Mon Feb  5 09:21:31 2018

I hopped, skipped, and jumped around in commits around that. The speed returns to normal in commits before 165380c
Here is a benchmark at this commit:

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-49-g165380c4) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    18451 H/s (67.41ms)

Started: Mon Feb  5 09:24:41 2018
Stopped: Mon Feb  5 09:24:53 2018

Here is a benchmark from 52c1e15

$ ./hashcat -m 2500 -b -d 3
hashcat (v3.6.0-48-g52c1e15f) starting in benchmark mode...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashtype: WPA/WPA2

Speed.Dev.#3.....:    34629 H/s (73.12ms)

Started: Mon Feb  5 09:23:10 2018
Stopped: Mon Feb  5 09:23:23 2018

If you need -m 12000 benchmarks too or anything else, let me know.

@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Feb 6, 2018

Member

update:
we identified and fixed the -m 12000 problem with this fix: 5391edc (this change was the first to introduce the drop for -m 12000: 9de1e55)
This "sha1 problem" was also affecting -m 2500 , but -m 2500 also seems to have heavily dropped in speed after this commit was merged: 165380c .
We currently suspect that the problem has to do with the fact that the macOS AMD driver (or OpenCL implementation) does not implement amd_bytealign () and therefore the speed for instance of switch_buffer_by_offset_carry_be_S () is very slow). The driver does not report the device as "AMD" device the vendor id / platform is "Apple" and therefore hashcat sees it as a generic OpenCL device (which seems to be correct, because otherwise, i.e. if we mark it as AMD device, among others the amd_bytealign () function would be called, which is not available/implemented on macOS AMD systems).

I think we still need to perform some further tests if we can somehow improve or work around the switch_buffer_by_offset_carry_be_S () problem.
Note: switch_buffer_by_offset_carry_be_S () was not used with older version of hashcat for -m 2500, since older versions did not support very long passwords.

Thanks you again @soxrok2212 and @emwinkler for helping us to identify these problems. I think there is still a chance that we can somehow find a workaround... but it might require several further try-and-error-style tests.

Member

philsmd commented Feb 6, 2018

update:
we identified and fixed the -m 12000 problem with this fix: 5391edc (this change was the first to introduce the drop for -m 12000: 9de1e55)
This "sha1 problem" was also affecting -m 2500 , but -m 2500 also seems to have heavily dropped in speed after this commit was merged: 165380c .
We currently suspect that the problem has to do with the fact that the macOS AMD driver (or OpenCL implementation) does not implement amd_bytealign () and therefore the speed for instance of switch_buffer_by_offset_carry_be_S () is very slow). The driver does not report the device as "AMD" device the vendor id / platform is "Apple" and therefore hashcat sees it as a generic OpenCL device (which seems to be correct, because otherwise, i.e. if we mark it as AMD device, among others the amd_bytealign () function would be called, which is not available/implemented on macOS AMD systems).

I think we still need to perform some further tests if we can somehow improve or work around the switch_buffer_by_offset_carry_be_S () problem.
Note: switch_buffer_by_offset_carry_be_S () was not used with older version of hashcat for -m 2500, since older versions did not support very long passwords.

Thanks you again @soxrok2212 and @emwinkler for helping us to identify these problems. I think there is still a chance that we can somehow find a workaround... but it might require several further try-and-error-style tests.

@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Feb 13, 2018

Member

Please retry with latest github master version. It's important to run make clean this time.

Member

jsteube commented Feb 13, 2018

Please retry with latest github master version. It's important to run make clean this time.

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 13, 2018

Latest commits fixed the issues here and is actually a little faster than my v3.40 benchmark!

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    34452 H/s (73.02ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Started: Tue Feb 13 11:54:01 2018
Stopped: Tue Feb 13 11:54:25 2018

soxrok2212 commented Feb 13, 2018

Latest commits fixed the issues here and is actually a little faster than my v3.40 benchmark!

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    34452 H/s (73.02ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Started: Tue Feb 13 11:54:01 2018
Stopped: Tue Feb 13 11:54:25 2018
@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Feb 14, 2018

Member

Please redo the test with latest github master. Had to do some change because this workaround made other hash-modes no longer working. But with a bit of luck the latest changes lead to the same result.

Member

jsteube commented Feb 14, 2018

Please redo the test with latest github master. Had to do some change because this workaround made other hash-modes no longer working. But with a bit of luck the latest changes lead to the same result.

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 14, 2018

Speeds dropped back to what they were before.

$ ./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    17759 H/s (68.76ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Started: Wed Feb 14 10:35:38 2018
Stopped: Wed Feb 14 10:35:50 2018

soxrok2212 commented Feb 14, 2018

Speeds dropped back to what they were before.

$ ./hashcat -b -m 2500 -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    17759 H/s (68.76ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Started: Wed Feb 14 10:35:38 2018
Stopped: Wed Feb 14 10:35:50 2018
@jsteube

This comment has been minimized.

Show comment
Hide comment
@jsteube

jsteube Feb 14, 2018

Member

Pull again and retry please, don't forget to rm -rf kernels

Member

jsteube commented Feb 14, 2018

Pull again and retry please, don't forget to rm -rf kernels

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 14, 2018

Same result:

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    17773 H/s (68.76ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Started: Wed Feb 14 15:27:05 2018
Stopped: Wed Feb 14 15:27:17 2018

soxrok2212 commented Feb 14, 2018

Same result:

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    17773 H/s (68.76ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Started: Wed Feb 14 15:27:05 2018
Stopped: Wed Feb 14 15:27:17 2018
@philsmd

This comment has been minimized.

Show comment
Hide comment
@philsmd

philsmd Feb 21, 2018

Member

Hmm, does a patch like the following fix the problem?

diff --git a/OpenCL/inc_hash_sha1.cl b/OpenCL/inc_hash_sha1.cl
index 8caa33d..80b430e 100644
--- a/OpenCL/inc_hash_sha1.cl
+++ b/OpenCL/inc_hash_sha1.cl
@@ -208,7 +208,9 @@ DECLSPEC void sha1_update_64 (sha1_ctx_t *ctx, u32 w0[4], u32 w1[4], u32 w2[4],
     u32 c2[4] = { 0 };
     u32 c3[4] = { 0 };
 
+    #ifndef SKIP_SWITCH_BUFFER_WITH_CARRY
     switch_buffer_by_offset_carry_be_S (w0, w1, w2, w3, c0, c1, c2, c3, pos);
+    #endif
 
     ctx->w0[0] |= w0[0];
     ctx->w0[1] |= w0[1];
diff --git a/OpenCL/inc_hash_sha256.cl b/OpenCL/inc_hash_sha256.cl
index 49e1070..5375b9d 100644
--- a/OpenCL/inc_hash_sha256.cl
+++ b/OpenCL/inc_hash_sha256.cl
@@ -193,7 +193,9 @@ DECLSPEC void sha256_update_64 (sha256_ctx_t *ctx, u32 w0[4], u32 w1[4], u32 w2[
     u32 c2[4] = { 0 };
     u32 c3[4] = { 0 };
 
+    #ifndef SKIP_SWITCH_BUFFER_WITH_CARRY
     switch_buffer_by_offset_carry_be_S (w0, w1, w2, w3, c0, c1, c2, c3, pos);
+    #endif
 
     ctx->w0[0] |= w0[0];
     ctx->w0[1] |= w0[1];
diff --git a/OpenCL/m02500.cl b/OpenCL/m02500.cl
index 468da9b..ee71042 100644
--- a/OpenCL/m02500.cl
+++ b/OpenCL/m02500.cl
@@ -5,6 +5,8 @@
 
 #define NEW_SIMD_CODE
 
+#define SKIP_SWITCH_BUFFER_WITH_CARRY
+
 #include "inc_vendor.cl"
 #include "inc_hash_constants.h"
 #include "inc_hash_functions.cl"

To test this:

  1. make sure that you have a clean/unmodified version of hashcat
  2. check out the current master (git checkout 72fc708)
  3. store the above code into a file called wpa.diff
  4. run: "make clean" (just to make sure there are no cached kernel files)
  5. run: "git apply wpa.diff"
  6. run: "make"
  7. test speed: hashcat -m 2500 -b
  8. test if it still cracks: hashcat -m 2500 -a 3 hashcat.hccapx hashcat?s

(hashcat.hccapx can be downloaded form the examle hashes page: https://hashcat.net/wiki/example_hashes)
Thx

Member

philsmd commented Feb 21, 2018

Hmm, does a patch like the following fix the problem?

diff --git a/OpenCL/inc_hash_sha1.cl b/OpenCL/inc_hash_sha1.cl
index 8caa33d..80b430e 100644
--- a/OpenCL/inc_hash_sha1.cl
+++ b/OpenCL/inc_hash_sha1.cl
@@ -208,7 +208,9 @@ DECLSPEC void sha1_update_64 (sha1_ctx_t *ctx, u32 w0[4], u32 w1[4], u32 w2[4],
     u32 c2[4] = { 0 };
     u32 c3[4] = { 0 };
 
+    #ifndef SKIP_SWITCH_BUFFER_WITH_CARRY
     switch_buffer_by_offset_carry_be_S (w0, w1, w2, w3, c0, c1, c2, c3, pos);
+    #endif
 
     ctx->w0[0] |= w0[0];
     ctx->w0[1] |= w0[1];
diff --git a/OpenCL/inc_hash_sha256.cl b/OpenCL/inc_hash_sha256.cl
index 49e1070..5375b9d 100644
--- a/OpenCL/inc_hash_sha256.cl
+++ b/OpenCL/inc_hash_sha256.cl
@@ -193,7 +193,9 @@ DECLSPEC void sha256_update_64 (sha256_ctx_t *ctx, u32 w0[4], u32 w1[4], u32 w2[
     u32 c2[4] = { 0 };
     u32 c3[4] = { 0 };
 
+    #ifndef SKIP_SWITCH_BUFFER_WITH_CARRY
     switch_buffer_by_offset_carry_be_S (w0, w1, w2, w3, c0, c1, c2, c3, pos);
+    #endif
 
     ctx->w0[0] |= w0[0];
     ctx->w0[1] |= w0[1];
diff --git a/OpenCL/m02500.cl b/OpenCL/m02500.cl
index 468da9b..ee71042 100644
--- a/OpenCL/m02500.cl
+++ b/OpenCL/m02500.cl
@@ -5,6 +5,8 @@
 
 #define NEW_SIMD_CODE
 
+#define SKIP_SWITCH_BUFFER_WITH_CARRY
+
 #include "inc_vendor.cl"
 #include "inc_hash_constants.h"
 #include "inc_hash_functions.cl"

To test this:

  1. make sure that you have a clean/unmodified version of hashcat
  2. check out the current master (git checkout 72fc708)
  3. store the above code into a file called wpa.diff
  4. run: "make clean" (just to make sure there are no cached kernel files)
  5. run: "git apply wpa.diff"
  6. run: "make"
  7. test speed: hashcat -m 2500 -b
  8. test if it still cracks: hashcat -m 2500 -a 3 hashcat.hccapx hashcat?s

(hashcat.hccapx can be downloaded form the examle hashes page: https://hashcat.net/wiki/example_hashes)
Thx

@soxrok2212

This comment has been minimized.

Show comment
Hide comment
@soxrok2212

soxrok2212 Feb 21, 2018

Got one new compiler warning in the self test:

src/selftest.c:60:20: warning: passing 'char *' to parameter of type 'u8 *' (aka 'unsigned char *')
      converts between pointers to integer types with different sign [-Wpointer-sign]
        uppercase (pw_ptr, pw.pw_len);
                   ^~~~~~
include/convert.h:51:21: note: passing argument to parameter 'buf' here
void uppercase (u8 *buf, const size_t len);
                    ^
1 warning generated.

Speed is back to normal:

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    34508 H/s (73.02ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Started: Wed Feb 21 10:11:13 2018
Stopped: Wed Feb 21 10:11:29 2018

Cracking works:

$ ./hashcat -m 2500 -d 3 -a 3 hashcat.hccapx hashcat?s
hashcat (v4.1.0) starting...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashes: 1 digests; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates

Applicable optimizers:
* Zero-Byte
* Single-Hash
* Single-Salt
* Brute-Force
* Slow-Hash-SIMD-LOOP

Minimum password length supported by kernel: 8
Maximum password length supported by kernel: 63

Watchdog: Temperature abort trigger disabled.

The wordlist or mask that you are using is too small.
This means that hashcat cannot use the full parallel power of your device(s).
Unless you supply more work, your cracking speed will drop.
For tips on supplying more work, see: https://hashcat.net/faq/morework

Approaching final keyspace - workload adjusted.  

a895f7d62ccc3e892fa9e9f9146232c1:aef50f22801c:987bdcf9f950:8381533406003807685881523:hashcat!
                                                 
Session..........: hashcat
Status...........: Cracked
Hash.Type........: WPA/WPA2
Hash.Target......: 8381533406003807685881523 (AP:ae:f5:0f:22:80:1c STA:98:7b:dc:f9:f9:50)
Time.Started.....: Wed Feb 21 10:13:36 2018 (0 secs)
Time.Estimated...: Wed Feb 21 10:13:36 2018 (0 secs)
Guess.Mask.......: hashcat?s [8]
Guess.Queue......: 1/1 (100.00%)
Speed.Dev.#3.....:      162 H/s (0.46ms) @ Accel:32 Loops:16 Thr:256 Vec:1
Recovered........: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.........: 33/33 (100.00%)
Rejected.........: 0/33 (0.00%)
Restore.Point....: 0/33 (0.00%)
Candidates.#3....: hashcat! -> hashcat 

Started: Wed Feb 21 10:13:33 2018
Stopped: Wed Feb 21 10:13:37 2018

soxrok2212 commented Feb 21, 2018

Got one new compiler warning in the self test:

src/selftest.c:60:20: warning: passing 'char *' to parameter of type 'u8 *' (aka 'unsigned char *')
      converts between pointers to integer types with different sign [-Wpointer-sign]
        uppercase (pw_ptr, pw.pw_len);
                   ^~~~~~
include/convert.h:51:21: note: passing argument to parameter 'buf' here
void uppercase (u8 *buf, const size_t len);
                    ^
1 warning generated.

Speed is back to normal:

$ ./hashcat -m 2500 -b -d 3
hashcat (v4.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Benchmark relevant options:
===========================
* --opencl-devices=3
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2 (Iterations: 4096)

Speed.Dev.#3.....:    34508 H/s (73.02ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Started: Wed Feb 21 10:11:13 2018
Stopped: Wed Feb 21 10:11:29 2018

Cracking works:

$ ./hashcat -m 2500 -d 3 -a 3 hashcat.hccapx hashcat?s
hashcat (v4.1.0) starting...

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz, skipped.
* Device #2: Iris Pro, skipped.
* Device #3: AMD Radeon R9 M370X Compute Engine, 512/2048 MB allocatable, 10MCU

Hashes: 1 digests; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates

Applicable optimizers:
* Zero-Byte
* Single-Hash
* Single-Salt
* Brute-Force
* Slow-Hash-SIMD-LOOP

Minimum password length supported by kernel: 8
Maximum password length supported by kernel: 63

Watchdog: Temperature abort trigger disabled.

The wordlist or mask that you are using is too small.
This means that hashcat cannot use the full parallel power of your device(s).
Unless you supply more work, your cracking speed will drop.
For tips on supplying more work, see: https://hashcat.net/faq/morework

Approaching final keyspace - workload adjusted.  

a895f7d62ccc3e892fa9e9f9146232c1:aef50f22801c:987bdcf9f950:8381533406003807685881523:hashcat!
                                                 
Session..........: hashcat
Status...........: Cracked
Hash.Type........: WPA/WPA2
Hash.Target......: 8381533406003807685881523 (AP:ae:f5:0f:22:80:1c STA:98:7b:dc:f9:f9:50)
Time.Started.....: Wed Feb 21 10:13:36 2018 (0 secs)
Time.Estimated...: Wed Feb 21 10:13:36 2018 (0 secs)
Guess.Mask.......: hashcat?s [8]
Guess.Queue......: 1/1 (100.00%)
Speed.Dev.#3.....:      162 H/s (0.46ms) @ Accel:32 Loops:16 Thr:256 Vec:1
Recovered........: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.........: 33/33 (100.00%)
Rejected.........: 0/33 (0.00%)
Restore.Point....: 0/33 (0.00%)
Candidates.#3....: hashcat! -> hashcat 

Started: Wed Feb 21 10:13:33 2018
Stopped: Wed Feb 21 10:13:37 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment