Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to NVIDIA 465.24.02 broke wpapsk-opencl #4667

Closed
ZerBea opened this issue Apr 16, 2021 · 28 comments · Fixed by #4669
Closed

Update to NVIDIA 465.24.02 broke wpapsk-opencl #4667

ZerBea opened this issue Apr 16, 2021 · 28 comments · Fixed by #4669
Assignees
Labels

Comments

@ZerBea
Copy link

ZerBea commented Apr 16, 2021

Update to NVIDA 465.24.02 broke wpapsk-opencl:

$ pacman -Q | grep nvidia
nvidia 465.24.02-1
nvidia-settings 465.24.02-1
nvidia-utils 465.24.02-1
opencl-nvidia 465.24.02-1

Affected:
latest release
https://archlinux.org/packages/community/x86_64/john/
as well as latest git head:
$ john
John the Ripper 1.9.0-jumbo-1+bleeding-3088ea62d 2021-04-14 16:10:23 +0200 MPI + OMP [linux-gnu 64-bit x86_64 AVX AC]

$ john --format=wpapsk-opencl --test
Device 1@tux1: NVIDIA GeForce GTX 1080 Ti
Benchmarking: wpapsk-opencl, WPA/WPA2/PMF/PMKID PSK [PBKDF2-SHA1 OpenCL]... Options used: -I opencl -cl-mad-enable -DSM_MAJOR=6 -DSM_MINOR=1 -D__GPU__ -DDEVICE_INFO=524306 -D__SIZEOF_HOST_SIZE_T__=8 -DDEV_VER_MAJOR=465 -DDEV_VER_MINOR=24 -D_OPENCL_COMPILER -DHASH_LOOPS=105 -DITERATIONS=4095 -DPLAINTEXT_LENGTH=63 -DV_WIDTH=1 /usr/share/john/opencl/wpapsk_kernel.cl
Build log: In file included from :13:
opencl/opencl_cmac.h:62:22: error: implicit conversion from address space "generic" to address space "private" is not supported when passing to parameter of destination type
AES_set_encrypt_key(key, 128, &ctx->aesctx);
^~~
opencl/opencl_aes_bitslice.h:1001:61: note: expanded from macro 'AES_set_encrypt_key'
#define AES_set_encrypt_key(key, bits, ctx) AES_Setkey(ctx, key, (bits) / 8)
^
opencl/opencl_aes_bitslice.h:690:45: note: passing argument to parameter 'key' here
AES_Setkey(AES_CTX *ctx, AES_KEY_TYPE void *key, int len)
^

Error building kernel /usr/share/john/opencl/wpapsk_kernel.cl. DEVICE_INFO=524306
0: OpenCL CL_BUILD_PROGRAM_FAILURE (-11) error in opencl_common.c:1292 - clBuildProgram

Cheers
Mike

@magnumripper magnumripper self-assigned this Apr 16, 2021
@magnumripper
Copy link
Member

magnumripper commented Apr 16, 2021

Thank you for reporting! Does this fix the problem?

diff --git a/run/opencl/opencl_aes_bitslice.h b/run/opencl/opencl_aes_bitslice.h
index b3f0ffec0..b561a433e 100644
--- a/run/opencl/opencl_aes_bitslice.h
+++ b/run/opencl/opencl_aes_bitslice.h
@@ -26,6 +26,16 @@
 #ifndef _AES_BITSLICE
 #define _AES_BITSLICE
 
+#ifndef AES_KEY_TYPE
+#define AES_KEY_TYPE __private const
+#endif
+#ifndef AES_SRC_TYPE
+#define AES_SRC_TYPE __private const
+#endif
+#ifndef AES_DST_TYPE
+#define AES_DST_TYPE __private
+#endif
+
 typedef struct aes_ctx {
        uint32_t sk[60];
        uint32_t sk_exp[120];

Note: You may also have to rm -fr ~/.nv/ComputeCache to drop cached kernels. Edit: Scrap that, there shouldn't be any cached file since it failed building

@magnumripper
Copy link
Member

magnumripper commented Apr 16, 2021

Hmm no, those same defines are already in opencl_aes.h which is sourced before. This must be a driver bug.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Hi magnum.
Thanks for your quick response. ~/.nv/ is removed after every update as well as ~/.john/
Arch makepkg is running with out ERRORs, when building an Arch packet from john-git.

BTW:
It looks like NIVIDA changed something within the driver, but I haven't figured out, what exactly changed, yet.

@magnumripper
Copy link
Member

Try this

diff --git a/run/opencl/opencl_cmac.h b/run/opencl/opencl_cmac.h
index a07c37b43..3ba1aced5 100644
--- a/run/opencl/opencl_cmac.h
+++ b/run/opencl/opencl_cmac.h
@@ -57,7 +57,7 @@ AES_CMAC_Init(AES_CMAC_CTX *ctx)
 }
 
 inline void
-AES_CMAC_SetKey(AES_CMAC_CTX *ctx, const uint8_t *key)
+AES_CMAC_SetKey(AES_CMAC_CTX *ctx, AES_KEY_TYPE uint8_t *key)
 {
        AES_set_encrypt_key(key, 128, &ctx->aesctx);
 }

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

I'll check it. Looks like we have to cast between address spaces and address space semantics.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Nope, AES_KEY_TYPE uint8_t *key doesn't work.

@magnumripper
Copy link
Member

Very annoying. Try this then,

diff --git a/run/opencl/opencl_aes.h b/run/opencl/opencl_aes.h
index 57c3a13d3..1558cbad3 100644
--- a/run/opencl/opencl_aes.h
+++ b/run/opencl/opencl_aes.h
@@ -48,7 +48,7 @@
  * CPU's seem to generally perform worse with it. Nvidia GPU's love it.
  * macOS may crash just trying to build it.
  */
-#if defined(AES_NO_BITSLICE) || cpu(DEVICE_INFO) || (__OS_X__ && gpu_amd(DEVICE_INFO))
+#if 1 //defined(AES_NO_BITSLICE) || cpu(DEVICE_INFO) || (__OS_X__ && gpu_amd(DEVICE_INFO))
 #include "opencl_aes_plain.h"
 #else
 #include "opencl_aes_bitslice.h"

That will force non-bitslice AES which is somewhat slower but maybe it at least works around the problem? Or if it doesn't, maybe the error message changes in some way that gives us some clue.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Now applying the second patch. The first patch (AES_KEY_TYPE uint8_t *key) is not removed.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

As expected, the second patch does the trick:
$ john --format=wpapsk-opencl --test
Device 1@tux1: NVIDIA GeForce GTX 1080 Ti
Benchmarking: wpapsk-opencl, WPA/WPA2/PMF/PMKID PSK [PBKDF2-SHA1 OpenCL]... LWS=32 GWS=458752 (14336 blocks) DONE
Raw: 533432 c/s real, 531372 c/s virtual, Dev#1 util: 100%

@magnumripper
Copy link
Member

Great! But I need to install this driver and try to work around the problem without resorting to the slower AES code.

@magnumripper magnumripper changed the title Update to NVIDA 465.24.02 broke wpapsk-opencl Update to NVIDIA 465.24.02 broke wpapsk-opencl Apr 16, 2021
@solardiz
Copy link
Member

@ZerBea Did only this one format break, or did some others break too?

@magnumripper I guess AES performance isn't very important for this format, but is of more importance in some others, right?

@solardiz solardiz added this to the Potentially 1.9.0-jumbo-2 milestone Apr 16, 2021
@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Maybe it is a simple casting issue.
@solardiz I only tested wpapsk-opencl, but I think more AES formats are affected.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Looks like there are more changes in the driver code than expected. NVIDA just released CUDA 11.3!

@magnumripper
Copy link
Member

@magnumripper I guess AES performance isn't very important for this format, but is of more importance in some others, right?

Yes, using 460.56 (which doesn't have this problem) the speed only drops 0.5% when I force the simpler AES.

@solardiz
Copy link
Member

@ZerBea Can you run a full --test --format=opencl using unmodified code from this repo? We'd like to be aware of all issues, not only with this format. Thank you! (If you'd like to speed this up, you can use e.g. --test=0 --gws=64 --lws=64 --format=opencl, which will test basic correctness, but not the speeds.)

@magnumripper
Copy link
Member

I installed CUDA 11.3 which included 465.19.01, I can reproduce the problem with it. And for me, this fix does help:

diff --git a/run/opencl/opencl_cmac.h b/run/opencl/opencl_cmac.h
index a07c37b43..3ba1aced5 100644
--- a/run/opencl/opencl_cmac.h
+++ b/run/opencl/opencl_cmac.h
@@ -57,7 +57,7 @@ AES_CMAC_Init(AES_CMAC_CTX *ctx)
 }
 
 inline void
-AES_CMAC_SetKey(AES_CMAC_CTX *ctx, const uint8_t *key)
+AES_CMAC_SetKey(AES_CMAC_CTX *ctx, AES_KEY_TYPE uint8_t *key)
 {
        AES_set_encrypt_key(key, 128, &ctx->aesctx);
 }

@magnumripper
Copy link
Member

@ZerBea Can you run a full --test --format=opencl using unmodified code from this repo? We'd like to be aware of all issues, not only with this format. Thank you! (If you'd like to speed this up, you can use e.g. --test=0 --gws=64 --lws=64 --format=opencl, which will test basic correctness, but not the speeds.)

I'm on it. First failing was truecrypt, very similar

Testing: TrueCrypt-opencl [RIPEMD160 AES256_XTS OpenCL]... Options used: -I opencl -cl-mad-enable -DSM_MAJOR=7 -DSM_MINOR=5 -D__GPU__ -DDEVICE_INFO=1048594 -D__SIZEOF_HOST_SIZE_T__=8 -DDEV_VER_MAJOR=465 -DDEV_VER_MINOR=19 -D_OPENCL_COMPILER -DKEYLEN=64 -DSALTLEN=64 -DOUTLEN=64 ../run/opencl/pbkdf2_ripemd160_kernel.cl
Build log: <kernel>:157:56: error: implicit conversion from address space "generic" to address space "private" is not supported when passing to parameter of destination type 
        AES_256_XTS_first_sector(salt->bin, outbuffer[idx].v, (uchar*)key);
                                                              ^~~~~~~~~~~
opencl/opencl_aes.h:284:58: note: passing argument to parameter 'double_key' here
                                     AES_KEY_TYPE uchar *double_key)
                                                         ^

Error building kernel ../run/opencl/pbkdf2_ripemd160_kernel.cl. DEVICE_INFO=1048594
0: OpenCL CL_BUILD_PROGRAM_FAILURE (-11) error in opencl_common.c:1292 - clBuildProgram

@magnumripper
Copy link
Member

you can use e.g. --test=0 --gws=64 --lws=64 --format=opencl

BTW --test=0 implies defaulting to even lower LWS/GWS than that.

@magnumripper
Copy link
Member

magnumripper commented Apr 16, 2021

Trivial fix for Truecrypt:

diff --git a/run/opencl/pbkdf2_ripemd160_kernel.cl b/run/opencl/pbkdf2_ripemd160_kernel.cl
index a4a2b1a93..6b6b0ec6c 100644
--- a/run/opencl/pbkdf2_ripemd160_kernel.cl
+++ b/run/opencl/pbkdf2_ripemd160_kernel.cl
@@ -154,5 +154,5 @@ __kernel void tc_ripemd_aesxts(__global const pbkdf2_password *inbuffer,
 
 	pbkdf2(inbuffer[idx].v, inbuffer[idx].length, salt->salt, key);
 
-	AES_256_XTS_first_sector(salt->bin, outbuffer[idx].v, (uchar*)key);
+	AES_256_XTS_first_sector(salt->bin, outbuffer[idx].v, (AES_KEY_TYPE uchar*)key);
 }

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

@magnumripper installed CUDA package and can confirm the the first patch is working. We don't need the second one.
@solardiz full test is running.....

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

working:
AES_256_XTS_first_sector(salt->bin, outbuffer[idx].v, (AES_KEY_TYPE uchar*)key);

@magnumripper
Copy link
Member

I'm working on DMG now. Same sort of problem.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Mostly simple casting problem.

@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

This was my fault:
#4667 (comment)
I used the wrong (git head) version instead of the patched one.
Looks like casting to AES_KEY_TYPE is enough to fix the issues.

@magnumripper
Copy link
Member

OK here is the exact problem: Our code is written for OpenCL 1.2. In OpenCL 2.0, the default address space is no longer
__private but instead __generic. Once we can start actually making use of that, it's wonderful: A very good example is we'll no longer need separate versions of memcpy for every needed combination of address types!

A trick I've been using in shared code such as opencl_aes.h is using macros for what address space we need for certain buffers when used for a certain kernel (such as in AES_KEY_TYPE). They were defaulting to __private because that was the same thing as "unspecified"... until now. The problem now is if we have an unspecified char buf[16] and then cast it to (AES_KEY_TYPE) char*, we're no longer casting between __private's but between __generic and __private and in some situations (driver bug or not, I'm not sure) this fails. Ironically the functionality that will eventually remove the need for my trick sabotaged the very trick 🙄

So, instead of the patches mentioned above, we can just define AES_KEY_TYPE to be empty instead of __private and all is good!

I have a working patch that does just that and with it, every format passes. I'll make a PR soon.

magnumripper added a commit to magnumripper/john that referenced this issue Apr 16, 2021
Avoid declaring anything __private (it is/was the default anyway)
because once we do, a can of worms is opened due to OpenCL 2.0's
notion of generic address space.  Problems seen with nvidia 465.xx.

Closes openwall#4667
@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

Good investigation and excellent work. Thanks.
Now JtR is prepared to be used in combination with latest NVIDIA driver.

@ZerBea ZerBea closed this as completed Apr 16, 2021
@solardiz
Copy link
Member

I'll reopen until we've merged the PR.

@solardiz solardiz reopened this Apr 16, 2021
magnumripper added a commit to magnumripper/john that referenced this issue Apr 16, 2021
Avoid declaring anything __private (it is/was the default anyway)
because once we do, a can of worms is opened due to OpenCL 2.0's
notion of generic address space.  Problems seen with nvidia 465.xx.

Closes openwall#4667
magnumripper added a commit to magnumripper/john that referenced this issue Apr 16, 2021
Avoid declaring anything __private (it is/was the default anyway)
because once we do, a can of worms is opened due to OpenCL 2.0's
notion of generic address space.  Problems seen with nvidia 465.xx.

Closes openwall#4667
magnumripper added a commit to magnumripper/john that referenced this issue Apr 16, 2021
Avoid declaring anything __private (it is/was the default anyway)
because once we do, a can of worms is opened due to OpenCL 2.0's
notion of generic address space.  Problems seen with nvidia 465.xx.

Closes openwall#4667
magnumripper added a commit to magnumripper/john that referenced this issue Apr 16, 2021
Avoid declaring anything __private (it is/was the default anyway)
because once we do, a can of worms is opened due to OpenCL 2.0's
notion of generic address space.  Problems seen with nvidia 465.xx.

Closes openwall#4667
@ZerBea
Copy link
Author

ZerBea commented Apr 16, 2021

@solardiz ok. Looks I was a way too fast, closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants