Improve packaged Windows builds #3960

solardiz · 2019-05-16T20:56:18Z

I'll start recording related sub-issues in here.

doc/README[.txt] turns from a symlink to a text file with the symlink's target filename in it. This isn't helpful, and the file is better to be removed by windows-package target. Use an equivalent of these commands in the windows-package target:

rm ../doc/README
find ../doc ../run/rules -type f -exec sed -i -e 's/\r*$/\r/' {} ';'
sed -i -e 's/\r*$/\r/' ../README.md ../run/*.conf ../run/password.lst
find ../doc -type f -not -name '*.txt' -not -name '*.md' -exec mv -v '{}' '{}'.txt \;

We should move all of the fallback executables under a subdirectory (but make sure this doesn't exclude them from the peflags) called e.g. fallback. This will be a closer match to how I intended this functionality to be used, where on Linux systemwide installs /usr/libexec/john is used. This is why I didn't worry about those files confusing the user.

Otherwise someone might run a suboptimal build thinking it's the best suitable for their CPU, by looking at the many filenames.

Of course, this change also requires changing where the executables expect their next fallbacks.

We should add AVX-512 builds, either as AVX512BW->AVX2->... or as AVX512BW->AVX512F->AVX2->...

The text was updated successfully, but these errors were encountered:

claudioandre-br · 2019-05-16T21:29:16Z

[...]

Fixed in #3959

[...]

# Move all of the fallback executables under a subdirectory
mkdir ../run/libexec

# CPU (OMP and extensions fallback)
shell "./configure [...] && mv ../run/john ../run/libexec/john-sse2-non-omp"
[...]
shell "./configure [...] && make -sj2  && make -s strip"

solardiz · 2019-05-16T21:33:03Z

I don't mind using libexec for the directory name, but fallback is probably better in this case - no need to match the Unix'ish directory name there, and fallback is descriptive. It is reasonable for someone to knowingly use one of the fallback binaries in some special case.

claudioandre-br · 2019-05-16T21:33:44Z

Ok, fallback

solardiz · 2019-05-16T21:34:33Z

Don't forget you need to also specify the paths to fallback binaries during build of binaries that will invoke the fallbacks.

claudioandre-br · 2019-05-16T21:40:33Z

I know (and that makes me a sad person). The madness of escaping.

solardiz · 2019-05-16T21:43:18Z

This change shouldn't require additional escaping, or does it?

claudioandre-br · 2019-05-16T21:56:25Z

Testing now. Let's see.

claudioandre-br · 2019-05-17T03:51:01Z

It is basically done and works in https://ci.appveyor.com/project/claudioandre-br/johntheripper/builds/24608734.

C:\Temp\JohnTheRipper\run>john --list=build-info
Version: 1.9.0-jumbo-1
Build: cygwin 32-bit i686 SSE2 AC OMP
SIMD: SSE2, interleaving: MD4:3 MD5:3 SHA1:2 SHA256:1 SHA512:1
CPU tests: SSE2
OMP fallback binary: fallback/john-sse2-non-omp
$JOHN is
Format interface version: 14
Max. number of reported tunable costs: 4
Rec file version: REC4
Charset file version: CHR3
CHARSET_MIN: 1 (0x01)
CHARSET_MAX: 255 (0xff)
CHARSET_LENGTH: 24
SALT_HASH_SIZE: 1048576
SINGLE_IDX_MAX: 2147483648
SINGLE_BUF_MAX: 4294967295
Effective limit: Number of salts vs. SingleMaxBufferSize
Max. Markov mode level: 400
Max. Markov mode password length: 30
gcc version: 7.4.0
OpenCL headers version: 2.2
Crypto library: OpenSSL
OpenSSL library version: 01010102f
OpenSSL 1.1.1b  26 Feb 2019
GMP library version: 6.1.2
File locking: fcntl()
fseek(): fseeko
ftell(): ftello
fopen(): _fopen64
memmem(): System's

But there is a bad side effect. And a Windows symlink did not solve it (remember Win7 32bits).

I need the libraries in two directories.
I had to Ctrl+c and Ctrl+v

17/05/2019  03:29    <DIR>          .
17/05/2019  03:29    <DIR>          ..
17/05/2019  03:26    <SYMLINK>      cygbz2-1.dll [..\cygbz2-1.dll]
17/05/2019  03:29    <SYMLINK>      cygcrypt-0.dll [..\cygcrypt-0.dll]
17/05/2019  03:29    <SYMLINK>      cygcrypt-2.dll [..\cygcrypt-2.dll]
17/05/2019  03:29    <SYMLINK>      cygcrypto-1.0.0.dll [..\cygcrypto-1.0.0.d

17/05/2019  03:29    <SYMLINK>      cygcrypto-1.1.dll [..\cygcrypto-1.1.dll]
17/05/2019  03:29    <SYMLINK>      cyggcc_s-1.dll [..\cyggcc_s-1.dll]
17/05/2019  03:29    <SYMLINK>      cyggmp-10.dll [..\cyggmp-10.dll]
17/05/2019  03:29    <SYMLINK>      cyggomp-1.dll [..\cyggomp-1.dll]
17/05/2019  03:29    <SYMLINK>      cygOpenCL-1.dll [..\cygOpenCL-1.dll]
17/05/2019  03:29    <SYMLINK>      cygssl-1.0.0.dll [..\cygssl-1.0.0.dll]
17/05/2019  03:29    <SYMLINK>      cygssl-1.1.dll [..\cygssl-1.1.dll]
17/05/2019  03:29    <SYMLINK>      cygwin1.dll [..\cygwin1.dll]
17/05/2019  03:29    <SYMLINK>      cygz.dll [..\cygz.dl]
16/05/2019  23:48         7.139.342 john-avx-non-omp.exe
16/05/2019  23:52         7.186.446 john-avx.exe
17/05/2019  00:04         7.143.438 john-avx2-non-omp.exe
16/05/2019  23:33         7.141.390 john-sse2-non-omp.exe
16/05/2019  23:37         7.189.006 john-sse2.exe
16/05/2019  23:40         7.160.334 john-sse41-non-omp.exe
16/05/2019  23:44         7.207.438 john-sse41.exe
16/05/2019  23:56         7.114.766 john-xop-non-omp.exe
17/05/2019  00:00         7.161.870 john-xop.exe
              22 arquivo(s)     64.444.030 bytes
               2 pasta(s)   37.389.836.288 bytes disponíveis

solardiz · 2019-05-17T12:04:51Z

* I need the libraries in two directories.

Oh, I missed that problem, which I now realize was to be expected. This may be a reason to revert to the other approach I mentioned on the 1.9.0-jumbo-1 meta-issue:

"maybe we should include the best SIMD+OMP not only as john.exe, but also with its full name consistent with the rest. [...] We could then also use our symlink.c to produce a tiny john.exe that merely executes the best one (letting it start the fallback chain if necessary)."

claudioandre-br · 2019-05-17T21:43:31Z

Done!

The Windows release "reloaded" is available at: https://ci.appveyor.com/project/claudioandre-br/johntheripper/builds/24630991

64bits only;
it contains the avx512bw binaries (OMP and non-OMP);
it contains a lightweight john.exe that fallbacks;

The fallback is tested by CI itself (avx512bw->avx2)
But the avx512bw binary deserves real testing, I have no idea how the avx512bw Windows binary behaves in the real world.

I added only avx512bw because we are very close to CI limits (I shouldn't, better, can't add more stuff).

solardiz · 2019-05-18T14:26:48Z

* 64bits only

I guess this is temporary, just for the test build? We also need Win32 builds for our releases.

Regarding AVX-512 support in 32-bit builds, on one hand I doubt there's a 32-bit Windows that supports AVX-512 on context switches, but on the other hand someone might mistakenly install a 32-bit build that we release on 64-bit Windows on AVX-512 capable hardware. Do we care about having such installs use AVX-512?

claudioandre-br · 2019-05-18T20:32:43Z

I guess this is temporary, just for the test build? We also need Win32 builds for our releases.

I removed all 32bits testing. But I will be able to release for 32bits.

Do we care about having such installs use AVX-512?

I wont build AVX512BW on 32bits (see #3962). I will only build 512BW for 64bits.

Note to self: AVX512F versus AVX512BW (from a Linux machine).

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSE2 2x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1026 c/s real, 523 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSE2 2x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	607 c/s real, 608 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSSE3 2x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1189 c/s real, 596 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSSE3 2x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	617 c/s real, 617 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSE4.1 2x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1188 c/s real, 595 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 SSE4.1 2x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	611 c/s real, 611 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 AVX 2x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1612 c/s real, 807 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 128/128 AVX 2x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	770 c/s real, 771 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 256/256 AVX2 4x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	3130 c/s real, 1567 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 256/256 AVX2 4x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	1639 c/s real, 1641 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 512/512 AVX512F 8x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	7445 c/s real, 3730 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 512/512 AVX512F 8x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	3635 c/s real, 3635 c/s virtual

Will run 2 OpenMP threads
Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 512/512 AVX512BW 8x]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw:	7782 c/s real, 3891 c/s virtual

Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 512/512 AVX512BW 8x]... DONE
Speed for cost 1 (iteration count) of 5000
Raw:	3755 c/s real, 3755 c/s virtual

solardiz · 2019-05-18T20:36:23Z

I guess you meant to test 512BW in 2xOMP there, but didn't. Anyway, we first need to know which of our formats are expected to benefit from BW, then test those. @magnumripper Perhaps you can suggest specific formats to use for 512F vs. 512BW tests.

claudioandre-br · 2019-05-18T20:46:24Z

I guess you meant to test 512BW in 2xOMP there

No, I did something wrong (fixed now).

magnumripper · 2019-05-18T23:53:12Z

(...) someone might mistakenly install a 32-bit build that we release on 64-bit Windows on AVX-512 capable hardware. Do we care about having such installs use AVX-512?

I don't think it's worth it.

@magnumripper Perhaps you can suggest specific formats to use for 512F vs. 512BW tests.

I believe the only difference between them (currently) is in swap32/swap64, so -BW probably doesn't gain very much (likely only noticable for raw formats - unless it's hidden there too for other reasons).

solardiz · 2019-05-19T00:13:32Z

likely only noticable for raw formats

Claudio shows a difference for sha512crypt, and I'm not too surprised it too might involve byte swaps - but we need to take a look at the code, or just run more benchmarks first to confirm there's a difference for that format. Anyway, I think we've decided on going 512BW for 64-bit Windows.

magnumripper · 2019-05-19T17:08:42Z

The basic SHA512 function in simd-intrinsics.c only has swaps if we use it with "flat in" and/or "flat out" flags, that is, we feed it scalar buffers (it will also use scatter/gather instructions of course but they don't differ between -F and -BW).

The sha512crypt format indeed uses SSEi_FLAT_IN and also does some byte swaps on its own - but the latter is (ultimately) using __builtin_bswap64((x)) on scalars so will be same speed on AVX-512F. I'm a bit surprised the gain is that big for BW, but that's a good thing of course. Now, I wonder if it would possible to avoid those scalar swaps by using SSEi_FLAT_OUT in the very last SHA512 call, we should look into that. Apparently Jim wrote the SIMD support.

claudioandre-br · 2019-05-20T20:04:02Z

Closing for now. Everything is done.

solardiz assigned claudioandre-br May 16, 2019

solardiz added the enhancement label May 16, 2019

magnumripper added this to the Planned release (1.9.0-jumbo-2) milestone May 16, 2019

claudioandre-br added the fixed - pending verify label May 17, 2019

claudioandre-br closed this as completed May 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve packaged Windows builds #3960

Improve packaged Windows builds #3960

solardiz commented May 16, 2019 •

edited

Loading

claudioandre-br commented May 16, 2019

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019 •

edited

Loading

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019

claudioandre-br commented May 17, 2019

solardiz commented May 17, 2019

claudioandre-br commented May 17, 2019 •

edited

Loading

solardiz commented May 18, 2019

claudioandre-br commented May 18, 2019 •

edited

Loading

solardiz commented May 18, 2019

claudioandre-br commented May 18, 2019 •

edited

Loading

magnumripper commented May 18, 2019

solardiz commented May 19, 2019

magnumripper commented May 19, 2019 •

edited

Loading

claudioandre-br commented May 20, 2019

Improve packaged Windows builds #3960

Improve packaged Windows builds #3960

Comments

solardiz commented May 16, 2019 • edited Loading

claudioandre-br commented May 16, 2019

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019 • edited Loading

solardiz commented May 16, 2019

claudioandre-br commented May 16, 2019

claudioandre-br commented May 17, 2019

solardiz commented May 17, 2019

claudioandre-br commented May 17, 2019 • edited Loading

solardiz commented May 18, 2019

claudioandre-br commented May 18, 2019 • edited Loading

solardiz commented May 18, 2019

claudioandre-br commented May 18, 2019 • edited Loading

magnumripper commented May 18, 2019

solardiz commented May 19, 2019

magnumripper commented May 19, 2019 • edited Loading

claudioandre-br commented May 20, 2019

solardiz commented May 16, 2019 •

edited

Loading

claudioandre-br commented May 16, 2019 •

edited

Loading

claudioandre-br commented May 17, 2019 •

edited

Loading

claudioandre-br commented May 18, 2019 •

edited

Loading

claudioandre-br commented May 18, 2019 •

edited

Loading

magnumripper commented May 19, 2019 •

edited

Loading