Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSL vs LibreSSL performance AES-NI #3551

Closed
h-2 opened this issue Jun 23, 2019 · 13 comments
Closed

OpenSSL vs LibreSSL performance AES-NI #3551

h-2 opened this issue Jun 23, 2019 · 13 comments
Labels
support Community support

Comments

@h-2
Copy link

h-2 commented Jun 23, 2019

Describe the bug

This is a follow-up to #2343.

When I choose the LibreSSL flavour, OpenVPN reports no hardware crypto. With OpenSSL flavour it does.

LibreSSL:
OpenVPN config, Hardware Crypto: No Hardware Crypto Acceleration

OpenSSL:
OpenVPN config, Hardware Crypto: Intel RDRAND engine - RAND.

The actual speed of libressl suggests that it does have hardware accel:

LibreSSL /usr/local/bin/openssl speed aes-128-cbc yields 125 MB/s
LibreSSL /usr/local/bin/openssl speed -evp aes-128-cbc yields 572 MB/s

On the other hand OpenSSL seems to have regressed:

OpenSSL /usr/bin/openssl speed aes-128-cbc yields 125 MB/s
OpenSSL /usr/bin/openssl speed -evp aes-128-cbc yields 206 MB/s

(OpenSSL with -evp gave ~ 400 MB /s in #2343)

I am not too worried about the latter if we can fix the former, i.e. make OpenVPN use LibreSSL + hardware crypto.

Environment
Software version used and hardware type if relevant.
e.g.:

OPNsense 19.1.9-amd64
FreeBSD 11.2-RELEASE-p10-HBSD
LibreSSL 2.8.3
Intel(R) Celeron(R) CPU J3455 @ 1.50GHz (4 cores)

@fichtner fichtner added the support Community support label Jun 24, 2019
@fichtner
Copy link
Member

Would you mind rerunning the test with the correct OpenSSL binary at /usr/local/bin/openssl ?

@Keltere
Copy link

Keltere commented Jun 25, 2019

/usr/bin/openssl and /usr/local/bin/openssl have the same identical performance, there is a small difference of 4KB but i think it's inside the margin of error.

On my device with Intel(R) Celeron(R) CPU 3865U @ 1.80GHz (2 cores)
/usr/bin/openssl speed aes-128-cbc result in 80MB/s
/usr/bin/openssl speed -evp aes-128-cbc result in 640MB/s

@fichtner
Copy link
Member

That's what I would expect as well, but then again the tests b @h-2 seem to be off. LibreSSL in -evp should be a bit slower than OpenSSL, but without -evp they should perform the same (much slower than -evp).

@h-2
Copy link
Author

h-2 commented Jun 25, 2019

OK, here are more complete logs, maybe that's helpful.

Summary. Performance seems to be equal everywhere. I don't know where above's 200MB/s comes from. But it's still the problem that OpenVPN doesn't detect hardware accel when LibreSSL is chosen.

OpenSSL flavour, base, no evp

# /usr/bin/openssl speed aes-128-cbc
Doing aes-128 cbc for 3s on 16 size blocks: 9072264 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 64 size blocks: 2536191 aes-128 cbc's in 3.02s
Doing aes-128 cbc for 3s on 256 size blocks: 660585 aes-128 cbc's in 3.05s
Doing aes-128 cbc for 3s on 1024 size blocks: 362704 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 8192 size blocks: 46080 aes-128 cbc's in 3.00s
OpenSSL 1.0.2o-freebsd  27 Mar 2018
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      48385.41k    53685.99k    55502.69k   123802.97k   125829.12k

OpenSSL flavour, base, evp

# /usr/bin/openssl speed -evp aes-128-cbc 
Doing aes-128-cbc for 3s on 16 size blocks: 77225335 aes-128-cbc's in 3.07s
Doing aes-128-cbc for 3s on 64 size blocks: 24376799 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 256 size blocks: 6371965 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 1658513 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 209535 aes-128-cbc's in 3.00s
OpenSSL 1.0.2o-freebsd  27 Mar 2018
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     402436.35k   516007.07k   542328.70k   566105.77k   572170.24k

OpenSSL flavour, pkg, no-evp

# /usr/local/bin/openssl speed aes-128-cbc
Doing aes-128 cbc for 3s on 16 size blocks: 8788676 aes-128 cbc's in 3.03s
Doing aes-128 cbc for 3s on 64 size blocks: 2567406 aes-128 cbc's in 3.06s
Doing aes-128 cbc for 3s on 256 size blocks: 650507 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 1024 size blocks: 367025 aes-128 cbc's in 3.02s
Doing aes-128 cbc for 3s on 8192 size blocks: 46315 aes-128 cbc's in 3.01s
OpenSSL 1.0.2s  28 May 2019
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: cc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -pthread -D_THREAD_SAFE -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -O3 -Wall -O2 -pipe  -DHARDENEDBSD -fPIE -fPIC -Werror -Qunused-arguments -fstack-protector-all -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      46389.71k    53653.55k    55509.93k   124306.72k   126142.33k

OpenSSL flavour, pkg, evp

# /usr/local/bin/openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 63046145 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 64 size blocks: 23199839 aes-128-cbc's in 3.04s
Doing aes-128-cbc for 3s on 256 size blocks: 6245402 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 1648657 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 209396 aes-128-cbc's in 3.00s
OpenSSL 1.0.2s  28 May 2019
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: cc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -pthread -D_THREAD_SAFE -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -O3 -Wall -O2 -pipe  -DHARDENEDBSD -fPIE -fPIC -Werror -Qunused-arguments -fstack-protector-all -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     335372.74k   488568.33k   532940.97k   562741.59k   571790.68k

LibreSSL flavour, base , no evp

# /usr/bin/openssl speed aes-128-cbc
Doing aes-128 cbc for 3s on 16 size blocks: 9032105 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 64 size blocks: 2507120 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 256 size blocks: 669309 aes-128 cbc's in 3.09s
Doing aes-128 cbc for 3s on 1024 size blocks: 366071 aes-128 cbc's in 3.02s
Doing aes-128 cbc for 3s on 8192 size blocks: 46791 aes-128 cbc's in 3.05s
OpenSSL 1.0.2o-freebsd  27 Mar 2018
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      48297.00k    53624.87k    55383.63k   123983.61k   125804.92k

LibreSSL flavour, base , evp

# /usr/bin/openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 72654480 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 24148044 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 6371409 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 1672847 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 8192 size blocks: 215272 aes-128-cbc's in 3.09s
OpenSSL 1.0.2o-freebsd  27 Mar 2018
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     387490.56k   515158.27k   543693.57k   566572.10k   571465.96k

LibreSSL flavour, pkg , no-evp

# /usr/local/bin/openssl speed aes-128-cbc
Doing aes-128 cbc for 3s on 16 size blocks: 8788676 aes-128 cbc's in 3.03s
Doing aes-128 cbc for 3s on 64 size blocks: 2567406 aes-128 cbc's in 3.06s
Doing aes-128 cbc for 3s on 256 size blocks: 650507 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 1024 size blocks: 367025 aes-128 cbc's in 3.02s
Doing aes-128 cbc for 3s on 8192 size blocks: 46315 aes-128 cbc's in 3.01s
OpenSSL 1.0.2s  28 May 2019
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: cc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -pthread -D_THREAD_SAFE -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -O3 -Wall -O2 -pipe  -DHARDENEDBSD -fPIE -fPIC -Werror -Qunused-arguments -fstack-protector-all -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      46389.71k    53653.55k    55509.93k   124306.72k   126142.33k

LibreSSL flavour, pkg , evp

# /usr/local/bin/openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 65602902 aes-128-cbc's in 3.04s
Doing aes-128-cbc for 3s on 64 size blocks: 23701471 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 6353264 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 1024 size blocks: 1685153 aes-128-cbc's in 3.05s
Doing aes-128-cbc for 3s on 8192 size blocks: 211773 aes-128-cbc's in 3.03s
LibreSSL 2.8.3
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     345819.46k   503601.53k   539108.75k   565208.08k   572077.48k

@fichtner
Copy link
Member

In base there is no LibreSSL :)

But it's still the problem that OpenVPN doesn't detect hardware accel when LibreSSL is chosen.

LibreSSL does not have engine support since --err-- version 2.3 I think?

But if you choose an AESNI cipher envelope acceleration (-evp) will automatically be used. The code is assembler code inside the crypto library so no need for external engine support in both cases.

It might be that OpenVPN on OpenSSL with crypto engine is slower than without it....

Cheers,
Franco

@h-2
Copy link
Author

h-2 commented Jun 25, 2019

In base there is no LibreSSL :)

Yeah, I know. But, because you suggested by results were fishy, I wanted to make sure everything was ok.

LibreSSL does not have engine support since --err-- version 2.3 I think?

So engine support is just about random numbers and not about ciphers?

But if you choose an AESNI cipher envelope acceleration (-evp) will automatically be used.

Well, I don't really know what OpenVPN is doing, do you? I would need to setup some local benchmark because my outgoing connection isn't fast enough to measure a difference.... Or are there some local OpenVPN test benchmarks one can run?

@fichtner
Copy link
Member

fichtner commented Jun 25, 2019

So engine support is just about random numbers and not about ciphers?

Engine hardware acceleration accelerates whatever the engine supports. I'm not sure if it (always) speeds up between engine and envelope or if there are drawbacks (i.e. the old cryptodev was slower in some scenarios).

Best way forward is measuring OpenVPN performance, but that would entail testing between endpoints so maybe a test lab, but it also requires a more powerful peer so the benchmarks cap for the hardware to be tested.

Maybe @mimugmail has some test results at hand from earlier or general lab setup tips.

@mimugmail
Copy link
Member

I only tested OpenVPN one time with OpenSSL compared to WireGuard, no tests with Libre at all.
Rebuilding the lab would take some hours and I'm not sure what new results to gain as OpenVPN is known to be much slower than IPSEC.

@mimugmail
Copy link
Member

@Keltere
Copy link

Keltere commented Jun 27, 2019

https://www.routerperformance.net/comparing-opnsense-vpn-performance/

Thanks for that benchmark but did you tried plain ipsec or ipsec/l2tp?

@fichtner
Copy link
Member

pretty sure this wasn't ipsec/l2tp

@Keltere
Copy link

Keltere commented Jun 27, 2019

pretty sure this wasn't ipsec/l2tp

I've always found that openvpn was a way better choice even for just raw performance, excluding the security aspect.
This is the first time i see ipsec winning.

@mimugmail
Copy link
Member

We are talking about Site to Site, not client VPN. Plain IKEv2 road warrior should ne faster than OpenVPN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Community support
Development

No branches or pull requests

4 participants