Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ramips: add support for mtk eip93 crypto engine #10042

Merged
merged 1 commit into from
Sep 12, 2023

Conversation

skbeh
Copy link
Contributor

@skbeh skbeh commented Jun 12, 2022

Mediatek EIP93 Crypto engine is a crypto accelerator which
is available in the Mediatek MT7621 SoC.
Althrough it was submitted to upstream kernel, it is far away from being accepted by the upstream. To make it available in openwrt as soon as possible, I ported the package from immortalwrt with some adjustment to make it possible to be built with 5.10 kernel.

Thanks:
mtk-eip93
immortalwrt

@skbeh skbeh changed the title add support for mtk eip93 crypto engine mt7621: add support for mtk eip93 crypto engine Jun 12, 2022
@skbeh skbeh changed the title mt7621: add support for mtk eip93 crypto engine ramips: add support for mtk eip93 crypto engine Jun 12, 2022
@musashino205
Copy link
Contributor

musashino205 commented Jun 12, 2022

Please use your real email address for Signed-off-by line in the commit message instead of GitHub's address.

Guideline: https://openwrt.org/submitting-patches#submission_guidelines

@Ansuel Ansuel added the target/ramips pull request/issue for ramips target label Jun 12, 2022
@neheb
Copy link
Contributor

neheb commented Jun 12, 2022

I remember running this a long time ago with cryptsetup's benchmark. While there was a speedup with AES and such, it wasn't that dramatic. I also don't remember if openssl was faster. Should be slower for lower block sizes.

ping @cotequeiroz since he deals with this stuff.

@aiamadeus
Copy link
Contributor

No offense, but this looks like a copy of immortalwrt/immortalwrt@7fc2fd7 with author info stripped.

@skbeh
Copy link
Contributor Author

skbeh commented Jun 13, 2022

@AmadeusGhost I have not seen it before the notification therefore I reinvented the wheel🌚
Also please tell me if there is any good way to keep author information within Makefile.

@cotequeiroz
Copy link
Member

I remember running this a long time ago with cryptsetup's benchmark. While there was a speedup with AES and such, it wasn't that dramatic. I also don't remember if openssl was faster. Should be slower for lower block sizes.

ping @cotequeiroz since he deals with this stuff.

I've had some disappointing experiences with hw crypto, so my comments are probably discouraging. There are several drivers already included in openwrt, so I'm not going to be picky about adding another one. However, this driver does not appear to be ready for inclusion in the kernel. Just skimming through the code, I found some places needing changes. For example, the AES skcipher algos are registering themselves with CRYPTO_ALG_NEED_FALLBACK, which indicates that they need another driver to be used as a fallback for cases where this one cannot handle. Nowhere in the code there is a call to initialize a skcipher fallback driver, so that flag should not be there.

I would strongly suggest that the driver be tested with the kernel self-test, and also speed tested. A friend has a device that I could try to install the module to do speed tests, but it is used in production, so I probably won't be able to install an image there. I'll post the results if able.

Running the kernel self-tests

Openwrt disables the crypto self-tests to save space. To be able to run them, one must enable the self-tests in the kernel config: make kernel_menuconfig, under Cryptographic API, deselect Disable run-time self tests. It affects the crypto manager module. Even though there is a package for it, the crypto manager ends up being built into the kernel, so installing a custom image will probably be necessary.

The self-tests will run upon loading the tcrypt module: insmod tcrypt mode=<test>, where mode defaults to 0. The list of tests can be seen in crypto/tcrypt.c in the kernel. Note that the speed tests are also listed there, and run with the same command, but using different modes. Here's the list of supported self-tests for this driver:

  • mode=0 (default) run all self-tests
  • mode=3 DES self-tests
  • mode=4 3DES self-tests
  • mode=10 AES self-tests
  • mode=155 AES-128-SHA1 HMAC self-test
  • mode=181 DES-SHA1 HMAC self-test
  • mode=182 3DES-SHA1 HMAC self-test
  • mode=183 DES-SHA224 HMAC self-test
  • mode=184 3DES-SHA224 HMAC self-test
  • mode=185 DES-SHA256 HMAC self-test
  • mode=186 3DES-SHA256 HMAC self-test

Running the kernel speed tests

Speed testing can be done without the above change, by installing the kmod-crypto-test package, then running insmod tcrypt mode=<test> sec=<seconds>. I'll list some interesting ones to test. If you omit sec or run it with sec=0, it will do one operation and count CPU cycles instead. Sync tests will not use eip93, while async ones will--you may use the results of an async test (eip93) to compare the performance against the equivalent sync test without having to remove the eip93 module:

  • ´mode=200` sync AES speed test
  • mode=500 async AES speed test
  • mode=201 sync 3DES speed test
  • mode=501 async 3DES speed test
  • mode=204 sync DES speed test
  • mode=504 async DES speed test

At least on arm, using hw-crypto to perform AES on network packets is not worth it. There is a large latency to set up the hw crypto operation. For networking with MTU=1500, take a look at the speed of block size=1472. On mvebu, the CESA beats the arm-asm driver (regular arm, not the neon version), but not by much (128-bits AES: 23560 vs 26607 CPU cycles). Then the acks will kill most/all of the gain: for CESA, using 128-bit AES on a 1472B packet plus a 64B ack: 23560 + 9469 = 33029 CPU cycles; arm-asm: 26607 + 1355 = 27962 cycles, a 15% penalty.

I hope things on mips are different, and my expectations are just wrong.

@neheb
Copy link
Contributor

neheb commented Jun 13, 2022

@cotequeiroz IIRC the mvebu hardware stuff was created when the core clock was closer to 1.2ghz for the original WRT1900. It makes a little more sense there.

@cotequeiroz
Copy link
Member

It makes sense. With CBC pretty much out of the picture, and CESA not supporting CTR, I had 0 f1090000.crypto interrupts with over 40 days of uptime before running the tests above.

@skbeh
Copy link
Contributor Author

skbeh commented Jun 14, 2022

Here are the test logs.
test.log
speed.log

@cotequeiroz
Copy link
Member

Maybe you didn't enable openwrt cryptodev engine. Mine have 277000 in a day.

I was just illustrating how it was not being used in the kernel. I don't run or recommend running openssl or wolfssl with cryptodev or afalg, even if it works faster. Most user space programs are not aware of how openssl processes its requests--internal or using an engine. Cryptodev file descriptors must not be shared across processes, which happens when you call fork() after opening an openssl context. Openssh is known to fail with either engine because of this.


Thanks for the tests:

The speed tests indicate that it may speedup regular network encryption, which is good. The bad news is that the self tests showed one error that raised concern:

[  622.862190] alg: aead: ccm_base(ctr(aes-eip93),cbcmac(aes-generic)) decryption failed on test vector 7; expected_error=0, actual_error=-77, cfg="in-place"

77 is -EBADMSG for mips, indicating a message authentication failure.

The fact that it happened only in test vector 7 indicates it worked for vectors 0 through 6. 7 is the first vector without associated data--i.e. data not to be encrypted/decrypted, but that must be authenticated.

ccm(aes) is used in WPA. Since it passed most of the tests, a failure will not be easily detected during usage.

This should be investigated further. I would not trade reliability for speed. At the moment, this is a NAK for me.

@skbeh
Copy link
Contributor Author

skbeh commented Jun 15, 2022

I think it is relevant to https://github.com/vschagen/mtk-eip93/blob/ca08387bf8352652129019bb19e2423ab313d5cb/crypto/mtk-eip93/eip93-main.c#L165 .
I wonder if the problem is that the engine (on chip) can not handle the authentication of GCM (gash), according to https://forum.openwrt.org/t/mediatek-eip-93-crypto-driver-mt7621-finally-pushed-upstream-in-progress/111510 .
I think we can just fallback to software authentication in this situation.

@cotequeiroz
Copy link
Member

I would use an async crypto fallback for requests smaller than, say 512 bytes. I've done it with qce in torvalds/linux@ce163ba, although later with torvalds/linux@25b71d6, it was inadvertently? limited to XTS only.

After you do this, the self tests should pass, but beware that they will fall under the 512-byte threshold, so see if you can test it with other data.

@skbeh
Copy link
Contributor Author

skbeh commented Jun 21, 2022

I would use an async crypto fallback for requests smaller than, say 512 bytes. I've done it with qce in torvalds/linux@ce163ba, although later with torvalds/linux@25b71d6, it was inadvertently? limited to XTS only.

I applied similar changes in my stage tree. It works like a charm.
I will push the commit later.
There are the logs:
log.tar.gz

@jickding
Copy link

jickding commented Jun 22, 2022

please post your TEST method (test command or other)

this is my test log :
openssl speed -elapsed -evp aes-256-cbc -engine devcrypto

engine "devcrypto" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 72548 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 64 size blocks: 70679 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 256 size blocks: 64339 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 1024 size blocks: 47812 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 8192 size blocks: 13709 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 7565 aes-256-cbc's in 3.01s
OpenSSL 1.1.1o 3 May 2022
built on: Mon Jun 20 15:07:21 2022 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: mipsel-openwrt-linux-musl-gcc -fPIC -pthread -mabi=32 -Wa,--noexecstack -Wall -O3 -Os -pipe -mno-branch-likely -mips32r2 -mtune=24kc -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -msoft-float -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -DPIC -fpic -ffunction-sections -fdata-sections -znow -zrelro -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DNDEBUG -DOPENSSL_PREFER_CHACHA_OVER_GCM -DOPENSSL_SMALL_FOOTPRINT
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 385.64k 1502.81k 5472.02k 16265.61k 37310.34k 41177.73k

@skbeh
Copy link
Contributor Author

skbeh commented Jun 22, 2022

insmod tcrypt mode=0 as mentioned above.

@jickding
Copy link

insmod tcrypt mode=0 as mentioned above.

i have tested . very great. EIP-93 work fine

Copy link

@jickding jickding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

building success and work fine at openwrt-22.03 branch

@skbeh
Copy link
Contributor Author

skbeh commented Jun 22, 2022

@cotequeiroz I implemented fallback in the driver. Therefore I saw a great speedup with small packets and the self tests passed.

@mgkiller7
Copy link

openssl speed -elapsed -evp aes-256-cbc -engine devcrypto

I did see the same test result as yours, but i note the CPU load is very high:
Tasks: 92 total, 2 running, 90 sleeping, 0 stopped, 0 zombie
%Cpu(s): 99.0 us, 1.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 488.2 total, 292.5 free, 63.8 used, 131.9 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 395.7 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2431 root 20 0 4320 3416 3040 R 95.2 0.7 0:06.12 openssl
1 root 20 0 30856 7180 5396 S 1.0 1.4 0:37.29 systemd
2214 root 20 0 0 0 0 I 1.0 0.0 0:00.24 kworker/0+
2433 root 20 0 9300 2600 2216 R 1.0 0.5 0:00.14 top
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu+
9 root 20 0 0 0 0 S 0.0 0.0 0:02.37 ksoftirqd+
10 root 20 0 0 0 0 I 0.0 0.0 0:02.58 rcu_preem+
11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration+
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
13 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kdevtmpfs
14 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks+
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 oom_reaper
18 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 writeback

Is it ok?

@skbeh
Copy link
Contributor Author

skbeh commented Jun 22, 2022

@mgkiller7

I did see the same test result as yours, but i note the CPU load is very high

It is the expected behaviour because there is a high overhead submitting data to the engine and getting it back.

@cotequeiroz
Copy link
Member

@cotequeiroz I implemented fallback in the driver. Therefore I saw a great speedup with small packets and the self tests passed.

Nice job! You've managed to get the best of both worlds with this:

Cipher block size generic eip-93 eip-93+fb fb/generic fb/pure
AES-128-CTR 16 1313 14915 1723.5 -24% 765%
AES-128-CTR 64 2576.5 12889 2648 -3% 387%
AES-128-CTR 256 10025.75 17183.5 12947.5 -23% 33%
AES-128-CTR 1024 36891.5 19410 20203.5 83% -4%
AES-128-CTR 1472 52833.75 24109.5 22868 131% 5%
AES-128-CTR 8192 292113.5 68983.5 68115 329% 1%
AES-192-CTR 16 942.5 13605.5 1843.5 -49% 638%
AES-192-CTR 64 4169 13128 3057 36% 329%
AES-192-CTR 256 10834.5 15618 10891.5 -1% 43%
AES-192-CTR 1024 42622 21249 21556.5 98% -1%
AES-192-CTR 1472 61164.75 24198.5 25188.5 143% -4%
AES-192-CTR 8192 344112.75 113832 111401.5 209% 2%
AES-256-CTR 16 1035 12765 1599.5 -35% 698%
AES-256-CTR 64 4906.75 13906 5165 -5% 169%
AES-256-CTR 256 12329 14868.5 18852 -35% -21%
AES-256-CTR 1024 48605 21865.5 21553.5 126% 1%
AES-256-CTR 1472 71290.5 25994.5 25996 174% 0%
AES-256-CTR 8192 390817 92178 88552 341% 4%

In the table above, I've averaged your test results, considering both encryption and decryption numbers (in counter-mode, they are the same operation).


TLDR: The following is just a nitpick. If you're happy with what you've got, leave it alone. If you want to tweak the results for extra efficiency, read on.

The mean absolute deviation of the block sizes <= 256 was quite large (34% of the avg for AES-256). Try to test this with sec=1, to get a better sample (you'll get thousands of operations in one second).

If these numbers are indeed consistent with what you get with sec=1, I would, perhaps, reduce the transition point from 512 to 256 bytes. AES-128 would suffer a little, while AES-256 will benefit. Considering that using eip-93 reduces CPU usage, I would opt for the smaller value. The good thing is that it can be configured. You can add it as a config option in the package, and people can always override it manually, if they don't use aes-256, for example. Before doing this, I strongly advise to gather better data.

I used this to get a table from the logs that I could just copy and paste to a spreadsheet:

sed -n -e '/ ctr(aes) /,+18{s/.*] .*.*(\([0-9]\{3\}\).* \([0-9]\+\) byte.*in \(.*\) cycles.*/AES-\1-CTR\t\2\t\3/;s/.* \(a\?sync\).*\(ctr(aes-\)/\1 \2/;p}'

Of course, using sec=1 will require some sed tweaking.

@csharper2005
Copy link
Contributor

csharper2005 commented Aug 17, 2023

Hi guys! Sorry In advance if this is a dumb question or I missed something. Why is there no performance growth in openssl benchmark - https://openwrt.org/docs/guide-user/perf_and_log/benchmark.openssl?

kmod-cryptodev, libopenssl and libopenssl-devcrypto are installed.

The first result is with eip93, the second one is without:
https://pastebin.com/Z056c41e

Proof that eip93 is enabled:
https://pastebin.com/Gd9gzh6M

@lukasz1992
Copy link

Hi guys! Sorry In advance if this is a dumb question or I missed something. Why is there no performance growth in openssl benchmark - https://openwrt.org/docs/guide-user/perf_and_log/benchmark.openssl?

kmod-cryptodev, libopenssl and libopenssl-devcrypto are installed.

The first result is with eip93, the second one is without: https://pastebin.com/Z056c41e

Proof that eip93 is enabled: https://pastebin.com/Gd9gzh6M

please run openssl speed aes . Notice lower usage of cpu.

@csharper2005
Copy link
Contributor

please run openssl speed aes . Notice lower usage of cpu.

I noticed 100% load of the one core in both cases. How does it look like on your device with eip93 enabled?

@skbeh
Copy link
Contributor Author

skbeh commented Aug 18, 2023

@csharper2005 You need to enable CONFIG_OPENSSL_ENGINE_BUILTIN and CONFIG_OPENSSL_ENGINE_BUILTIN_DEVCRYPTO to make it work by default.

@csharper2005
Copy link
Contributor

@csharper2005 You need to enable CONFIG_OPENSSL_ENGINE_BUILTIN and CONFIG_OPENSSL_ENGINE_BUILTIN_DEVCRYPTO to make it work by default.

Thanks. I installed necessary packages manually before. Everything turned out to be easier. The test from https://openwrt.org/docs/guide-user/perf_and_log/benchmark.openssl isn't representative for eip93.

I tried the another one:
time openssl speed -evp aes-256-ctr

The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-ctr       6821.40k     7737.54k     8034.30k     8097.79k     8099.47k     8094.02k
real    0m 18.08s
user    0m 18.05s
sys     0m 0.03s

eip93:

The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-ctr      14714.21k    71627.89k   327614.17k   237514.11k  5585305.60k  4141875.20k
real    0m 18.32s
user    0m 1.34s
sys     0m 10.15s 

@eduardo010174
Copy link
Contributor

eduardo010174 commented Aug 18, 2023

@csharper2005 https://openwrt.org/docs/techref/hardware/cryptographic.hardware.accelerators

File you need to change is /etc/ssl/modules.cnf.d/devcrypto.cnf

@rany2
Copy link
Contributor

rany2 commented Aug 18, 2023

The authorship information was lost when converting this into a kernel patch. Personally I prefer it as a module to make changes to eip93 easier but I guess you think it's mature enough to be a kernel patch which is fair.

Also the previous patches (001-fix-building-with-kernel-5.10.patch and 002-crypto-use-AES-fallback-for-small-requests.patch) were simply folded into that one patch instead of being split to preserve authorship and the commit message.

@eduardo010174
Copy link
Contributor

Router: Xiaomi R3G v1
Kernel: 5.15.126

root@OpenWrt:~# openssl engine -t -c -vv -pre DUMP_INFO devcrypto
(devcrypto) /dev/crypto engine
Information about ciphers supported by the /dev/crypto engine:
Cipher DES-CBC, NID=31, /dev/crypto info: id=1, driver=cbc(des-eip93) (hw accelerated)
Cipher DES-EDE3-CBC, NID=44, /dev/crypto info: id=2, driver=cbc(des3_ede-eip93) (hw accelerated)
Cipher BF-CBC, NID=91, /dev/crypto info: id=3, CIOCGSESSION (session open call) failed
Cipher CAST5-CBC, NID=108, /dev/crypto info: id=4, CIOCGSESSION (session open call) failed
Cipher AES-128-CBC, NID=419, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher AES-192-CBC, NID=423, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher AES-256-CBC, NID=427, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher RC4, NID=5, /dev/crypto info: id=12, CIOCGSESSION (session open call) failed
Cipher AES-128-CTR, NID=904, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-192-CTR, NID=905, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-256-CTR, NID=906, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-128-ECB, NID=418, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)
Cipher AES-192-ECB, NID=422, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)
Cipher AES-256-ECB, NID=426, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)

Information about digests supported by the /dev/crypto engine:
Digest MD5, NID=4, /dev/crypto info: id=13, driver=md5-generic (software), CIOCCPHASH capable
Digest SHA1, NID=64, /dev/crypto info: id=14, driver=sha1-generic (software), CIOCCPHASH capable
Digest RIPEMD160, NID=117, /dev/crypto info: id=102, driver=unknown. CIOCGSESSION (session open) failed
Digest SHA224, NID=675, /dev/crypto info: id=103, driver=sha224-generic (software), CIOCCPHASH capable
Digest SHA256, NID=672, /dev/crypto info: id=104, driver=sha256-generic (software), CIOCCPHASH capable
Digest SHA384, NID=673, /dev/crypto info: id=105, driver=sha384-generic (software), CIOCCPHASH capable
Digest SHA512, NID=674, /dev/crypto info: id=106, driver=sha512-generic (software), CIOCCPHASH capable

[Success]: DUMP_INFO
 [DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB]
     [ available ]
     USE_SOFTDRIVERS: specifies whether to use software (not accelerated) drivers (0=use only accelerated drivers, 1=allow all drivers, 2=use if acceleration can't be determined) [default=2]
     CIPHERS: either ALL, NONE, or a comma-separated list of ciphers to enable [default=ALL]
     DIGESTS: either ALL, NONE, or a comma-separated list of digests to enable [default=NONE]
     DUMP_INFO: dump info about each algorithm to stderr; use 'openssl engine -pre DUMP_INFO devcrypto'
#SOFTWARE
openssl speed -evp AES-128-CBC
openssl speed -evp AES-192-CBC
openssl speed -evp AES-256-CBC
openssl speed -evp AES-128-CTR
openssl speed -evp AES-192-CTR
openssl speed -evp AES-256-CTR

#MTK EIP
openssl speed -engine devcrypto -evp AES-128-CBC
openssl speed -engine devcrypto -evp AES-192-CBC
openssl speed -engine devcrypto -evp AES-256-CBC
openssl speed -engine devcrypto -evp AES-128-CTR
openssl speed -engine devcrypto -evp AES-192-CTR
openssl speed -engine devcrypto -evp AES-256-CTR
#SOFTWARE
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-CBC      11160.72k    14251.78k    15287.30k    15494.83k    15734.10k    15739.56k
AES-192-CBC       9922.53k    12296.77k    13097.30k    13314.39k    13377.54k    13380.27k
AES-256-CBC       8936.96k    10801.90k    11415.30k    11579.73k    11627.18k    11627.18k
AES-128-CTR      11204.31k    13612.74k    14423.30k    14636.03k    14740.12k    14685.53k
AES-192-CTR       9939.79k    11801.54k    12411.22k    12559.70k    12593.83k    12630.47k
AES-256-CTR       8971.88k    10426.03k    10885.89k    11011.41k    11026.43k    11037.35k

#MTK EIP
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-CBC      22852.32k   103936.75k   488260.27k   471263.42k  3485286.40k 15961292.80k
AES-192-CBC      16668.76k    91910.78k   507156.48k   281657.60k  4088354.13k  6846054.40k
AES-256-CBC      31263.75k   104266.97k   456140.80k   402432.00k  2793881.60k 12381388.80k
AES-128-CTR      17025.28k    96576.00k   289082.88k   317659.43k  1962218.06k  5372859.73k
AES-192-CTR      14954.95k    52799.31k   243704.32k   332618.83k  2026018.13k  2308505.60k
AES-256-CTR      16023.91k    52159.02k   442977.28k   271899.31k  2180382.72k  6122700.80k

@skbeh
Copy link
Contributor Author

skbeh commented Aug 19, 2023

@eduardo010174 You need to add -elapsed to the parameters to make the test accurate.

@Ansuel
Copy link
Member

Ansuel commented Aug 19, 2023

@rany2 do you have more info on the authorship? An idea might be to add all of them in the tag in this commit. Or even better create the patch from a git format run instead of a diff and add all the credits there.

Aside from these formal stuff I think this is ready and I will merge soon if we can't sort this last small thing.

@lukasz1992
Copy link

@Ansuel I assume the author is: Richard van Schagen vschagen@icloud.com
https://github.com/vschagen/mtk-eip93 - each file in crypto/ cotains this header:

// SPDX-License-Identifier: GPL-2.0
/*
 * Copyright (C) 2019 - 2021
 *
 * Richard van Schagen <vschagen@icloud.com>
 */

@lukasz1992
Copy link

lukasz1992 commented Aug 19, 2023

It would be perfect if this change is ported back to openwrt-22.03.
If not the whole driver - at least dts patches, that would allow to install it manually later.

EDIT: Tested it on 22.03-snapshot and rc2 - it works correctly and stable as for snapshot

@eduardo010174
Copy link
Contributor

eduardo010174 commented Aug 19, 2023

Sorry for mistake.

Router: Xiaomi R3G v1
Kernel: 5.15.126

#SOFTWARE
openssl speed -elapsed -evp AES-128-CBC
openssl speed -elapsed -evp AES-192-CBC
openssl speed -elapsed -evp AES-256-CBC
openssl speed -elapsed -evp AES-128-CTR
openssl speed -elapsed -evp AES-192-CTR
openssl speed -elapsed -evp AES-256-CTR

#MTK EIP
openssl speed -engine devcrypto -elapsed -evp AES-128-CBC
openssl speed -engine devcrypto -elapsed -evp AES-192-CBC
openssl speed -engine devcrypto -elapsed -evp AES-256-CBC
openssl speed -engine devcrypto -elapsed -evp AES-128-CTR
openssl speed -engine devcrypto -elapsed -evp AES-192-CTR
openssl speed -engine devcrypto -elapsed -evp AES-256-CTR
#SOFTWARE
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-CBC      11119.23k    14259.86k    15345.83k    15646.38k    15704.06k    15734.10k
AES-192-CBC       9884.20k    12289.07k    13088.51k    13305.86k    13300.43k    13363.88k
AES-256-CBC       8896.22k    10800.58k    11392.68k    11573.93k    11621.72k    11621.72k
AES-128-CTR      11195.61k    13627.93k    14427.99k    14640.81k    14699.18k    14696.45k
AES-192-CTR       9961.54k    11812.03k    12402.69k    12564.48k    12610.22k    12615.68k
AES-256-CTR       8922.91k    10425.62k    10886.57k    11015.17k    11029.16k    11048.28k

#MTK EIP
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-CBC       2213.72k     5790.83k     9705.30k    17153.37k    46303.91k    53351.77k
AES-192-CBC       2112.26k     5266.71k     8468.91k    16140.29k    40990.04k    46071.81k
AES-256-CBC       1969.17k     4779.86k     7570.86k    15508.14k    36828.50k    40905.39k
AES-128-CTR       4462.18k     5826.13k     9634.05k    17236.65k    47090.35k    54018.05k
AES-192-CTR       4185.27k     4904.92k     8176.90k    16325.63k    41314.99k    46809.09k
AES-256-CTR       3822.21k     4644.63k     7403.35k    15494.83k    37090.65k    41353.22k

@rany2
Copy link
Contributor

rany2 commented Aug 19, 2023

@Ansuel The original authorship information is in this commit: 53bdd30#diff-42f66007baa5f76166fcc61d3742a890a1f29e5b6e5f0de0e9c50912e9caa2ca

However when it was converted into a kernel patch, it dropped this metadata.

@lukasz1992
Copy link

@Ansuel The original authorship information is in this commit: 53bdd30#diff-42f66007baa5f76166fcc61d3742a890a1f29e5b6e5f0de0e9c50912e9caa2ca

However when it was converted into a kernel patch, it dropped this metadata.

vschagen wrote the code, Aviana Cruz improved it (to be not executed when cpu is faster)

Mediatek EIP93 Crypto engine is a crypto accelerator which
is available in the Mediatek MT7621 SoC.

Signed-off-by: Aviana Cruz <gwencroft@proton.me>
Co-authored-by: Richard van Schagen <vschagen@icloud.com>
Co-authored-by: Chukun Pan <amadeus@jmu.edu.cn>
@openwrt-bot openwrt-bot merged commit 46d6730 into openwrt:main Sep 12, 2023
6 checks passed
@Ansuel
Copy link
Member

Ansuel commented Sep 12, 2023

Question are we sure this is ok to backport to 23.05? (backport to 22.03 is too much)

Actually on second tought maybe it would be problematic to backport to 23.05 since it would not be a fix but really a new feature... Hope you guys are not sad about 23.05 not having it.

@rany2
Copy link
Contributor

rany2 commented Sep 12, 2023

@Ansuel It would have been easier to backport to 23.05 if it was a separate package like it was in this iteration of this PR: 53bdd30

Don't get why it was turned into a kernel patch, just adds to overhead IMO

@Djfe
Copy link
Contributor

Djfe commented Sep 12, 2023

I'm glad, this got merged 👍
I get your concerns and I would be OK with it not making the next release. But I won't be the biggest user of this new feature so you should wait for replies of others 😅

Let's think about the implications of backporting this now:
If this commit were to cause issues, then what would be the implication? A fix could be as easy as reverting the commit in release, right?
This feature could mostly break some (rarer) VPN's on just one target. But the tests all ran just fine. So it could be some devices where the acceleration chip is broken? If there are any issues with this feature, then I feel like we might not even catch it in master/snapshots.
You could discuss this with other committers. But only if you feel like it.

@Ansuel
Copy link
Member

Ansuel commented Sep 12, 2023 via email

@csharper2005
Copy link
Contributor

csharper2005 commented Sep 12, 2023

Question are we sure this is ok to backport to 23.05? (backport to 22.03 is too much)

Yeah. It would be nice to have this in 23.05 and even 22.03. Fix for the insufficient ikev2 performance. :)

@Ansuel
Copy link
Member

Ansuel commented Sep 12, 2023

I'm glad, this got merged 👍 I get your concerns and I would be OK with it not making the next release. But I won't be the biggest user of this new feature so you should wait for replies of others 😅

Let's think about the implications of backporting this now: If this commit were to cause issues, then what would be the implication? A fix could be as easy as reverting the commit in release, right? This feature could mostly break some (rarer) VPN's on just one target. But the tests all ran just fine. So it could be some devices where the acceleration chip is broken? If there are any issues with this feature, then I feel like we might not even catch it in master/snapshots. You could discuss this with other committers. But only if you feel like it.

We would wait a point release for the revert to be applied that is the main problem. It's ok for RC but if we don't catch any problem in RC stage then it will be problematic in the future

fengmushu pushed a commit to fengmushu/openwrt that referenced this pull request Nov 18, 2023
1. Fixes CPU and regulator boot problems
2. Enable CPUFREQ, I2C, RTC and THERMAL support
3. Disabled annoying debug logs, refresh kconfig
4. Fix compilation failure due to wrong kconfig

Fixes: openwrt#10042

Signed-off-by: AmadeusGhost <amadeus@openjmu.xyz>
@orangepizza
Copy link
Contributor

don't think libustream-mbedtls uses this engine even when included: (at least for gcm in tls) is there something more need to be done?

@neheb
Copy link
Contributor

neheb commented Jan 20, 2024

mbedtls does not support hardware offload

@Headcrabed
Copy link

Headcrabed commented Jan 20, 2024

don't think libustream-mbedtls uses this engine even when included: (at least for gcm in tls) is there something more need to be done?

You need to remove luci-ssl and wpad-basic-mbedtls and libustream-openssl, then install luci-ssl-openssl, wpad-(basic)-openssl and libustream-openssl to switch to openssl-based implementation, and install libopenssl-devcrypto to enable hw crypto usage, don't forget to reboot your router after installing.
图片

@orangepizza
Copy link
Contributor

https://mbed-tls.readthedocs.io/en/latest/kb/development/hw_acc_guidelines/
looks like mbedtls have own extension to allow hw accel, but that looks massive can of worm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core packages pull request/issue for core (in-tree) packages kernel pull request/issue with Linux kernel related changes target/ramips pull request/issue for ramips target
Projects
None yet
Development

Successfully merging this pull request may close these issues.