-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ramips: add support for mtk eip93 crypto engine #10042
Conversation
Please use your real email address for Signed-off-by line in the commit message instead of GitHub's address. Guideline: https://openwrt.org/submitting-patches#submission_guidelines |
package/kernel/mtk-eip93/patches/001-Allow-build-on-kernel-5.10.patch
Outdated
Show resolved
Hide resolved
I remember running this a long time ago with cryptsetup's benchmark. While there was a speedup with AES and such, it wasn't that dramatic. I also don't remember if openssl was faster. Should be slower for lower block sizes. ping @cotequeiroz since he deals with this stuff. |
No offense, but this looks like a copy of immortalwrt/immortalwrt@7fc2fd7 with author info stripped. |
@AmadeusGhost I have not seen it before the notification therefore I reinvented the wheel🌚 |
I've had some disappointing experiences with hw crypto, so my comments are probably discouraging. There are several drivers already included in openwrt, so I'm not going to be picky about adding another one. However, this driver does not appear to be ready for inclusion in the kernel. Just skimming through the code, I found some places needing changes. For example, the AES skcipher algos are registering themselves with I would strongly suggest that the driver be tested with the kernel self-test, and also speed tested. A friend has a device that I could try to install the module to do speed tests, but it is used in production, so I probably won't be able to install an image there. I'll post the results if able. Running the kernel self-testsOpenwrt disables the crypto self-tests to save space. To be able to run them, one must enable the self-tests in the kernel config: The self-tests will run upon loading the
Running the kernel speed testsSpeed testing can be done without the above change, by installing the
At least on arm, using hw-crypto to perform AES on network packets is not worth it. There is a large latency to set up the hw crypto operation. For networking with MTU=1500, take a look at the speed of block size=1472. On mvebu, the CESA beats the arm-asm driver (regular arm, not the neon version), but not by much (128-bits AES: 23560 vs 26607 CPU cycles). Then the acks will kill most/all of the gain: for CESA, using 128-bit AES on a 1472B packet plus a 64B ack: 23560 + 9469 = 33029 CPU cycles; arm-asm: 26607 + 1355 = 27962 cycles, a 15% penalty. I hope things on mips are different, and my expectations are just wrong. |
@cotequeiroz IIRC the mvebu hardware stuff was created when the core clock was closer to 1.2ghz for the original WRT1900. It makes a little more sense there. |
It makes sense. With CBC pretty much out of the picture, and CESA not supporting CTR, I had 0 |
I was just illustrating how it was not being used in the kernel. I don't run or recommend running openssl or wolfssl with cryptodev or afalg, even if it works faster. Most user space programs are not aware of how openssl processes its requests--internal or using an engine. Cryptodev file descriptors must not be shared across processes, which happens when you call Thanks for the tests: The speed tests indicate that it may speedup regular network encryption, which is good. The bad news is that the self tests showed one error that raised concern:
77 is The fact that it happened only in test vector 7 indicates it worked for vectors 0 through 6. 7 is the first vector without associated data--i.e. data not to be encrypted/decrypted, but that must be authenticated.
This should be investigated further. I would not trade reliability for speed. At the moment, this is a NAK for me. |
I think it is relevant to https://github.com/vschagen/mtk-eip93/blob/ca08387bf8352652129019bb19e2423ab313d5cb/crypto/mtk-eip93/eip93-main.c#L165 . |
I would use an async crypto fallback for requests smaller than, say 512 bytes. I've done it with qce in torvalds/linux@ce163ba, although later with torvalds/linux@25b71d6, it was inadvertently? limited to XTS only. After you do this, the self tests should pass, but beware that they will fall under the 512-byte threshold, so see if you can test it with other data. |
I applied similar changes in my stage tree. It works like a charm. |
please post your TEST method (test command or other) this is my test log : engine "devcrypto" set. |
|
i have tested . very great. EIP-93 work fine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
building success and work fine at openwrt-22.03 branch
@cotequeiroz I implemented fallback in the driver. Therefore I saw a great speedup with small packets and the self tests passed. |
I did see the same test result as yours, but i note the CPU load is very high: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND Is it ok? |
It is the expected behaviour because there is a high overhead submitting data to the engine and getting it back. |
Nice job! You've managed to get the best of both worlds with this:
In the table above, I've averaged your test results, considering both encryption and decryption numbers (in counter-mode, they are the same operation). TLDR: The following is just a nitpick. If you're happy with what you've got, leave it alone. If you want to tweak the results for extra efficiency, read on. The mean absolute deviation of the block sizes <= 256 was quite large (34% of the avg for AES-256). Try to test this with If these numbers are indeed consistent with what you get with I used this to get a table from the logs that I could just copy and paste to a spreadsheet:
Of course, using |
Hi guys! Sorry In advance if this is a dumb question or I missed something. Why is there no performance growth in openssl benchmark - https://openwrt.org/docs/guide-user/perf_and_log/benchmark.openssl?
The first result is with eip93, the second one is without: Proof that eip93 is enabled: |
please run |
I noticed 100% load of the one core in both cases. How does it look like on your device with eip93 enabled? |
@csharper2005 You need to enable |
Thanks. I installed necessary packages manually before. Everything turned out to be easier. The test from https://openwrt.org/docs/guide-user/perf_and_log/benchmark.openssl isn't representative for eip93. I tried the another one:
eip93:
|
@csharper2005 https://openwrt.org/docs/techref/hardware/cryptographic.hardware.accelerators File you need to change is /etc/ssl/modules.cnf.d/devcrypto.cnf |
The authorship information was lost when converting this into a kernel patch. Personally I prefer it as a module to make changes to eip93 easier but I guess you think it's mature enough to be a kernel patch which is fair. Also the previous patches (001-fix-building-with-kernel-5.10.patch and 002-crypto-use-AES-fallback-for-small-requests.patch) were simply folded into that one patch instead of being split to preserve authorship and the commit message. |
Router: Xiaomi R3G v1 root@OpenWrt:~# openssl engine -t -c -vv -pre DUMP_INFO devcrypto
(devcrypto) /dev/crypto engine
Information about ciphers supported by the /dev/crypto engine:
Cipher DES-CBC, NID=31, /dev/crypto info: id=1, driver=cbc(des-eip93) (hw accelerated)
Cipher DES-EDE3-CBC, NID=44, /dev/crypto info: id=2, driver=cbc(des3_ede-eip93) (hw accelerated)
Cipher BF-CBC, NID=91, /dev/crypto info: id=3, CIOCGSESSION (session open call) failed
Cipher CAST5-CBC, NID=108, /dev/crypto info: id=4, CIOCGSESSION (session open call) failed
Cipher AES-128-CBC, NID=419, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher AES-192-CBC, NID=423, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher AES-256-CBC, NID=427, /dev/crypto info: id=11, driver=cbc(aes-eip93) (hw accelerated)
Cipher RC4, NID=5, /dev/crypto info: id=12, CIOCGSESSION (session open call) failed
Cipher AES-128-CTR, NID=904, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-192-CTR, NID=905, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-256-CTR, NID=906, /dev/crypto info: id=21, driver=ctr(aes-eip93) (hw accelerated)
Cipher AES-128-ECB, NID=418, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)
Cipher AES-192-ECB, NID=422, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)
Cipher AES-256-ECB, NID=426, /dev/crypto info: id=23, driver=ecb(aes-eip93) (hw accelerated)
Information about digests supported by the /dev/crypto engine:
Digest MD5, NID=4, /dev/crypto info: id=13, driver=md5-generic (software), CIOCCPHASH capable
Digest SHA1, NID=64, /dev/crypto info: id=14, driver=sha1-generic (software), CIOCCPHASH capable
Digest RIPEMD160, NID=117, /dev/crypto info: id=102, driver=unknown. CIOCGSESSION (session open) failed
Digest SHA224, NID=675, /dev/crypto info: id=103, driver=sha224-generic (software), CIOCCPHASH capable
Digest SHA256, NID=672, /dev/crypto info: id=104, driver=sha256-generic (software), CIOCCPHASH capable
Digest SHA384, NID=673, /dev/crypto info: id=105, driver=sha384-generic (software), CIOCCPHASH capable
Digest SHA512, NID=674, /dev/crypto info: id=106, driver=sha512-generic (software), CIOCCPHASH capable
[Success]: DUMP_INFO
[DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB]
[ available ]
USE_SOFTDRIVERS: specifies whether to use software (not accelerated) drivers (0=use only accelerated drivers, 1=allow all drivers, 2=use if acceleration can't be determined) [default=2]
CIPHERS: either ALL, NONE, or a comma-separated list of ciphers to enable [default=ALL]
DIGESTS: either ALL, NONE, or a comma-separated list of digests to enable [default=NONE]
DUMP_INFO: dump info about each algorithm to stderr; use 'openssl engine -pre DUMP_INFO devcrypto' #SOFTWARE
openssl speed -evp AES-128-CBC
openssl speed -evp AES-192-CBC
openssl speed -evp AES-256-CBC
openssl speed -evp AES-128-CTR
openssl speed -evp AES-192-CTR
openssl speed -evp AES-256-CTR
#MTK EIP
openssl speed -engine devcrypto -evp AES-128-CBC
openssl speed -engine devcrypto -evp AES-192-CBC
openssl speed -engine devcrypto -evp AES-256-CBC
openssl speed -engine devcrypto -evp AES-128-CTR
openssl speed -engine devcrypto -evp AES-192-CTR
openssl speed -engine devcrypto -evp AES-256-CTR #SOFTWARE
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 11160.72k 14251.78k 15287.30k 15494.83k 15734.10k 15739.56k
AES-192-CBC 9922.53k 12296.77k 13097.30k 13314.39k 13377.54k 13380.27k
AES-256-CBC 8936.96k 10801.90k 11415.30k 11579.73k 11627.18k 11627.18k
AES-128-CTR 11204.31k 13612.74k 14423.30k 14636.03k 14740.12k 14685.53k
AES-192-CTR 9939.79k 11801.54k 12411.22k 12559.70k 12593.83k 12630.47k
AES-256-CTR 8971.88k 10426.03k 10885.89k 11011.41k 11026.43k 11037.35k
#MTK EIP
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 22852.32k 103936.75k 488260.27k 471263.42k 3485286.40k 15961292.80k
AES-192-CBC 16668.76k 91910.78k 507156.48k 281657.60k 4088354.13k 6846054.40k
AES-256-CBC 31263.75k 104266.97k 456140.80k 402432.00k 2793881.60k 12381388.80k
AES-128-CTR 17025.28k 96576.00k 289082.88k 317659.43k 1962218.06k 5372859.73k
AES-192-CTR 14954.95k 52799.31k 243704.32k 332618.83k 2026018.13k 2308505.60k
AES-256-CTR 16023.91k 52159.02k 442977.28k 271899.31k 2180382.72k 6122700.80k |
@eduardo010174 You need to add |
@rany2 do you have more info on the authorship? An idea might be to add all of them in the tag in this commit. Or even better create the patch from a git format run instead of a diff and add all the credits there. Aside from these formal stuff I think this is ready and I will merge soon if we can't sort this last small thing. |
@Ansuel I assume the author is: Richard van Schagen vschagen@icloud.com
|
It would be perfect if this change is ported back to openwrt-22.03. EDIT: Tested it on 22.03-snapshot and rc2 - it works correctly and stable as for snapshot |
Router: Xiaomi R3G v1 #SOFTWARE
openssl speed -elapsed -evp AES-128-CBC
openssl speed -elapsed -evp AES-192-CBC
openssl speed -elapsed -evp AES-256-CBC
openssl speed -elapsed -evp AES-128-CTR
openssl speed -elapsed -evp AES-192-CTR
openssl speed -elapsed -evp AES-256-CTR
#MTK EIP
openssl speed -engine devcrypto -elapsed -evp AES-128-CBC
openssl speed -engine devcrypto -elapsed -evp AES-192-CBC
openssl speed -engine devcrypto -elapsed -evp AES-256-CBC
openssl speed -engine devcrypto -elapsed -evp AES-128-CTR
openssl speed -engine devcrypto -elapsed -evp AES-192-CTR
openssl speed -engine devcrypto -elapsed -evp AES-256-CTR #SOFTWARE
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 11119.23k 14259.86k 15345.83k 15646.38k 15704.06k 15734.10k
AES-192-CBC 9884.20k 12289.07k 13088.51k 13305.86k 13300.43k 13363.88k
AES-256-CBC 8896.22k 10800.58k 11392.68k 11573.93k 11621.72k 11621.72k
AES-128-CTR 11195.61k 13627.93k 14427.99k 14640.81k 14699.18k 14696.45k
AES-192-CTR 9961.54k 11812.03k 12402.69k 12564.48k 12610.22k 12615.68k
AES-256-CTR 8922.91k 10425.62k 10886.57k 11015.17k 11029.16k 11048.28k
#MTK EIP
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 2213.72k 5790.83k 9705.30k 17153.37k 46303.91k 53351.77k
AES-192-CBC 2112.26k 5266.71k 8468.91k 16140.29k 40990.04k 46071.81k
AES-256-CBC 1969.17k 4779.86k 7570.86k 15508.14k 36828.50k 40905.39k
AES-128-CTR 4462.18k 5826.13k 9634.05k 17236.65k 47090.35k 54018.05k
AES-192-CTR 4185.27k 4904.92k 8176.90k 16325.63k 41314.99k 46809.09k
AES-256-CTR 3822.21k 4644.63k 7403.35k 15494.83k 37090.65k 41353.22k |
@Ansuel The original authorship information is in this commit: 53bdd30#diff-42f66007baa5f76166fcc61d3742a890a1f29e5b6e5f0de0e9c50912e9caa2ca However when it was converted into a kernel patch, it dropped this metadata. |
vschagen wrote the code, Aviana Cruz improved it (to be not executed when cpu is faster) |
Mediatek EIP93 Crypto engine is a crypto accelerator which is available in the Mediatek MT7621 SoC. Signed-off-by: Aviana Cruz <gwencroft@proton.me> Co-authored-by: Richard van Schagen <vschagen@icloud.com> Co-authored-by: Chukun Pan <amadeus@jmu.edu.cn>
Question are we sure this is ok to backport to 23.05? (backport to 22.03 is too much) Actually on second tought maybe it would be problematic to backport to 23.05 since it would not be a fix but really a new feature... Hope you guys are not sad about 23.05 not having it. |
I'm glad, this got merged 👍 Let's think about the implications of backporting this now: |
Well thing is that this is a whole new feature that would be added after
the branching and in theory backported should be max fixes or trivial new
device.
About having this as a patch, the idea was to use the upstream series with
a few fixup instead of raw files to make it easier to upstream, but it was
just created a diff patch. But this has been here for a long time and I
didn't want to delay this further so I preferred to just merge and see what
happen.
Il Mar 12 Set 2023, 15:59 rany2 ***@***.***> ha scritto:
… @Ansuel <https://github.com/Ansuel> It would have been easier to backport
to 23.05 if it was a separate package like it was in this iteration of this
PR: 53bdd30
<53bdd30>
Don't get why it was turned into a kernel patch, just adds to overhead IMO
—
Reply to this email directly, view it on GitHub
<#10042 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE2ZMQU3YLUNTUALVXIY3OLX2BTCRANCNFSM5YRV4WKA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah. It would be nice to have this in 23.05 and even 22.03. Fix for the insufficient ikev2 performance. :) |
We would wait a point release for the revert to be applied that is the main problem. It's ok for RC but if we don't catch any problem in RC stage then it will be problematic in the future |
1. Fixes CPU and regulator boot problems 2. Enable CPUFREQ, I2C, RTC and THERMAL support 3. Disabled annoying debug logs, refresh kconfig 4. Fix compilation failure due to wrong kconfig Fixes: openwrt#10042 Signed-off-by: AmadeusGhost <amadeus@openjmu.xyz>
don't think libustream-mbedtls uses this engine even when included: (at least for gcm in tls) is there something more need to be done? |
mbedtls does not support hardware offload |
https://mbed-tls.readthedocs.io/en/latest/kb/development/hw_acc_guidelines/ |
Mediatek EIP93 Crypto engine is a crypto accelerator which
is available in the Mediatek MT7621 SoC.
Althrough it was submitted to upstream kernel, it is far away from being accepted by the upstream. To make it available in openwrt as soon as possible, I ported the package from immortalwrt with some adjustment to make it possible to be built with 5.10 kernel.
Thanks:
mtk-eip93
immortalwrt