8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions #10582

vpaprotsk · 2022-10-05T21:28:26Z

Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16 message blocks at a time. For more details, left a lot of comments in macroAssembler_x86_poly.cpp.

Added new KAT test for Poly1305 and a fuzz test to compare intrinsic and java.
- Would like to add an InvalidKeyException in Poly1305.java (see commented out block in that file), but that conflicts with the KAT. I do think we should detect (R==0 || S ==0) so would like advice please.
Added a JMH perf test.
- JMH test had to use reflection (instead of existing MacBench.java), since Poly1305 is not 'properly' registered with the provider.

Perf before:

Benchmark                   (dataSize)  (provider)   Mode  Cnt        Score        Error  Units
Poly1305DigestBench.digest          64              thrpt    8  2961300.661 ± 110554.162  ops/s
Poly1305DigestBench.digest         256              thrpt    8  1791912.962 ±  86696.037  ops/s
Poly1305DigestBench.digest        1024              thrpt    8   637413.054 ±  14074.655  ops/s
Poly1305DigestBench.digest       16384              thrpt    8    48762.991 ±    390.921  ops/s
Poly1305DigestBench.digest     1048576              thrpt    8      769.872 ±      1.402  ops/s

and after:

Benchmark                   (dataSize)  (provider)   Mode  Cnt        Score        Error  Units
Poly1305DigestBench.digest          64              thrpt    8  2841243.668 ± 154528.057  ops/s
Poly1305DigestBench.digest         256              thrpt    8  1662003.873 ±  95253.445  ops/s
Poly1305DigestBench.digest        1024              thrpt    8  1770028.718 ± 100847.766  ops/s
Poly1305DigestBench.digest       16384              thrpt    8   765547.287 ±  25883.825  ops/s
Poly1305DigestBench.digest     1048576              thrpt    8    14508.458 ±     56.147  ops/s

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

Reviewers

Sandhya Viswanathan (@sviswa7 - Reviewer) ⚠️ Review applies to 835fbe3a
Vladimir Ivanov (@iwanowww - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10582/head:pull/10582
$ git checkout pull/10582

Update a local copy of the PR:
$ git checkout pull/10582
$ git pull https://git.openjdk.org/jdk pull/10582/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10582

View PR using the GUI difftool:
$ git pr show -t 10582

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10582.diff

bridgekeeper · 2022-10-05T21:30:24Z

Hi @vpaprotsk, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user vpaprotsk" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

vpaprotsk · 2022-10-05T21:31:07Z

/covered

vpaprotsk · 2022-10-05T21:31:26Z

I am part of Intel Java Team

openjdk · 2022-10-05T21:32:28Z

@vpaprotsk The following labels will be automatically applied to this pull request:

hotspot
security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

bridgekeeper · 2022-10-05T21:32:45Z

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

vpaprotsk · 2022-10-05T21:35:15Z

/label hotspot-compiler

openjdk · 2022-10-05T21:35:29Z

@vpaprotsk
The hotspot-compiler label was successfully added.

- Add benchmark

mlbridge · 2022-10-14T21:38:21Z

Webrevs

sviswa7 · 2022-10-18T22:51:51Z

src/java.base/share/classes/com/sun/crypto/provider/Poly1305.java

    /**
     * Partition the authentication key into the R and S components, clamp
     * the R value, and instantiate IntegerModuloP objects to R and S's
     * numeric values.
     */
-    private void setRSVals() {
+    private void setRSVals() { //throws InvalidKeyException {


The R and S check for invalid key (all bytes zero) could be submitted as a separate PR.
It is not related to the Poly1305 acceleration.

done, added a flag

sviswa7 · 2022-10-18T23:03:55Z

src/java.base/share/classes/com/sun/crypto/provider/Poly1305.java

+    @IntrinsicCandidate
+    private static void processMultipleBlocks(byte[] input, int offset, int length, byte[] aBytes, byte[] rBytes) {
+        MutableIntegerModuloP A = ipl1305.getElement(aBytes).mutable();
+        MutableIntegerModuloP R = ipl1305.getElement(rBytes).mutable();


R doesn't need to be mutable.

sviswa7 · 2022-10-18T23:13:24Z

...er/Cipher/ChaCha20/unittest/java.base/com/sun/crypto/provider/Poly1305IntrinsicFuzzTest.java

+public class Poly1305IntrinsicFuzzTest {
+        public static void main(String[] args) throws Exception {
+                //Note: it might be useful to increase this number during development of new Poly1305 intrinsics
+                final int repeat = 100;


Should we increase this repeat count for the c2 compiler to kick in for compiling engineUpdate() and have the call to stub in place from there?

did it with @run main/othervm -Xcomp -XX:-TieredCompilation com.sun.crypto.provider.Cipher.ChaCha20.Poly1305UnitTestDriver

sviswa7 · 2022-10-18T23:20:15Z

.../crypto/provider/Cipher/ChaCha20/unittest/java.base/com/sun/crypto/provider/Poly1305KAT.java

+        for (TestData test : testList) {
+            System.out.println("*** Test " + ++testNumber + ": " +
+                    test.testName);
+            if (runSingleTest(test)) {


runSingleTest may need to be called enough number of times for the engineUpdate to be compiled by c2.

added a second copy with @run main/othervm -Xcomp -XX:-TieredCompilation com.sun.crypto.provider.Cipher.ChaCha20.Poly1305UnitTestDriver

jatin-bhateja

Some initial assembler level comments.

jatin-bhateja · 2022-10-17T15:33:57Z

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp

@@ -1955,6 +1955,90 @@ address StubGenerator::generate_base64_encodeBlock()
  return start;
 }

+address StubGenerator::generate_poly1305_masksCP() {
+  StubCodeMark mark(this, "StubRoutines", "generate_poly1305_masksCP");
+  address start = __ pc();


You may use align64 here, like

jatin-bhateja · 2022-10-18T06:26:38Z

src/hotspot/cpu/x86/assembler_x86.cpp

+}
+
+void Assembler::evpunpckhqdq(XMMRegister dst, KRegister mask, XMMRegister src1, XMMRegister src2, bool merge, int vector_len) {
+  assert(UseAVX > 2, "requires AVX512F");


Please replace flag with feature EVEX check.

jatin-bhateja · 2022-10-18T06:40:25Z

src/hotspot/cpu/x86/assembler_x86.cpp

@@ -7747,6 +7827,16 @@ void Assembler::vpandq(XMMRegister dst, XMMRegister nds, XMMRegister src, int ve
  emit_int16((unsigned char)0xDB, (0xC0 | encode));
 }

+void Assembler::vpandq(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+  assert(VM_Version::supports_evex(), "");


Assertion should check existence of AVX512VL for non 512 but vectors.

jatin-bhateja · 2022-10-18T06:40:39Z

src/hotspot/cpu/x86/assembler_x86.cpp

@@ -7864,6 +7954,15 @@ void Assembler::vporq(XMMRegister dst, XMMRegister nds, XMMRegister src, int vec
  emit_int16((unsigned char)0xEB, (0xC0 | encode));
 }

+void Assembler::vporq(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+  assert(VM_Version::supports_evex(), "");


Same as above

TobiHartmann · 2022-10-21T09:57:14Z

I executed some quick testing and this fails with:

[2022-10-21T09:54:28,696Z] # A fatal error has been detected by the Java Runtime Environment:
[2022-10-21T09:54:28,696Z] #
[2022-10-21T09:54:28,696Z] #  Internal Error (/opt/mach5/mesos/work_dir/slaves/0c72054a-24ab-4dbb-944f-97f9341a1b96-S8380/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5903b026-cdbd-4aa4-8433-6a45fb7ee593/runs/f75b29aa-40ef-46a5-b323-3a80aaa9aa6b/workspace/open/src/hotspot/cpu/x86/assembler_x86.cpp:5358), pid=2385300, tid=2385302
[2022-10-21T09:54:28,696Z] #  Error: assert(vector_len == AVX_128bit ? VM_Version::supports_avx() : vector_len == AVX_256bit ? VM_Version::supports_avx2() : vector_len == AVX_512bit ? VM_Version::supports_avx512bw() : 0) failed
[2022-10-21T09:54:28,696Z] #
[2022-10-21T09:54:28,696Z] # JRE version:  (20.0) (fastdebug build )
[2022-10-21T09:54:28,696Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 20-internal-2022-10-21-0733397.tobias.hartmann.jdk2, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
[2022-10-21T09:54:28,696Z] # Problematic frame:
[2022-10-21T09:54:28,696Z] # V  [libjvm.so+0x6e3bf0]  Assembler::vpslldq(XMMRegister, XMMRegister, int, int)+0x190

vpaprotsk · 2022-10-21T18:06:00Z

Hi @TobiHartmann , thanks for looking. Could you share CPU Model and flags from hs_err please?

vnkozlov · 2022-10-21T18:20:10Z

Test: jdk/incubator/vector/VectorMaxConversionTests.java#id1
Flags: -ea -esa -XX:UseAVX=3 -XX:-TieredCompilation -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting -XX:+UseZGC
CPU: Intel 8358 (all AVX512 features).

I think the problem is this subtest runs with -XX:+UseKNLSettingVectorMaxConversionTests.java#L50 which limits AVX512 features.

Call stack:

V  [libjvm.so+0x6e3bf0]  Assembler::vpslldq(XMMRegister, XMMRegister, int, int)+0x190  (assembler_x86.cpp:5358)
V  [libjvm.so+0x152a23b]  MacroAssembler::poly1305_process_blocks_avx512(Register, Register, Register, Register, Register, Register, Register, Register)+0xc7b  (macroAssembler_x86_poly.cpp:590)
V  [libjvm.so+0x152c23d]  MacroAssembler::poly1305_process_blocks(Register, Register, Register, Register)+0x3ad  (macroAssembler_x86_poly.cpp:849)
V  [libjvm.so+0x192dc00]  StubGenerator::generate_poly1305_processBlocks()+0x170  (stubGenerator_x86_64.cpp:2069)
V  [libjvm.so+0x1936a89]  StubGenerator::generate_initial()+0x419  (stubGenerator_x86_64.cpp:3798)
V  [libjvm.so+0x1937b78]  StubGenerator_generate(CodeBuffer*, int)+0xf8  (stubGenerator_x86_64.hpp:526)
V  [libjvm.so+0x198e695]  StubRoutines::initialize1() [clone .part.0]+0x155  (stubRoutines.cpp:229)
V  [libjvm.so+0xfc4342]  init_globals()+0x32  (init.cpp:123)
V  [libjvm.so+0x1a7268f]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x37f

vpaprotsk · 2022-10-21T20:11:11Z

(Apologies, ignore the Stash: fetch limbs directly commit.. got git commit command mixed up.. will force-push a fix to the crash in a sec)

vpaprotsk · 2022-11-16T22:37:05Z

@iwanowww Answered your review comments, please take a look again? Thanks again!

sviswa7 · 2022-11-16T22:47:37Z

src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp

+  __ vpxorq(xmm0, xmm0, xmm0, Assembler::AVX_512bit);
+  __ vpxorq(xmm1, xmm1, xmm1, Assembler::AVX_512bit);


You could use T0, T1 in place of xmm0, xmm1 here.

Or simply switch to vzeroall for xmm0 - xmm15.

ah.. I remember thinking about doing that.. vzeroall isnt encoded yet and I figured since I already have to do the xmm16-29, might as well do them all.. should I add that instruction too?

Yes, please. And for the upper half of register file, just code it as a loop over register range:

for (int rxmm_num = 16; rxmm_num < 30; rxmm_num++) { XMMRegister rxmm = as_XMMRegister(rxmm_num); __ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit); }

or even

// Zeroes zmm16-zmm31. for (XMMRegister rxmm = xmm16; rxmm->is_valid(); rxmm = rxmm->successor()) { __ vpxorq(rxmm, rxmm, rxmm, Assembler::AVX_512bit); }

Will do.. ("loop" erm.. wow.. "duh, this isn't assembler!") Thanks!!

done
(Note: disassembler proof for vzeroall encoding

0x7fffed0022f8: vzeroall 0x7fffed0022fb: vpxorq zmm16,zmm16,zmm16 0x7fffed002301: vpxorq zmm17,zmm17,zmm17 0x7fffed002307: vpxorq zmm18,zmm18,zmm18 0x7fffed00230d: vpxorq zmm19,zmm19,zmm19 0x7fffed002313: vpxorq zmm20,zmm20,zmm20 0x7fffed002319: vpxorq zmm21,zmm21,zmm21 0x7fffed00231f: vpxorq zmm22,zmm22,zmm22 0x7fffed002325: vpxorq zmm23,zmm23,zmm23 0x7fffed00232b: vpxorq zmm24,zmm24,zmm24 0x7fffed002331: vpxorq zmm25,zmm25,zmm25 0x7fffed002337: vpxorq zmm26,zmm26,zmm26 0x7fffed00233d: vpxorq zmm27,zmm27,zmm27 0x7fffed002343: vpxorq zmm28,zmm28,zmm28 0x7fffed002349: vpxorq zmm29,zmm29,zmm29 0x7fffed00234f: vpxorq zmm30,zmm30,zmm30 0x7fffed002355: vpxorq zmm31,zmm31,zmm31 0x7fffed00235b: cmp ebx,0x10 0x7fffed00235e: jl 0x7fffed0023e6

)

iwanowww · 2022-11-16T23:12:28Z

src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp

+  __ vpsllq(R2P, R2P, 2, Assembler::AVX_512bit);
+
+  // Store R^8-R for later use
+  __ evmovdquq(Address(rsp, 64*0), B0, Assembler::AVX_512bit);


Could these vector spills be eliminated? I counted 8 spare zmm registers available across the vector loop (xmm7-xmm12, xmm30, xmm31).

And here's what is explicitly used in process256Loop:

D0 D1 = xmm2-xmm3 B0 B1 B2 B3 B4 B5 = xmm19-xmm24 TMP = xmm6 A0 A1 A2 A3 A4 A5 = xmm13-xmm18 R0 R1 R2 R1P R2P = xmm25-xmm29 T0 T1 T2 T3 T4 T5 = xmm0-xmm5

Interesting!! Let me try that!

Done!

PS: This find really was great!
PPS: I also reordered the map alphabetically and counted in-order... it was just really bugging me!!

vpaprotsk

@iwanowww Another round ready your way :)

iwanowww

Overall, looks good. Just one minor cleanup suggestion.

I've submitted the latest patch for testing (hs-tier1 - hs-tier4).

iwanowww · 2022-11-17T19:30:14Z

src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp

+  }
+  __ shlq(t0, 40);
+  __ addq(a1, t0);
+  if (a2 == noreg) {


Please, get rid of early return and turn the check into if (a2 != noreg) { ... } which guards the following code.

done (some golang-ism slipped in.. rewiring habits again)

vpaprotsk · 2022-11-21T17:42:28Z

@iwanowww Hope the extra tests passed? (Or do you have to re-run them on the latest patch again?)

iwanowww · 2022-11-21T18:56:16Z

The test results look good. (Had to wait until testing is complete.)

iwanowww

JVM part looks good.

vpaprotsk · 2022-11-21T19:16:05Z

/integrate

openjdk · 2022-11-21T19:18:19Z

@vpaprotsk
Your change (at version 08ea45e) is now ready to be sponsored by a Committer.

sviswa7 · 2022-11-21T20:59:26Z

/sponsor

openjdk · 2022-11-21T21:01:43Z

Going to push as commit f12710e.
Since your change was applied there have been 119 commits pushed to the master branch:

cd6a203: 8297348: make CONF=xxx should match if xxx is an exact match
817e039: 8297352: configure should check pandoc version
15e2e28: 8297353: Regenerated checked-in html files with new pandoc
b366d17: 8294073: Performance improvement for message digest implementations
57f5cfd: 8296399: crlNumExtVal might be null inside X509CRLSelector::match
0b04a99: 8297347: Problem list compiler/debug/TestStress*.java
0ac0148: 8297342: make LOG=debug is too verbose
d0a7938: 8286575: Document how properties in java.security are parsed
5c33454: 8296472: Remove ObjectLocker around appendToClassPathForInstrumentation call
0800813: 8293584: CodeCache::old_nmethods_do incorrectly filters is_unloading nmethods
... and 109 more: https://git.openjdk.org/jdk/compare/7357a1a379ed79c6754a8093eb108cd82062880a...master

Your commit was automatically rebased without conflicts.

openjdk · 2022-11-21T21:02:09Z

@sviswa7 @vpaprotsk Pushed as commit f12710e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

dholmes-ora · 2022-11-22T00:43:34Z

Testing is broken:

test/jdk/sun/security/util/math/BigIntegerModuloP.java:160: error: BigIntegerModuloP.ImmutableElement is not abstract and does not override abstract method getLimbs() in IntegerModuloP
private class ImmutableElement extends Element

Did you forget to commit a test file?

I will file a new bug for this.

TobiHartmann · 2022-11-22T15:21:44Z

I fixed the test issue with JDK-8297382 but this also caused a regression with one of the crypto tests: JDK-8297417. @vpaprotsk, @sviswa7 could you please have a look at this?

vpaprotsk · 2022-11-22T15:28:01Z

@TobiHartmann @dholmes-ora Sorry about that, looking

vpaprotsk · 2022-11-22T17:18:38Z

@robcasloz Update to JDK-8297417 (since I don't have an account on the bugtracker yet to update there)

Not able to reproduce it on Linux yet. The seed should make it deterministic.. but nothing. Resurrecting's my windows sandbox to see if I can reproduce on windows (only difference on windows is the intrinsic function register linkage. However problem there would make the problem very deterministic.. I think)

asgibbons · 2023-06-06T21:41:29Z

/backport jdk17u-dev

openjdk · 2023-06-06T21:43:02Z

@asgibbons Could not automatically backport f12710e9 to openjdk/jdk17u-dev due to conflicts in the following files:

src/hotspot/cpu/x86/assembler_x86.cpp
src/hotspot/cpu/x86/assembler_x86.hpp
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
src/hotspot/cpu/x86/macroAssembler_x86.cpp
src/hotspot/cpu/x86/macroAssembler_x86.hpp
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64.hpp
src/hotspot/cpu/x86/stubRoutines_x86.hpp
src/hotspot/cpu/x86/vm_version_x86.cpp
src/hotspot/cpu/x86/vm_version_x86.hpp
src/hotspot/share/opto/escape.cpp
src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/amd64/AMD64.java
test/lib-test/jdk/test/whitebox/CPUInfoTest.java

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk17u-dev. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk17u-dev.git master:master

# Check out the target branch and create your own branch to backport
$ git checkout master
$ git checkout -b asgibbons-backport-f12710e9

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git f12710e938b36594623e9c82961d8aa0c0ef29c2

# Backport the commit
$ git cherry-pick --no-commit f12710e938b36594623e9c82961d8aa0c0ef29c2
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport f12710e938b36594623e9c82961d8aa0c0ef29c2'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk17u-dev with the title Backport f12710e938b36594623e9c82961d8aa0c0ef29c2.

Poly1305 AVX512 intrinsic for x86_64

e3cfc74

bridgekeeper bot added the oca Needs verification of OCA signatory status label Oct 5, 2022

openjdk bot added security security-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Oct 5, 2022

bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label Oct 5, 2022

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Oct 5, 2022

vpaprotsk added 3 commits October 13, 2022 12:36

Merge remote-tracking branch 'vpaprotsk/master' into avx512-poly

ec6c807

- Fix whitespace and copyright statements

507d6bf

- Add benchmark

missed white-space fix

7e070d9

vpaprotsk marked this pull request as ready for review October 13, 2022 21:06

bridgekeeper bot removed oca Needs verification of OCA signatory status oca-verify Needs verification of OCA signatory status labels Oct 14, 2022

openjdk bot added the rfr Pull request is ready for review label Oct 14, 2022

sviswa7 reviewed Oct 18, 2022

View reviewed changes

jatin-bhateja reviewed Oct 19, 2022

View reviewed changes

openjdk bot removed the rfr Pull request is ready for review label Oct 21, 2022

further restrict UsePolyIntrinsics with supports_avx512vlbw

f048f93

vpaprotsk force-pushed the avx512-poly branch from 6a60c12 to f048f93 Compare October 21, 2022 20:17

sviswa7 reviewed Nov 16, 2022

View reviewed changes

iwanowww reviewed Nov 16, 2022

View reviewed changes

vzeroall, no spill, reg re-map

56aed9b

vpaprotsk commented Nov 17, 2022

View reviewed changes

iwanowww reviewed Nov 17, 2022

View reviewed changes

remove early return

08ea45e

iwanowww approved these changes Nov 21, 2022

View reviewed changes

openjdk bot added the sponsor Pull request is ready to be sponsored label Nov 21, 2022

openjdk bot added the integrated Pull request has been integrated label Nov 21, 2022

openjdk bot closed this Nov 21, 2022

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Nov 21, 2022

This was referenced Nov 23, 2022

8297417: Poly1305IntrinsicFuzzTest fails with tag mismatch exception #11308

Closed

8297379: Enable the ByteBuffer path of Poly1305 optimizations #11338

Closed

jnimeh mentioned this pull request Dec 28, 2022

8298592: Add java man page documentation for ChaCha20 and Poly1305 intrinsics openjdk/jdk20#78

Closed

3 tasks

		__ vpxorq(xmm0, xmm0, xmm0, Assembler::AVX_512bit);
		__ vpxorq(xmm1, xmm1, xmm1, Assembler::AVX_512bit);

8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions #10582

8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions #10582

Conversation

vpaprotsk commented Oct 5, 2022 • edited by openjdk bot Loading

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented Oct 5, 2022

vpaprotsk commented Oct 5, 2022

vpaprotsk commented Oct 5, 2022

openjdk bot commented Oct 5, 2022

bridgekeeper bot commented Oct 5, 2022

vpaprotsk commented Oct 5, 2022

openjdk bot commented Oct 5, 2022

mlbridge bot commented Oct 14, 2022 • edited Loading

Webrevs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jatin-bhateja left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TobiHartmann commented Oct 21, 2022

vpaprotsk commented Oct 21, 2022

vnkozlov commented Oct 21, 2022 • edited Loading

vpaprotsk commented Oct 21, 2022

vpaprotsk commented Nov 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpaprotsk Nov 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpaprotsk left a comment

Choose a reason for hiding this comment

iwanowww left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpaprotsk commented Nov 21, 2022

iwanowww commented Nov 21, 2022

iwanowww left a comment

Choose a reason for hiding this comment

vpaprotsk commented Nov 21, 2022

openjdk bot commented Nov 21, 2022

sviswa7 commented Nov 21, 2022

openjdk bot commented Nov 21, 2022

openjdk bot commented Nov 21, 2022

dholmes-ora commented Nov 22, 2022

TobiHartmann commented Nov 22, 2022

vpaprotsk commented Nov 22, 2022

vpaprotsk commented Nov 22, 2022

asgibbons commented Jun 6, 2023

openjdk bot commented Jun 6, 2023

vpaprotsk commented Oct 5, 2022 •

edited by openjdk bot

Loading

mlbridge bot commented Oct 14, 2022 •

edited

Loading

vnkozlov commented Oct 21, 2022 •

edited

Loading

vpaprotsk Nov 17, 2022 •

edited

Loading