8353500: [s390x] Intrinsify Unsafe::setMemory #24480

offamitkumar · 2025-04-07T08:44:07Z

Unsafe::setMemory intrinsic implementation for s390x.

Stub Code:

StubRoutines::unsafe_setmemory [0x000003ffb04b63c0, 0x000003ffb04b64d0] (272 bytes)
--------------------------------------------------------------------------------
  0x000003ffb04b63c0:   ogrk	%r1,%r2,%r3
  0x000003ffb04b63c4:   nill	%r1,7
  0x000003ffb04b63c8:   je	0x000003ffb04b6410
  0x000003ffb04b63cc:   nill	%r1,3
  0x000003ffb04b63d0:   je	0x000003ffb04b6460
  0x000003ffb04b63d4:   nill	%r1,1
  0x000003ffb04b63d8:   jlh	0x000003ffb04b64a0
  0x000003ffb04b63dc:   risbg	%r4,%r4,48,55,8
  0x000003ffb04b63e2:   risbgz	%r1,%r3,32,63,62
  0x000003ffb04b63e8:   je	0x000003ffb04b6402
  0x000003ffb04b63ec:   nopr
  0x000003ffb04b63ee:   nopr
  0x000003ffb04b63f0:   sth	%r4,0(%r2)
  0x000003ffb04b63f4:   sth	%r4,2(%r2)
  0x000003ffb04b63f8:   agfi	%r2,4
  0x000003ffb04b63fe:   brct	%r1,0x000003ffb04b63f0
  0x000003ffb04b6402:   nilf	%r3,2
  0x000003ffb04b6408:   ber	%r14
  0x000003ffb04b640a:   sth	%r4,0(%r2)
  0x000003ffb04b640e:   br	%r14
  0x000003ffb04b6410:   risbg	%r4,%r4,48,55,8
  0x000003ffb04b6416:   risbg	%r4,%r4,32,47,16
  0x000003ffb04b641c:   risbg	%r4,%r4,0,31,32
  0x000003ffb04b6422:   risbgz	%r1,%r3,32,63,60
  0x000003ffb04b6428:   je	0x000003ffb04b6446
  0x000003ffb04b642c:   nopr
  0x000003ffb04b642e:   nopr
  0x000003ffb04b6430:   stg	%r4,0(%r2)
  0x000003ffb04b6436:   stg	%r4,8(%r2)
  0x000003ffb04b643c:   agfi	%r2,16
  0x000003ffb04b6442:   brct	%r1,0x000003ffb04b6430
  0x000003ffb04b6446:   nilf	%r3,8
  0x000003ffb04b644c:   ber	%r14
  0x000003ffb04b644e:   stg	%r4,0(%r2)
  0x000003ffb04b6454:   br	%r14
  0x000003ffb04b6456:   nopr
  0x000003ffb04b6458:   nopr
  0x000003ffb04b645a:   nopr
  0x000003ffb04b645c:   nopr
  0x000003ffb04b645e:   nopr
  0x000003ffb04b6460:   risbg	%r4,%r4,48,55,8
  0x000003ffb04b6466:   risbg	%r4,%r4,32,47,16
  0x000003ffb04b646c:   risbgz	%r1,%r3,32,63,61
  0x000003ffb04b6472:   je	0x000003ffb04b6492
  0x000003ffb04b6476:   nopr
  0x000003ffb04b6478:   nopr
  0x000003ffb04b647a:   nopr
  0x000003ffb04b647c:   nopr
  0x000003ffb04b647e:   nopr
  0x000003ffb04b6480:   st	%r4,0(%r2)
  0x000003ffb04b6484:   st	%r4,4(%r2)
  0x000003ffb04b6488:   agfi	%r2,8
  0x000003ffb04b648e:   brct	%r1,0x000003ffb04b6480
  0x000003ffb04b6492:   nilf	%r3,4
  0x000003ffb04b6498:   ber	%r14
  0x000003ffb04b649a:   st	%r4,0(%r2)
  0x000003ffb04b649e:   br	%r14
  0x000003ffb04b64a0:   risbgz	%r1,%r3,32,63,63
  0x000003ffb04b64a6:   je	0x000003ffb04b64c2
  0x000003ffb04b64aa:   nopr
  0x000003ffb04b64ac:   nopr
  0x000003ffb04b64ae:   nopr
  0x000003ffb04b64b0:   stc	%r4,0(%r2)
  0x000003ffb04b64b4:   stc	%r4,1(%r2)
  0x000003ffb04b64b8:   agfi	%r2,2
  0x000003ffb04b64be:   brct	%r1,0x000003ffb04b64b0
  0x000003ffb04b64c2:   nilf	%r3,1
  0x000003ffb04b64c8:   ber	%r14
  0x000003ffb04b64ca:   stc	%r4,0(%r2)
  0x000003ffb04b64ce:   br	%r14

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8353500: [s390x] Intrinsify Unsafe::setMemory (Enhancement - P4)

Reviewers

Lutz Schmidt (@RealLucy - Reviewer)
Martin Doerr (@TheRealMDoerr - Reviewer) Review applies to f75209f5

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24480/head:pull/24480
$ git checkout pull/24480

Update a local copy of the PR:
$ git checkout pull/24480
$ git pull https://git.openjdk.org/jdk.git pull/24480/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24480

View PR using the GUI difftool:
$ git pr show -t 24480

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24480.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-04-07T08:45:08Z

👋 Welcome back amitkumar! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-04-07T08:45:14Z

@offamitkumar This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8353500: [s390x] Intrinsify Unsafe::setMemory

Reviewed-by: lucy, mdoerr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 939 new commits pushed to the master branch:

a37e826: 8357649: IGV: add block index to the supplemental node properties
3dbd2d3: 8347570: Configure fails on macOS if directory name do not have correct case
99f33b4: 8357568: IGV: Show NULL and numbers up to 4 characters in "Condense graph" filter
... and 936 more: https://git.openjdk.org/jdk/compare/15d36ee4a5dc3a143faccd59ecc3f9b0b45ed5d3...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2025-04-07T08:45:55Z

@offamitkumar The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-04-07T08:49:00Z

Webrevs

offamitkumar · 2025-04-07T11:53:11Z

with patch:

with the patch:   Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30   2.351 ± 0.015  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30   2.655 ± 0.020  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30   2.614 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30   2.783 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30   2.760 ± 0.014  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30   2.891 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30   2.697 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30   2.769 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30   3.689 ± 0.016  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30   3.127 ± 0.009  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  15.900 ± 0.046  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30   4.140 ± 0.057  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  53.748 ± 0.872  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30   9.245 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30   2.346 ± 0.020  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30   2.647 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30   2.617 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30   2.786 ± 0.008  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30   2.755 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30   2.892 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30   2.699 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30   2.765 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30   3.691 ± 0.015  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30   3.175 ± 0.053  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  15.892 ± 0.028  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  15.122 ± 0.347  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  53.588 ± 0.315  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  52.775 ± 0.169  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30   2.333 ± 0.216  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30   1.878 ± 0.092  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30   2.301 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30   2.400 ± 0.201  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30   2.666 ± 0.052  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30   2.209 ± 0.084  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30   3.086 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30   2.294 ± 0.217  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30   4.631 ± 0.013  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30   2.164 ± 0.124  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  13.959 ± 0.042  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30   3.078 ± 0.211  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  51.435 ± 0.712  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30   7.879 ± 0.140  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30   2.486 ± 0.169  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30   2.163 ± 0.065  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30   2.307 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30   2.489 ± 0.121  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30   2.653 ± 0.025  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30   2.830 ± 0.161  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30   3.086 ± 0.008  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30   3.124 ± 0.189  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30   4.634 ± 0.015  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30   4.552 ± 0.194  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  13.977 ± 0.031  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  14.310 ± 0.177  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  52.244 ± 1.414  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  53.824 ± 0.580  ns/op
Finished running test 'micro:java.lang.foreign.MemorySegmentZeroUnsafe'

without patch:

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30   2.368 ± 0.029  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30   2.647 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30   2.615 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30   2.782 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30   2.760 ± 0.014  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30   2.889 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30   2.702 ± 0.017  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30   2.766 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30   3.748 ± 0.045  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30   3.122 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  24.901 ± 0.106  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30  20.841 ± 0.154  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  24.498 ± 0.233  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30  24.290 ± 0.050  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30   2.345 ± 0.012  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30   2.648 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30   2.619 ± 0.008  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30   2.784 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30   2.756 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30   2.892 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30   2.702 ± 0.011  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30   2.765 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30   3.702 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30   3.121 ± 0.010  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  25.130 ± 0.058  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  24.891 ± 0.128  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  24.385 ± 0.061  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  24.444 ± 0.076  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30  19.611 ± 0.495  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30  18.797 ± 0.126  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30  22.808 ± 0.075  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30  18.797 ± 0.047  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30  22.934 ± 0.114  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30  19.580 ± 0.061  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30  22.798 ± 0.063  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30  18.029 ± 0.689  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30  22.736 ± 0.034  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30  17.799 ± 0.276  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  22.777 ± 0.033  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30  19.271 ± 0.017  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  22.758 ± 0.068  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30  22.752 ± 0.057  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30  19.115 ± 0.069  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30  22.795 ± 0.067  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30  22.754 ± 0.057  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30  22.797 ± 0.064  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30  22.803 ± 0.078  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30  22.738 ± 0.044  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30  22.815 ± 0.074  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30  22.732 ± 0.026  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30  22.754 ± 0.063  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30  22.743 ± 0.042  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  23.250 ± 1.193  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  22.838 ± 0.182  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  22.748 ± 0.033  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  22.740 ± 0.039  ns/op
Finished running test 'micro:java.lang.foreign.MemorySegmentZeroUnsafe'

RealLucy · 2025-04-07T13:40:54Z

src/hotspot/cpu/s390/stubGenerator_s390.cpp

+    __ z_risbg(tmp, size, 32, 128/* risbgz */ + 63, 64 - exact_log2(2 * elem_size), 0); // just do the right shift and set cc
+    __ z_bre(L_Tail);
+
+    __ align(16); // loop alignment


align(32) would be more helpful:

instruction engine fetches octoword (32 bytes) bundles.

Tight loop is < 32 byes -> all in one bundle, does not cross cache line boundary.

RealLucy · 2025-04-07T13:42:03Z

src/hotspot/cpu/s390/stubGenerator_s390.cpp

+      // multiple of 2
+      do_setmemory_atomic_loop(2, dest, size, byteVal, _masm);
+
+      __ align(16);


What is this alignment good for?

Branch target alignment. There is no fallthrough path from before this point. Should it be 32?

RealLucy · 2025-04-07T13:43:26Z

src/hotspot/cpu/s390/stubGenerator_s390.cpp

+      __ z_ogrk(rScratch1, dest, size);
+
+      __ z_nill(rScratch1, 7);
+      __ z_bre(L_fill8Bytes); // branch if 0


Pls use z_braz() to reflect check semantics

src/hotspot/cpu/s390/stubGenerator_s390.cpp

src/hotspot/cpu/s390/assembler_s390.inline.hpp

src/hotspot/cpu/s390/stubGenerator_s390.cpp

TheRealMDoerr · 2025-04-07T20:13:12Z

Since this is taken from #24254: Maybe you can review that one, too?

src/hotspot/cpu/s390/stubGenerator_s390.cpp

This reverts commit a6af6da.

TheRealMDoerr · 2025-04-09T21:28:26Z

This looks good to me. I suggest measuring performance with the latest version.

offamitkumar · 2025-04-10T10:14:27Z

Result looks almost similar:

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30   2.349 ± 0.012  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30   2.647 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30   2.614 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30   2.779 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30   2.759 ± 0.016  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30   2.887 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30   2.697 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30   2.771 ± 0.034  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30   3.700 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30   3.165 ± 0.042  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  17.266 ± 0.830  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30   4.479 ± 0.019  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  54.563 ± 1.222  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30   9.141 ± 0.069  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30   2.338 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30   2.647 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30   2.618 ± 0.009  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30   2.780 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30   2.752 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30   2.889 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30   2.695 ± 0.002  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30   2.763 ± 0.009  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30   3.684 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30   3.115 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  16.376 ± 0.018  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  15.394 ± 0.080  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  55.838 ± 1.325  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  52.927 ± 0.874  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30   2.281 ± 0.206  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30   2.076 ± 0.147  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30   2.562 ± 0.004  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30   2.020 ± 0.105  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30   2.938 ± 0.052  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30   2.412 ± 0.007  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30   3.349 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30   2.304 ± 0.220  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30   5.005 ± 0.005  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30   2.113 ± 0.110  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  14.160 ± 0.401  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30   3.200 ± 0.170  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  55.619 ± 0.672  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30   7.613 ± 0.186  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30   2.324 ± 0.224  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30   2.483 ± 0.004  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30   2.565 ± 0.005  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30   2.669 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30   2.916 ± 0.031  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30   3.042 ± 0.029  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30   3.360 ± 0.037  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30   3.401 ± 0.074  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30   5.012 ± 0.014  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30   4.592 ± 0.156  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  13.981 ± 0.392  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  14.876 ± 0.894  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  55.273 ± 0.546  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  53.228 ± 1.325  ns/op
Finished running test 'micro:java.lang.foreign.MemorySegmentZeroUnsafe'

offamitkumar · 2025-04-16T11:36:52Z

This result is from shared-machine, but looks like the regression part is fixed.

We got regression because, for Unaligned case, only 1-byte store instruction were getting emitted (i.e. stc). And as the alignment depends on two factors (size and address where we are storing the value). So we can't always exactly tell that this will be an aligned or un-aligned case in the Benchmark.

I will do further testing and will see if more optimization can be done. Then will mark this PR ready for review.

Benchmark                       (aligned)  (size)  Mode  Cnt  Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30  2.893 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30  3.122 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30  3.286 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30  3.401 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30  3.291 ± 0.021  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30  3.455 ± 0.015  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30  3.471 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30  3.215 ± 0.033  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30  4.632 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30  3.815 ± 0.014  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  9.695 ± 0.036  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30  5.296 ± 0.008  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  9.682 ± 0.011  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30  9.508 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30  2.887 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30  3.134 ± 0.024  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30  3.285 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30  3.397 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30  3.297 ± 0.049  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30  3.445 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30  3.471 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30  3.204 ± 0.023  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30  4.630 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30  3.811 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  9.676 ± 0.012  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  9.690 ± 0.031  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  9.678 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  4.180 ± 0.010  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30  2.636 ± 0.060  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30  2.379 ± 0.006  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30  7.743 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30  2.531 ± 0.113  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30  7.746 ± 0.012  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30  3.183 ± 0.006  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30  7.742 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30  2.580 ± 0.095  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30  7.870 ± 0.184  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30  2.523 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  7.757 ± 0.033  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30  3.580 ± 0.005  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  7.744 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30  8.090 ± 0.110  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30  2.683 ± 0.025  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30  7.747 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30  7.738 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30  7.745 ± 0.009  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30  7.773 ± 0.064  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30  7.736 ± 0.008  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30  7.747 ± 0.010  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30  7.748 ± 0.030  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30  7.735 ± 0.008  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30  7.747 ± 0.020  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  7.746 ± 0.013  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  7.743 ± 0.012  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  7.741 ± 0.011  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  2.739 ± 0.005  ns/op
Finished running test 'micro:java.lang.foreign.MemorySegmentZeroUnsafe'

offamitkumar · 2025-05-12T03:49:55Z

Thanks! That sounds like mvc should better not be used for Unsafe operations. Seeing no failures in some tests doesn't prove that it's safe.

@TheRealMDoerr But in this case MVC will only be used iff store is unaligned. If they are unaligned then we don't care about the atomicity. In other case, we will use sth, st, stg as per alignment. And current C++ implementation is also emitting mvc instruction for unaligned case. Which is the behaviour this stub will replicate.

If we don't go ahead with mvc, then we are seeing regression, as you have noticed in the previous result.

TheRealMDoerr · 2025-05-12T08:59:29Z

As I said, mvc usage may be a bug. It was probably not indented that gcc generates it for Unsafe operations. Atomicity is never a problem when filling memory with Bytes. The code is designed to have a defined behavior when hitting signals. That's why UnsafeMemoryAccessMark is used.

TheRealMDoerr · 2025-05-12T09:07:29Z

If we don't go ahead with mvc, then we are seeing regression, as you have noticed in the previous result.

Are these corner cases relevant at all?

offamitkumar · 2025-05-13T09:36:32Z

If we don't go ahead with mvc, then we are seeing regression, as you have noticed in the previous result.

Are these corner cases relevant at all?

I am not sure about that. But the hit was significant in case of 255 & 256 byte.

TheRealMDoerr · 2025-05-13T09:59:49Z

The invariant on other platforms is that all Bytes before the non-writable address have been written when hitting a signal. I don' know if that is really required on s390. It may be a risk to use a different behavior. The code can be used to write memory mapped files or other stuff.
If this behavior is not required, why not use mvc always?

uweigand · 2025-05-13T11:00:32Z

The invariant on other platforms is that all Bytes before the non-writable address have been written when hitting a signal. I don' know if that is really required on s390. It may be a risk to use a different behavior. The code can be used to write memory mapped files or other stuff. If this behavior is not required, why not use mvc always?

I thought the reason for not using mvc always is atomicity within array elements? That is, if you're writing an array of 4- or 8-byte values, than change to every one of those array elements should be atomic w.r.t. other CPUs. If that is true, you cannot use mvc. (However, that requirement would not be relevant for arrays of 1-byte values.)

TheRealMDoerr · 2025-05-13T12:30:47Z

However, that requirement would not be relevant for arrays of 1-byte values.

Correct. Unsafe::setMemory fills a memory region with 1-byte values. So, atomicity can't be a problem.

theRealAph · 2025-05-22T09:47:59Z

There's a lot of confusion about this. There is no requirement that all bytes before the non-writable address have been written when hitting a signal. Behaving nicely when writing beyond allocated memory is "best effort" only: we're trying to be nice, that's all.

The atomicity requirement is here , in the specification of Unsafe::SetMemory:

     * <p>The stores are in coherent (atomic) units of a size determined
     * by the address and length parameters.  If the effective address and
     * length are all even modulo 8, the stores take place in 'long' units.
     * If the effective address and length are (resp.) even modulo 4 or 2,
     * the stores take place in units of 'int' or 'short'.

TheRealMDoerr · 2025-05-22T09:52:22Z

Ah, thanks! I was not aware of that. That means the current implementation is probably wrong in some cases (mvc generated by gcc). Or is mvc only used in the single Byte aligned case?

TheRealMDoerr

The new proposal is probably ok, then.

theRealAph · 2025-05-23T08:18:12Z

Ah, thanks! I was not aware of that. That means the current implementation is probably wrong in some cases (mvc generated by gcc). Or is mvc only used in the single Byte aligned case?

Yes, that's right, just for the byte-aligned case.

offamitkumar · 2025-05-26T04:08:14Z

Tier-1 test are clean with fastdebug-vm;

These are the performance number on my z16 zVM:

Benchmark                       (aligned)  (size)  Mode  Cnt  Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30  2.889 ± 0.020  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30  3.115 ± 0.014  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30  3.271 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30  3.382 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30  3.295 ± 0.062  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30  3.428 ± 0.008  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30  3.482 ± 0.049  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30  3.188 ± 0.013  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30  4.612 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30  3.795 ± 0.004  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  5.376 ± 0.037  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30  4.846 ± 0.033  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  7.723 ± 0.263  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30  7.299 ± 0.017  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30  2.883 ± 0.017  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30  3.110 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30  3.271 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30  3.385 ± 0.009  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30  3.268 ± 0.024  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30  3.431 ± 0.010  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30  3.459 ± 0.003  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30  3.186 ± 0.005  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30  4.614 ± 0.015  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30  3.799 ± 0.006  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  5.282 ± 0.020  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  4.891 ± 0.012  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  8.038 ± 0.007  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  7.890 ± 0.108  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30  3.785 ± 0.062  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30  3.772 ± 0.075  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30  3.433 ± 0.052  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30  3.727 ± 0.172  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30  3.414 ± 0.062  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30  3.313 ± 0.117  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30  3.198 ± 0.015  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30  2.843 ± 0.158  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30  3.278 ± 0.004  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30  2.925 ± 0.113  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  3.800 ± 0.006  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30  3.400 ± 0.050  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  7.032 ± 0.120  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30  6.423 ± 0.013  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30  3.645 ± 0.148  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30  3.638 ± 0.152  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30  3.377 ± 0.068  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30  3.692 ± 0.119  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30  3.436 ± 0.027  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30  3.427 ± 0.038  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30  3.192 ± 0.014  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30  3.035 ± 0.046  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30  3.294 ± 0.049  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30  3.042 ± 0.061  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  3.579 ± 0.006  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  3.449 ± 0.035  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  8.633 ± 0.317  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  7.003 ± 0.085  ns/op

RealLucy · 2025-05-26T07:58:03Z

The atomicity spec cited by @theRealAph severely limits the optimisation options. Depending on the data alignment, you have to use 8, 4, or 2-byte stores. Only for the unaligned case there are no hard restrictions, just the soft "let's be nice" conventions.

With that said, the vector implementation should be ok. It is just not as nice as a byte store loop. There could be as many as 15 uninitialised bytes if just the last byte of a vector store is not writable. I would take that risk.

RealLucy · 2025-05-26T07:36:14Z

src/hotspot/cpu/s390/stubGenerator_s390.cpp

+    UnsafeMemoryAccessMark umam(this, true, false);
+
+    __ z_vlvgb(Z_V0, byteVal, 0);
+    __ z_vrepb(Z_V0, Z_V0, 0);


You could also use z_vzero(Vreg) to preload the vector register with all zeroes. Saves an instruction.

I am not loading 0 here. This is my intention: with z_vlvgb, putting value of byteVal in the first 0th index of Z_V0 and then with z_vrepb replicating the 0th index value (1 byte) to the whole register.

z_vzero will make sense if we are zeroing out the memory but that's not the case always. We do fill some non-zero 1 byte value in most of the case.

TheRealMDoerr · 2025-05-26T08:42:06Z

The large number of conditional branches may cause a regression in real life scenarios with a large variance of sizes and alignments.

offamitkumar · 2025-05-26T09:23:20Z

The large number of conditional branches may cause a regression in real life scenarios with a large variance of sizes and alignments.

I can try to run the same benchmark with larger sizes. But again it wouldn't replicate the real life scenario. Could you suggest some other benchmark ?

offamitkumar · 2025-05-30T03:49:14Z

As of now I am not getting any regression in the benchmark. And vector store + mvc is not performing better then the vector store only solution. So I am moving ahead with the integration.

offamitkumar · 2025-05-30T03:50:09Z

Thanks to all for the help and reviews/suggestion you provided.

/integrate

openjdk · 2025-05-30T03:50:43Z

Going to push as commit 2000551.
Since your change was applied there have been 1027 commits pushed to the master branch:

fd51b03: 8351369: [macos] Use --install-dir option with DMG packaging
64503c7: 8357299: Graphics copyArea doesn't copy any pixels when there is overflow
a05f9de: 8358017: Various enhancements of jpackage test helpers
... and 1024 more: https://git.openjdk.org/jdk/compare/15d36ee4a5dc3a143faccd59ecc3f9b0b45ed5d3...master

Your commit was automatically rebased without conflicts.

openjdk · 2025-05-30T03:50:50Z

@offamitkumar Pushed as commit 2000551.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

theRealAph · 2025-05-30T08:32:30Z

What are all those noprs for?

offamitkumar · 2025-06-02T03:35:27Z

What are all those noprs for?

Sorry that is old code; nops were inserted for the loop alignment; this is the newer stub code:

- - - [BEGIN] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
StubRoutines::unsafe_setmemory [0x000003ffa84b63c0, 0x000003ffa84b644c] (140 bytes)
--------------------------------------------------------------------------------
BFD: unknown S/390 disassembler option: s390
.long	0x00000000
  0x000003ffa84b63c0:   vlvgb	%v0,%r4,0
  0x000003ffa84b63c6:   vrepb	%v0,%v0,0
  0x000003ffa84b63cc:   aghi	%r3,-32
  0x000003ffa84b63d0:   jl	0x000003ffa84b63ec
  0x000003ffa84b63d4:   vst	%v0,0(%r2)
  0x000003ffa84b63da:   vst	%v0,16(%r2)
  0x000003ffa84b63e0:   aghi	%r2,32
  0x000003ffa84b63e4:   aghi	%r3,-32
  0x000003ffa84b63e8:   jhe	0x000003ffa84b63d4
  0x000003ffa84b63ec:   tmll	%r3,16
  0x000003ffa84b63f0:   je	0x000003ffa84b63fe
  0x000003ffa84b63f4:   vst	%v0,0(%r2)
  0x000003ffa84b63fa:   aghi	%r2,16
  0x000003ffa84b63fe:   tmll	%r3,8
  0x000003ffa84b6402:   je	0x000003ffa84b6410
  0x000003ffa84b6406:   vsteg	%v0,0(%r2),0
  0x000003ffa84b640c:   aghi	%r2,8
  0x000003ffa84b6410:   tmll	%r3,7
  0x000003ffa84b6414:   je	0x000003ffa84b644a
  0x000003ffa84b6418:   tmll	%r3,4
  0x000003ffa84b641c:   je	0x000003ffa84b642a
  0x000003ffa84b6420:   vstef	%v0,0(%r2),0
  0x000003ffa84b6426:   aghi	%r2,4
  0x000003ffa84b642a:   tmll	%r3,2
  0x000003ffa84b642e:   je	0x000003ffa84b643c
  0x000003ffa84b6432:   vsteh	%v0,0(%r2),0
  0x000003ffa84b6438:   aghi	%r2,2
  0x000003ffa84b643c:   tmll	%r3,1
  0x000003ffa84b6440:   je	0x000003ffa84b644a
  0x000003ffa84b6444:   vsteb	%v0,0(%r2),0
  0x000003ffa84b644a:   br	%r14
--------------------------------------------------------------------------------
- - - [END] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

s390: unsafe::setMemory Port

e7cf3a8

openjdk bot changed the title ~~8353500~~ 8353500: [s390x] Intrinsify Unsafe::setMemory Apr 7, 2025

openjdk bot added the rfr Pull request is ready for review label Apr 7, 2025

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Apr 7, 2025

RealLucy suggested changes Apr 7, 2025

View reviewed changes

TheRealMDoerr reviewed Apr 7, 2025

View reviewed changes

src/hotspot/cpu/s390/stubGenerator_s390.cpp Outdated Show resolved Hide resolved

TheRealMDoerr reviewed Apr 7, 2025

View reviewed changes

src/hotspot/cpu/s390/stubGenerator_s390.cpp Outdated Show resolved Hide resolved

reviews from Lutz and Martin

bfbf99b

offamitkumar marked this pull request as draft April 8, 2025 10:05

openjdk bot removed the rfr Pull request is ready for review label Apr 8, 2025

TheRealMDoerr reviewed Apr 8, 2025

View reviewed changes

src/hotspot/cpu/s390/stubGenerator_s390.cpp Outdated Show resolved Hide resolved

TheRealMDoerr reviewed Apr 8, 2025

View reviewed changes

src/hotspot/cpu/s390/stubGenerator_s390.cpp Show resolved Hide resolved

offamitkumar added 3 commits April 9, 2025 07:26

minor improvement

a6af6da

Revert "minor improvement"

7e3bb5e

This reverts commit a6af6da.

reviews for Martin

1b8ea8b

offamitkumar marked this pull request as ready for review April 9, 2025 08:52

openjdk bot added the rfr Pull request is ready for review label Apr 9, 2025

offamitkumar marked this pull request as draft April 16, 2025 04:50

openjdk bot removed the rfr Pull request is ready for review label Apr 16, 2025

offamitkumar added 2 commits April 16, 2025 06:07

[wip] initial mvc template solution

f1d075a

wip: fixed the regression

36ef2e4

TheRealMDoerr approved these changes May 22, 2025

View reviewed changes

switch to vector stores

d79a841

openjdk bot removed the ready Pull request is ready to be integrated label May 26, 2025

RealLucy approved these changes May 26, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label May 26, 2025

openjdk bot added the integrated Pull request has been integrated label May 30, 2025

openjdk bot closed this May 30, 2025

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 30, 2025

offamitkumar deleted the not_safe_intrinsic branch June 2, 2025 03:35

8353500: [s390x] Intrinsify Unsafe::setMemory #24480

8353500: [s390x] Intrinsify Unsafe::setMemory #24480

Uh oh!

Conversation

offamitkumar commented Apr 7, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

Uh oh!

bridgekeeper bot commented Apr 7, 2025

Uh oh!

openjdk bot commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Apr 7, 2025

Uh oh!

mlbridge bot commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

offamitkumar commented Apr 7, 2025

Uh oh!

RealLucy Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

RealLucy Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

TheRealMDoerr Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

RealLucy Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheRealMDoerr commented Apr 7, 2025

Uh oh!

Uh oh!

Uh oh!

TheRealMDoerr commented Apr 9, 2025

Uh oh!

offamitkumar commented Apr 10, 2025

Uh oh!

offamitkumar commented Apr 16, 2025

Uh oh!

offamitkumar commented May 12, 2025

Uh oh!

TheRealMDoerr commented May 12, 2025

Uh oh!

TheRealMDoerr commented May 12, 2025

Uh oh!

offamitkumar commented May 13, 2025

Uh oh!

TheRealMDoerr commented May 13, 2025

Uh oh!

uweigand commented May 13, 2025

Uh oh!

TheRealMDoerr commented May 13, 2025

Uh oh!

theRealAph commented May 22, 2025

Uh oh!

TheRealMDoerr commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheRealMDoerr left a comment

Choose a reason for hiding this comment

Uh oh!

theRealAph commented May 23, 2025

Uh oh!

offamitkumar commented May 26, 2025

Uh oh!

RealLucy commented May 26, 2025

Uh oh!

RealLucy May 26, 2025

offamitkumar commented Apr 7, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Apr 7, 2025 •

edited

Loading

mlbridge bot commented Apr 7, 2025 •

edited

Loading

TheRealMDoerr commented May 22, 2025 •

edited

Loading