Write barrier without any RWX pages #114982

davidwrighton · 2025-04-23T23:50:57Z

Add a new config switch DOTNET_UseGCWriteBarrierCopy to control whether or not to use a copy of the WriteBarrier code instead of assembly code embedded in the coreclr binary. Typically we do this to improve performance, but this change enables a new path where we just have the barrier as assembly code in the coreclr binary that does not require any RWX behavior. Also, with this change, the 1 copy of the code which is used for this scenario is shared between NativeAOT and coreclr.

This change will change the write barrier used in some situations on CoreCLR to be the same as the write barrier used on NativeAOT.

In addition, we now have a copy of NativeAOT variant of the write barrier for Linux X86, although the other NativeAOT support is not present.

Finally, the Linux Arm32 version of the NativeAOT write barrier could now be enabled to use software write watch, like CoreCLR does. This is not enabled in this change, although the code has been written into the write barrier (as it was needed for CoreCLR).

…rrier_WithoutCodegen

…east, somewhat close.

dotnet-policy-service · 2025-04-23T23:51:59Z

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

… If it does... I'll likely do this logic for all architectures

…as well. If it does... I'll likely do this logic for all architectures" This reverts commit a882fbf.

…_WRITE_WATCH_FOR_GC_HEAP

filipnavara · 2025-04-29T21:20:00Z

src/coreclr/runtime/arm/WriteBarriers.S

@@ -179,6 +179,10 @@ GLOBAL_LABEL RhpCheckedAssignRefAVLocation
 LEAF_END RhpCheckedAssignRef\EXPORT_REG_NAME, _TEXT
 .endm

+LEAF_ENTRY RhpWriteBarriers, _TEXT


FWIW this would not work anyway... we compile all code on Apple platforms as .subsections_via_symbols and the Apple linker is free to reorder the functions (or discard unused ones if told to). Compiling without subsections_via_symbols is not an option due to various bugs in different linker versions. The only other option to keep the desired order is to mark the successive symbols as .alt_entry, which also happens to run into linker bugs if not done extremely carefully...

… as well. If it does... I'll likely do this logic for all architectures" This reverts commit f5a07a7.

filipnavara · 2025-05-02T06:45:37Z

src/coreclr/runtime/i386/WriteBarriers.S

+.intel_syntax noprefix
+#include "AsmMacros_Shared.h"
+
+// TODO! This is implemented, but not tested.


I'll test it since I already have the setup.

It passed some basic smoke tests. 👍🏿

filipnavara · 2025-05-02T06:54:02Z

src/coreclr/runtime/arm64/WriteBarriers.S

+#if defined(__APPLE__)
+    // Currently the build is failing without this due to an issue if the first method in the assembly file has an alternate entry at the start of the file.
+    // Fix, but adding an empty, unused method
+    LEAF_ENTRY RhpWriteBarriersDoNotFailToBuild, _TEXT
+       ret
+    LEAF_END RhpWriteBarriersDoNotFailToBuild, _TEXT
+#endif


For posterity, this was reported as FB14743667 to Apple. It should be fixed in Xcode 16.3. We should still keep the workaround until we bump the minimum Xcode requirement past this version.

The general issue is that the new linker (ld-prime) didn't correctly handle combination of symbols at overlapping addresses and alt-entry symbols. There are some cases where section symbols are implicitly generated at the beginning of each section in object file so it's easy to run into this. Old linker (ld64) doesn’t handle alt-entry symbols at a start of a section so it's problematic there too.

davidwrighton · 2025-05-05T18:32:51Z

/azp list

davidwrighton · 2025-05-05T18:34:12Z

/azp run runtime-nativeaot-outerloop

azure-pipelines · 2025-05-05T18:34:21Z

Azure Pipelines successfully started running 1 pipeline(s).

…thoutCodegen

…e barrier when using the old write barriers

janvorli · 2025-05-06T14:19:40Z

src/coreclr/vm/excep.cpp

+#endif
+
+#ifdef TARGET_ARM
+        if ((writeBarrierAVLocations[i] | THUMB_CODE) == (uControlPc | THUMB_CODE))


Are one of these missing the THUMB bit? I would assume that both would have it set.
Anyways, we have PCODEToPINSTR(addr) to strip the thumb bit if it is necessary.

There is a bunch of unnecessary casts in this space. I'll get rid of them, and use PCODEToPINSTR, and see if it all works. If not, the fix will be simple..

src/coreclr/vm/threads.cpp

janvorli · 2025-05-06T21:11:24Z

src/coreclr/vm/excep.cpp

+        ASSERT(*(uint8_t*)writeBarrierAVLocations[i] != 0xE9); // jmp XXXXXXXX
+#endif
+
+        if (writeBarrierAVLocations[i] == PCODEToPINSTR(uControlPc))


It is still strange to me that the writeBarrierAVLocations[i] would not have the THUMB bit set when it is a pointer to thumb code. Are you sure it is really the case?

They really don't have it:

nm ./artifacts/obj/coreclr/linux.arm.Debug/vm/wks/CMakeFiles/cee_wks_core.dir/__/__/runtime/arm/WriteBarriers.S.o 00000000 t $t 00000001 T RhpAssignRef 00000004 T RhpAssignRefAVLocation 00000004 T RhpAssignRefAvLocationr1 00000001 T RhpAssignRefr1 00000111 T RhpByRefAssignRef 00000114 T RhpByRefAssignRefAVLocation1 00000116 T RhpByRefAssignRefAVLocation2 00000077 T RhpCheckedAssignRef 0000007a T RhpCheckedAssignRefAVLocation 0000007a T RhpCheckedAssignRefAvLocationr1 00000077 T RhpCheckedAssignRefr1 U __aeabi_unwind_cpp_pr0 U g_card_table U g_ephemeral_high U g_ephemeral_low U g_highest_address U g_lowest_address U g_write_watch_table

I think it boils down to how ALTERNATE_ENTRY is defined:

.macro PATCH_LABEL Name .thumb_func .global C_FUNC(\Name) C_FUNC(\Name): .endm .macro ALTERNATE_ENTRY Name .global C_FUNC(\Name) .type \Name, %function C_FUNC(\Name): .endm

I think the assembler treats it as ARM function label.

Ah, the macro doesn't have the .thumb_func, that's why. I wonder if that's intentional.

It's somewhat intentional. The definition is used in NativeAOT and there we are comparing it to already stripped instruction pointer. I don't have any strong option on unifying it one way or the other but when I was working on the linux-arm NativeAOT port I thought it makes sense to represent CODE_LOCATION as the actual memory location (ie. abstract out the Thumb weirdness early on)...

am11

LGTM, thanks! 🎉

Fixes #64253
Unblocks #115339 (the last HMF dependency)

cc @jkotas, @janvorli

jkotas · 2025-05-08T21:50:11Z

src/coreclr/runtime/amd64/StubDispatch.S

@@ -15,14 +24,14 @@

 LEAF_ENTRY RhpInterfaceDispatch\entries, _TEXT

-        // r11 currently contains the indirection cell address.


My guess that this was intentionally done first so that the data dependency below gets satisfied first. Is this really needed with the workaround above?

I don't see the value of having a specific memory ordering between loading the MethodTable pointer and loading the cache block. I think we can safely load those in any order.

It is functionally correct to load them in any order.

I meant that the existing order might have been chosen intentionally as a performance micro-optimization, so that [r11 + OFFSETOF__InterfaceDispatchCell__m_pCache] indirection gets handled a bit sooner or be more likely to be parallelized with [rdi] indirection.

I reverted the order and it still produced correct symbols and unwind info on Xcode 16.3 on ARM64, at least in the object file: https://gist.github.com/filipnavara/bdd6dc77727fa68782bdb314a117f6d3.

That's why I asked for specific Xcode version so I could get to the bottom of it.

I think I see the issue now in the linked .dylib:

[593] funcOffset=0x0065504C, encoding[ 6]=0x02000000 (no frame, no saved registers ) _RhpInterfaceDispatch2 [594] funcOffset=0x00655050, encoding[ 8]=0x00000000 (no frame, no saved registers ) _RhpInterfaceDispatchAVLocation2 [595] funcOffset=0x00655080, encoding[ 6]=0x02000000 (no frame, no saved registers ) _RhpInterfaceDispatch4 [596] funcOffset=0x00655084, encoding[ 8]=0x00000000 (no frame, no saved registers ) _RhpInterfaceDispatchAVLocation4

This is weird. Both ld64 and ld-prime produce this. unwinddump treats the null opcode as continuation of the previous entry (in this case it prints "no frame, no saved registers", but if I force some DWARF sequence it still shows the DWARF pointer for both entries) has a bug where it's missing a case for the null opcode on ARM64 (the bug is not present for other archs). llvm-libunwind doesn't seem to have any special treatment for the null opcode.

In summary, this really does look like a linker bug. I'll do a bit more poking and file Apple feedback if necessary. One possible workaround in this specific case is to start a new unwind info at the alt_entry location by issuing .cfi_endproc in front of it and .cfi_startproc behind it. Unfortunately that doesn't work for the general case.

Filed with Apple as FB17568654
Repro code: Archive.zip
Let's see if they can provide some clarification or workarounds.

jkotas · 2025-05-08T22:15:04Z

src/coreclr/runtime/amd64/WriteBarriers.S

+#include "AsmMacros_Shared.h"
+
+#if defined(__APPLE__)
+    // Currently the build is failing without this due to an issue if the first method in the assembly file has an alternate entry at the start of the file.


Why is this not failing in NativeAOT build today?

Now that I look at it, this probably isn't necessary in the amd64 write barrier since it wasn't complaining before.

This leads me to wonder if maybe the issue on MacOS arm64 with having an alternate entry as the first symbol in the object file was that since our alternate entry macro doesn't set the thumb bit, the offset for the alternate entry was before that of the normal function symbol for the first symbol in the object file. Time for an experiment. Sigh.

Thumb bit is only relevant for arm32 so that should not matter here.

Also, the issue with first symbol in object file is known ld64 bug, see #114982 (comment).

jkotas · 2025-05-08T22:21:07Z

src/coreclr/runtime/amd64/WriteBarriers.S

@@ -261,6 +269,14 @@ LEAF_END RhpCheckedXchg, _TEXT
 LEAF_ENTRY RhpByRefAssignRef, _TEXT
 ALTERNATE_ENTRY RhpByRefAssignRefAVLocation1
    mov     rcx, [rsi]
+#ifdef TARGET_APPLE
+// Apple's linker has issues which break unwind info if


What's the manifestation of the bad unwind info?

If it is just when single stepping in lldb, do we care?

The manifestation is that if we AV on this location, we are unable to unwind to the managed frame, so we failfast the process instead of producing a NullReferenceException which can be caught.

Would it be better to use the same manual uwnind routine in regular CoreCLR as we use in NAOT to avoid this problem?

There is another approach where we don't use libunwind for unwinding these entries, instead we use our own unwinder which is special cased to know what to do. (This is what NativeAot does.)

(Unlike NativeAOT, CoreCLR implements the Apple compact unwinding through a modified copy of the llvm-libunwind code; libunwind should not even come into play for this since it's covered by the compact codes.)

When I was doing fixes of the osx-arm64 NativeAOT port I found a bug in llvm-libunwind comparison of the boundary comparison for IP addresses and function start/end (cb6fb60#diff-fa7a2ccbf98ccd844bfc21e48f2bc700137c2903cf97d487fa011e16b1f9b3c0).

I looked at the custom unwinding code in CoreCLR and there are similar comparisons that look sketchy:

runtime/src/coreclr/pal/src/exception/remote-unwind.cpp

Lines 1034 to 1037 in c201ad1

if (ip < funcStart || ip > funcEnd) {

ERROR("ip %p not in regular second level\n", (void*)ip);

return false;

}

runtime/src/coreclr/pal/src/exception/remote-unwind.cpp

Lines 1064 to 1067 in c201ad1

if (ip < funcStart || ip > funcEnd) {

ERROR("ip %p not in compressed second level\n", (void*)ip);

return false;

}

(in both cases the condition should presumably be ip >= funcEnd, not ip > funcEnd; shouldn't really cause an error for alt_entry though..)

… if the Apple linker bug can be fixed that way

…rrier_WithoutCodegen

a74nh · 2025-05-16T09:00:09Z

I've been working in a similar area in #111636. Essentially I'm making Arm64 use writebarriers in the same way as AMD64. I've just been rebasing off the top of your work.

AIUI, with this PR, when IsWriteBarrierCopyEnabled() returns true it means that none of the writebarrrier code in src/vm/ is used (specifically, I care about JIT_WriteBarrier in src/vm/arm64/patchedcode.S). Instead it uses a read only write barrier in runtime/<target>/WriteBarriers.S.

If so, then I think we're good and I don't need to worry about my PR :)

davidwrighton added 6 commits April 22, 2025 15:46

Builds with write barrier moved

bfd57cd

More adjustment

02b9ea6

Merge branch 'main' of https://github.com/dotnet/runtime into WriteBa…

273b640

…rrier_WithoutCodegen

In theory all mucked up to make this work on Windows Amd64... Or at l…

17ad508

…east, somewhat close.

Fix config settings for write barrier

4669265

Get the Arm64 build to work on Windows

a24a67e

ghost added the area-VM-coreclr label Apr 23, 2025

dotnet-policy-service bot assigned davidwrighton Apr 23, 2025

This was referenced Apr 24, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

[linux-x64] [mono-aot] Test Runtime_101731.TestConvertToInt64NativeSingle(3.4028235E+38) returns exit code 22 #112557

Open

davidwrighton added 4 commits April 25, 2025 21:21

Fix arm64 build

a858993

Fix arm build

1c4c79a

Fix RiscV build

999b466

Fix Loongarch64 build

547fb96

build-analysis bot mentioned this pull request Apr 26, 2025

LibraryImportGenerator.Unit.Tests crashing on linux-x64 mono interpreter #100800

Open

davidwrighton added 7 commits April 25, 2025 21:26

Fix Windows X86 build

9fd2921

Attempt to fix Windows Arm64 and Windows X86 issues

9c2e247

Attempt to fix Linux Arm build

4601ff8

Fix Linux Arm32

ff8a695

Alternative approach to fixing the EH issue that may fix OSX as well.…

a882fbf

… If it does... I'll likely do this logic for all architectures

Revert "Alternative approach to fixing the EH issue that may fix OSX …

f5a07a7

…as well. If it does... I'll likely do this logic for all architectures" This reverts commit a882fbf.

Fix the Linux Arm32 Rhp write barrier to support FEATURE_USE_SOFTWARE…

45a866b

…_WRITE_WATCH_FOR_GC_HEAP

build-analysis bot mentioned this pull request Apr 29, 2025

System.Security.Cryptography.X509Certificates.Tests.PfxTests.ReadMLKem512PrivateKey_NotSupported failing with CryptographicException #115156

Closed

filipnavara reviewed Apr 29, 2025

View reviewed changes

davidwrighton added 4 commits April 30, 2025 14:44

Remove ALTERNATE_ENTRY AVLocations

516656e

Reapply "Alternative approach to fixing the EH issue that may fix OSX…

2b791c6

… as well. If it does... I'll likely do this logic for all architectures" This reverts commit f5a07a7.

Use the different style of lookup only on apple platforms on CoreCLR

ecc06a7

Fix build breaks

e8c9a34

am11 mentioned this pull request May 1, 2025

Convert JIT_Box* to C# #115134

Merged

filipnavara reviewed May 2, 2025

View reviewed changes

davidwrighton added 2 commits May 2, 2025 20:26

Put the write watch table update in the right spot

9467e25

Set UseGCWriteBarrierCopy to its production value

4e1ee53

davidwrighton changed the title ~~[DRAFT] Write barrier without any RWX pages~~ Write barrier without any RWX pages May 5, 2025

dotnet deleted a comment from azure-pipelines bot May 5, 2025

Merge branch 'main' of github.com:dotnet/runtime into WriteBarrier_Wi…

3383a68

…thoutCodegen

build-analysis bot mentioned this pull request May 5, 2025

ReadSlhDsa_Pfx_Ietf_NotSupported: CryptographicException : The specified network password is not correct. #115270

Closed

Fix Windows X86 issue where the we failed to stack walk out of a writ…

6ca2479

…e barrier when using the old write barriers

davidwrighton marked this pull request as ready for review May 5, 2025 22:56

davidwrighton requested a review from MichalStrehovsky as a code owner May 5, 2025 22:56

davidwrighton requested a review from janvorli May 5, 2025 23:56

janvorli reviewed May 6, 2025

View reviewed changes

src/coreclr/vm/threads.cpp Outdated Show resolved Hide resolved

filipnavara mentioned this pull request May 6, 2025

Share allocation helpers between CoreCLR and NativeAOT #115339

Open

Code review feedback

435c220

janvorli reviewed May 6, 2025

View reviewed changes

davidwrighton requested a review from jkotas May 8, 2025 01:33

am11 approved these changes May 8, 2025

View reviewed changes

jkotas reviewed May 8, 2025

View reviewed changes

davidwrighton added 2 commits May 13, 2025 09:59

Try the cfi_startproc/cfi_endproc wrappers for ALTERNATE_ENTRY to see…

2950b1f

… if the Apple linker bug can be fixed that way

Merge branch 'main' of https://github.com/dotnet/runtime into WriteBa…

247ae5f

…rrier_WithoutCodegen

build-analysis bot mentioned this pull request May 13, 2025

Test failure: baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester.cmd #110173

Open

davidwrighton requested a review from jkotas May 14, 2025 22:17

jkotas approved these changes May 14, 2025

View reviewed changes

davidwrighton merged commit 3aa00d7 into dotnet:main May 15, 2025
108 checks passed

		@@ -15,14 +24,14 @@

		LEAF_ENTRY RhpInterfaceDispatch\entries, _TEXT

		// r11 currently contains the indirection cell address.

	if (ip < funcStart \|\| ip > funcEnd) {
	ERROR("ip %p not in regular second level\n", (void*)ip);
	return false;
	}

	if (ip < funcStart \|\| ip > funcEnd) {
	ERROR("ip %p not in compressed second level\n", (void*)ip);
	return false;
	}

Write barrier without any RWX pages #114982

Write barrier without any RWX pages #114982

Uh oh!

Conversation

davidwrighton commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotnet-policy-service bot commented Apr 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipnavara May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidwrighton commented May 5, 2025

Uh oh!

davidwrighton commented May 5, 2025

Uh oh!

azure-pipelines bot commented May 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipnavara May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

am11 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkotas May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipnavara May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipnavara May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkotas May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

davidwrighton commented Apr 23, 2025 •

edited

Loading

filipnavara May 2, 2025 •

edited

Loading

filipnavara May 6, 2025 •

edited

Loading

jkotas May 8, 2025 •

edited

Loading

filipnavara May 10, 2025 •

edited

Loading

filipnavara May 9, 2025 •

edited

Loading

jkotas May 8, 2025 •

edited

Loading

filipnavara May 9, 2025 •

edited

Loading