Skip to content

Commit 7db0b74

Browse files
authored
Abolish PSPSym from ABI (#114630)
NativeAOT was already using an ABI without PSPSym and this aligns CoreCLR to match. The aim is to get some improvements in code quality and also get better testing for the code generation by unifying it across NativeAOT and CoreCLR and thus getting better GC stress coverage. Eventually it opens path to share more code between the two runtimes. Removes any use of PSPSym from JIT, both for emitting and consumption. The only usage of PSPSym in the VM was in GetExactGenericsToken. The encoding of generics instance context stack slot is changed to be SP/FP relative. The profiler code is adjusted accordingly.
1 parent c83f92a commit 7db0b74

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+288
-1377
lines changed

docs/design/coreclr/botr/clr-abi.md

Lines changed: 5 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -322,33 +322,9 @@ Finally1:
322322

323323
Note that JIT64 does not implement this properly. The C# compiler used to always insert all necessary "step" blocks. The Roslyn C# compiler at one point did not, but then was changed to once again insert them.
324324

325-
## The PSPSym and funclet parameters
325+
## Funclet parameters
326326

327-
The *PSPSym* (which stands for Previous Stack Pointer Symbol) is a pointer-sized local variable used to access locals from the main function body.
328-
329-
NativeAOT does not use PSPSym. For filter funclets the VM sets the frame register to be the same as the parent function. For second pass funclets the VM restores all non-volatile registers. The same convention is used across all platforms.
330-
331-
CoreCLR uses PSPSym for all platforms except x86: the frame pointer on x86 is always preserved when the handlers are invoked.
332-
333-
First, two definitions.
334-
335-
*Caller-SP* is the value of the stack pointer in a function's caller before the call instruction is executed. That is, when function A calls function B, Caller-SP for B is the value of the stack pointer immediately before the call instruction in A (calling B) was executed. Note that this definition holds for both AMD64, which pushes the return value when a call instruction is executed, and for ARM, which doesn't. For AMD64, Caller-SP is the address above the call return address.
336-
337-
*Initial-SP* is the initial value of the stack pointer after the fixed-size portion of the frame has been allocated. That is, before any "alloca"-type allocations.
338-
339-
The value stored in PSPSym is the value of Initial-SP for AMD64 or Caller-SP for other platforms, for the main function. The stack offset of the PSPSym is reported to the VM in the GC information header. The value reported in the GC information is the offset of the PSPSym from Initial-SP for AMD64 or Caller-SP for other platforms. (Note that both the value stored, and the way the value is reported to the VM, differs between architectures. In particular, note that most things in the GC information header are reported as offsets relative to Caller-SP, but PSPSym on AMD64 is one exception, and maybe the only exception.)
340-
341-
The VM uses the PSPSym to find other locals it cares about (such as the generics context in a funclet frame). The JIT uses it to re-establish the frame pointer register, so that the frame pointer is the same value in a funclet as it is in the main function body.
342-
343-
When a funclet is called, it is passed the *Establisher Frame Pointer*. For AMD64 this is true for all funclets and it is passed as the first argument in RCX, but for ARM and ARM64 this is only true for first pass funclets (currently just filters) and it is passed as the second argument in R1. The Establisher Frame Pointer is a stack pointer of an interesting "parent" frame in the exception processing system. For the CLR, it points either to the main function frame or a dynamically enclosing funclet frame from the same function, for the funclet being invoked. The value of the Establisher Frame Pointer is Initial-SP on AMD64, Caller-SP on x86, ARM, and ARM64.
344-
345-
Using the establisher frame, the funclet wants to load the value of the PSPSym. Since we don't know if the Establisher Frame is from the main function or a funclet, we design the main function and funclet frame layouts to place the PSPSym at an identical, small, constant offset from the Establisher Frame in each case. (This is also required because we only report a single offset to the PSPSym in the GC information, and that offset must be valid for the main function and all of its funclets). Then, the funclet uses this known offset to compute the PSPSym address and read its value. From this, it can compute the value of the frame pointer (which is a constant offset from the PSPSym value) and set the frame register to be the same as the parent function. Also, the funclet writes the value of the PSPSym to its own frame's PSPSym. This "copying" of the PSPSym happens for every funclet invocation, in particular, for every nested funclet invocation.
346-
347-
On ARM and ARM64, for all second pass funclets (finally, fault, catch, and filter-handler) the VM restores all non-volatile registers to their values within the parent frame. This includes the frame register (`R11`). Thus, the PSPSym is not used to recompute the frame pointer register in this case, though the PSPSym is copied to the funclet's frame, as for all funclets.
348-
349-
Catch, Filter, and Filter-handlers also get an Exception object (GC ref) as an argument (`REG_EXCEPTION_OBJECT`). On AMD64 it is the second argument and thus passed in RDX. On ARM and ARM64 this is the first argument and passed in R0.
350-
351-
(Note that the JIT64 source code contains a comment that says, "The current CLR doesn't always pass the correct establisher frame to the funclet. Funclet may receive establisher frame of funclet when expecting that of original routine." It indicates this is the reason that a PSPSym is required in all funclets as well as the main function, whereas if the establisher frame was correctly reported, the PSPSym could be omitted in some cases.)
327+
Catch, Filter, and Filter-handlers get an Exception object (GC ref) as an argument (`REG_EXCEPTION_OBJECT`). On AMD64 it is passed in RCX (Windows ABI) or RSI (Unix ABI). On ARM and ARM64 this is the first argument and passed in R0.
352328

353329
## Funclet Return Values
354330

@@ -374,11 +350,11 @@ Some definitions:
374350

375351
When an exception occurs, the VM is invoked to do some processing. If the exception is within a "try" region, it eventually calls a corresponding handler (which also includes calling filters). The exception location within a function might be where a "throw" instruction executes, the point of a processor exception like null pointer dereference or divide by zero, or the point of a call where the callee threw an exception but did not catch it.
376352

377-
On AMD64, all register values that existed at the exception point in the corresponding "try" region are trashed on entry to the funclet. That is, the only registers that have known values are those of the funclet parameters.
353+
The VM sets the frame register to be the same as the parent function. This allows the funclets to access local variables using frame-relative addresses.
378354

379-
On ARM and ARM64, all registers are restored to their values at the exception point.
355+
For filter funclets and on CoreCLR/AMD64 for all funclets, all other register values that existed at the exception point in the corresponding "try" region are trashed on entry to the funclet. That is, the only registers that have known values are those of the funclet parameters and the frame register.
380356

381-
On x86: TBD.
357+
For other funclets on all platforms except CoreCLR/AMD64, all non-volatile registers are restored to their values at the exception point. The JIT codegen [does not take advantage of it currently](https://github.com/dotnet/runtime/pull/114630#issuecomment-2810210759).
382358

383359
### Registers on return from a funclet
384360

@@ -696,12 +672,6 @@ x64 currently saves RBP, RSI and RDI while ARM64 saves just FP and LR.
696672

697673
However, EnC remap is not supported inside funclets. The stack layout of funclets does not matter for EnC.
698674

699-
## Considerations with regards to PSPSym
700-
701-
As explained previously in this document, on x64 we have Initial RSP == PSPSym. For EnC methods, as we disallow remappings after localloc (see below), we furthermore have RBP == PSPSym.
702-
For ARM64 we have Caller SP == PSPSym and the FP points to the previously saved FP/LR pair. For EnC the JIT always sets up the stack frame so that the FP/LR pair is at Caller SP - 16 and does not save any additional callee saves.
703-
These invariants allow the VM to compute new value of the frame pointer and PSPSym after the edit without any additional information. Note that the frame pointer and PSPSym do not change values or location on ARM64. However, EH may be added to a function in which case a new PSPSym needs to be materialized, even on ARM64. Location of PSPSym is found via GC info.
704-
705675
## Localloc
706676

707677
Localloc is allowed in EnC code, but remap is disallowed after the method has executed a localloc instruction. VM uses the invariants above (`RSP == RBP` on x64, `FP + 16 == SP + stack size` on ARM64) to detect whether localloc was executed by the method.

docs/design/coreclr/botr/guide-for-porting.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -386,12 +386,10 @@ Here is an annotated list of the stubs implemented for Unix on Arm64.
386386
application
387387

388388
11. `CallEHFunclet` – Used to call catch, finally and fault funclets. Behavior
389-
is specific to exactly how funclets are implemented. Only used if
390-
USE_FUNCLET_CALL_HELPER is set
389+
is specific to exactly how funclets are implemented.
391390

392391
12. `CallEHFilterFunclet` – Used to call filter funclets. Behavior is specific
393-
to exactly how funclets are implemented. Only used if
394-
USE_FUNCLET_CALL_HELPER is set
392+
to exactly how funclets are implemented.
395393

396394
13. `ResolveWorkerChainLookupAsmStub`/ `ResolveWorkerAsmStub` Used for virtual
397395
stub dispatch (virtual call support for interface, and some virtual

docs/design/features/OsrDetailsAndDebugging.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -307,8 +307,6 @@ On Arm64 we have epilog unwind codes and the second SP adjust does not appear to
307307

308308
OSR funclets are more or less normal funclets.
309309

310-
On Arm64, to satisfy PSPSym reporting constraints, the funclet frame must be padded to include the Tier0 frame size. This is conceptually similar to the way the funclet frames also pad for homed varargs arguments -- in both cases the padded space is never used, it is just there to ensure the PSPSym ends up at the same caller-SP relative offset for the main function and any funclet.
311-
312310
#### OSR Unwind Info
313311

314312
On x64 the prolog unwind includes a phantom SP adjustment at offset 0 for the Tier0 frame.

src/coreclr/gcinfo/gcinfoencoder.cpp

Lines changed: 2 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -357,7 +357,6 @@ GcInfoSize& GcInfoSize::operator+=(const GcInfoSize& other)
357357
SecObjSize += other.SecObjSize;
358358
GsCookieSize += other.GsCookieSize;
359359
GenericsCtxSize += other.GenericsCtxSize;
360-
PspSymSize += other.PspSymSize;
361360
StackBaseSize += other.StackBaseSize;
362361
ReversePInvokeFrameSize += other.ReversePInvokeFrameSize;
363362
FixedAreaSize += other.FixedAreaSize;
@@ -406,7 +405,6 @@ void GcInfoSize::Log(DWORD level, const char * header)
406405
LogSpew(LF_GCINFO, level, "Prolog/Epilog: %zu\n", ProEpilogSize);
407406
LogSpew(LF_GCINFO, level, "SecObj: %zu\n", SecObjSize);
408407
LogSpew(LF_GCINFO, level, "GsCookie: %zu\n", GsCookieSize);
409-
LogSpew(LF_GCINFO, level, "PspSym: %zu\n", PspSymSize);
410408
LogSpew(LF_GCINFO, level, "GenericsCtx: %zu\n", GenericsCtxSize);
411409
LogSpew(LF_GCINFO, level, "StackBase: %zu\n", StackBaseSize);
412410
LogSpew(LF_GCINFO, level, "FixedArea: %zu\n", FixedAreaSize);
@@ -471,7 +469,6 @@ template <typename GcInfoEncoding> TGcInfoEncoder<GcInfoEncoding>::TGcInfoEncode
471469
m_GSCookieValidRangeStart = 0;
472470
_ASSERTE(sizeof(m_GSCookieValidRangeEnd) == sizeof(UINT32));
473471
m_GSCookieValidRangeEnd = (UINT32) (-1); // == UINT32.MaxValue
474-
m_PSPSymStackSlot = NO_PSP_SYM;
475472
m_GenericsInstContextStackSlot = NO_GENERICS_INST_CONTEXT;
476473
m_contextParamType = GENERIC_CONTEXTPARAM_NONE;
477474

@@ -702,14 +699,6 @@ template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::SetGSCoo
702699
m_GSCookieValidRangeEnd = validRangeEnd;
703700
}
704701

705-
template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::SetPSPSymStackSlot( INT32 spOffsetPSPSym )
706-
{
707-
_ASSERTE( spOffsetPSPSym != NO_PSP_SYM );
708-
_ASSERTE( m_PSPSymStackSlot == NO_PSP_SYM || m_PSPSymStackSlot == spOffsetPSPSym );
709-
710-
m_PSPSymStackSlot = spOffsetPSPSym;
711-
}
712-
713702
template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::SetGenericsInstContextStackSlot( INT32 spOffsetGenericsContext, GENERIC_CONTEXTPARAM_TYPE type)
714703
{
715704
_ASSERTE( spOffsetGenericsContext != NO_GENERICS_INST_CONTEXT);
@@ -941,7 +930,7 @@ template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::Build()
941930
UINT32 hasContextParamType = (m_GenericsInstContextStackSlot != NO_GENERICS_INST_CONTEXT);
942931
UINT32 hasReversePInvokeFrame = (m_ReversePInvokeFrameSlot != NO_REVERSE_PINVOKE_FRAME);
943932

944-
BOOL slimHeader = (!m_IsVarArg && !hasGSCookie && (m_PSPSymStackSlot == NO_PSP_SYM) &&
933+
BOOL slimHeader = (!m_IsVarArg && !hasGSCookie &&
945934
!hasContextParamType && (m_InterruptibleRanges.Count() == 0) && !hasReversePInvokeFrame &&
946935
((m_StackBaseRegister == NO_STACK_BASE_REGISTER) || (GcInfoEncoding::NORMALIZE_STACK_BASE_REGISTER(m_StackBaseRegister) == 0))) &&
947936
#ifdef TARGET_AMD64
@@ -970,7 +959,7 @@ template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::Build()
970959
GCINFO_WRITE(m_Info1, (m_IsVarArg ? 1 : 0), 1, FlagsSize);
971960
GCINFO_WRITE(m_Info1, 0 /* unused - was hasSecurityObject */, 1, FlagsSize);
972961
GCINFO_WRITE(m_Info1, (hasGSCookie ? 1 : 0), 1, FlagsSize);
973-
GCINFO_WRITE(m_Info1, ((m_PSPSymStackSlot != NO_PSP_SYM) ? 1 : 0), 1, FlagsSize);
962+
GCINFO_WRITE(m_Info1, 0 /* unused - was hasPSPSymStackSlot */, 1, FlagsSize);
974963
GCINFO_WRITE(m_Info1, m_contextParamType, 2, FlagsSize);
975964
#if defined(TARGET_LOONGARCH64)
976965
assert(m_StackBaseRegister == 22 || 3 == m_StackBaseRegister);
@@ -1037,17 +1026,6 @@ template <typename GcInfoEncoding> void TGcInfoEncoder<GcInfoEncoding>::Build()
10371026

10381027
}
10391028

1040-
// Encode the offset to the PSPSym.
1041-
// The PSPSym is relative to the caller SP on IA64 and the initial stack pointer before stack allocations on X64.
1042-
if(m_PSPSymStackSlot != NO_PSP_SYM)
1043-
{
1044-
_ASSERTE(!slimHeader);
1045-
#ifdef _DEBUG
1046-
LOG((LF_GCINFO, LL_INFO1000, "Parent PSP at " FMT_STK "\n", DBG_STK(m_PSPSymStackSlot)));
1047-
#endif
1048-
GCINFO_WRITE_VARL_S(m_Info1, GcInfoEncoding::NORMALIZE_STACK_SLOT(m_PSPSymStackSlot), GcInfoEncoding::PSP_SYM_STACK_SLOT_ENCBASE, PspSymSize);
1049-
}
1050-
10511029
// Encode the offset to the generics type context.
10521030
if(m_GenericsInstContextStackSlot != NO_GENERICS_INST_CONTEXT)
10531031
{

src/coreclr/inc/eetwain.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -483,10 +483,10 @@ PTR_VOID GetExactGenericsToken(PREGDISPLAY pContext,
483483
EECodeInfo * pCodeInfo);
484484

485485
static
486-
PTR_VOID GetExactGenericsToken(SIZE_T baseStackSlot,
486+
PTR_VOID GetExactGenericsToken(TADDR sp,
487+
TADDR fp,
487488
EECodeInfo * pCodeInfo);
488489

489-
490490
#endif // FEATURE_EH_FUNCLETS && USE_GC_INFO_DECODER
491491

492492
/*

src/coreclr/inc/gcinfodecoder.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -218,7 +218,7 @@ enum GcInfoDecoderFlags
218218
DECODE_INTERRUPTIBILITY = 0x08,
219219
DECODE_GC_LIFETIMES = 0x10,
220220
DECODE_NO_VALIDATION = 0x20,
221-
DECODE_PSP_SYM = 0x40,
221+
DECODE_PSP_SYM = 0x40, // Unused starting with v4 format
222222
DECODE_GENERICS_INST_CONTEXT = 0x80, // stack location of instantiation context for generics
223223
// (this may be either the 'this' ptr or the instantiation secret param)
224224
DECODE_GS_COOKIE = 0x100, // stack location of the GS cookie
@@ -237,7 +237,7 @@ enum GcInfoHeaderFlags
237237
GC_INFO_IS_VARARG = 0x1,
238238
// unused = 0x2, // was GC_INFO_HAS_SECURITY_OBJECT
239239
GC_INFO_HAS_GS_COOKIE = 0x4,
240-
GC_INFO_HAS_PSP_SYM = 0x8,
240+
GC_INFO_HAS_PSP_SYM = 0x8, // Unused starting with v4 format
241241
GC_INFO_HAS_GENERICS_INST_CONTEXT_MASK = 0x30,
242242
GC_INFO_HAS_GENERICS_INST_CONTEXT_NONE = 0x00,
243243
GC_INFO_HAS_GENERICS_INST_CONTEXT_MT = 0x10,
@@ -583,6 +583,7 @@ class TGcInfoDecoder
583583
INT32 GetReversePInvokeFrameStackSlot();
584584
bool HasMethodDescGenericsInstContext();
585585
bool HasMethodTableGenericsInstContext();
586+
bool HasStackBaseRegister();
586587
bool GetIsVarArg();
587588
bool WantsReportOnlyLeaf();
588589
#if defined(TARGET_ARM) || defined(TARGET_ARM64) || defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)

src/coreclr/inc/gcinfoencoder.h

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
- Flag: isVarArg,
2424
unused (was hasSecurityObject),
2525
hasGSCookie,
26-
hasPSPSymStackSlot,
26+
unused (was hasPSPSymStackSlot),
2727
hasGenericsInstContextStackSlot,
2828
hasStackBaseregister,
2929
wantsReportOnlyLeaf (AMD64 use only),
@@ -34,9 +34,9 @@
3434
- CodeLength
3535
- Prolog (if hasGenericsInstContextStackSlot || hasGSCookie)
3636
- Epilog (if hasGSCookie)
37-
- SecurityObjectStackSlot (if any)
37+
- SecurityObjectStackSlot (if any; no longer used)
3838
- GSCookieStackSlot (if any)
39-
- PSPSymStackSlot (if any)
39+
- PSPSymStackSlot (if any; no longer used)
4040
- GenericsInstContextStackSlot (if any)
4141
- StackBaseRegister (if any)
4242
- SizeOfEditAndContinuePreservedArea (if any)
@@ -128,7 +128,6 @@ struct GcInfoSize
128128
size_t ProEpilogSize;
129129
size_t SecObjSize;
130130
size_t GsCookieSize;
131-
size_t PspSymSize;
132131
size_t GenericsCtxSize;
133132
size_t StackBaseSize;
134133
size_t ReversePInvokeFrameSize;
@@ -408,7 +407,6 @@ class TGcInfoEncoder
408407

409408
void SetPrologSize( UINT32 prologSize );
410409
void SetGSCookieStackSlot( INT32 spOffsetGSCookie, UINT32 validRangeStart, UINT32 validRangeEnd );
411-
void SetPSPSymStackSlot( INT32 spOffsetPSPSym );
412410
void SetGenericsInstContextStackSlot( INT32 spOffsetGenericsContext, GENERIC_CONTEXTPARAM_TYPE type);
413411
void SetReversePInvokeFrameSlot(INT32 spOffset);
414412
void SetIsVarArg();
@@ -492,7 +490,6 @@ class TGcInfoEncoder
492490
INT32 m_GSCookieStackSlot;
493491
UINT32 m_GSCookieValidRangeStart;
494492
UINT32 m_GSCookieValidRangeEnd;
495-
INT32 m_PSPSymStackSlot;
496493
INT32 m_GenericsInstContextStackSlot;
497494
GENERIC_CONTEXTPARAM_TYPE m_contextParamType;
498495
UINT32 m_CodeLength;

src/coreclr/inc/readytorun.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@
1919
// src/coreclr/nativeaot/Runtime/inc/ModuleHeaders.h
2020
// If you update this, ensure you run `git grep MINIMUM_READYTORUN_MAJOR_VERSION`
2121
// and handle pending work.
22-
#define READYTORUN_MAJOR_VERSION 12
22+
#define READYTORUN_MAJOR_VERSION 13
2323
#define READYTORUN_MINOR_VERSION 0x0000
2424

25-
#define MINIMUM_READYTORUN_MAJOR_VERSION 12
25+
#define MINIMUM_READYTORUN_MAJOR_VERSION 13
2626

2727
// R2R Version 2.1 adds the InliningInfo section
2828
// R2R Version 2.2 adds the ProfileDataInfo section
@@ -40,6 +40,9 @@
4040
// R2R Version 10.1 adds Unbox_TypeTest helper
4141
// R2R Version 11 uses GCInfo v4, which encodes safe points without -1 offset and does not track return kinds in GCInfo
4242
// R2R Version 12 requires all return buffers to be always on the stack
43+
// R2R Version 13 removes usage of PSPSym, changes ABI for funclets to match NativeAOT, changes register for
44+
// exception parameter on AMD64, and redefines generics instance context stack slot in GCInfo v4
45+
// to be SP/FP relative
4346

4447
struct READYTORUN_CORE_HEADER
4548
{

0 commit comments

Comments
 (0)