Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve static field access for NativeAOT #63620

Closed
wants to merge 1 commit into from

Conversation

MichalStrehovsky
Copy link
Member

NativeAOT currently goes through a helper call whenever we need to access static fields. For this simple program:

class Program
{
    static int s1, s2;
    static int Main() => s1 + s2;
}

We generate:

repro_Program__Main:
sub         rsp,28h
call        __GetNonGCStaticBase_repro_Program
mov         edx,dword ptr [rax]
add         edx,dword ptr [rax+4]
mov         eax,edx
add         rsp,28h
ret

__GetNonGCStaticBase_repro_Program:
lea         rax,[?__NONGCSTATICS@repro_Program@@]
ret

It's not terrible, but also could be better. As far as I can see, JitInterface cannot express the exact way static fields are accessed in NativeAOT. I had a previous attempt to use the existing JitInterface facilities in dotnet/corert#5131 (misusing the facilities that exist to support RVA static fields), but it didn't generate "nice" addressing modes and couldn't support GC statics (see the disassembly there).

I'm adding a way to do that with "nice" addressing modes. After this change, the above program compiles into:

repro_Program__Main:
mov         eax,dword ptr [?__NONGCSTATICS@repro_Program@@]
lea         rdx,[?__NONGCSTATICS@repro_Program@@]
add         eax,dword ptr [rdx+4]
ret

There's room for improvement:

NativeAOT currently goes through a helper call whenever we need to access static fields. For this simple program:

```csharp
class Program
{
    static int s1, s2;
    static int Main() => s1 + s2;
}
```

We generate:

```nasm
repro_Program__Main:
sub         rsp,28h
call        __GetNonGCStaticBase_repro_Program
mov         edx,dword ptr [rax]
add         edx,dword ptr [rax+4]
mov         eax,edx
add         rsp,28h
ret

__GetNonGCStaticBase_repro_Program:
lea         rax,[?__NONGCSTATICS@repro_Program@@]
ret
```

It's not terrible, but also could be better. As far as I can see, JitInterface cannot express the exact way static fields are accessed in NativeAOT. I had a previous attempt to use the existing JitInterface facilities in dotnet/corert#5131 (misusing the facilities that exist to support RVA static fields), but it didn't generate "nice" addressing modes and couldn't support GC statics (see the disassembly there).

I'm adding a way to do that with "nice" addressing modes. After this change, the above program compiles into:

```nasm
repro_Program__Main:
mov         eax,dword ptr [?__NONGCSTATICS@repro_Program@@]
lea         rdx,[?__NONGCSTATICS@repro_Program@@]
add         eax,dword ptr [rdx+4]
ret
```

There's room for improvement:
* It would be nice if we could similarly inline these lookups when we're in shared generic code
* I think the addressing mode could be more compact if RyuJIT generated a reloc with a delta on x64.
* It would be nice if we could do this if there's a static constructor. .NET Native could inline the "did static constructor already run?" checks. There was a previous attempt at that in dotnet/coreclr#12420 (and corresponding dotnet/corert#3962).
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 11, 2022
@ghost
Copy link

ghost commented Jan 11, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

NativeAOT currently goes through a helper call whenever we need to access static fields. For this simple program:

class Program
{
    static int s1, s2;
    static int Main() => s1 + s2;
}

We generate:

repro_Program__Main:
sub         rsp,28h
call        __GetNonGCStaticBase_repro_Program
mov         edx,dword ptr [rax]
add         edx,dword ptr [rax+4]
mov         eax,edx
add         rsp,28h
ret

__GetNonGCStaticBase_repro_Program:
lea         rax,[?__NONGCSTATICS@repro_Program@@]
ret

It's not terrible, but also could be better. As far as I can see, JitInterface cannot express the exact way static fields are accessed in NativeAOT. I had a previous attempt to use the existing JitInterface facilities in dotnet/corert#5131 (misusing the facilities that exist to support RVA static fields), but it didn't generate "nice" addressing modes and couldn't support GC statics (see the disassembly there).

I'm adding a way to do that with "nice" addressing modes. After this change, the above program compiles into:

repro_Program__Main:
mov         eax,dword ptr [?__NONGCSTATICS@repro_Program@@]
lea         rdx,[?__NONGCSTATICS@repro_Program@@]
add         eax,dword ptr [rdx+4]
ret

There's room for improvement:

Author: MichalStrehovsky
Assignees: MichalStrehovsky
Labels:

area-CodeGen-coreclr

Milestone: -

src/coreclr/jit/importer.cpp Show resolved Hide resolved
Comment on lines +8255 to +8256
// Prevent CSE from adding the offset to the base
op1->gtFlags |= GTF_DONT_CSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why prevent CSE unconditionally? At least for the IAT_PVALUE case it seems it could be valuable, for the IAT_VALUE less so I suppose, except on ARM/64.

Is the issue here that we cannot fold Icon + Offset and end up CSEing it?
(If so would be worth a comment to that effect; seems like a bug in gtSetEvalOrder's logic for setting ADDRMODE_NO_CSE -- perhaps that's what the existing comment meant to say?)

Copy link
Member Author

@MichalStrehovsky MichalStrehovsky Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was seeing RyuJIT fold the constant handle with the constant and that created an invalid handle (the handle is not an address when we're AOT compiling. It's an actual handle that should not be done constant propagation math on). Could be that this is a bug in CSE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think I know what the issue is (it should be a VN + assertion propagation bug). Could you open a bug on this and add a TODO-Bug: workaround for <link> here?

@@ -1649,6 +1649,7 @@ enum CORINFO_FIELD_ACCESSOR
CORINFO_FIELD_STATIC_ADDR_HELPER, // static field accessed using address-of helper (argument is FieldDesc *)
CORINFO_FIELD_STATIC_TLS, // unmanaged TLS access
CORINFO_FIELD_STATIC_READYTORUN_HELPER, // static field access using a runtime lookup helper
CORINFO_FIELD_STATIC_DATASEGMENT, // static field access from the data segment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have documentation above on what the various accessors mean, it would be nice to update it to cover this new value.

(I realize some of it is rather outdated)

@@ -8231,6 +8231,35 @@ GenTree* Compiler::impImportStaticFieldAccess(CORINFO_RESOLVED_TOKEN* pResolvedT
}
break;

case CORINFO_FIELD_STATIC_DATASEGMENT:
Copy link
Contributor

@SingleAccretion SingleAccretion Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look a lot like getFieldAddress with the second argument not set to a dummy value.

I realize it won't work as-is because the Jit assumes getFieldAddress will always return a direct address today (it should not be hard to fix that), it just does not feel great to have the duplicated functionality.

Edit: on second thought, plumbing this through getFieldAddress is not that great as it will require getFieldAddress to become more like getBaseFieldAddress. So I agree a new accessor is the better way to have things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference from getFieldAddress is that this gives RyuJIT a base+offset view instead of giving a single symbol that RyuJIT can't do much about (getFieldAddress returns the address with the base factored in).

It was the feedback from dotnet/corert#5131 (comment) (my earlier attempt) that ended up going the getFieldAddress path.

else
{
assert(pFieldInfo->fieldLookup.accessType == InfoAccessType::IAT_PVALUE);
op1 = gtNewIndOfIconHandleNode(TYP_BYREF, (size_t)pFieldInfo->fieldLookup.addr, GTF_ICON_GLOBAL_PTR, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GTF_ICON_GLOBAL_PTR, isInvariant: false

What is the scenario where the address of the static can change?

Note that some code in the Jit depends on the addresses of statics being invariant. It will not matter for this change now, I believe, but I have changes in progress that will make this dependency more visible.

(To be clear: mutable addresses can absolutely be made to work, I just need to understand the scenarios I will need to add as tests to account for them)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gtNewIndOfIconHandleNode asserts if GTF_ICON_GLOBAL_PTR is used with isInvariant = true. This was just me hammering on it until it worked. Maybe we need a new GTF_ value for this? I'm not particularly experienced in the RyuJIT code base; I know just enough to CTRL-C/CTRL-V from various places.

Copy link
Contributor

@SingleAccretion SingleAccretion Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a new handle kind would be the best solution (GTF_STATIC_ADDR_PTR say...). Using GTF_ICON_CONST_PTR will work too but is less desirable. When adding a new handle kind, please also add printing for it to cTreeFlags and gtDispConst.

Why a new handle kind?

A "raw" GTF_ICON_CONST_PTR will work as an invariant indirection, but a dedicated handle kind will allow VN to treat it specially (and more precisely):

// We will have staticPtr = IND(STATIC_ADDR_PTR) and give "staticPtr" a "VNF_PtrToStatic" VN.
ref var staticPtr = ref staticField;

staticPtr = { ... }; // The compiler will be able to see that this store is just "staticField = { ... }"

It will be very similar to GTF_ICON_STATIC_BOX_PTR.

@MichalStrehovsky
Copy link
Member Author

@jkotas Would there be any advantage for CoreCLR to also give RyuJIT visibility into base+offset form of static fields? I'm wondering whether we should just do something with getFieldAddress and then use the same codepath for both JIT and AOT.

@jkotas
Copy link
Member

jkotas commented Jan 13, 2022

It would not hurt to have the JIT/EE interface setup for this. It is done today using heuristic for JIT-case only - see #39096.

I agree with @SingleAccretion that the new CORINFO_FIELD_STATIC_... value looks very similar to the existing one, and that it would be nice to just extend the existing one.

@MichalStrehovsky
Copy link
Member Author

I think taking the same codepaths as in the JITted case will be a lot more feasible once we get rid of GT_VAR as outlined in #63845. At that point we can construct the right tree using pFieldInfo->fieldLookup.addr + pFieldInfo->offset.

Right now we kind of lose the addr+offset information when we make a GT_CLS_VAR (and then we need to query it with getFieldAddress`). So I'll just wait with this for that.

@MichalStrehovsky
Copy link
Member Author

Don't have time for this now. Tracked in #64242.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 17, 2022
@MichalStrehovsky MichalStrehovsky deleted the staticFields branch January 29, 2024 07:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants