[WIP] Fast codegen-free field access #23783

GrabYourPitchforks · 2019-04-06T01:53:18Z

Contributes to https://github.com/dotnet/corefx/issues/24390. This is only a proof of concept to solicit discussion.

This is an experimental FieldAccessor type which provides read+write access to a class's instance fields in a manner almost as high performance as raw codegen.

static void Main()
{
    FieldInfo field = typeof(Person).GetField("LastName");
    var accessor = new FieldAccessor<Person, string>(field);

    Person p = new Person
    {
        FirstName = "John",
        LastName = "Smith"
    };

    while (true)
    {
        GetLastNameFromFieldAccessor(accessor, p);
        GetLastNameFromPersonDirectly(accessor, p);
    }
}

[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.NoInlining)]
static string GetLastNameFromFieldAccessor(FieldAccessor<Person, string> accessor, Person p)
{
    return accessor.GetRef(p);
}

[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.NoInlining)]
static string GetLastNameFromPersonDirectly(object unused, Person p)
{
    return p.LastName;
}

public class Person
{
    public string FirstName;
    public string LastName;
}

GetLastNameFromFieldAccessor(System.Reflection.FieldAccessor`2<Person,System.String>, Person):
00007ffc`74526ce0 0f1f440000      nop     dword ptr [rax+rax]
00007ffc`74526ce5 488b4108        mov     rax,qword ptr [rcx+8]   ; rax := offset of Person.LastName field
00007ffc`74526ce9 488b0a          mov     rcx,qword ptr [rdx]   ; null check on Person before we perform the byref operation
00007ffc`74526cec 488b0402        mov     rax,qword ptr [rdx+rax]   ; deref Person.LastName field and return it
00007ffc`74526cf0 c3              ret

GetLastNameFromPersonDirectly(System.Object, Person):
00007ffc`74526d10 0f1f440000      nop     dword ptr [rax+rax]
00007ffc`74526d15 488b4210        mov     rax,qword ptr [rdx+10h]   ; return Person.LastName directly
00007ffc`74526d19 c3              ret

Lots of open issues here, especially regarding how readonly fields should be treated, whether the type checks in the FieldAccessor ctor are correct, and so on. But it serves as a proof of concept that we can pull this off to enable codegen-free fast serialization scenarios.

(This also slightly optimizes the Memory<T>.Span property getter.)

/cc @jkotas @steveharter @stephentoub @dotnet/jit-contrib as FYI

jkotas · 2019-04-06T02:03:16Z

src/jit/importer.cpp

@@ -4269,6 +4283,13 @@ NamedIntrinsic Compiler::lookupNamedIntrinsic(CORINFO_METHOD_HANDLE method)
            result = NI_System_Collections_Generic_EqualityComparer_get_Default;
        }
    }
+    else if (strcmp(namespaceName, "System.Runtime.CompilerServices") == 0)
+    {


Why do you need all this - why is not object.GetRawData() + Unsafe.Add enough?

It was resulting in weird unoptimized codegen. (The Memory.Span property getter today shows some of this weird codegen since it uses a very similar trick to what you propose.) This seemed like the safest / most straightforward way to address it.

My usual question for this is how hard is it to just fix it in the JIT. You can always introduce special intrinsics, etc., but it does not scale.

Somebody more familiar with JIT would have to chime in, but a cursory inspection makes it seem like modifying the optimizer would be more work than adding this intrinsic. It seems JIT doesn't quite know how to optimize the two-step process of getting a field ref from an object (convert GCREF to BYREF), then further manipulating the BYREF by performing an addition operation. I suspect that there was simply never a scenario for this previously so it never made sense to invest in this seemingly bizarre optimization.

Are you talking about generated code such as:

3912 cmp dword ptr [rdx], edx 4883C208 add rdx, 8 8B4204 mov eax, dword ptr [rdx+4]

where the null check could be omitted and the addition could be folded into the address mode?

ObjectHasComponentSize

I am not really happy that the current implementation blurs the boundaries between objectrefs and byrefs. E.g. it would prevent us from experimenting with removing syncblocks. The proper implementation of this method would avoid ever creating a byref pointing to offset 0. If we want to make this better, I would rather spend time on that; not on trying to fix the JIT to produce perfect code for access to negative field offsets.

I feel you that it's a bit of a hack and has definitely got some code smell to it. And I'd prefer a much cleaner way of doing it. :)

But before it was checked in there were discussions with people familiar with the runtime, object layout, the JIT, and the GC. And it was determined that accessing fields at offset 0 (pMethodTable) would be safe.

You're right that accessing fields at truly negative offsets would be dangerous, and it would interfere both with the GC's ability to ref track and the runtime team's ability to experiment with things like sync block removal. But nobody is proposing accessing fields at negative offsets, let alone optimizing it.

BTW I'm experimenting with whether it'd be possible to do this using IL replacement rather than GenTree replacement. Would that alleviate some of the concerns people have?

Edit: Interestingly, using IL replacement (through method rewriting a la what we do with Unsafe and other RuntimeHelpers routines) not only resulted in worse codegen, it resulted in incorrect codegen.

mdToken tokRawData = MscorlibBinder::GetField(FIELD__RAW_DATA__DATA)->GetMemberDef(); static BYTE ilcode[] = { CEE_LDARG_0, CEE_LDFLDA,0,0,0,0, #ifdef _TARGET_64BIT_ CEE_LDC_I4_8, #else // _TARGET_64BIT_ CEE_LDC_I4_4, #endif // _TARGET_64BIT_ CEE_SUB, CEE_LDARG_1, CEE_ADD, CEE_RET }; ilcode[2] = (BYTE)(tokRawData); ilcode[3] = (BYTE)(tokRawData >> 8); ilcode[4] = (BYTE)(tokRawData >> 16); ilcode[5] = (BYTE)(tokRawData >> 24);

GetLastNameFromFieldAccessor(System.Reflection.FieldAccessor`2<Person,System.String>, Person): 00007ffc`74536ed0 0f1f440000 nop dword ptr [rax+rax] 00007ffc`74536ed5 488b4108 mov rax,qword ptr [rcx+8] ; rax := this._fieldOffset (also a null check on 'this') 00007ffc`74536ed9 3912 cmp dword ptr [rdx],edx ; null check on 'obj' 00007ffc`74536edb 4883c208 add rdx,8 ; rdx := ref to ((RawData)obj).Data 00007ffc`74536edf 4803c2 add rax,rdx ; ** INCORRECT ** adding _fieldOffset to ref data before subtracting 8 00007ffc`74536ee2 4883c0f8 add rax,0FFFFFFFFFFFFFFF8h 00007ffc`74536ee6 488b00 mov rax,qword ptr [rax] ; deref LastName field 00007ffc`74536ee9 c3 ret

Even though in IL I subtracted 8 first before adding fieldOffset, something in the JIT decided to reorder these operations, resulting in a scenario where rax might point well off the end of the object and no longer be GC-trackable.

Why can't you just do => ref Unsafe.As<byte, TField>(ref Unsafe.AddByteOffset(ref RuntimeHelpers.GetRawData(o), _offset));? The codegen for that may have one extra instruction than the ideal codegen; but I do not think that it warrants workarounds like IL replacement.

it resulted in incorrect codegen

Yep, byref arithmetic is very slippery slope. Many opportunities for hard to trace down GC holes.

The codegen for that may have one extra instruction than the ideal codegen; but I do not think that it warrants workarounds like IL replacement.

Sure, I'll cede that point. Bonus: using the existing code allows the obj null check to avoid an extra register allocation. :)

jkotas · 2019-04-06T02:49:10Z

It would be useful to have some examples that shows how serializers would be changed to use this.

omariom · 2019-04-06T03:23:27Z

👍
Any chance for structs?
public ref TField GetRef(in TObject obj)

hughbe · 2019-04-06T05:31:15Z

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

I’m not discounting the benefits - performance that permeates through the stack, but it strikes me that some of these cases would be better done by improving the JIT to optimise more cases or do better.

The reasons to me is that it’s “easier” to do this in c# because open source contributors have more knowledge of c# and the JIT Is hard to approach

mikedn · 2019-04-06T07:16:12Z

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

And when the JIT does improve, attempting to remove such hacks can require a ton of work. It's not like you can just remove them, you need to check the generated code, perhaps run some benchmarks, deal with unexpected interactions between multiple hacks and so on.

The reasons to me is that it’s “easier” to do this in c# because open source contributors have more knowledge of c# and the JIT Is hard to approach

Perhaps the JIT would be easier to approach if the code was better written. Sadly, a lot was written just with short term gains in mind and targeting today's "real world" use cases with 0 consideration for the fact that tomorrow's "real world" might be different.

GrabYourPitchforks · 2019-04-06T16:52:57Z

Let me open a new issue to track the ldflda + add JIT optimization work. If that comes in, this PR would be able to take advantage of it, and the Memory.Span property getter (the only other similar usage I know of today) would immediately see the benefit of the simplified codegen.

tannergooding · 2019-04-06T18:33:10Z

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

For the most part, many of these are things (like HWIntrinsics) are things that even native compilers expose because they are non-trivial things to handle in the compiler or (like Unsafe) are places that are limitations in the C# language today.

However, I do agree that we shouldn't just be adding new intrinsics for any old case and that we should look at optimizing the JIT instead where possible or trying to provide an appropriate abstractions so that low level details don't bleed into all of our APIs.

omariom · 2019-04-06T22:16:55Z

src/System.Private.CoreLib/src/System/Reflection/FieldAccessor.cs

+
+namespace System.Reflection
+{
+    public sealed class FieldAccessor<TObject, TField> where TObject : class


Why is FieldAccessor not a struct?

Would need to add an initalized field to ensure it had sensible values for default(FieldAccessor)

private readonly uint _fieldOffset; private readonly bool _initalized;

and additionally need a check in the GetRef

public ref TField GetRef(TObject obj) { if (!_initalized) ThrowHelper.ThrowNotInitalized(); // ... }

Question is does it add value in terms of the extra cost vs size saved in number of instances and one less indirection.

Yeah, Ben and I had a brief conversation on this the other day. The type wouldn't be safe to use if it were a struct, as not only would uninitialized usage be a problem, the struct would also be subject to tearing concerns.

There's no real perf hit to making this a class instead of a struct. The intent is that the caller will cache the instance and reuse it. Since the accessor itself is inlined (it's a non-virtual method) and usage doesn't incur a null check on the accessor itself (see the comment at the top of GetRef), this should take care of any perf concerns.

yahorsi · 2019-04-12T07:02:13Z

This might be usefull not just in serializers. Here are our case. In our project we have a set of POCO classes that needs to be copied/cloned. It is very easy to forget to copy the field if you write such code manually and to be fair, writing such code is not pleasant. So, we build our own "cloner" that generate copy/clone code on the fly by emiting IL.

GrabYourPitchforks · 2019-09-05T21:11:37Z

I updated the PR based on previous feedback. It's still WIP since I've not tested it to see what codegen is emitted into the callers.

Updates since last iteration:

Now includes support for value types in addition to reference types.
There's a read-only version and a mutable version. The mutable version is built atop the read-only version, but this should probably be changed. We might also need different classes for value types vs. reference types since they have different mutability constraints.
Removed the the custom IL generation step that reinterpreted the method table as a ref byte, instead opting for existing methods on the Unsafe and RuntimeHelpers class. I had to add one method to determine whether TObject was a value type or a reference type, but if we split the accessor into different types depending on valuetype vs. class then we might be able to get rid of this new method.

benaadams · 2019-09-07T05:00:32Z

Example use case? https://github.com/aspnet/AspNetCore/blob/master/src/Shared/PropertyHelper/PropertyHelper.cs

jkotas · 2019-09-07T05:03:50Z

https://github.com/aspnet/AspNetCore/blob/master/src/Shared/PropertyHelper/PropertyHelper.cs

All these are methods and properties. This feature as it stands is for fields only.

YohDeadfall · 2019-09-07T07:47:06Z

@benadams A good example is dotnet/corefx#39254. I will be happy if @GrabYourPitchforks will do the same for properties because it's the case you pointed, but also serializers and composite type handling in Npgsql would benefit from this.

maryamariyan · 2019-11-06T21:05:14Z

Thank you for your contribution. As announced in dotnet/coreclr#27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

In your coreclr repository clone, create patch by running git format-patch origin
In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>

GrabYourPitchforks · 2019-11-07T01:16:40Z

This isn't going to be finished before the coming repo merge. I've migrated it to a temporary holding area GrabYourPitchforks#1 so that it doesn't get lost in the shuffle. Will reopen at some point in the future once the dust settles.

jkotas reviewed Apr 6, 2019

View reviewed changes

omariom reviewed Apr 6, 2019

View reviewed changes

ReubenBond mentioned this pull request Apr 8, 2019

Support readonly/private fields & properties ReubenBond/Hagar#46

Merged

Provide implementations of ReadOnlyFieldAccessor / FieldAccessor

920be54

GrabYourPitchforks force-pushed the fast_fieldaccess branch from ae98c02 to 920be54 Compare September 5, 2019 21:06

jkotas added the area-System.Reflection label Nov 2, 2019

GrabYourPitchforks mentioned this pull request Nov 7, 2019

[WIP] Fast codegen-free field access GrabYourPitchforks/coreclr#1

Open

GrabYourPitchforks closed this Nov 7, 2019

This was referenced Jan 31, 2020

JIT: ldflda + add not properly being optimized in certain cases dotnet/runtime#12436

Open

JIT optimizations can produce incorrect IL when dealing with byrefs dotnet/runtime#12438

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Fast codegen-free field access #23783

[WIP] Fast codegen-free field access #23783

GrabYourPitchforks commented Apr 6, 2019

jkotas Apr 6, 2019

GrabYourPitchforks Apr 6, 2019

jkotas Apr 6, 2019

GrabYourPitchforks Apr 6, 2019

mikedn Apr 6, 2019

jkotas Apr 6, 2019 •

edited

GrabYourPitchforks Apr 6, 2019

GrabYourPitchforks Apr 6, 2019 •

edited

jkotas Apr 6, 2019 •

edited

GrabYourPitchforks Apr 6, 2019 •

edited

jkotas commented Apr 6, 2019

omariom commented Apr 6, 2019

hughbe commented Apr 6, 2019

mikedn commented Apr 6, 2019

GrabYourPitchforks commented Apr 6, 2019

tannergooding commented Apr 6, 2019

omariom Apr 6, 2019

benaadams Apr 7, 2019

GrabYourPitchforks Apr 10, 2019 •

edited

yahorsi commented Apr 12, 2019

GrabYourPitchforks commented Sep 5, 2019

benaadams commented Sep 7, 2019

jkotas commented Sep 7, 2019 •

edited

YohDeadfall commented Sep 7, 2019

maryamariyan commented Nov 6, 2019

GrabYourPitchforks commented Nov 7, 2019

[WIP] Fast codegen-free field access #23783

[WIP] Fast codegen-free field access #23783

Conversation

GrabYourPitchforks commented Apr 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas Apr 6, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GrabYourPitchforks Apr 6, 2019 • edited

Choose a reason for hiding this comment

jkotas Apr 6, 2019 • edited

Choose a reason for hiding this comment

GrabYourPitchforks Apr 6, 2019 • edited

Choose a reason for hiding this comment

jkotas commented Apr 6, 2019

omariom commented Apr 6, 2019

hughbe commented Apr 6, 2019

mikedn commented Apr 6, 2019

GrabYourPitchforks commented Apr 6, 2019

tannergooding commented Apr 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GrabYourPitchforks Apr 10, 2019 • edited

Choose a reason for hiding this comment

yahorsi commented Apr 12, 2019

GrabYourPitchforks commented Sep 5, 2019

benaadams commented Sep 7, 2019

jkotas commented Sep 7, 2019 • edited

YohDeadfall commented Sep 7, 2019

maryamariyan commented Nov 6, 2019

GrabYourPitchforks commented Nov 7, 2019

jkotas Apr 6, 2019 •

edited

GrabYourPitchforks Apr 6, 2019 •

edited

jkotas Apr 6, 2019 •

edited

GrabYourPitchforks Apr 6, 2019 •

edited

GrabYourPitchforks Apr 10, 2019 •

edited

jkotas commented Sep 7, 2019 •

edited