Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

[WIP] Fast codegen-free field access #23783

Closed

Conversation

GrabYourPitchforks
Copy link
Member

Contributes to https://github.com/dotnet/corefx/issues/24390. This is only a proof of concept to solicit discussion.

This is an experimental FieldAccessor type which provides read+write access to a class's instance fields in a manner almost as high performance as raw codegen.

static void Main()
{
    FieldInfo field = typeof(Person).GetField("LastName");
    var accessor = new FieldAccessor<Person, string>(field);

    Person p = new Person
    {
        FirstName = "John",
        LastName = "Smith"
    };

    while (true)
    {
        GetLastNameFromFieldAccessor(accessor, p);
        GetLastNameFromPersonDirectly(accessor, p);
    }
}

[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.NoInlining)]
static string GetLastNameFromFieldAccessor(FieldAccessor<Person, string> accessor, Person p)
{
    return accessor.GetRef(p);
}

[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.NoInlining)]
static string GetLastNameFromPersonDirectly(object unused, Person p)
{
    return p.LastName;
}

public class Person
{
    public string FirstName;
    public string LastName;
}
GetLastNameFromFieldAccessor(System.Reflection.FieldAccessor`2<Person,System.String>, Person):
00007ffc`74526ce0 0f1f440000      nop     dword ptr [rax+rax]
00007ffc`74526ce5 488b4108        mov     rax,qword ptr [rcx+8]   ; rax := offset of Person.LastName field
00007ffc`74526ce9 488b0a          mov     rcx,qword ptr [rdx]   ; null check on Person before we perform the byref operation
00007ffc`74526cec 488b0402        mov     rax,qword ptr [rdx+rax]   ; deref Person.LastName field and return it
00007ffc`74526cf0 c3              ret

GetLastNameFromPersonDirectly(System.Object, Person):
00007ffc`74526d10 0f1f440000      nop     dword ptr [rax+rax]
00007ffc`74526d15 488b4210        mov     rax,qword ptr [rdx+10h]   ; return Person.LastName directly
00007ffc`74526d19 c3              ret

Lots of open issues here, especially regarding how readonly fields should be treated, whether the type checks in the FieldAccessor ctor are correct, and so on. But it serves as a proof of concept that we can pull this off to enable codegen-free fast serialization scenarios.

(This also slightly optimizes the Memory<T>.Span property getter.)

/cc @jkotas @steveharter @stephentoub @dotnet/jit-contrib as FYI

@@ -4269,6 +4283,13 @@ NamedIntrinsic Compiler::lookupNamedIntrinsic(CORINFO_METHOD_HANDLE method)
result = NI_System_Collections_Generic_EqualityComparer_get_Default;
}
}
else if (strcmp(namespaceName, "System.Runtime.CompilerServices") == 0)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need all this - why is not object.GetRawData() + Unsafe.Add enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was resulting in weird unoptimized codegen. (The Memory.Span property getter today shows some of this weird codegen since it uses a very similar trick to what you propose.) This seemed like the safest / most straightforward way to address it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My usual question for this is how hard is it to just fix it in the JIT. You can always introduce special intrinsics, etc., but it does not scale.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somebody more familiar with JIT would have to chime in, but a cursory inspection makes it seem like modifying the optimizer would be more work than adding this intrinsic. It seems JIT doesn't quite know how to optimize the two-step process of getting a field ref from an object (convert GCREF to BYREF), then further manipulating the BYREF by performing an addition operation. I suspect that there was simply never a scenario for this previously so it never made sense to invest in this seemingly bizarre optimization.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about generated code such as:

       3912                 cmp      dword ptr [rdx], edx
       4883C208             add      rdx, 8
       8B4204               mov      eax, dword ptr [rdx+4]

where the null check could be omitted and the addition could be folded into the address mode?

Copy link
Member

@jkotas jkotas Apr 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectHasComponentSize

I am not really happy that the current implementation blurs the boundaries between objectrefs and byrefs. E.g. it would prevent us from experimenting with removing syncblocks. The proper implementation of this method would avoid ever creating a byref pointing to offset 0. If we want to make this better, I would rather spend time on that; not on trying to fix the JIT to produce perfect code for access to negative field offsets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you that it's a bit of a hack and has definitely got some code smell to it. And I'd prefer a much cleaner way of doing it. :)

But before it was checked in there were discussions with people familiar with the runtime, object layout, the JIT, and the GC. And it was determined that accessing fields at offset 0 (pMethodTable) would be safe.

You're right that accessing fields at truly negative offsets would be dangerous, and it would interfere both with the GC's ability to ref track and the runtime team's ability to experiment with things like sync block removal. But nobody is proposing accessing fields at negative offsets, let alone optimizing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I'm experimenting with whether it'd be possible to do this using IL replacement rather than GenTree replacement. Would that alleviate some of the concerns people have?

Edit: Interestingly, using IL replacement (through method rewriting a la what we do with Unsafe and other RuntimeHelpers routines) not only resulted in worse codegen, it resulted in incorrect codegen.

        mdToken tokRawData = MscorlibBinder::GetField(FIELD__RAW_DATA__DATA)->GetMemberDef();

        static BYTE ilcode[] = { CEE_LDARG_0,
                                 CEE_LDFLDA,0,0,0,0,
#ifdef _TARGET_64BIT_
                                 CEE_LDC_I4_8,
#else // _TARGET_64BIT_
                                 CEE_LDC_I4_4,
#endif // _TARGET_64BIT_
                                 CEE_SUB,
                                 CEE_LDARG_1,
                                 CEE_ADD,
                                 CEE_RET };

        ilcode[2] = (BYTE)(tokRawData);
        ilcode[3] = (BYTE)(tokRawData >> 8);
        ilcode[4] = (BYTE)(tokRawData >> 16);
        ilcode[5] = (BYTE)(tokRawData >> 24);
GetLastNameFromFieldAccessor(System.Reflection.FieldAccessor`2<Person,System.String>, Person):
00007ffc`74536ed0 0f1f440000      nop     dword ptr [rax+rax]
00007ffc`74536ed5 488b4108        mov     rax,qword ptr [rcx+8]  ; rax := this._fieldOffset (also a null check on 'this')
00007ffc`74536ed9 3912            cmp     dword ptr [rdx],edx  ; null check on 'obj'
00007ffc`74536edb 4883c208        add     rdx,8  ; rdx := ref to ((RawData)obj).Data
00007ffc`74536edf 4803c2          add     rax,rdx  ; ** INCORRECT ** adding _fieldOffset to ref data before subtracting 8
00007ffc`74536ee2 4883c0f8        add     rax,0FFFFFFFFFFFFFFF8h
00007ffc`74536ee6 488b00          mov     rax,qword ptr [rax]  ; deref LastName field
00007ffc`74536ee9 c3              ret

Even though in IL I subtracted 8 first before adding fieldOffset, something in the JIT decided to reorder these operations, resulting in a scenario where rax might point well off the end of the object and no longer be GC-trackable.

Copy link
Member

@jkotas jkotas Apr 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you just do => ref Unsafe.As<byte, TField>(ref Unsafe.AddByteOffset(ref RuntimeHelpers.GetRawData(o), _offset));? The codegen for that may have one extra instruction than the ideal codegen; but I do not think that it warrants workarounds like IL replacement.

it resulted in incorrect codegen

Yep, byref arithmetic is very slippery slope. Many opportunities for hard to trace down GC holes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codegen for that may have one extra instruction than the ideal codegen; but I do not think that it warrants workarounds like IL replacement.

Sure, I'll cede that point. Bonus: using the existing code allows the obj null check to avoid an extra register allocation. :)

@jkotas
Copy link
Member

jkotas commented Apr 6, 2019

It would be useful to have some examples that shows how serializers would be changed to use this.

@omariom
Copy link

omariom commented Apr 6, 2019

👍
Any chance for structs?
public ref TField GetRef(in TObject obj)

@hughbe
Copy link

hughbe commented Apr 6, 2019

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

I’m not discounting the benefits - performance that permeates through the stack, but it strikes me that some of these cases would be better done by improving the JIT to optimise more cases or do better.

The reasons to me is that it’s “easier” to do this in c# because open source contributors have more knowledge of c# and the JIT Is hard to approach

@mikedn
Copy link

mikedn commented Apr 6, 2019

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

And when the JIT does improve, attempting to remove such hacks can require a ton of work. It's not like you can just remove them, you need to check the generated code, perhaps run some benchmarks, deal with unexpected interactions between multiple hacks and so on.

The reasons to me is that it’s “easier” to do this in c# because open source contributors have more knowledge of c# and the JIT Is hard to approach

Perhaps the JIT would be easier to approach if the code was better written. Sadly, a lot was written just with short term gains in mind and targeting today's "real world" use cases with 0 consideration for the fact that tomorrow's "real world" might be different.

@GrabYourPitchforks
Copy link
Member Author

Let me open a new issue to track the ldflda + add JIT optimization work. If that comes in, this PR would be able to take advantage of it, and the Memory.Span property getter (the only other similar usage I know of today) would immediately see the benefit of the simplified codegen.

@tannergooding
Copy link
Member

I’ve noticed that recently we’ve been substantially affecting the readability and expressibility of the System.Private.CoreLib code smattering it with Unsafe reads, JIT intrinsics and other memory “hacks”

For the most part, many of these are things (like HWIntrinsics) are things that even native compilers expose because they are non-trivial things to handle in the compiler or (like Unsafe) are places that are limitations in the C# language today.

However, I do agree that we shouldn't just be adding new intrinsics for any old case and that we should look at optimizing the JIT instead where possible or trying to provide an appropriate abstractions so that low level details don't bleed into all of our APIs.


namespace System.Reflection
{
public sealed class FieldAccessor<TObject, TField> where TObject : class
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is FieldAccessor not a struct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need to add an initalized field to ensure it had sensible values for default(FieldAccessor)

private readonly uint _fieldOffset;
private readonly bool _initalized;

and additionally need a check in the GetRef

public ref TField GetRef(TObject obj)
{
    if (!_initalized) ThrowHelper.ThrowNotInitalized();
    // ...
}

Question is does it add value in terms of the extra cost vs size saved in number of instances and one less indirection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Ben and I had a brief conversation on this the other day. The type wouldn't be safe to use if it were a struct, as not only would uninitialized usage be a problem, the struct would also be subject to tearing concerns.

There's no real perf hit to making this a class instead of a struct. The intent is that the caller will cache the instance and reuse it. Since the accessor itself is inlined (it's a non-virtual method) and usage doesn't incur a null check on the accessor itself (see the comment at the top of GetRef), this should take care of any perf concerns.

@yahorsi
Copy link

yahorsi commented Apr 12, 2019

This might be usefull not just in serializers. Here are our case. In our project we have a set of POCO classes that needs to be copied/cloned. It is very easy to forget to copy the field if you write such code manually and to be fair, writing such code is not pleasant. So, we build our own "cloner" that generate copy/clone code on the fly by emiting IL.

@GrabYourPitchforks
Copy link
Member Author

I updated the PR based on previous feedback. It's still WIP since I've not tested it to see what codegen is emitted into the callers.

Updates since last iteration:

  • Now includes support for value types in addition to reference types.

  • There's a read-only version and a mutable version. The mutable version is built atop the read-only version, but this should probably be changed. We might also need different classes for value types vs. reference types since they have different mutability constraints.

  • Removed the the custom IL generation step that reinterpreted the method table as a ref byte, instead opting for existing methods on the Unsafe and RuntimeHelpers class. I had to add one method to determine whether TObject was a value type or a reference type, but if we split the accessor into different types depending on valuetype vs. class then we might be able to get rid of this new method.

@benaadams
Copy link
Member

@jkotas
Copy link
Member

jkotas commented Sep 7, 2019

https://github.com/aspnet/AspNetCore/blob/master/src/Shared/PropertyHelper/PropertyHelper.cs

All these are methods and properties. This feature as it stands is for fields only.

@YohDeadfall
Copy link

@benadams A good example is dotnet/corefx#39254. I will be happy if @GrabYourPitchforks will do the same for properties because it's the case you pointed, but also serializers and composite type handling in Npgsql would benefit from this.

@maryamariyan
Copy link
Member

Thank you for your contribution. As announced in dotnet/coreclr#27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

  1. In your coreclr repository clone, create patch by running git format-patch origin
  2. In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>

@GrabYourPitchforks
Copy link
Member Author

This isn't going to be finished before the coming repo merge. I've migrated it to a temporary holding area GrabYourPitchforks#1 so that it doesn't get lost in the shuffle. Will reopen at some point in the future once the dust settles.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
10 participants