Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixing C# and IL locals #23

Closed
buybackoff opened this issue Nov 28, 2020 · 13 comments
Closed

Mixing C# and IL locals #23

buybackoff opened this issue Nov 28, 2020 · 13 comments

Comments

@buybackoff
Copy link

buybackoff commented Nov 28, 2020

Hello Lucas!

I tried to use InlineIL for a painful case of working with ref locals, and found some limitations. A simplified sample below. I want to mix locals defined normally and via InlineIL. For ref locals, I really miss reference to reference similar to pointer to pointer and C# does not allow to initialize ref local in efficient way in my case. But after initialization, I would rather continue to use C#, not IL. The type under question if a struct with objects, so using Unsafe.AsPointer is dangerous and questionable.

Here we define baz in C# and then add two additional locals via IL. The output locals will be merged as expected. I want to have access to baz from IL.

        public int UseLocalVariables(int value)
        {
#pragma warning disable 219
            var baz = 10;
#pragma warning restore 219
            IL.DeclareLocals(
                new LocalVar("foo", typeof(int)),
                new LocalVar("bar", typeof(int))
            );

            IL.Push(value);
            Stloc("foo");

            Ldc_I4(42);
            Stloc("bar");

            IncrBaz();

            Ldloc("foo");

            // We want to be able to load baz, which is defined normally in C#
            // Ldloc("baz"); -- will not work, and local names are not available in Mono.Cecil: https://groups.google.com/g/mono-cecil/c/GwDG0bQ8TXk/m/mIr6QtXTQNsJ
            // But we could use an index. There is one auto local for local function IncrBaz,
            // so the baz has index 1. Or we could look the index in actual IL after weaving.
            Ldloc_1();
            Add();
            return IL.Return<int>();

            void IncrBaz()
            {
                baz++;
            }
        }

Looking into MethodLocals implementation I think the fix could be simple. Even though Mono.Cecil does not support names of locals (or technically there are no names in IL), we could use indices.

A potential fix is to add this:
image

This will create some complexity with indexing, because the required index will be the one from combined locals, not from IL-defined locals. And in the presence of local functions, like in the sample, there could be autogenerated locals. To get the right indices one will have to look into generated IL. Thankfully, in Rider, IL View is one click away and it shows IL after weaving. So this indexing behavior could be left as "by design".

But since it's the first time I look into this source code, I don't know if something else could break after the proposed change.


The real use case looks like this:
image

This is one of the hottest methods, and GetLoadRef2 is not trivial. But it's still faster to have it as AggressivelyInlined. However, this generates twice as much of the assembly code. So the idea is to initialize ref locals in a loop to have the GetLoadRef2 inlined once.

@ltrzesniewski
Copy link
Owner

ltrzesniewski commented Nov 28, 2020

Hi Victor!

InlineIL doesn't let you access variables declared in C# that way primarily because that would produce very brittle code, as the compiler can decide to manage locals in unpredictable ways. For instance, a local can be present in Debug builds, but it could be entirely eliminated in Release builds if the compiler can use the stack instead. You couldn't also rely on the index of your local to be stable.

Because of this, InlineIL lets you treat locals defined by DeclareLocals as always starting at index 0, and then remaps the indices of locals in emitted IL instructions to the real index (it actually appends the locals you define to the ones that already exist, and Ldloc_1 in your example would load the value of bar). Changing this would be a major breaking change.

Note that in your example, baz is not an int32 local in IL, but is wrapped in a "DisplayClass" struct because of the IncrBaz method.

Still, you can access C# variables in InlineIL by using methods like IL.Push. In your example, if you replace Ldloc_1(); with IL.Push(baz);, you'll get baz on the stack. 😉

Similarly, IL.Push(ref baz); would push a ref to baz on the stack, and you could use IL.Pop(out baz); to assign the value on top of the stack to baz.

I'm not sure I understood you correctly though (I don't really understand the link between the example and your real use case), so let me know if something's still missing in that toolbox.

@buybackoff
Copy link
Author

buybackoff commented Nov 28, 2020

With simple baz as int I understood how to do things, but what is baz was ref local, e.g. ref KeyValuePair<long,object> baz? I think there are not enough overloads to store a value to such baz, but maybe I missed something.

@buybackoff
Copy link
Author

you could use IL.Pop(out baz); to assign the value on top of the stack to baz.

Probably this is what I've missed.

@ltrzesniewski
Copy link
Owner

I think with a ref local you can do anything you like on the target, but there's no method that would let you reassign the ref...

I initially tried to write IL.Pop as T Pop<T>() but I had to change it to void Pop<T>(out T value) since the other one didn't play very well with the compiler (it would emit code that did not lend itself to weaving very well). I guess I'd need to write a method like void Pop<T>(out ref T value) but that's not a valid signature 😢

I can see if I can add a ref T PopRef<T>() method, but I suppose if will prove difficult for the same reason as T Pop<T>() didn't work.

@buybackoff
Copy link
Author

buybackoff commented Nov 28, 2020

Do you think this creates a GC hole? DynValue is the same as KeyValuePair<long,object> from GC perspective.

image

This doesn't work as on the picture. But if I was sure it's safe from GC point of view, I will work out pointer arithmetics, maybe even without IL.

@buybackoff
Copy link
Author

So this works:

image

But it doesn't with the loop:

image

The stack should be balanced

@buybackoff
Copy link
Author

buybackoff commented Nov 28, 2020

I was able to measure the impact of double inlining and it appears that the cost of for loop - additional local, int arithmetic, branch - is higher. Measured not exactly the same thing, but close, and will just leave two inlined methods. The main interest was to understand the impact. Sometimes very tiny obscure change improves (or kills) performance dramatically.

Thanks for your help! I really like how easy it is to use this lib 😄

@ltrzesniewski
Copy link
Owner

Yeah, the for loop won't work, as you need to have a well-defined evaluation stack state for each basic block.

But I think you probably have a GC hole here since you're going through a void* local... I guess it depends on the way this is going to get JITted. 😕

I think I'll have to provide some kind of PopRef method or add something similar which would let you reassign a ref.

@buybackoff
Copy link
Author

I do not understand why the second picture from here doesn't work #23 (comment)

The stack should be balanced, we put two references on it in a loop, then pop outside the loop. The code is identical other than the loop...

If it worked, I would really like a method:

public ref T Ldlocal<T>(index or name) 

// usage

if(Ldlocal<DynValue>(0).IsNumber && Ldlocal<DynValue>(1).IsNumber)
    ...


// same as:

ref DynValue B = ref ...
ref DynValue C = ref ...
if(B.IsNumber && C.IsNumber)
   ...

So that there is no struct copying when accessing it's content for ref local.

@ltrzesniewski
Copy link
Owner

The evaluation stack doesn't work that way, its state needs to be statically defined at each instruction:

See ECMA-335, §I.12.3.2.1:

The evaluation stack is made up of slots that can hold any data type, including an unboxed
instance of a value type. The type state of the stack (the stack depth and types of each element on
the stack) at any given point in a program shall be identical for all possible control flow paths.
For example, a program that loops an unknown number of times and pushes a new element on
the stack at each iteration would be prohibited.

@buybackoff
Copy link
Author

It's somewhat 🤯 to deal with dotnet value stack and locals while working on implementing value stack and locals. Those code sample are from Lua interpreter, trying to move values between the stack and variables in the most efficient branchless way 😀

@ltrzesniewski
Copy link
Owner

Haha yeah I can imagine 😅 Good luck with that! 😄

@buybackoff
Copy link
Author

buybackoff commented Nov 28, 2020

Good luck with that! 😄

Thanks! It's already faster than C on SciMark benchmark. But absolutely meaningless timing of the benchmark is 1095, and you could guess what my obsession is and what a round number is the goal 😂

I have implemented kind of JIT, or an optimization phase, that rewrites instructions from stack-based to virtual-registers based ones. Could eliminate more than 25% of all instructions. Without that the timing is at 1460. I should probably continue with this direction, and not IL rewrite.

Very fun experience overall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants