Skip to content
This repository has been archived by the owner on Nov 1, 2020. It is now read-only.

Support for IL interpreter in CoreRT #5011

Open
tonerdo opened this issue Nov 23, 2017 · 55 comments
Open

Support for IL interpreter in CoreRT #5011

tonerdo opened this issue Nov 23, 2017 · 55 comments

Comments

@tonerdo
Copy link
Contributor

tonerdo commented Nov 23, 2017

We should consider porting Mono's .NET interpreter (http://www.mono-project.com/news/2017/11/13/mono-interpreter/) over to CoreRT, to support runtime IL generation and execution within native code

@jkotas
Copy link
Member

jkotas commented Nov 23, 2017

The interpreter implementation is always tighly coupled with the underlying runtime implementation details. It cannot be easily ported from runtime to runtime. This would be more like a rewrite, maybe using some of the ideas from the other implementations. We would want to rewrite it in C# for CoreRT. Implementing it in C/C++ for CoreRT would be much harder than implementing it in C#.

BTW: We have interpreter in CoreCLR as well: https://github.com/dotnet/coreclr/blob/master/src/vm/interpreter.cpp . It is not shipping in .NET Core because of .NET Core would not benefit from it. It has been used for new platform bringups, and various experiments.

There are two distinct pieces:

  • IL interpreter: The interpreter should not be tightly coupled with Reflection.Emit. It should be able to interpret any IL, e.g. even IL comming from regular IL .dlls. The idea is that we would have pluggable execution strategy for the IL. Some of the early plumbing for this has been done in our System.Private.Jit prototype - look for MethodExecutionStrategy. The interpreter can live in say System.Private.Interpreter and be pluged in as execution strategy. A good first step would be to put all scafolding in place that is able to interpret a super simple method that just returns "42" end-to-end. Let's use this issue to track further dicusssion on this.

  • Refleflection.Emit: Reflection.Emit should be independent part as described in: https://github.com/dotnet/corefx/issues/4491#issuecomment-189756092 . Let's use this corefx issue for further discussion on Reflection.Emit. The "Run" flavor of Reflection.Emit should communicate with the runtime by passing PE files down. In case of DynamicMethods, it can pass down just baked IL method bodies as performance optimization since producing a PE file per dynamic method would be too expensive.

@jkotas jkotas changed the title Support for Reflection.Emit in CoreRT Support for IL interpreter in CoreRT Nov 23, 2017
@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 22, 2018

@jkotas looking to take a first pass at this. However, I have a couple of questions:

  • The OnEntryPoint method returns an IntPtr which I'm guessing points to the native compiled code. How will this work in the context of the interpreter.
  • Considering CoreRT ships without a JIT, at what point is the RyuJitExecutionStrategy used? Where in the code can I find it?

@jkotas
Copy link
Member

jkotas commented Apr 22, 2018

The OnEntryPoint method returns an IntPtr

CoreRT has capability to manufacture entrypoints. Look for CallConverterThunk.MakeThunk for an example. The call converter is sort of a mini-interpreter that just changes once calling convention to a different one.

RyuJitExecutionStrategy

Code under src\System.Private.Jit was a prototype for JIT support in CoreRT. It was able to run a few methods, nothing extensive. It does not ship, and it is not used in the default CoreRT config and it has likely bit-rotten and does not work anymore.

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 22, 2018

Thanks for the quick response. I have more stuff I'm curious about

What happens when MethodInfo.Invoke(object, args) is called? I'm guessing there's a transition from managed code to unmanaged code at some point but I can't seem to find it (I've looked through coreclr, corefx and corert)

I ask because I'm trying my best to avoid using reflection and instead leverage the underlying runtime capabilities. Specifically how info like the value of static and instance fields are retrieved from a handle to a managed object (*Object?) that already exists in memory

@mattwarren
Copy link
Contributor

What happens when MethodInfo.Invoke(object, args) is called? I'm guessing there's a transition from managed code to unmanaged code at some point but I can't seem to find it (I've looked through coreclr, corefx and corert)

I've written about this scenario, see the section on 'How does Reflection work?' (referring to the CoreCLR)

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 23, 2018

Sweet! thanks Matt! Will take a look.

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 23, 2018

Yup, that definitely answers my question, knew there had to be an FCALL somewhere

@mattwarren
Copy link
Contributor

mattwarren commented Apr 23, 2018

I've looked into the equivalent flow for CoreRT (mostly just for my own interest) and unless I'm missing something it's all in C#, which is pretty cool (I know that's one of the goals of CoreRT, it's just interesting to see it in action)!

At this point there are no C# method bodies, but after a bit of digging, it seems that the [McgIntrinsics] attribute is a placeholder and the method bodies are wired up via ILProvider - CreateMethodIL(MethodDesc method) and finally CalliIntrinsic - EmitIL(MethodDesc target) (using OpCodes.Calli)

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 23, 2018

You should write a blog post on this 😃

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 23, 2018

Here's also for my second question on getting field info from an object reference:

@morganbr
Copy link
Contributor

It looks like you're on the right track with the implementation details in System.Private.Reflection.Execution (no FCALLs in CoreRT). @MichalStrehovsky or @davidwrighton, do you have any details to fill in?

@mattwarren
Copy link
Contributor

Just out of interest, from reading this post there seems to be 2 possible approaches to allowing runtime code generation in CoreRT:

  1. Implement an IL Interpreter
  2. Use RyuJIT (i.e. System.Private.Jit)

I was wondering what's the preferred approach? Or in an ideal world would you have both? (leaving aside the cost to implement, engineering time, other priorities, etc)

Is 2) less preferable because it's no longer 'AOT compiled' or is that not really a concern? I assume that 1) is slower, but does that matter?

@jkotas
Copy link
Member

jkotas commented Apr 25, 2018

Interpreter is good to have for places where (Ryu)JIT is not available, not allowed (e.g. Apple devices), or not desirable (e.g. environments locked down for security).

It may be also used as part of tiered code generation strategy.

@mattwarren
Copy link
Contributor

@jkotas Thanks for the info

@tonerdo BTW you may find these 2 blog posts help you out and/or give you something to think about when working on this issue:

@tonerdo
Copy link
Contributor Author

tonerdo commented Apr 29, 2018

Hi @jkotas,

I've taken a thorough look through the code, especially the parts involved in building thunks for CallConverterThunk.MakeThunk and here's what I'm still trying to wrap my head around:

  1. At this point where the thunk data is set we're simply passing in an integer that represents the index of a CallConversionInfo object in a lowlevel dictionary, at no point is this dictionary or the object itself passed in. So how does the runtime figure out the connection? How do SetThunkData and related methods actually work? (The native code seems to be some external dependency, can't find any)

  2. The OnEntryPoint method also takes a second callerArgs argument that is an IntPtr which I'm guessing represents a pointer to the method arguments. The only thing is I'm not sure exactly how to use it. Does it point to the starting memory address the arguments are stored in?

  3. The EntrypointThunk in JitSupport.MethodEntrypointStubs doesn't seem to be used anywhere, also the GlobalExecutionStrategy in ReflectionExecution is understandably not in use. Which begs my question on how to wire stuff up to be able to adequately test out the interpreter.

Cheers

@jkotas
Copy link
Member

jkotas commented Apr 29, 2018

So how does the runtime figure out the connection? How do SetThunkData and related methods actually work?

SetThunkData sets what the low-level code should call back when the thunk is actually called.

I have found that we have this actually wrapped in more easy to use type: abstract class CallInterceptor.

The only thing is I'm not sure exactly how to use it. Does it point to the starting memory address the arguments are stored in?

Yes, it points to the memory block where arguments are stored in.

I have found that we have this actually wrapped in more easy to use type: abstract class CallInterceptor. Can you try the following?

  • Inherit from CallInterceptor and implement the abstract methods. For a quick experiment, just make them return some fixed arrays.
  • Create instance of the CallInterceptor and call GetThunkAddress() on it. It will give you the address of the method.
  • Either wrap the ThunkAddress in a delegate and call the delegate; or call it using CalliIntrinsics directly
  • You should see your overiden ThunkExecute method getting called. And the CallInterceptorArgs should give you access to the method arguments.

@tonerdo
Copy link
Contributor Author

tonerdo commented May 12, 2018

Thanks for info, been a bit busy with work. Will take a first pass over the weekend

tonerdo added a commit to tonerdo/corert that referenced this issue May 20, 2018
@tonerdo
Copy link
Contributor Author

tonerdo commented May 20, 2018

Hi @jkotas, quick questions about the CallInterceptor class

  • How do I get the information (e.g. MethodDesc) of the method being intercepted?
  • How will the IL of the method be passed to the CallInterceptor?

@jkotas
Copy link
Member

jkotas commented May 20, 2018

How do I get the information (e.g. MethodDesc) of the method being intercepted?

Store it as a field in your type inherited from CallInterceptor.

How will the IL of the method be passed to the CallInterceptor?

It should be computed from the MethodDesc. I would just make something simple to make the prototype work.

Eventually, it may need to handle IL coming from all sort of different places - something like

private MethodIL CreateMethodIL(MethodDesc method)
.

@tonerdo
Copy link
Contributor Author

tonerdo commented May 20, 2018

I'll like to run my thought process by you @jkotas, kindly correct me anywhere I might have it wrong:

  1. ReflectionExecution is initialized as part of LibraryInitializers

  2. An ExecutionStratergy is the entrypoint for runtime code execution. I basically inherit the MethodExecutionStrategy and override the OnEntryPoint method.

  3. I implement the abstract CallInterceptor class (which is a convenient wrapper to wire up Thunks). I initialize my implementation of the CallInterceptor class (passing in all the required info) and return the value of GetThunkAddress as the return value of the OnEntryPoint method.

  4. The interpreter logic starts from my implementation of the ThunkExecute method

  5. CoreRT currently doesn't have a way to pass baked IL methods coming from DynamicMethod to the Reflection Execution engine. I'll need to add this to be able to test preliminary support for Reflection Emit

@jkotas
Copy link
Member

jkotas commented May 21, 2018

@tonerdo Yes, you got it. I do not see anything wrong in your description.

tonerdo added a commit to tonerdo/corert that referenced this issue Feb 12, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Feb 12, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Feb 12, 2019
jkotas pushed a commit that referenced this issue Feb 12, 2019
* add support for newarr opcode (#5011)

* add support for ldlen opcode (#5011)

* add support for stelem.* opcodes (#5011)

* add support for ldelem.* opcodes (#5011)

* use the right arguement type metadata (#5011)

* handle when index values are native int (#5011)

* use generic Unsafe class to read/write array elements (#5011)

* add array bounds check (#5011)

* add assignability checks for stelem.ref opcode (#5011)

* add IntPtr to int conversion bounds check (#5011)

* simplify IntPtr<->int bounds check expression (#5011)
tonerdo added a commit to tonerdo/corert that referenced this issue May 27, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Jun 1, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Jun 6, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Jun 6, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Jul 21, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue Jul 21, 2019
tonerdo added a commit to tonerdo/corert that referenced this issue May 3, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 4, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 5, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 12, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 12, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 12, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 12, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue May 13, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue Jun 9, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue Sep 27, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue Oct 17, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue Oct 17, 2020
tonerdo added a commit to tonerdo/corert that referenced this issue Oct 18, 2020
jkotas pushed a commit that referenced this issue Oct 30, 2020
* Add support for ldfld and stfld (#5011)

* Make interpreter methods take an concrete *Desc types

* Update load and store field methods to handle specific types (5011)

* Add support for static fields belonging to dynamically loaded types

* Add support for loading and storing fields in native metadata

* Use byte arrays to represent statics bases

* Add support for static constructors

* Include module name in key identity check

* Ensure static class constructor is run for native format types

* Retrieve non gc statics from typeloader environment

* Allocate memory for non gc statics (#5011)

* Add support for dynamic static constructors in runtime (#5011)

* Ensure static constructors of compiled in types are run (#5011)

* Use RuntimeTypeHandle when type is in native binary (#5011)

* Simplify static field load/store methods (#5011)

* Add support for gc statics of dynamic types (#5011)

* Fix static field get/set for pre-compiled in types

* Add TryGetThreadStaticFieldDataDirect method

* Remove StaticsRegion class

* Use state.GcDataSize to calculate eetype base size

* Improve support for gc statics

* s/GetHasCode/GetHashCode

* Improve code comments

* Clean up thread static support code
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants