Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

Bring project up to date with latest .NET and Python #237

Open
wants to merge 287 commits into
base: master
Choose a base branch
from

Conversation

tonybaloney
Copy link
Contributor

@tonybaloney tonybaloney commented Sep 28, 2020

  • Convert build process to CMake
  • Link .NET 5 stdlib libraries directly instead of copying dll files
  • Update API to implement the merged PEP for FrameEval
  • Remove the Python submodule as now uses the Python3 libraries directly
  • Validate C++11/Clang support and compilation errors to ensure builds correctly on Linux and MacOS
  • Remove MSBuild related files and scripts
  • Remove IExecutionEngine/CExecutionEngine shim classes as they're removed from .NET and no longer needed for init
  • Remove legacy Opcodes
  • Replace Windows TLS API with the "new" PEP539 TLS API
  • Add new opcodes
  • Test it?

Opcodes to implement

  • JUMP_IF_NOT_EXC_MATCH
  • DICT_MERGE
  • LOAD_ASSERTION_ERROR
  • RERAISE
  • SETUP_ANNOTATIONS
  • IS_OP
  • DICT_UPDATE
  • CONTAINS_OP
  • SET_UPDATE
  • LIST_EXTEND
  • LIST_TO_TUPLE
  • ROT_FOUR

Unsupported

  • WITH_EXCEPT_START
  • END_ASYNC_FOR
  • GET_AITER

@brettcannon brettcannon marked this pull request as draft September 28, 2020 22:57
@brettcannon
Copy link
Member

@tonybaloney I have made this a draft PR for now. Once it's no longer a WIP and you would like someone to have a look just let us know.

@tonybaloney
Copy link
Contributor Author

@brettcannon The frame eval hardcoded to the default implementation in the interpreter state. Am I correct in thinking the only way to change it to an external library is to modify the source and recompile CPython?

interp->eval_frame = _PyEval_EvalFrameDefault;

@brettcannon
Copy link
Member

Yeah. We did that because we weren't sure what another library that set the frame eval function would do, so to minimize "this doesn't work with coverage.py" we hard-coded it.

@tonybaloney
Copy link
Contributor Author

This change to CPython 3.8 broke the way that the PoC worked
https://bugs.python.org/issue35886

There are some new APIs instead _PyInterpreterState_SetEvalFrameFunc being one

@tonybaloney
Copy link
Contributor Author

Added

  • LOAD_METHOD and CALL_METHOD, but they need checking because of the special logic for conditional pop'ing of the NULL value
  • IS_OP, ROT_FOUR

There are two roadblocks I've hit

  1. The POP_BLOCK logic causes a crash. there were some comments in there about changes coming in 3.6, and some other code commented out. If you JIT any code the operation after a2a72ab#diff-bf4ab7594f080dd1192b4861366b404bR93-R101
  2. The JIT crashes with an internal assertion around the during the Unwind phase. I've got a full debug up and running and it looks related to a code offset address. It's trying to read an invalid memory address, but I can't figure out where its coming from. It might be related to point 1

I can't progress any further, which seems a shame because I got it so far!

@brettcannon if you could have a look at this and any ideas about those points.

There are a few more opcodes left to go, but without solving the POP_BLOCK issue its very hard to reproduce them.

@tonybaloney
Copy link
Contributor Author

Screen Shot 2020-09-30 at 6 15 09 pm

@brettcannon
Copy link
Member

@DinoV up for helping out?

@tonybaloney
Copy link
Contributor Author

Raised issue on the runtime project dotnet/runtime#42925

@tonybaloney
Copy link
Contributor Author

@brettcannon the .NET team kindly helped reproduce issue (2) and gave a pointer as to the cause. That has now been fixed, so the JIT runs, compiles, unwinds and almost executes, but it fails with some assertions around the frame block positions.

I'm convinced its the same issue as (1). Think this is pretty close to working now!

@tonybaloney tonybaloney marked this pull request as ready for review November 11, 2020 03:08
@tonybaloney tonybaloney changed the title [WIP] Bring project up to date with latest .NET and Python Bring project up to date with latest .NET and Python Nov 11, 2020
@tonybaloney
Copy link
Contributor Author

@brettcannon ready for review. test suite passing on all platforms now.

pyjion/jitinfo.h Outdated Show resolved Hide resolved
@tonybaloney
Copy link
Contributor Author

@AndyAyersMS do you know much about the binary structure of the CIL?

I've written a disassembler in Python so that you can printout the CIL instructions inside Python. For small functions it seems really simple and follows the ECMA specification, but for larger functions the binary data is totally different. I think it's either padded or in another format. I read about the Fat format, but can't find any good examples, I think the larger functions are that format.

def print_il(il):

@unittest.expectedFailure # not implemented yet
def test_fat(self):
test_method = bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\xa68\xd6\x11\xa5\x7f\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\xa68\xd6\x11\xa5\x7f\x00\x00\x0e8\xd6\x11\xa5\x7f\x00\x00\n\x00\x00'
b'\x00J\x17XT\x90\xdd\xdb\x99\xff\x7f\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00 h\x01\x00\x00\xd3X\x11\n\xdf(\x10\x00\x00\x00\x06 '
b'\x04\x00\x00\x00\xd3T!P\x19\xd2\x11\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \x06\x00\x00\x00\xd3T\x13\n\x03 '
b'p\x01\x00\x00\xd3XM\x03 p\x01\x00\x00\xd3X\x11\n\xdf(\x10\x00\x00\x00\x06 '
b'\x08\x00\x00\x00\xd3T\x03 '
b'h\x01\x00\x00\xd3XM%\x0c\x16\xd3@\x1a\x00\x00\x00!0\x1f\xee\x11\xa5\x7f\x00\x00'
b'\xd3(:\x00\x00\x00\x03(8\x00\x00\x008\n\x04\x00\x00\x08% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \n\x00\x00\x00\xd3T\x03 '
b'p\x01\x00\x00\xd3XM%\x0c\x16\xd3@\x1f\x00\x00\x00!p\xbe\xe1\x11\xa5\x7f\x00\x00'
b'\xd3(:\x00\x00\x00\x03(8\x00\x00\x00(\x10\x00\x00\x008\xc3\x03\x00\x00\x08% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \x0c\x00\x00\x00\xd3T\x18('
b'\t\x00\x00\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\x93\x03\x00\x00\x08\x06 '
b'\x0e\x00\x00\x00\xd3T%!\x80\x9a\x91\t\x01\x00\x00\x00\xd3;=\x00\x00\x00%!`\x9a\x91'
b'\t\x01\x00\x00\x00\xd3;#\x00\x00\x00%(\x05\x00\x02\x00%\x15@\x11\x00\x00\x00&\x03('
b'8\x00\x00\x00(\x10\x00\x00\x008L\x03\x00\x00:\n\x00\x00\x00('
b'\x10\x00\x00\x008T\x00\x00\x00(\x10\x00\x00\x00\x06 '
b'\x10\x00\x00\x00\xd3T!p\x19\xd2\x11\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \x12\x00\x00\x00\xd3T\x13\n\x03 '
b'x\x01\x00\x00\xd3XM\x03 '
b'x\x01\x00\x00\x00\x00\x00\x00\x00\xe0\xb2\x83\x13\xa5\x7f\x00\x00\xe0\xb2\x83\x13'
b'\xa5\x7f\x00\x00\x01\x00\x00\x00\x16\x00\x00\x00\x00\x00\x00\x00('
b'\xd2\x11\xa5\x7f\x00\x00\xd3% \x00\x00\x00\x00\xd3X%J\x17XT\x06 '
b'\x18\x00\x00\x00\xd3T\x13\n\x03 x\x01\x00\x00\xd3XM\x03 '
b'x\x01\x00\x00\xd3X\x11\n\xdf(\x10\x00\x00\x00\x06 \x1a\x00\x00\x00\xd3T\x03 '
b'x\x01\x00\x00\xd3XM%\x0c\x16\xd3@\x1a\x00\x00\x00!0\x8c\xdd\x11\xa5\x7f\x00\x00'
b'\xd3(:\x00\x00\x00\x03(8\x00\x00\x008s\x02\x00\x00\x08% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 '
b'\x1c\x00\x00\x00\xd3T!\x90\x19\xd2\x11\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \x1e\x00\x00\x00\xd3T\x18('
b'\t\x00\x00\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008$\x02\x00\x00\x08\x06 '
b'\x00\x00\x00\xd3T%!\x80\x9a\x01\x00\x00\x00\x00\x00\xd3;=\x00\x00\x00%!`\x9a\x91\t'
b'\x01\x00\x00\x00\xd3;#\x00\x00\x00%(\x05\x00\x02\x00%\x15@\x11\x00\x00\x00&\x03('
b'8\x00\x00\x00(\x10\x00\x00\x008\xdd\x01\x00\x00:\n\x00\x00\x00('
b'\x10\x00\x00\x008\x8d\x00\x00\x00(\x10\x00\x00\x00\x06 '
b'"\x00\x00\x00\xd3T\x03!\xb0\xc6\xd6\x11\xa5\x7f\x00\x00\xd3('
b'\x00\x00\x03\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\x9d\x01\x00\x00\x08\x06 '
b'$\x00\x00\x00\xd3T!\xf0\xb2\x83\x13\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 &\x00\x00\x00\xd3T('
b'\x01\x00\x01\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\\\x01\x00\x00\x08\x06 (\x00\x00\x00\xd3T(\x10\x00\x00\x00\x06 '
b'*\x00\x00\x00\xd3T8{\x00\x00\x00\x06 ,'
b'\x00\x00\x00\xd3T\x03!\xb0\xc6\xd6\x11\xa5\x7f\x00\x00\xd3('
b'\x00\x00\x03\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\x15\x01\x00\x00\x08\x06 '
b'.\x00\x00\x00\xd3T!\xb0\xb5\x83\x13\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 0\x00\x00\x00\xd3T('
b'\x01\x00\x01\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\xd4\x00\x00\x00\x08\x06 2\x00\x00\x00\xd3T(\x10\x00\x00\x00\x06 '
b'4\x00\x00\x00\xd3T!0\x19\xd2\x11\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 6\x00\x00\x00\xd3T\x13\n\x03 '
b'\x80\x01\x00\x00\xd3XM\x03 \x80\x01\x00\x00\xd3X\x11\n\xdf(\x10\x00\x00\x00\x06 '
b'8\x00\x00\x00\xd3T!P\x19\xd2\x11\xa5\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 :\x00\x00\x00\xd3T\x13\n\x03 '
b'\x88\x01\x00\x00\xd3XM\x03 \x88\x01\x00\x00\xd3X\x11\n\xdf(\x10\x00\x00\x00\x06 '
b'<\x00\x00\x00\xd3T!\xe0\xce\x92\t\x01\x00\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 '
b'>\x00\x00\x00\xd3T\x0b\xdd\x1c\x00\x00\x00\t\x16>\t\x00\x00\x00&&&\x19\tY\r+\xf08'
b'\x00\x00\x00\x00\x16\xd38\x01\x00\x00\x00\x07\x03(B\x00\x00\x00*')
f = io.StringIO()
with contextlib.redirect_stdout(f):
print_il(test_method)
self.assertIn("ldarg.1", f.getvalue())
def test_thin(self):
test_method = bytearray(b'\x03 h\x00\x00\x00\xd3X\n\x03(A\x00\x00\x00\x16\r\x06 '
b'\x00\x00\x00\x00\xd3T\x03!\xb0\xc6V)\x91\x7f\x00\x00\xd3('
b'\x00\x00\x03\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008\x91\x00\x00\x00\x08\x06 '
b'\x02\x00\x00\x00\xd3T!\xf0\xc3\x13*\x91\x7f\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 \x04\x00\x00\x00\xd3T('
b'\x01\x00\x01\x00%\x0c\x16\xd3@\x0b\x00\x00\x00\x03('
b'8\x00\x00\x008P\x00\x00\x00\x08\x06 \x06\x00\x00\x00\xd3T(\x10\x00\x00\x00\x06 '
b'\x08\x00\x00\x00\xd3T!\xe0\x1e\xda\x02\x01\x00\x00\x00\xd3% '
b'\x00\x00\x00\x00\xd3X%J\x17XT\x06 '
b'\n\x00\x00\x00\xd3T\x0b\xdd\x1c\x00\x00\x00\t\x16>\t\x00\x00\x00&&&\x19\tY\r+\xf08'
b'\x00\x00\x00\x00\x16\xd38\x01\x00\x00\x00\x07\x03(B\x00\x00\x00*')

@AndyAyersMS
Copy link
Member

This is CIL you're generating, or CIL we've created that you just want to parse?

As you likely know the fat format is described in Ecma-335 II.25.4.3. If you prefer code, check out

https://github.com/dotnet/runtime/blob/72b7d236ad634c2280c73499ebfc2b594995ec06/src/coreclr/src/inc/corhdr.h#L1240-L1250

and the various decoder helper classes, eg

https://github.com/dotnet/runtime/blob/72b7d236ad634c2280c73499ebfc2b594995ec06/src/coreclr/src/inc/corhlpr.h#L626-L632

@tonybaloney
Copy link
Contributor Author

tonybaloney commented Nov 16, 2020

Merged code will be redirected into tonybaloney#4 and live in this fork

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants