Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Examples Compiler Explanation #51

Open
tripleo1 opened this issue Dec 31, 2021 · 1 comment
Open

Basic Examples Compiler Explanation #51

tripleo1 opened this issue Dec 31, 2021 · 1 comment

Comments

@tripleo1
Copy link

Maybe if you have time you could explain how the compiler works (or is supposed to work), specifically for the two basic examples of factorial (or fibonacci) and hello world in bxl (which, if I understand correctly is abandoned because of typing issues) and your newer fast branch?

@c3d
Copy link
Owner

c3d commented Dec 31, 2021

I'm not sure how much detail you want.

  • In the bxl variant (xl2 subtree), the parse tree would be progressively rewritten using translate statements, until it was lowered to a "bytecode" that would then be mapped to C code using the file xl2/native/library/runtime/C/xl.bytecode (another similar file would generate Java code). The generated code would then be fed to a C compiler, with the help of xl_lib.h to provide some basic support functions.

  • In the fast branch, an early dispatch will instantiate various implementations of the Evaluator class based on the optimization level. After that, parsing will be common to all implementation, sharing the same parse tree, and then passing it to the selected Evaluate function.

  • The interpreter (-O0) will simply rewrite the parse tree progressively in order to evaluate the tree.

  • The bytecode evaluator (-O1) is a compiler to an intermediate "bytecode" representation which is designed for fast interpretation. I'm presently focusing on that implementation because tracking LLVM changes was taking most of my very limited development time. See below what the bytecode looks like. So there is a "real" basic compiler, but it's very incomplete at this point.

  • The "fast compiler" (-O2) is a port of the XL compiler as it existed in Tao3D. Unfortunately, there are a number of issues with it (starting with a very bad type system), and I'm not really looking at it.

  • The LLVM compiler (-O3) was attempting to use Haskell-style type inference and from there generate machine-level code. I've put that on hold after realizing how futile it was to try and track LLVM incompatible API respins. At this point, it supports LLVM up to 10 (I believe), but fails on "Fibonacci" on an assert regarding internal low-level types (Assertion failed: (!boxed[base] || boxed[base] == mtype), function AddBoxedType, file compiler-types.cpp, line 152). Essentially, the type inference fails to correctly deduce consistent integer types. If you are interested, I can explain how it's supposed to work, but for now, I'm not spending much time on this one.

Back to the bytecode one, if you run ./xl -O1 -B f.xl with f.xl containing:

fib 0 is 1
fib 1 is 1
fib N is (fib(N-1) + fib(N-2))

fib 6

then you should get the following bytecode output, slightly commented with //:

// Compiles whole declaration scopes, here `builtins.xl`, as a `get_scope` that returns the scope
Compiled [X:natural + Y:natural as natur…as tree is native "xl_listen"] as
Locals:
  L0	= [X:natural + Y:natural as natur…as tree is native "xl_listen"]
Opcodes (1 locals, 0 parameters, 1 temporaries)
     0: get_scope           L0
     2: ret                 L0

// Compilation of several basic types in order to evaluate them during type check
Compiled [natural] is [natural] as
Constants:   // The bytecode contains a table of constants
  C0	= natural
Types:    // This is the type information gathered during compilation
  natural	as	name
  natural	as	nil  // This means "no type found" (a bug, I'm working on it)
Opcodes (0 locals, 0 parameters)
// A type is evaluated as a constant tree value with the type name (ensures unicity)
     0: ret                 C0

Compiled [integer] is [integer] as
Constants:
  C0	= integer
Types:
  integer	as	name
  integer	as	nil
Opcodes (0 locals, 0 parameters)
     0: ret                 C0

Compiled [real] is [real] as
Constants:
  C0	= real
Types:
  real	as	name
  real	as	nil
Opcodes (0 locals, 0 parameters)
     0: ret                 C0

// Actual compilation of the `fib N` body
Compiled [fib N] is [(fib (N - 1) + fib (N - 2))] as
Constants: // Constant table like above
  C0	= 1
  C1	= 0
  C2	= fib N
  C3	= 2
  C4	= fib (N - 1) + fib (N - 2)
Parameters:
  A0	= N
Locals:
  A0	= [N][N][N]     // A0 is argument 0, here mapped to three different tree representations of N
  L1	= [N - 1]  // L1 is local 1, here mapped to computed value for N-1
  L2	= [fib (N - 1)]  // L2 is local 2, mapped to computed value for fib(N-1)
  L3	= [N - 2]
  L4	= [fib (N - 2)]
Types:
  N	as	natural // Types as deduced
  fib (N - 2)	as	real // This is bogus (work in progress), should have retained natural
  fib (N - 1)	as	real
  fib (N - 1) + fib (N - 2)	as	real
Opcodes (5 locals, 1 parameters, 4 temporaries)
     0: sub_natural         A0, C0, L1    // Here, we subtract C0=1 from N and result in L1 = N-1
     4: check_natural       L1, C1, +5=13 // We check if N-1 is C1=0, if not jump to 13
     8: copy                C0, L2 // For N-1=0, we copy constant C0=1 into L2 (fib N-1)
    11: branch              +14=27
    13: check_natural       L1, C0, +5=22 // We check if N-1 is C0=1, if not jump to 22
    17: copy                C0, L2 // For N-1=1, we copy C0=1 into L2 (fib N-1)
    20: branch              +5=27
    22: call                C2, (1 arguments) // Remaining case (not 0 or 1): recursive call of C2 = fib N
       0: L1 // Pass argument L1 = N-1
       => L2 // Result in L2
    27: natural_typecheck   L2, +36=66 // Check that the result is a natural
    30: sub_natural         A0, C3, L3
    34: check_natural       L3, C1, +5=43
    38: copy                C0, L4
    41: branch              +14=57
    43: check_natural       L3, C0, +5=52
    47: copy                C0, L4
    50: branch              +5=57
    52: call                C2, (1 arguments)
       0: L3
       => L4
    57: natural_typecheck   L4, +6=66
    60: add_natural         L2, L4, L5
    64: branch              +26=92
    66: integer_typecheck   L2, +9=78
    69: integer_typecheck   L4, +6=78
    72: add_integer         L2, L4, L5
    76: branch              +14=92
    78: real_typecheck      L2, +9=90
    81: real_typecheck      L4, +6=90
    84: add_real            L2, L4, L5
    88: branch              +2=92
    90: form_error          C4
    92: ret                 L5

Compiled [fib 0 is 1|fib 1 is 1|fib N is… (N - 1) + fib (N - 2))|fib 6] as
Constants:
  C0	= 6
  C1	= fib N
Locals:
  L0	= [fib 6][fib 0 is 1|fib 1 is 1|fib N is… (N - 1) + fib (N - 2))|fib 6]
Types:
  6	as	natural
Opcodes (1 locals, 0 parameters, 1 temporaries)
     0: call                C1, (1 arguments)
       0: C0
       => L0
     5: ret                 L0

followed by the computation result:

13

Hope this helps getting started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants