Merge 'expression' JIT backend #674

bdw · 2017-09-04T16:01:52Z

Now that the changes to the legacy JIT have been applied and merged, this branch now contains (mostly) the addition of the expression JIT compiler.
Added functionality includes:

Expression tree building
Code generation by means of tree tiling
Full register selection via DynASM
A general register allocator
- Although it can only handle forward-only graphs for now
Allocating values in local memory (for spilling)
Automated 'bad copmpilation' bisect process
A minor 'bitmap' library (used in the register allocator)

Not yet implemented, but planned:

An optimizer
REPR-specialized templates or tree generation
Multiple basic blocks per expression tree

Because we can't allocate scratch registers anyway, this is no longer useful.

A particular nodes' value may be stored in a multitude of registers and stack locations (and vice versa). Hence we want multiple value (descriptor0 structures per node. I first extract the MVMJitExprValue structure out of the NodeInfo structure. The NodeInfo structure is relevant in an earlier phase, not really during register allocation. Value structures are now spesh-allocated.

Allow easier backwards and forwards traversal in the tilelist, which is relevant in several steps of the register allocator (such as precoloring).

May be useful in other contexts (such as optimization)

We will now need way to specify per-tile register requirements, as tiles can no longer try and 'fix' their registers at runtime. We don't actually seem to hit the NYI paths, though.

Each node value may reside in multiple storage locations, and each storage location may contain multiple node values. ValueDescriptor structures are intended to describe just a single value in a single location, and to be arranged in tables and lists for lookup purposes.

The register allocator only contains temporary structures, and as such does not need to be 'public'.

Tile objects already contain all the necessary values, so we can just pass it rather than push all its elements.

No need to allocate fixed-size buffers

At least conceptually, a value descriptor can have a quite different (typically shorter) lifespan than a live range does. This allows multiple value descriptors per node, and in the future to manage spilled nodes.

This will allow inserting tiles in the linear list with a fixed cost per tile and without upsetting ranges prior to the edit.

So we can create pseudotiles in the register allocator too

This is work-in-progress, doesn't work yet

I can't implement loading spilled values just yet, because it requires sensitivity to the relative ordering of insert-after, and I'm not quite sure how to do that.

Will develop the register allocator based on linear scan, with 4 separate passes.

Vector is probably more common terminology for a resizing array than 'DYNAR', and nearly as short.

A bit more sensible that way I hope

They were treated as the same, but because they are subtly different, they are best treated differently

Contrary to my expectations, MVMint8 != char

So it shouldn't be operated on with vector methods, and vector methods should especially not assume that they work on this particular data structure.

Add a document describing its most important components (expression template processor / tree builder, tiler, and register allocator).

jnthn · 2017-10-02T19:51:52Z

docs/jit/ir.md

+single expression-wide scope and that redefinitions are not allowed.
+
+
+**Statement Macro's** are lists of which the node name start with an


jnthn · 2017-10-02T19:53:29Z

docs/jit/ir.md

+
+A tree macro is defined using the 'macro:' keyword, followed by the
+macro name (including the '^' prefix), a list of macro arguments, and
+the actual macro list. For example, the '^p6obody' macro is


The '^spesh_slot` macro appears to be shown below?

jnthn · 2017-10-02T19:53:59Z

docs/jit/ir.md

+
+    (macro: ^spesh_slot (,a)
+       (idx
+          (load (addr (frame) (&offsetof MVMFrame effective_spesh_slots)))


Also, effective_spesh_slots went away iirc, so this example is a tad out-dated.

effective_spesh_slots still exists in src/core/frame.h, should it not?

Oh, right, that one had to survive my MVMFrame shrinkage. :-)

jnthn · 2017-10-02T20:03:50Z

docs/jit/tiles.md

+A tile description is a list that starts with the keyword
+<code>tile:</code>, followed by the tile *name*, the pattern proper,
+the *symbol* that it generates, and the (estimated) tile *cost*. There
+are two supported symbols, namely __reg__, __flag__ and __void__. The


jnthn · 2017-10-02T20:08:41Z

docs/jit/tiles.md

+        int src_b = values[2]->reg_num;
+        if (values[0]->size < 8)
+            MVM_oops(tc, "oops!");
+        if (src_a != dsst) {


s/dsst/dst/

jnthn · 2017-10-02T20:11:30Z

docs/jit/tiles.md

+  node, if any.
+
+These values are thus available to any tile that is compiled. Often, a
+till will require data that is 'deeper' in the tree, for instance the


s/till/tile/

jnthn · 2017-10-02T20:15:08Z

src/core/bitmap.h

+/* NB - make this a separate 'library', use it for register bitmap */
+/* Witness the elegance of the bitmap for our purposes. */
+MVM_STATIC_INLINE void MVM_bitmap_set(MVMBitmap *bits, MVMint32 idx) {
+    bits[idx >> 6] |= (UINT64_C(1) << (idx & 0x3f));


Why 6, out of curiosity? Are we keeping the upper bits of the byte for something?

The idx indicates the bit we want, the first 6 bits represetn 0-63, so as we're dereferencing an array of 64 bit integers, we need to shift down by 6 bits to get the correct array index

jnthn · 2017-10-02T20:29:54Z

src/jit/core_templates.expr

+              (^is_type_obj $1)
+              (zr (^getf (^stable $1) MVMSTable container_spec)))
+         (store $0 $1 ptr_sz)
+         (callv (^stable_cont_func $1 fetch)


I guess this'd be a bit more efficient with a let that dereferenced the container_spec off the STable once?

Yes, but we'd have to guard against $1 being non-null, and in general complicate the logic quite considerably... It'd be something like:

(ifv (all (nz $1) (^is_concrete_obj $1)) (let (($cont_spec (^getf (^stable $1) MVMSTable container_spec))) (ifv (nz $const_spec) (callv (^getf $const_spec MVMContainerSpec) ...) (store $0 $1 ptr_sz)) (store $0 $1 ptr_sz))

And considering that due to tiling, the loads in (^getf) may be done doubly anyway (I'm planning an optimization to prevent that), I'm not sure we're actually doing better, and more importantly, the (future) optimizer should be able to resolve these things automatically.

But thanks for the comment, well found.

jnthn · 2017-10-02T20:37:10Z

src/jit/expr.c

+                /* A label the OSR can jump into to 'start running', so to
+                 * speak. As it breaks the basic-block assumption, arguably,
+                 * this should only ever be at the start of a basic block. But
+                 * it's not. So we have to insert the label and compute it. */


As of a while ago, this actually is the case: https://github.com/MoarVM/MoarVM/blob/master/src/spesh/graph.c#L329

If that's the case then this can be cleaned up. Either way no label will actually be inserted since MVM_jit_is_label_for_ins will be false (it is a basic block label in that case) 😄

jnthn · 2017-10-02T20:48:46Z

src/jit/linear_scan.c

+     * large in this case (32 for RISC architectures, maybe, if we ever support
+     * them; 7 for x86-64. So the time spent on insertion sort is always small
+     * and bounded by a constant, hence O(1). Yes, algorithmics works this way
+     * :-) */


jnthn · 2017-10-02T20:52:04Z

src/jit/macro.expr

+   (load (addr ,object (&offsetof ,type ,field)) (&SIZEOF_MEMBER ,type ,field)))
+(macro: ^setf (,object ,type ,field ,value)
+    (store (addr ,object (&offsetof ,type ,field)) ,value (&SIZEOF_MEMBER ,type ,field)))
+(macro: ^cu_string (,a) (idx  (^getf (cu) MVMCompUnit body.strings) ,a ptr_sz))


Hm, but strings may be lazily loaded, or do we guard against that case somewhere else?

Nope, good point, this is an old legacy and should be fixed

jnthn · 2017-10-02T20:53:24Z

src/jit/macro.expr

+
+# get spesh slot address
+(macro: ^spesh_slot (,a)
+   (idx (^getf (^frame) MVMFrame effective_spesh_slots) ,a ptr_sz))


Nothing to resolve right now, but I'm curious how we do the "if it's gen2, then we don't need to resolve it again because it won't move" trick with the expr JIT.

In general, with a special node objref or const_obj or somesuch, that should be added by the expr tree builder. And then - to top it off - we can make that a GC root that is automatically updated.

jnthn · 2017-10-02T20:57:02Z

src/jit/x64/arch.c

+
+#ifndef MIN
+#define MIN(a,b) ((a) < (b) ? (a) : (b))
+#endif


I'm sure I saw this defined in internal.h

Correct, and then also undefined, because I didn't want them to leak out, i.e. to have nonsymbolic values... I'm not sure about the legitimacy of that concern

jnthn · 2017-10-02T21:02:19Z

I've left a few small cleanup comments, but overall am very happy with this PR, and happy for it to be merged. Very nice work! 👍

AlexDaniel · 2017-10-02T23:34:02Z

FWIW If it looks good, I'd prefer to see this merged this week. Any later than that raises the possibility of unnecessary headache with the upcoming release (this month we have rakudo * release too).

I don't know, maybe this won't be disruptive at all, but better safe than sorry.

AlexDaniel · 2017-10-02T23:38:41Z

docs/jit/ir.md

@@ -0,0 +1,257 @@
+# Expression 'Tree' Intermediate Representation
+
+The 'expression tree' IR has been developed developed to support


developed developed

jnthn · 2017-10-02T23:43:31Z

@AlexDaniel Yes, getting this into the upcoming release in good time is why I spent a good chunk of my evening reviewing this. :-) Provided @bdw is happy that it's ready, so am I.

Some of the things in tiles.md were no longer true

The direct access of MVMCompUnit->body.strings was a legacy from simpler days when compunit strings were loaded eagerly. As they're now using lazy loading, that isn't really valid anymore. Possible future development would be to force eager loading during JIT compilation and/or upgrading to second-generation memory.

Not idx, oops

bdw added 30 commits May 21, 2016 21:59

Remove register locking logic

c9bf2fc

Because we can't allocate scratch registers anyway, this is no longer useful.

Merge remote-tracking branch 'origin/master' into even-moar-jit

cba12fc

Move TileList from linked list to vector

fcb2edb

Allow easier backwards and forwards traversal in the tilelist, which is relevant in several steps of the register allocator (such as precoloring).

Be a sport, clean up yer mess

746b340

Move expression tree walking code

6a6fcc0

May be useful in other contexts (such as optimization)

Reduce register allocator surface area

9a6ca81

We will now need way to specify per-tile register requirements, as tiles can no longer try and 'fix' their registers at runtime. We don't actually seem to hit the NYI paths, though.

Internalize RegisterAllocator structure

34da4f6

The register allocator only contains temporary structures, and as such does not need to be 'public'.

Simplify the tile function signature

ac8b1e1

Tile objects already contain all the necessary values, so we can just pass it rather than push all its elements.

Use static register buffers

ba7f715

No need to allocate fixed-size buffers

Split live ranges from value descriptors

c217b9a

At least conceptually, a value descriptor can have a quite different (typically shorter) lifespan than a live range does. This allows multiple value descriptors per node, and in the future to manage spilled nodes.

Add tilelist editing functionality

f0c9a86

This will allow inserting tiles in the linear list with a fixed cost per tile and without upsetting ranges prior to the edit.

Publish tile construction API

985bb59

So we can create pseudotiles in the register allocator too

Add preliminary register spilling

168e11a

This is work-in-progress, doesn't work yet

More things in their right place

30a7932

I can't implement loading spilled values just yet, because it requires sensitivity to the relative ordering of insert-after, and I'm not quite sure how to do that.

Merge remote-tracking branch 'origin/master' into even-moar-jit

d62126a

Merge remote-tracking branch 'origin/master' into even-moar-jit

e976b96

Roadmap for register allocator

0ede936

Will develop the register allocator based on linear scan, with 4 separate passes.

Use DYNAR macro arguments only once if possible

d7aa0d3

Bikeshedding - rename DYNAR to VECTOR

f9fe7ca

Vector is probably more common terminology for a resizing array than 'DYNAR', and nearly as short.

Move tile list editing to tile.c

c07a09f

A bit more sensible that way I hope

Work-in-progress pieces for linear scan

6079744

More notes on linear scan

d99334c

More notes on register allocation

84d8099

Remove redundant macro parameter

cdb7856

Merge remote-tracking branch 'origin/master' into even-moar-jit

8734e0b

Merge remote-tracking branch 'origin/master' into even-moar-jit

bb5c0e2

Treat the different copy/phi nodes separately

7da475f

They were treated as the same, but because they are subtly different, they are best treated differently

Change some types and format strings to make clang happy

f24555a

Contrary to my expectations, MVMint8 != char

bdw added 2 commits September 28, 2017 08:16

Merge remote-tracking branch 'origin/master' into even-moar-jit

2de682b

RegisterAllocator.active is not a vector

2503441

So it shouldn't be operated on with vector methods, and vector methods should especially not assume that they work on this particular data structure.

bdw requested review from timo, jnthn and niner October 2, 2017 06:36

bdw self-assigned this Oct 2, 2017

Improve the expression JIT documentation

dc3d40f

Add a document describing its most important components (expression template processor / tree builder, tiler, and register allocator).

jnthn reviewed Oct 2, 2017

View reviewed changes

AlexDaniel reviewed Oct 2, 2017

View reviewed changes

bdw added 3 commits October 3, 2017 12:47

More documentation fixes

6de7455

Some of the things in tiles.md were no longer true

MVM_cu_string - second argument is *cu

33270f0

Not idx, oops

bdw merged commit 7f07181 into master Oct 3, 2017

zoffixznet mentioned this pull request Oct 8, 2017

Merging of PRs with many commits isn't announced Raku/geth#9

Open

AlexDaniel deleted the even-moar-jit branch April 27, 2018 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge 'expression' JIT backend #674

Merge 'expression' JIT backend #674

bdw commented Sep 4, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

bdw Oct 2, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

bdw Oct 2, 2017

jnthn Oct 2, 2017

bdw Oct 3, 2017

jnthn Oct 2, 2017

bdw Oct 3, 2017

jnthn Oct 2, 2017

jnthn Oct 2, 2017

bdw Oct 3, 2017

jnthn Oct 2, 2017

bdw Oct 3, 2017

jnthn Oct 2, 2017

bdw Oct 3, 2017

jnthn commented Oct 2, 2017

AlexDaniel commented Oct 2, 2017

AlexDaniel Oct 2, 2017

jnthn commented Oct 2, 2017 •

edited

		single expression-wide scope and that redefinitions are not allowed.


		Statement Macro's are lists of which the node name start with an

		@@ -0,0 +1,257 @@
		# Expression 'Tree' Intermediate Representation

		The 'expression tree' IR has been developed developed to support

Merge 'expression' JIT backend #674

Merge 'expression' JIT backend #674

Conversation

bdw commented Sep 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnthn commented Oct 2, 2017

AlexDaniel commented Oct 2, 2017

Choose a reason for hiding this comment

jnthn commented Oct 2, 2017 • edited

jnthn commented Oct 2, 2017 •

edited