New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge 'expression' JIT backend #674
Conversation
Because we can't allocate scratch registers anyway, this is no longer useful.
A particular nodes' value may be stored in a multitude of registers and stack locations (and vice versa). Hence we want multiple value (descriptor0 structures per node. I first extract the MVMJitExprValue structure out of the NodeInfo structure. The NodeInfo structure is relevant in an earlier phase, not really during register allocation. Value structures are now spesh-allocated.
Allow easier backwards and forwards traversal in the tilelist, which is relevant in several steps of the register allocator (such as precoloring).
May be useful in other contexts (such as optimization)
We will now need way to specify per-tile register requirements, as tiles can no longer try and 'fix' their registers at runtime. We don't actually seem to hit the NYI paths, though.
Each node value may reside in multiple storage locations, and each storage location may contain multiple node values. ValueDescriptor structures are intended to describe just a single value in a single location, and to be arranged in tables and lists for lookup purposes.
The register allocator only contains temporary structures, and as such does not need to be 'public'.
Tile objects already contain all the necessary values, so we can just pass it rather than push all its elements.
No need to allocate fixed-size buffers
At least conceptually, a value descriptor can have a quite different (typically shorter) lifespan than a live range does. This allows multiple value descriptors per node, and in the future to manage spilled nodes.
This will allow inserting tiles in the linear list with a fixed cost per tile and without upsetting ranges prior to the edit.
So we can create pseudotiles in the register allocator too
This is work-in-progress, doesn't work yet
I can't implement loading spilled values just yet, because it requires sensitivity to the relative ordering of insert-after, and I'm not quite sure how to do that.
Will develop the register allocator based on linear scan, with 4 separate passes.
Vector is probably more common terminology for a resizing array than 'DYNAR', and nearly as short.
A bit more sensible that way I hope
They were treated as the same, but because they are subtly different, they are best treated differently
Contrary to my expectations, MVMint8 != char
So it shouldn't be operated on with vector methods, and vector methods should especially not assume that they work on this particular data structure.
Add a document describing its most important components (expression template processor / tree builder, tiler, and register allocator).
docs/jit/ir.md
Outdated
single expression-wide scope and that redefinitions are not allowed. | ||
|
||
|
||
**Statement Macro's** are lists of which the node name start with an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/'//
docs/jit/ir.md
Outdated
|
||
A tree macro is defined using the 'macro:' keyword, followed by the | ||
macro name (including the '^' prefix), a list of macro arguments, and | ||
the actual macro list. For example, the '^p6obody' macro is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The '^spesh_slot` macro appears to be shown below?
|
||
(macro: ^spesh_slot (,a) | ||
(idx | ||
(load (addr (frame) (&offsetof MVMFrame effective_spesh_slots))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, effective_spesh_slots
went away iirc, so this example is a tad out-dated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
effective_spesh_slots still exists in src/core/frame.h
, should it not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, that one had to survive my MVMFrame shrinkage. :-)
docs/jit/tiles.md
Outdated
A tile description is a list that starts with the keyword | ||
<code>tile:</code>, followed by the tile *name*, the pattern proper, | ||
the *symbol* that it generates, and the (estimated) tile *cost*. There | ||
are two supported symbols, namely __reg__, __flag__ and __void__. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
three? :)
docs/jit/tiles.md
Outdated
int src_b = values[2]->reg_num; | ||
if (values[0]->size < 8) | ||
MVM_oops(tc, "oops!"); | ||
if (src_a != dsst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/dsst/dst/
docs/jit/tiles.md
Outdated
node, if any. | ||
|
||
These values are thus available to any tile that is compiled. Often, a | ||
till will require data that is 'deeper' in the tree, for instance the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/till/tile/
/* NB - make this a separate 'library', use it for register bitmap */ | ||
/* Witness the elegance of the bitmap for our purposes. */ | ||
MVM_STATIC_INLINE void MVM_bitmap_set(MVMBitmap *bits, MVMint32 idx) { | ||
bits[idx >> 6] |= (UINT64_C(1) << (idx & 0x3f)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 6, out of curiosity? Are we keeping the upper bits of the byte for something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idx indicates the bit we want, the first 6 bits represetn 0-63, so as we're dereferencing an array of 64 bit integers, we need to shift down by 6 bits to get the correct array index
(^is_type_obj $1) | ||
(zr (^getf (^stable $1) MVMSTable container_spec))) | ||
(store $0 $1 ptr_sz) | ||
(callv (^stable_cont_func $1 fetch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this'd be a bit more efficient with a let that dereferenced the container_spec off the STable once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we'd have to guard against $1
being non-null, and in general complicate the logic quite considerably... It'd be something like:
(ifv (all (nz $1) (^is_concrete_obj $1))
(let (($cont_spec (^getf (^stable $1) MVMSTable container_spec)))
(ifv (nz $const_spec)
(callv (^getf $const_spec MVMContainerSpec)
...)
(store $0 $1 ptr_sz))
(store $0 $1 ptr_sz))
And considering that due to tiling, the loads in (^getf)
may be done doubly anyway (I'm planning an optimization to prevent that), I'm not sure we're actually doing better, and more importantly, the (future) optimizer should be able to resolve these things automatically.
But thanks for the comment, well found.
/* A label the OSR can jump into to 'start running', so to | ||
* speak. As it breaks the basic-block assumption, arguably, | ||
* this should only ever be at the start of a basic block. But | ||
* it's not. So we have to insert the label and compute it. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of a while ago, this actually is the case: https://github.com/MoarVM/MoarVM/blob/master/src/spesh/graph.c#L329
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case then this can be cleaned up. Either way no label will actually be inserted since MVM_jit_is_label_for_ins
will be false (it is a basic block label in that case) 😄
* large in this case (32 for RISC architectures, maybe, if we ever support | ||
* them; 7 for x86-64. So the time spent on insertion sort is always small | ||
* and bounded by a constant, hence O(1). Yes, algorithmics works this way | ||
* :-) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-P
src/jit/macro.expr
Outdated
(load (addr ,object (&offsetof ,type ,field)) (&SIZEOF_MEMBER ,type ,field))) | ||
(macro: ^setf (,object ,type ,field ,value) | ||
(store (addr ,object (&offsetof ,type ,field)) ,value (&SIZEOF_MEMBER ,type ,field))) | ||
(macro: ^cu_string (,a) (idx (^getf (cu) MVMCompUnit body.strings) ,a ptr_sz)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, but strings may be lazily loaded, or do we guard against that case somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, good point, this is an old legacy and should be fixed
|
||
# get spesh slot address | ||
(macro: ^spesh_slot (,a) | ||
(idx (^getf (^frame) MVMFrame effective_spesh_slots) ,a ptr_sz)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to resolve right now, but I'm curious how we do the "if it's gen2, then we don't need to resolve it again because it won't move" trick with the expr JIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, with a special node objref
or const_obj
or somesuch, that should be added by the expr tree builder. And then - to top it off - we can make that a GC root that is automatically updated.
|
||
#ifndef MIN | ||
#define MIN(a,b) ((a) < (b) ? (a) : (b)) | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure I saw this defined in internal.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, and then also undefined, because I didn't want them to leak out, i.e. to have nonsymbolic values... I'm not sure about the legitimacy of that concern
I've left a few small cleanup comments, but overall am very happy with this PR, and happy for it to be merged. Very nice work! 👍 |
FWIW If it looks good, I'd prefer to see this merged this week. Any later than that raises the possibility of unnecessary headache with the upcoming release (this month we have rakudo * release too). I don't know, maybe this won't be disruptive at all, but better safe than sorry. |
docs/jit/ir.md
Outdated
@@ -0,0 +1,257 @@ | |||
# Expression 'Tree' Intermediate Representation | |||
|
|||
The 'expression tree' IR has been developed developed to support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
developed developed
@AlexDaniel Yes, getting this into the upcoming release in good time is why I spent a good chunk of my evening reviewing this. :-) Provided @bdw is happy that it's ready, so am I. |
Some of the things in tiles.md were no longer true
The direct access of MVMCompUnit->body.strings was a legacy from simpler days when compunit strings were loaded eagerly. As they're now using lazy loading, that isn't really valid anymore. Possible future development would be to force eager loading during JIT compilation and/or upgrading to second-generation memory.
Not idx, oops
Now that the changes to the legacy JIT have been applied and merged, this branch now contains (mostly) the addition of the expression JIT compiler.
Added functionality includes:
Not yet implemented, but planned: