Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge 'expression' JIT backend #674

Merged
merged 472 commits into from Oct 3, 2017
Merged

Merge 'expression' JIT backend #674

merged 472 commits into from Oct 3, 2017

Conversation

bdw
Copy link
Contributor

@bdw bdw commented Sep 4, 2017

Now that the changes to the legacy JIT have been applied and merged, this branch now contains (mostly) the addition of the expression JIT compiler.
Added functionality includes:

  • Expression tree building
  • Code generation by means of tree tiling
  • Full register selection via DynASM
  • A general register allocator
    • Although it can only handle forward-only graphs for now
  • Allocating values in local memory (for spilling)
  • Automated 'bad copmpilation' bisect process
  • A minor 'bitmap' library (used in the register allocator)

Not yet implemented, but planned:

  • An optimizer
  • REPR-specialized templates or tree generation
  • Multiple basic blocks per expression tree

bdw added 30 commits May 21, 2016 21:59
Because we can't allocate scratch registers anyway, this is no
longer useful.
A particular nodes' value may be stored in a multitude of
registers and stack locations (and vice versa). Hence we want
multiple value (descriptor0 structures per node. I first
extract the MVMJitExprValue structure out of the NodeInfo
structure. The NodeInfo structure is relevant in an earlier
phase, not really during register allocation. Value structures
are now spesh-allocated.
Allow easier backwards and forwards traversal in the tilelist,
which is relevant in several steps of the register allocator
(such as precoloring).
May be useful in other contexts (such as optimization)
We will now need way to specify per-tile register requirements,
as tiles can no longer try and 'fix' their registers at runtime.
We don't actually seem to hit the NYI paths, though.
Each node value may reside in multiple storage locations, and each
storage location may contain multiple node values. ValueDescriptor
structures are intended to describe just a single value in a
single location, and to be arranged in tables and lists for lookup
purposes.
The register allocator only contains temporary structures, and as
such does not need to be 'public'.
Tile objects already contain all the necessary values, so we can
just pass it rather than push all its elements.
No need to allocate fixed-size buffers
At least conceptually, a value descriptor can have a quite different
(typically shorter) lifespan than a live range does. This allows
multiple value descriptors per node, and in the future to manage
spilled nodes.
This will allow inserting tiles in the linear list with a fixed cost
per tile and without upsetting ranges prior to the edit.
So we can create pseudotiles in the register allocator too
This is work-in-progress, doesn't work yet
I can't implement loading spilled values just yet, because it
requires sensitivity to the relative ordering of insert-after,
and I'm not quite sure how to do that.
Will develop the register allocator based on linear scan, with
4 separate passes.
Vector is probably more common terminology for a resizing array
than 'DYNAR', and nearly as short.
A bit more sensible that way I hope
They were treated as the same, but because they are subtly
different, they are best treated differently
Contrary to my expectations, MVMint8 != char
So it shouldn't be operated on with vector methods, and vector methods
should especially not assume that they work on this particular data
structure.
@bdw bdw requested review from timo, jnthn and niner October 2, 2017 06:36
@bdw bdw self-assigned this Oct 2, 2017
Add a document describing its most important components (expression
template processor / tree builder, tiler, and register allocator).
docs/jit/ir.md Outdated
single expression-wide scope and that redefinitions are not allowed.


**Statement Macro's** are lists of which the node name start with an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/'//

docs/jit/ir.md Outdated

A tree macro is defined using the 'macro:' keyword, followed by the
macro name (including the '^' prefix), a list of macro arguments, and
the actual macro list. For example, the '^p6obody' macro is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The '^spesh_slot` macro appears to be shown below?


(macro: ^spesh_slot (,a)
(idx
(load (addr (frame) (&offsetof MVMFrame effective_spesh_slots)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, effective_spesh_slots went away iirc, so this example is a tad out-dated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

effective_spesh_slots still exists in src/core/frame.h, should it not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, that one had to survive my MVMFrame shrinkage. :-)

A tile description is a list that starts with the keyword
<code>tile:</code>, followed by the tile *name*, the pattern proper,
the *symbol* that it generates, and the (estimated) tile *cost*. There
are two supported symbols, namely __reg__, __flag__ and __void__. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

three? :)

int src_b = values[2]->reg_num;
if (values[0]->size < 8)
MVM_oops(tc, "oops!");
if (src_a != dsst) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/dsst/dst/

node, if any.

These values are thus available to any tile that is compiled. Often, a
till will require data that is 'deeper' in the tree, for instance the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/till/tile/

/* NB - make this a separate 'library', use it for register bitmap */
/* Witness the elegance of the bitmap for our purposes. */
MVM_STATIC_INLINE void MVM_bitmap_set(MVMBitmap *bits, MVMint32 idx) {
bits[idx >> 6] |= (UINT64_C(1) << (idx & 0x3f));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 6, out of curiosity? Are we keeping the upper bits of the byte for something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idx indicates the bit we want, the first 6 bits represetn 0-63, so as we're dereferencing an array of 64 bit integers, we need to shift down by 6 bits to get the correct array index

(^is_type_obj $1)
(zr (^getf (^stable $1) MVMSTable container_spec)))
(store $0 $1 ptr_sz)
(callv (^stable_cont_func $1 fetch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this'd be a bit more efficient with a let that dereferenced the container_spec off the STable once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we'd have to guard against $1 being non-null, and in general complicate the logic quite considerably... It'd be something like:

(ifv (all (nz $1) (^is_concrete_obj $1)) 
      (let (($cont_spec (^getf (^stable $1) MVMSTable container_spec)))
             (ifv (nz $const_spec)
                   (callv (^getf $const_spec MVMContainerSpec) 
                             ...)
                   (store $0 $1 ptr_sz))
    (store $0 $1 ptr_sz))

And considering that due to tiling, the loads in (^getf) may be done doubly anyway (I'm planning an optimization to prevent that), I'm not sure we're actually doing better, and more importantly, the (future) optimizer should be able to resolve these things automatically.

But thanks for the comment, well found.

/* A label the OSR can jump into to 'start running', so to
* speak. As it breaks the basic-block assumption, arguably,
* this should only ever be at the start of a basic block. But
* it's not. So we have to insert the label and compute it. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of a while ago, this actually is the case: https://github.com/MoarVM/MoarVM/blob/master/src/spesh/graph.c#L329

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case then this can be cleaned up. Either way no label will actually be inserted since MVM_jit_is_label_for_ins will be false (it is a basic block label in that case) 😄

* large in this case (32 for RISC architectures, maybe, if we ever support
* them; 7 for x86-64. So the time spent on insertion sort is always small
* and bounded by a constant, hence O(1). Yes, algorithmics works this way
* :-) */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-P

(load (addr ,object (&offsetof ,type ,field)) (&SIZEOF_MEMBER ,type ,field)))
(macro: ^setf (,object ,type ,field ,value)
(store (addr ,object (&offsetof ,type ,field)) ,value (&SIZEOF_MEMBER ,type ,field)))
(macro: ^cu_string (,a) (idx (^getf (cu) MVMCompUnit body.strings) ,a ptr_sz))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but strings may be lazily loaded, or do we guard against that case somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, good point, this is an old legacy and should be fixed


# get spesh slot address
(macro: ^spesh_slot (,a)
(idx (^getf (^frame) MVMFrame effective_spesh_slots) ,a ptr_sz))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to resolve right now, but I'm curious how we do the "if it's gen2, then we don't need to resolve it again because it won't move" trick with the expr JIT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, with a special node objref or const_obj or somesuch, that should be added by the expr tree builder. And then - to top it off - we can make that a GC root that is automatically updated.


#ifndef MIN
#define MIN(a,b) ((a) < (b) ? (a) : (b))
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure I saw this defined in internal.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and then also undefined, because I didn't want them to leak out, i.e. to have nonsymbolic values... I'm not sure about the legitimacy of that concern

@jnthn
Copy link
Member

jnthn commented Oct 2, 2017

I've left a few small cleanup comments, but overall am very happy with this PR, and happy for it to be merged. Very nice work! 👍

@AlexDaniel
Copy link
Contributor

FWIW If it looks good, I'd prefer to see this merged this week. Any later than that raises the possibility of unnecessary headache with the upcoming release (this month we have rakudo * release too).

I don't know, maybe this won't be disruptive at all, but better safe than sorry.

docs/jit/ir.md Outdated
@@ -0,0 +1,257 @@
# Expression 'Tree' Intermediate Representation

The 'expression tree' IR has been developed developed to support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

developed developed

@jnthn
Copy link
Member

jnthn commented Oct 2, 2017

@AlexDaniel Yes, getting this into the upcoming release in good time is why I spent a good chunk of my evening reviewing this. :-) Provided @bdw is happy that it's ready, so am I.

Some of the things in tiles.md were no longer true
The direct access of MVMCompUnit->body.strings was a legacy from
simpler days when compunit strings were loaded eagerly. As they're now
using lazy loading, that isn't really valid anymore.

Possible future development would be to force eager loading during JIT
compilation and/or upgrading to second-generation memory.
@bdw bdw merged commit 7f07181 into master Oct 3, 2017
@AlexDaniel AlexDaniel deleted the even-moar-jit branch April 27, 2018 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants