New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge 'expression' JIT backend #674

Merged
merged 472 commits into from Oct 3, 2017

Conversation

Projects
None yet
7 participants
@bdw
Contributor

bdw commented Sep 4, 2017

Now that the changes to the legacy JIT have been applied and merged, this branch now contains (mostly) the addition of the expression JIT compiler.
Added functionality includes:

  • Expression tree building
  • Code generation by means of tree tiling
  • Full register selection via DynASM
  • A general register allocator
    • Although it can only handle forward-only graphs for now
  • Allocating values in local memory (for spilling)
  • Automated 'bad copmpilation' bisect process
  • A minor 'bitmap' library (used in the register allocator)

Not yet implemented, but planned:

  • An optimizer
  • REPR-specialized templates or tree generation
  • Multiple basic blocks per expression tree

bdw added some commits May 21, 2016

Remove register locking logic
Because we can't allocate scratch registers anyway, this is no
longer useful.
Extract MVMJitExprValue out of MVMJitExprNodeInfo
A particular nodes' value may be stored in a multitude of
registers and stack locations (and vice versa). Hence we want
multiple value (descriptor0 structures per node. I first
extract the MVMJitExprValue structure out of the NodeInfo
structure. The NodeInfo structure is relevant in an earlier
phase, not really during register allocation. Value structures
are now spesh-allocated.
Move TileList from linked list to vector
Allow easier backwards and forwards traversal in the tilelist,
which is relevant in several steps of the register allocator
(such as precoloring).
Move expression tree walking code
May be useful in other contexts (such as optimization)
Reduce register allocator surface area
We will now need way to specify per-tile register requirements,
as tiles can no longer try and 'fix' their registers at runtime.
We don't actually seem to hit the NYI paths, though.
Replace ExprValue with ValueDescrioptor
Each node value may reside in multiple storage locations, and each
storage location may contain multiple node values. ValueDescriptor
structures are intended to describe just a single value in a
single location, and to be arranged in tables and lists for lookup
purposes.
Internalize RegisterAllocator structure
The register allocator only contains temporary structures, and as
such does not need to be 'public'.
Simplify the tile function signature
Tile objects already contain all the necessary values, so we can
just pass it rather than push all its elements.
Use static register buffers
No need to allocate fixed-size buffers
Split live ranges from value descriptors
At least conceptually, a value descriptor can have a quite different
(typically shorter) lifespan than a live range does. This allows
multiple value descriptors per node, and in the future to manage
spilled nodes.
Add tilelist editing functionality
This will allow inserting tiles in the linear list with a fixed cost
per tile and without upsetting ranges prior to the edit.
Publish tile construction API
So we can create pseudotiles in the register allocator too
Add preliminary register spilling
This is work-in-progress, doesn't work yet
More things in their right place
I can't implement loading spilled values just yet, because it
requires sensitivity to the relative ordering of insert-after,
and I'm not quite sure how to do that.
Roadmap for register allocator
Will develop the register allocator based on linear scan, with
4 separate passes.
Bikeshedding - rename DYNAR to VECTOR
Vector is probably more common terminology for a resizing array
than 'DYNAR', and nearly as short.
Move tile list editing to tile.c
A bit more sensible that way I hope
@vendethiel

This comment has been minimized.

Show comment
Hide comment
@vendethiel

vendethiel Aug 18, 2016

Contributor

register*

Contributor

vendethiel commented on docs/jit/register-allocator.org in 6079744 Aug 18, 2016

register*

RegisterAllocator.active is not a vector
So it shouldn't be operated on with vector methods, and vector methods
should especially not assume that they work on this particular data
structure.

@bdw bdw requested review from timo, jnthn and niner Oct 2, 2017

@bdw bdw self-assigned this Oct 2, 2017

Improve the expression JIT documentation
Add a document describing its most important components (expression
template processor / tree builder, tiler, and register allocator).
Show outdated Hide outdated docs/jit/ir.md
Show outdated Hide outdated docs/jit/ir.md
(macro: ^spesh_slot (,a)
(idx
(load (addr (frame) (&offsetof MVMFrame effective_spesh_slots)))

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

Also, effective_spesh_slots went away iirc, so this example is a tad out-dated.

@jnthn

jnthn Oct 2, 2017

Member

Also, effective_spesh_slots went away iirc, so this example is a tad out-dated.

This comment has been minimized.

@bdw

bdw Oct 2, 2017

Contributor

effective_spesh_slots still exists in src/core/frame.h, should it not?

@bdw

bdw Oct 2, 2017

Contributor

effective_spesh_slots still exists in src/core/frame.h, should it not?

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

Oh, right, that one had to survive my MVMFrame shrinkage. :-)

@jnthn

jnthn Oct 2, 2017

Member

Oh, right, that one had to survive my MVMFrame shrinkage. :-)

Show outdated Hide outdated docs/jit/tiles.md
Show outdated Hide outdated docs/jit/tiles.md
Show outdated Hide outdated docs/jit/tiles.md
/* NB - make this a separate 'library', use it for register bitmap */
/* Witness the elegance of the bitmap for our purposes. */
MVM_STATIC_INLINE void MVM_bitmap_set(MVMBitmap *bits, MVMint32 idx) {
bits[idx >> 6] |= (UINT64_C(1) << (idx & 0x3f));

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

Why 6, out of curiosity? Are we keeping the upper bits of the byte for something?

@jnthn

jnthn Oct 2, 2017

Member

Why 6, out of curiosity? Are we keeping the upper bits of the byte for something?

This comment has been minimized.

@bdw

bdw Oct 2, 2017

Contributor

The idx indicates the bit we want, the first 6 bits represetn 0-63, so as we're dereferencing an array of 64 bit integers, we need to shift down by 6 bits to get the correct array index

@bdw

bdw Oct 2, 2017

Contributor

The idx indicates the bit we want, the first 6 bits represetn 0-63, so as we're dereferencing an array of 64 bit integers, we need to shift down by 6 bits to get the correct array index

(^is_type_obj $1)
(zr (^getf (^stable $1) MVMSTable container_spec)))
(store $0 $1 ptr_sz)
(callv (^stable_cont_func $1 fetch)

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

I guess this'd be a bit more efficient with a let that dereferenced the container_spec off the STable once?

@jnthn

jnthn Oct 2, 2017

Member

I guess this'd be a bit more efficient with a let that dereferenced the container_spec off the STable once?

This comment has been minimized.

@bdw

bdw Oct 3, 2017

Contributor

Yes, but we'd have to guard against $1 being non-null, and in general complicate the logic quite considerably... It'd be something like:

(ifv (all (nz $1) (^is_concrete_obj $1)) 
      (let (($cont_spec (^getf (^stable $1) MVMSTable container_spec)))
             (ifv (nz $const_spec)
                   (callv (^getf $const_spec MVMContainerSpec) 
                             ...)
                   (store $0 $1 ptr_sz))
    (store $0 $1 ptr_sz))

And considering that due to tiling, the loads in (^getf) may be done doubly anyway (I'm planning an optimization to prevent that), I'm not sure we're actually doing better, and more importantly, the (future) optimizer should be able to resolve these things automatically.

But thanks for the comment, well found.

@bdw

bdw Oct 3, 2017

Contributor

Yes, but we'd have to guard against $1 being non-null, and in general complicate the logic quite considerably... It'd be something like:

(ifv (all (nz $1) (^is_concrete_obj $1)) 
      (let (($cont_spec (^getf (^stable $1) MVMSTable container_spec)))
             (ifv (nz $const_spec)
                   (callv (^getf $const_spec MVMContainerSpec) 
                             ...)
                   (store $0 $1 ptr_sz))
    (store $0 $1 ptr_sz))

And considering that due to tiling, the loads in (^getf) may be done doubly anyway (I'm planning an optimization to prevent that), I'm not sure we're actually doing better, and more importantly, the (future) optimizer should be able to resolve these things automatically.

But thanks for the comment, well found.

/* A label the OSR can jump into to 'start running', so to
* speak. As it breaks the basic-block assumption, arguably,
* this should only ever be at the start of a basic block. But
* it's not. So we have to insert the label and compute it. */

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

As of a while ago, this actually is the case: https://github.com/MoarVM/MoarVM/blob/master/src/spesh/graph.c#L329

@jnthn

This comment has been minimized.

@bdw

bdw Oct 3, 2017

Contributor

If that's the case then this can be cleaned up. Either way no label will actually be inserted since MVM_jit_is_label_for_ins will be false (it is a basic block label in that case) 😄

@bdw

bdw Oct 3, 2017

Contributor

If that's the case then this can be cleaned up. Either way no label will actually be inserted since MVM_jit_is_label_for_ins will be false (it is a basic block label in that case) 😄

* large in this case (32 for RISC architectures, maybe, if we ever support
* them; 7 for x86-64. So the time spent on insertion sort is always small
* and bounded by a constant, hence O(1). Yes, algorithmics works this way
* :-) */

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

:-P

@jnthn

jnthn Oct 2, 2017

Member

:-P

Show outdated Hide outdated src/jit/macro.expr
# get spesh slot address
(macro: ^spesh_slot (,a)
(idx (^getf (^frame) MVMFrame effective_spesh_slots) ,a ptr_sz))

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

Nothing to resolve right now, but I'm curious how we do the "if it's gen2, then we don't need to resolve it again because it won't move" trick with the expr JIT.

@jnthn

jnthn Oct 2, 2017

Member

Nothing to resolve right now, but I'm curious how we do the "if it's gen2, then we don't need to resolve it again because it won't move" trick with the expr JIT.

This comment has been minimized.

@bdw

bdw Oct 3, 2017

Contributor

In general, with a special node objref or const_obj or somesuch, that should be added by the expr tree builder. And then - to top it off - we can make that a GC root that is automatically updated.

@bdw

bdw Oct 3, 2017

Contributor

In general, with a special node objref or const_obj or somesuch, that should be added by the expr tree builder. And then - to top it off - we can make that a GC root that is automatically updated.

#ifndef MIN
#define MIN(a,b) ((a) < (b) ? (a) : (b))
#endif

This comment has been minimized.

@jnthn

jnthn Oct 2, 2017

Member

I'm sure I saw this defined in internal.h

@jnthn

jnthn Oct 2, 2017

Member

I'm sure I saw this defined in internal.h

This comment has been minimized.

@bdw

bdw Oct 3, 2017

Contributor

Correct, and then also undefined, because I didn't want them to leak out, i.e. to have nonsymbolic values... I'm not sure about the legitimacy of that concern

@bdw

bdw Oct 3, 2017

Contributor

Correct, and then also undefined, because I didn't want them to leak out, i.e. to have nonsymbolic values... I'm not sure about the legitimacy of that concern

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Oct 2, 2017

Member

I've left a few small cleanup comments, but overall am very happy with this PR, and happy for it to be merged. Very nice work! 👍

Member

jnthn commented Oct 2, 2017

I've left a few small cleanup comments, but overall am very happy with this PR, and happy for it to be merged. Very nice work! 👍

@AlexDaniel

This comment has been minimized.

Show comment
Hide comment
@AlexDaniel

AlexDaniel Oct 2, 2017

Member

FWIW If it looks good, I'd prefer to see this merged this week. Any later than that raises the possibility of unnecessary headache with the upcoming release (this month we have rakudo * release too).

I don't know, maybe this won't be disruptive at all, but better safe than sorry.

Member

AlexDaniel commented Oct 2, 2017

FWIW If it looks good, I'd prefer to see this merged this week. Any later than that raises the possibility of unnecessary headache with the upcoming release (this month we have rakudo * release too).

I don't know, maybe this won't be disruptive at all, but better safe than sorry.

Show outdated Hide outdated docs/jit/ir.md
@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Oct 2, 2017

Member

@AlexDaniel Yes, getting this into the upcoming release in good time is why I spent a good chunk of my evening reviewing this. :-) Provided @bdw is happy that it's ready, so am I.

Member

jnthn commented Oct 2, 2017

@AlexDaniel Yes, getting this into the upcoming release in good time is why I spent a good chunk of my evening reviewing this. :-) Provided @bdw is happy that it's ready, so am I.

bdw added some commits Oct 3, 2017

More documentation fixes
Some of the things in tiles.md were no longer true
^cu_string - is lazy-loaded so use wrapper
The direct access of MVMCompUnit->body.strings was a legacy from
simpler days when compunit strings were loaded eagerly. As they're now
using lazy loading, that isn't really valid anymore.

Possible future development would be to force eager loading during JIT
compilation and/or upgrading to second-generation memory.

@bdw bdw merged commit 7f07181 into master Oct 3, 2017

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@AlexDaniel AlexDaniel deleted the even-moar-jit branch Apr 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment