-
Notifications
You must be signed in to change notification settings - Fork 63
Add On-Stack Replacement for Sulong. #859
base: master
Are you sure you want to change the base?
Conversation
|
||
import com.oracle.truffle.api.nodes.ControlFlowException; | ||
|
||
public class LLVMBreakException extends ControlFlowException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this class final
.
|
||
@Child private LoopNode loop; | ||
|
||
@CompilationFinal private final Integer[] successors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect there's a reason why this isn't an int[]
. Please document it in a comment.
|
||
@Child private LoopNode loop; | ||
|
||
@CompilationFinal private final Integer[] successors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the entries of this array constant as well? If so, you should change this to @CompilationFinal(dimensions = 1)
.
} | ||
|
||
CompilerDirectives.transferToInterpreter(); | ||
throw new IllegalStateException("Must not reach here!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? You could make this error message more descriptive by e.g. stating that you did not find a loop successor.
} | ||
} catch (FrameSlotTypeException e) { | ||
CompilerDirectives.transferToInterpreter(); | ||
throw new RuntimeException("Error while reading from loop successor frame slot - type mismatch!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can include e
as cause for the RunTimeException
in the call to its constructor.
} | ||
} catch (FrameSlotTypeException e) { | ||
CompilerDirectives.transferToInterpreter(); | ||
throw new RuntimeException("Error while reading from loop successor frame slot - type mismatch!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already mentioned this above, but a more descriptive error message would come in handy for debugging. In general, since this class is in large parts a duplicate of LLVMDispatchBasicBlockNode
it would be preferable to have the common parts in a common baseclass, but it may be hard to accomplish this without any performance regressions. Perhaps you could look into this as future work.
|
||
@Override | ||
public boolean executeRepeating(VirtualFrame frame) { | ||
Object ret = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could replace this method body with return bodyNode.executeI1(frame)
. Instead of a RuntimeException
that would also throw the IMHO more fitting UnexpectedResultException
.
@@ -109,21 +117,83 @@ public RootCallTarget convert() { | |||
method.getParameters().size(), symbols, | |||
method, liveness, notNullable, dbgInfoHandler); | |||
method.accept(visitor); | |||
|
|||
LLVMControlFlowGraph cfg = new LLVMControlFlowGraph(method.getBlocks().toArray(new InstructionBlock[method.getBlocks().size()])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to pass a preallocated array to toArray
. Its safer to instead use a constant empty array like FunctionDefinition.EMPTY
.
FrameSlot[][] nullableAfterBlock, LLVMSourceLocation location, LLVMStatementNode[] copyArgumentsToFrameArray, UniquesRegion uniquesRegion) { | ||
|
||
if (cfg.getCFGLoops().size() > 0) { | ||
loopSuccessorSlot = frame.addFrameSlot("__LLVMLoopSuccessor", FrameSlotKind.Int); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sulong-internal frameslots usually use an identifier in the form "". See e.g. LLVMStack.FRAME_ID
or LLVMUserException.FRAME_SLOT_ID
. This is by no means an explicitly enforced convention, but I think doing this avoids name conflicts and makes debugging easier.
loopSuccessorSlot = frame.addFrameSlot("__LLVMLoopSuccessor", FrameSlotKind.Int); | ||
} | ||
|
||
LLVMStatementNode[] nodeArray = nodes.toArray(new LLVMStatementNode[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use LLVMStatementNode.NO_STATEMENTS
instead of explicitly allocating an array here.
|
||
LLVMStatementNode[] nodeArray = nodes.toArray(new LLVMStatementNode[0]); | ||
|
||
for (CFGLoop l : cfg.getCFGLoops()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid single-letter variable names if the values are used more often or not obviously in your loop body.
LLVMStatementNode loop = runtime.getNodeFactory().createLoop(loopBody, l.getSuccessorIDs()); | ||
|
||
// replace header block with loop node | ||
nodes.remove(headerId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make removing and adding an atomic operation by calling nodes.set(headerId, loop)
instead.
// add body nodes | ||
int i = 1; | ||
for (CFGBlock b : l.getBody()) { | ||
bodyNodes.add(nodeArray[b.id]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you keep nodeArray
and nodes
in sync in any case, I don't see a need for nodeArray
.
@@ -93,11 +98,36 @@ public Object executeGeneric(VirtualFrame frame) { | |||
int backEdgeCounter = 0; | |||
outer: while (basicBlockIndex != LLVMBasicBlockNode.RETURN_FROM_FUNCTION) { | |||
CompilerAsserts.partialEvaluationConstant(basicBlockIndex); | |||
LLVMBasicBlockNode bb = bodyNodes[basicBlockIndex]; | |||
|
|||
if(bodyNodes[basicBlockIndex] instanceof LLVMLoopNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, a loop node should be a terminating instruction - I think that this would fit more nicely into the current structure as all terminating instructions modify the basic block index.
loop.execute(frame); | ||
|
||
Integer[] successors = loop.getSuccessors(); | ||
for(int i = 0; i < successors.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the iteration should stop at successors.length - 1 and there should be some explicit code to handle the the last successor (see the pattern used for the LLVMSwitchNode below). Otherwise, the graph will look a bit worse.
private final FrameSlot exceptionValueSlot; | ||
private final int headerId; | ||
@Children private final LLVMStatementNode[] bodyNodes; | ||
@CompilationFinal private final Integer[] indexMapping; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you are using Integer instead of int? (same applies to loopSuccessors).
With Integer[], you would need @CompilationFinal(dimensions = 1) - otherwise, the values are not compilation final (also applies to loopSuccessors).
@ExplodeLoop | ||
private boolean isInLoop(int bci) { | ||
for(int i : loopSuccessors) { | ||
if(i == bci) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please always use braces
|
||
@Child private LoopNode loop; | ||
|
||
@CompilationFinal private final Integer[] successors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see other comment regarding int vs. Integer and @CompilationFinal(dimensions = 1)
@Override | ||
@ExplodeLoop(kind = LoopExplosionKind.MERGE_EXPLODE) | ||
public Object executeGeneric(VirtualFrame frame) { | ||
boolean ret = false; // helper variable to not return at first encounter of headerId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like firstIteration would be more readable and would make the comment unnecessary
CompilerAsserts.compilationConstant(bodyNodes.length); | ||
int basicBlockIndex = headerId; | ||
|
||
outer: while (isInLoop(basicBlockIndex) && (basicBlockIndex != headerId || !ret)) { // do-while loop fails at PE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, comments are more readable if they are on a separate line
|
||
ret = true; | ||
|
||
if(bodyNodes[indexMapping[basicBlockIndex]] instanceof LLVMLoopNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment in LLVMDispatchBasicBlockNode
} | ||
|
||
@ExplodeLoop | ||
private static void executePhis(VirtualFrame frame, LLVMControlFlowNode controlFlowNode, int successorIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
share the helper functions that are down here with the LLVMDispatchBasicBlockNode (this will not impact PE)
if (active.contains(inner)) { | ||
// catches case that there is a stack overflow because two loop nodes are being called iteratively | ||
// from one another, without one being left beforehand | ||
throw new RuntimeException("Irreducible nestedness!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw a custom exception and catch it in the build method (applies to all bailouts in here)
private static final int LOOP_HEADER_INITIAL_CAPACITY = 4; | ||
|
||
private CFGBlock[] blocks; | ||
private CFGLoop[] cfgLoops = new CFGLoop[LOOP_HEADER_INITIAL_CAPACITY]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need an array that you resize manually? I think you should use an ArrayList
instead, if not to make your code simpler than at least to make it safer.
indexMapping[b.id] = i++; | ||
} | ||
|
||
LLVMExpressionNode loopBody = runtime.getNodeFactory().createLoopDispatchNode(frame.findFrameSlot(LLVMUserException.FRAME_SLOT_ID), Collections.unmodifiableList(bodyNodes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSuccessorIDs
allocates a new array with the same content on each call. Since, I assume, the contents of the returned arrays are expected to be identical, you should only call the method once and pass the same array to createLoop
and createLoopDispatchNode
.
calculateSuccessors(); | ||
} | ||
|
||
Integer[] sucIDs = new Integer[successors.size()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the order of elements in sucIDs
guaranteed by your loop detection algorithm? These seem like they should be sorted in some way.
List<CFGLoop> loops = new ArrayList<>(); | ||
|
||
for (CFGLoop l : cfgLoops) { | ||
if (l != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style gate will probably catch this. Please always use braces to delimit the bodies of conditional branches even if they consist only of a single line.
if (l != null) | ||
loops.add(l); | ||
} | ||
loops.removeIf(l -> l == null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is unnecessary since you already make sure not to add any null
elements to the list.
|
||
import com.oracle.truffle.api.nodes.ControlFlowException; | ||
|
||
public class LLVMBreakException extends ControlFlowException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this class unused?
Integer[] successors = loop.getSuccessors(); | ||
for(int i = 0; i < successors.length; i++) { | ||
try { | ||
if(frame.getInt(loopSuccessorSlot) == successors[i]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for accessing the frame, you can use FrameUtil.getIntSafe(...)
@@ -0,0 +1,265 @@ | |||
/* | |||
* Copyright (c) 2017, 2018, Oracle and/or its affiliates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure that you have the correct copyright year in all files that you added
e3a1818
to
448838a
Compare
* Distinction between low level basic blocks and high level control flow | ||
* structures, like LLVMLoopNode | ||
*/ | ||
if(bodyNodes[basicBlockIndex] instanceof LLVMBasicBlockNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check will probably have a noticeable impact on performance. It should not be necessary either.
This PR adds On-Stack Replacement (OSR) for Sulong.
Based on the block based structure of LLVM IR bitcodes, loops are detected and for each loop a new node is created, which conforms to Truffle interface for loop nodes and replaces the basic block which was identified as the header of the loop.
Thus, Graal can replace loops during (interpreted) execution with optimized (compiled) versions, yielding a faster warmup. However, due to additional nodes, peak performance can be negatively affected.
Parameters (with default values) for playing around with OSR: