Skip to content

Commit

Permalink
compiler: super early exploration of instruction reordering
Browse files Browse the repository at this point in the history
See comments in InstructionReordering.ts. This needs substantial iteration before landing in some form, just putting up to share for discussion.

ghstack-source-id: 765c47c3d6ce37af06de307b71af5a8c2f4a8fa5
Pull Request resolved: #29579
  • Loading branch information
josephsavona committed May 29, 2024
1 parent 867edc6 commit c8c0495
Show file tree
Hide file tree
Showing 59 changed files with 926 additions and 800 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ import {
deadCodeElimination,
pruneMaybeThrows,
} from "../Optimization";
import { instructionReordering } from "../Optimization/InstructionReordering";
import {
CodegenFunction,
alignObjectMethodScopes,
Expand Down Expand Up @@ -195,6 +196,9 @@ function* runWithEnvironment(
deadCodeElimination(hir);
yield log({ kind: "hir", name: "DeadCodeElimination", value: hir });

instructionReordering(hir);
yield log({ kind: "hir", name: "InstructionReordering", value: hir });

pruneMaybeThrows(hir);
yield log({ kind: "hir", name: "PruneMaybeThrows", value: hir });

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -737,6 +737,12 @@ class InferenceState {
* For debugging purposes, dumps the state to a plain
* object so that it can printed as JSON.
*/
inspect(): any {
return {
values: this.#values,
variables: this.#variables,
};
}
debug(): any {
const result: any = { values: {}, variables: {} };
const objects: Map<InstructionValue, number> = new Map();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ function evaluateInstruction(
}
case "LoadLocal": {
const placeValue = read(constants, value.place);
if (placeValue !== null) {
if (placeValue != null) {
instr.value = placeValue;
}
return placeValue;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,322 @@
import {
BasicBlock,
Environment,
HIRFunction,
IdentifierId,
Instruction,
markInstructionIds,
} from "../HIR";
import { printFunction, printInstruction } from "../HIR/PrintHIR";
import {
eachInstructionValueLValue,
eachInstructionValueOperand,
eachTerminalOperand,
} from "../HIR/visitors";
import { mayAllocate } from "../ReactiveScopes/InferReactiveScopeVariables";
import { getOrInsertDefault } from "../Utils/utils";

/**
* WIP early exploration of instruction reordering. This is a fairly aggressive form and has
* some issues. The idea of what's implemented:
*
* The high-level approach is to build a dependency graph where nodes generally correspond
* either to instructions OR to particular lvalue assignments of an expresssion. So
* `Destructure [x, y] = z` creates 3 nodes: one for the instruction, and one each for x and y.
* The lvalue nodes depend on the instruction node that assigns them.
*
* We add dependency edges for all the rvalues/lvalues of each instruction. In addition, we
* implicitly add dependencies btw non-reorderable instructions (more on that criteria) to
* serialize any instruction where order might be observable.
*
* We then distinguish two types of instructions that are reorderable:
* - Primitives, JSXText, JSX elements, and globals can be *globally* reordered, ie across blocks.
* We defer emitting them until they are first used globally.
* - Array and object expressions are reorderable within basic blocks. This could likely be relaxed to be global.
* - StoreLocal, LoadLocal, and Destructure are reorderable within basic blocks. However, we serialize all
* references to each named variable (reads and writes) to ensure that we aren't changing the order of evaluation
* of variable references.
*
* The variable reordering relies on the fact that any variables that could be reassigned via a function expression
* are promoted to "context" variables and use LoadContext/StoreContext, which are not reorderable.
*
* In theory it might even be safe to do this variable reordering globally, but i want to think through that more.
*
* With the above context, the algorithm is approximately:
* - For each basic block:
* - Iterate the instructions to create the dependency graph
* - Re-emit instructions, "pulling" from all the values that are depended upon by the block's terminal.
* - Emit any remaining instructions that cannot be globally reordered, starting from later instructions first.
* - Save any globally-reorderable instructions into a global map that is shared across blocks, so they can be
* emitted by the first block that needs them.
*
* Emitting instructions is currently naive: we just iterate in the order that the dependencies were established.
* If instruction 4 depends on instructions 1, 2, and 3, we'll visit in depth-first order and emit 1, 2, 3, 4.
* That's true even if instruction 1 and 2 are simple instructions (for ex primitives) while instruction 3 has its
* own large dependency tree.
*
* ## Issues/things to explore:
*
* - An obvious improvement is to weight the nodes and emit dependencies based on weight. Alternatively, we could try to
* determine the reactive dependencies of each node, and try to emit nodes that have the same dependencies together.
* - Reordering destructure statements means that we also end up deferring the evaluation of its RHS. So i noticed some
* `const [state, setState] = useState(...)` getting moved around. But i think i might have just messed up the bit that
* ensures non-reorderable instructions (like the useState() call here) are serialized. So this should just be a simple fix,
* if i didn't already fix it (need to go back through the fixture output changes)
* - I also noticed that destructuring being moved meant that some reactive scopes ended up with less precise input, because
* the destructure moved into the reactive scope itself (so the scope depends on the rvalue of the destructure, not the lvalues).
* This is weird, i need to debug.
* - Probably more things.
*/
export function instructionReordering(fn: HIRFunction): void {
DEBUG && console.log(printFunction(fn));
const globalDependencies: Dependencies = new Map();
for (const [, block] of fn.body.blocks) {
reorderBlock(fn.env, block, globalDependencies);
}
markInstructionIds(fn.body);
DEBUG && console.log(printFunction(fn));
}

const DEBUG = false;

type Dependencies = Map<IdentifierId, Node>;
type Node = {
instruction: Instruction | null;
dependencies: Array<IdentifierId>;
depth: number | null;
};

function reorderBlock(
env: Environment,
block: BasicBlock,
globalDependencies: Dependencies
): void {
DEBUG && console.log(`bb${block.id}`);
const dependencies: Dependencies = new Map();
const locals = new Map<string, IdentifierId>();
let previousIdentifier: IdentifierId | null = null;
for (const instr of block.instructions) {
const node: Node = getOrInsertDefault(
dependencies,
instr.lvalue.identifier.id,
{
instruction: instr,
dependencies: [],
depth: null,
}
);
if (getReorderingLevel(instr) === ReorderingLevel.None) {
if (previousIdentifier !== null) {
node.dependencies.push(previousIdentifier);
}
previousIdentifier = instr.lvalue.identifier.id;
}
for (const operand of eachInstructionValueOperand(instr.value)) {
if (
operand.identifier.name !== null &&
operand.identifier.name.kind === "named"
) {
const previous = locals.get(operand.identifier.name.value);
if (previous !== undefined) {
node.dependencies.push(previous);
}
locals.set(operand.identifier.name.value, instr.lvalue.identifier.id);
} else {
if (dependencies.has(operand.identifier.id) || globalDependencies.has(operand.identifier.id)) {
node.dependencies.push(operand.identifier.id);
}
}
}
dependencies.set(instr.lvalue.identifier.id, node);

for (const lvalue of eachInstructionValueLValue(instr.value)) {
const lvalueNode = getOrInsertDefault(
dependencies,
lvalue.identifier.id,
{
instruction: null,
dependencies: [],
depth: null,
}
);
lvalueNode.dependencies.push(instr.lvalue.identifier.id);
if (
lvalue.identifier.name !== null &&
lvalue.identifier.name.kind === "named"
) {
const previous = locals.get(lvalue.identifier.name.value);
if (previous !== undefined) {
node.dependencies.push(previous);
}
locals.set(lvalue.identifier.name.value, instr.lvalue.identifier.id);
}
}
}

function getDepth(env: Environment, id: IdentifierId): number {
const node = dependencies.get(id);
if (node == null) {
return 0;
}
if (node.depth !== null) {
return node.depth;
}
node.depth = 0;
let depth =
node.instruction != null && mayAllocate(env, node.instruction) ? 1 : 0;
for (const dep of node.dependencies) {
depth += getDepth(env, dep);
}
node.depth = depth;
return depth;
}

const instructions: Array<Instruction> = [];

function print(
id: IdentifierId,
seen: Set<IdentifierId>,
depth: number = 0
): void {
const node = dependencies.get(id) ?? globalDependencies.get(id);
if (node == null || seen.has(id)) {
DEBUG && console.log(`${"\t|".repeat(depth)} skip $${id}`);
return;
}
seen.add(id);
node.dependencies.sort((a, b) => {
const aDepth = getDepth(env, a);
const bDepth = getDepth(env, b);
return bDepth - aDepth;
});
for (const dep of node.dependencies) {
print(dep, seen, depth + 1);
}
DEBUG && console.log(`${"\t|".repeat(depth)} ${printNode(id, node)}`);
}
const seen = new Set<IdentifierId>();
if (DEBUG) {
for (const operand of eachTerminalOperand(block.terminal)) {
print(operand.identifier.id, seen);
}
for (const id of Array.from(dependencies.keys()).reverse()) {
print(id, seen);
}
}

function emit(id: IdentifierId): void {
const node = dependencies.get(id) ?? globalDependencies.get(id);
if (node == null) {
return;
}
dependencies.delete(id);
globalDependencies.delete(id);
node.dependencies.sort((a, b) => {
const aDepth = getDepth(env, a);
const bDepth = getDepth(env, b);
return bDepth - aDepth;
});
for (const dep of node.dependencies) {
emit(dep);
}
if (node.instruction !== null) {
instructions.push(node.instruction);
}
}

for (const operand of eachTerminalOperand(block.terminal)) {
DEBUG && console.log(`terminal operand: $${operand.identifier.id}`);
emit(operand.identifier.id);
}
/**
* Gross hack: for value blocks we want the terminal operand to be emitted last, since that's its value.
* For other blocks the exact order doesn't matter, we assume instructions whose values aren't depended
* upon by the block terminal are used later, so it makes sense to order them last.
*/
const index = instructions.length;
for (const id of Array.from(dependencies.keys()).reverse()) {
const node = dependencies.get(id);
if (node == null) {
continue;
}
if (
node.instruction !== null &&
getReorderingLevel(node.instruction) === ReorderingLevel.Global &&
(block.kind === 'block' || block.kind === 'catch')
) {
globalDependencies.set(id, node);
DEBUG && console.log(`global: $${id}`);
} else {
DEBUG && console.log(`other: $${id}`);
emit(id);
}
}
if (block.kind !== 'block' && block.kind !== 'catch') {
const extra = instructions.splice(index);
instructions.splice(0, 0, ...extra);
}
block.instructions = instructions;
DEBUG && console.log();
}

function printDeps(deps: Dependencies): string {
return (
"[\n" +
Array.from(deps)
.map(([id, dep]) => printNode(id, dep))
.join("\n") +
"\n]"
);
}

function printNode(id: number, node: Node): string {
if (
node.instruction != null &&
node.instruction.value.kind === "FunctionExpression"
) {
return `$${id} FunctionExpression deps=[${node.dependencies
.map((x) => `$${x}`)
.join(", ")}]`;
}
return `$${id} ${
node.instruction != null ? printInstruction(node.instruction) : ""
} deps=[${node.dependencies.map((x) => `$${x}`).join(", ")}] depth=${
node.depth
}`;
}

enum ReorderingLevel {
None = "none",
Local = "local",
Global = "global",
}
function getReorderingLevel(instr: Instruction): ReorderingLevel {
switch (instr.value.kind) {
case "JsxExpression":
case "JsxFragment":
case "JSXText":
case "LoadGlobal":
case "Primitive":
case "TemplateLiteral": {
return ReorderingLevel.Global;
}
/*
For locals, a simple and robust strategy is to figure out the range of instructions where the identifier may be reassigned,
and then allow reordering of LoadLocal instructions which occur after this range. Obviously for const, this means that all
LoadLocals can be reordered, so a simpler thing to start with is just to only allow reordering of loads of known-consts.
With this overall strategy we can allow global reordering of LoadLocals and remove the global/local reordering distinction
(all reordering can be global).
case "Destructure":
case "StoreLocal":
case "LoadLocal":
{
return ReorderingLevel.Local;
}
*/
default: {
return ReorderingLevel.None;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,10 @@ export function isMutable({ id }: Instruction, place: Place): boolean {
return id >= range.start && id < range.end;
}

function mayAllocate(env: Environment, instruction: Instruction): boolean {
export function mayAllocate(
env: Environment,
instruction: Instruction
): boolean {
const { value } = instruction;
switch (value.kind) {
case "Destructure": {
Expand Down
Loading

0 comments on commit c8c0495

Please sign in to comment.