Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable data-flow analysis and optimization #58

Open
0x7CFE opened this issue Jun 16, 2014 · 1 comment
Open

Variable data-flow analysis and optimization #58

0x7CFE opened this issue Jun 16, 2014 · 1 comment
Labels

Comments

@0x7CFE
Copy link
Owner

0x7CFE commented Jun 16, 2014

JIT VM boosts up performance by eliminating value stack in context objects. Instead of using Smalltalk stack, we maintain values on the system level. Each PushX instruction generates an llvm::Value which is used by consumer instruction later.

The problem is that we could not simply hold the value in register or push it to the system stack. If garbage collection occurs it should update all references to the objects being moved including the one that was pushed.

We maintain shadow stack with a linked list that holds references to the temporary values so that GC can find them. However, storing value to the stack is time and space consuming operation. Moreover, most of intermediate values are used “in place” so backing them is redundant.

The natural way to suppress unneeded memory operations is to defer them as much as possible or even eliminate them completely.

In case of PushConstant, PushLiteral and PushArgument instructions we have data that is not changed during the method execution. The only way pointer to such data may be changed is the GC operation. Thus we may easily defer the obtaining operation of the value when it is actually needed by other instruction that originally takes the value from the context stack.

However, instructions such as PushTemporary, PushInstance and MarkArguments may not be deferred easily. Consider the following disassembly:

; Preserving the current value on the stack
PushTemporary 1 

; Checking condition
PushArgument 1
DoSpecial branchIfFalse: <skip offset>

; Value modifying code
PushTemporary 1
PushConstant 1
SendBinary +
AssignTemporary 1
popTop

; <skip offset> points here
; Code using the temporary
MarkArguments 1
...

This code reminds the post increment idiom that may be found in C-like languages. We store the original value, then increment variable and pass backed value to the subsequent code. However in this case we have a condition which selects which value to pass: original or modified.

IfTrue branch has the side effect which modifies temp variable depending on the argument's value operands. If PushTemporary is deferred without precaution it may break the logic of the program.

In this particular situation we need to store variables value into the memory and then reload value instead of variable. On the other hand, for all situations where variable is not modified between the PushTemporary and value consuming sites, we may simply read the value of the current variable. It will improve speed and maintenance of data locality which is especially useful in loops.

Correct implementation requires strict data-flow analysis to be performed on the control flow graph. Strictly speaking, we may defer value load if all paths from the PushX instruction to the use instruction do not contain AssignX instructions it it.

Still, we could relatively easy deduce the data-flow only for temporaries (not instance variables) and for methods without nested blocks. Check out this method:

test |temp block|
    temp <- 42.
    block <- [ temp <- temp + 1 ].
    temp <- temp + (self sendSomeMessageUsing: block).

Without proper type inference we could not say, whether variable will or will not be modified inside the expression. Even with it (as it was mentioned before) we may only detect the condition without trying to predict the outcome.

Therefore, in methods with inline blocks containing variable assignment instructions we should always treat PushTemporary instructions for that variable as non-deferrable.

Instance variables are even more hard to optimize because we have no compact lexical context to analyze. Interprocedural analysis on a fully dynamic language became really hard.

@0x7CFE 0x7CFE changed the title Temporary variable access analysis and optimization Variable data-flow analysis and optimization Jun 16, 2014
@0x7CFE
Copy link
Owner Author

0x7CFE commented Jun 18, 2016

Implemented in #92 almost without limitations on the blocks and temporaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant