-
Notifications
You must be signed in to change notification settings - Fork 2
code generation
Here is an explanation of how Factor generates assembly code for alien
FFI calls. It's very complicated and a lot of code is generated. Take
for example the sk_value
word:
IN: scratchpad \ sk_value see
USING: alien.c-types alien.syntax ;
IN: openssl.libssl
LIBRARY: libssl FUNCTION: void* sk_value
( _STACK* s, int ) ) ; inline
It's assembly code can be dumped
IN: scratchpad \ sk_value disassemble
00007f627b5e86b0: 89054ac917ff mov [rip-0xe836b6], eax
00007f627b5e86b6: 4883ec18 sub rsp, 0x18
00007f627b5e86ba: 4983c708 add r15, 0x8
.... and so on ...
00007f627b5e87da: 0000 add [rax], al
00007f627b5e87dc: 0000 add [rax], al
00007f627b5e87de: 0000 add [rax], al
Below I will explain what the purpose of each group of assembly instructions are:
00007fb95ea6b700: 8905fac817ff mov [rip-0xe83706], eax
This seemingly weird instruction is a "safe point." Safe points are emitted by the compiler in the code in locations where it thinks it is safe to switch to another thread. Where no data is hanging in the air.
The addresses above is what I get on my machine in one particular run and are most likely different than yours.
From the disassembly, we see that the rip register has the value
0x7fb95ea6b700, so the mov
instruction puts the contents of the eax
register in memory address:
0x7fb95ea6b700 - 0xe83706 = 0x7fb95dbe7ffa
That is the address of the safe point page. Most of the time it's a
complete dummy instruction which has no effect but when ctrl+c is
pressed to interrupt the running thread, the safe point page becomes
write protected. On Linux, that is accomplished with the mprotect
call:
mprotect(safepoint_page, getpagesize(), PROT_NONE)
Then the thread continues to run. Then the next time a safe point is executed, SIGSEGV occurs because you can't write to write protected memory! And the thread is interrupted which was the whole point of pressing ctrl+c.
Why not interrupt the thread directly on SIGINT? That's because the thread may be "in the middle of something" and Factor is nice and lets it run to the next safe point before interrupting.
sub rsp, 0x18
The next instruction decrements the stack pointer kept in the rsp register 24 bytes. Think of it as the function declaring space for three local variables (which are all longs and 8 bytes in size):
void some_func() {
long a, b c;
}
The variables will be used in the forthcoming assembler code when there is not enough space in the cpu registers to keep all values around.
Remember that the cpu stack grows downwards which is why the sub
instruction is used. The two other stacks, managed by Factor is the
data and retain stack which in contrast grows upwards.
Remember the previously shown definition of the sk_value
word? It's
fairly simple -- just takes two parameters from the stack and puts
back the result. Behind the scenes, a few more things are
happening. First the stack elements, which are Factor objects, needs
to be coerced to C compatible types. You can see exactly what code is
used for coercing the parameters with this:
IN: scratchpad T{ alien-invoke-params { parameters { void* int } } } param-prep-quot
[ [ [ ] dip >c-ptr ] dip >fixnum ]
This code is inserted during compilation and then Factor is smart and
is able to optimize it. For example by omitting the useless [ ] dip
construct. You end up with these assembly instructions:
add r15, 0x8
sub r14, 0x8
mov rcx, [r14+0x8]
mov [r15], rcx
call 0x7fb95eb17640 (>c-ptr)
sub r15, 0x8
add r14, 0x8
mov rcx, [r15+0x8]
mov [r14], rcx
call 0x7fb95ec72760 (>fixnum)
sub r14, 0x8
The r14 register holds the data stack and the r15 one the retain stack. They both grow upwards. The first four instructions are just stack shuffling -- the topmost item on the data stack is popped and pushed to the retain stack. With rcx used for temporary storage.
Then the >c-ptr
word is called with the top of the data stack
containing the Factor value to be converted to a void*
.
A similar dance is performed for the >fixnum
call. The end result is
that the two items on the data stack is the return values of the two
coercion calls.
mov rdi, [rsp]
mov rsi, [rsp+0x8]
The rdi and rsi are the first two parameters for the x86-64 C FFI
calling conventions (see
wikipedia
for details). sk_value
expects the caller to put the two parameter
values in those registers.
xor rax, rax
This instruction is strange because rax will be clobbered by
sk_value
anyway.
mov r11, 0x7f627a14a610
call r11
The address of the sk_value
function is loaded into the r11 register
and then called. More about this strange indirection later.
mov [rsp+0x10], rax
sk_value
puts its return value, which is a void pointer, in the rax
register (in accordance with the previously linked conventions).
Object allocation is performed in two steps, first Factor checks if there is enough space in the nursery. If it isn't, garbage is collected and only then is a new object allocated.
lea rcx, [r13+0x10]
Register r13 always holds a pointer to the running VM. So this
instruction loads the address to the second field of the VM struct
into the rcx register. That happens to be the nursery_space
.
mov rbx, [rcx]
The pointer is dereferenced which puts the value of the first field of
nursery_space
into rbx. Since nursery_space
is a bump_allocator
that field is called here
.
add rbx, 0x30
cmp rbx, [rcx+0x10]
jle 0x7f627b5e8780 {skip-minor-gc}
(Actually, the disassembly doesn't show jump labels like I've added above, but I'm adding them here for clarity.)
Now it gets interesting. The first instruction just adds 48 to rbx,
the second compares that sum to the third field of the
bump_allocator
the rcx register references -- end
. These five
instructions corresponds with this C code:
if (vm->nursery.here + 0x30 <= vm->nursery.end) {
...
} else {
...
}
The assembly code continues with the branch taken if the condition is not true:
mov rcx, [r13]
The first field of the VM is the context
which is now pointed to by
rcx. It's like a stash where you save stuff.
lea rbx, [rsp-0x8]
Not sure about this, I think it loads the return address of the current procedure.
mov [rcx], rbx
mov [rcx+0x10], r14
mov [rcx+0x18], r15
r14 points to the datastack and r15 to the retainstack. So stuff is
stashed away in the context
.
call 0x7f627aebd530 (minor-gc)
Performs a small garbage collection cycle. If you combine these assembly instructions with the previous ones you get this C code:
if (vm->nursery.here + 0x30 <= vm->nursery.end) {
} else {
vm->context->callstack_top = ret_addr;
vm->context->datatack = datastack;
vm->context->retainstack = retainstack;
minor_gc();
}
skip-minor-gc:
mov rbx, [rsp+0x10]
mov rax, 0x1
test rbx, rbx
jz 0x7f627b5e87c6 {00007f627b5e87c6}
After the last section in which a garbage collection cycle maybe was
run, Factor guarantees that there is enough space available to
allocate a wrapper for the void*
that the FFI call returned. First
the value is moved into the rbx register and then a check is made to
see if it is 0 or NULL
. If it is, then the boxing part is skipped
and the value is put on the stack as it is.
lea rcx, [r13+0x10]
mov rax, [rcx]
The nurserys here
pointer is moved into rax. The next instructions
bumps that pointer by 48 bytes and fills in the memory at the address
pointed to with an object of type alien. For the layout of alien and
other objects, refer to vm/layouts.hpp
.
mov qword [rax], 0x18
The first object cell is the header. It is generated by taking the type number of the object (6 in this case) and left shifting it by the number of object flag bits which is always 2. 0x6 << 0x2 = 0x18.
or rax, 0x6
Factor uses a tagged pointer system in which the four lowest bits of any pointer is a tag which describes what type of object it is. So or:ing the object pointer in rax with 0x6 states that the pointer is a pointer to an alien.
add qword [rcx], 0x30
Since 0x30 bytes have been allocated, the nurserys here
field needs
to be bumped.
mov qword [rax+0x2], 0x1
mov qword [rax+0xa], 0x1
mov [rax+0x12], rbx
mov [rax+0x1a], rbx
00007f627b5e87c6:
(Again: jump label inserted for clarity.)
Here values are moved into the second, third, fourth and fifth cell of the alien object. Remember that the pointer is tagged, which is why the offsets look weird.
Together, the assembly correspons to this C:
b = sk_value(...);
if (vm->nursery.here + 0x30 <= vm->nursery.end) {
} else {
vm->context->callstack_top = ret_addr;
vm->context->datatack = datastack;
vm->context->retainstack = retainstack;
minor_gc();
}
cell *ret = 1;
if (b == NULL) {
} else {
ret = vm->nursery.here;
ret[0] = 0x18;
ret[1] = 0x1;
ret[2] = 0x1;
ret[3] = b;
ret[4] = b;
vm->nursery.here += 0x30;
ret |= 0x6;
}
return ret;