Skip to content
bjourne edited this page Sep 21, 2014 · 11 revisions

Here is an explanation of how Factor generates assembly code for alien FFI calls. It's very complicated and a lot of code is generated. Take for example the sk_value word:

IN: scratchpad \ sk_value see
USING: alien.c-types alien.syntax ;
IN: openssl.libssl
LIBRARY: libssl FUNCTION: void* sk_value
    ( _STACK* s, int ) ) ; inline

It's assembly code can be dumped

IN: scratchpad \ sk_value disassemble
00007f627b5e86b0: 89054ac917ff          mov [rip-0xe836b6], eax
00007f627b5e86b6: 4883ec18              sub rsp, 0x18
00007f627b5e86ba: 4983c708              add r15, 0x8
.... and so on ...
00007f627b5e87da: 0000                  add [rax], al
00007f627b5e87dc: 0000                  add [rax], al
00007f627b5e87de: 0000                  add [rax], al

Below I will explain what the purpose of each group of assembly instructions are:

Safe Points

00007fb95ea6b700: 8905fac817ff          mov [rip-0xe83706], eax

This seemingly weird instruction is a "safe point." Safe points are emitted by the compiler in the code in locations where it thinks it is safe to switch to another thread. Where no data is hanging in the air.

The addresses above is what I get on my machine in one particular run and are most likely different than yours.

From the disassembly, we see that the rip register has the value 0x7fb95ea6b700, so the mov instruction puts the contents of the eax register in memory address:

0x7fb95ea6b700 - 0xe83706 = 0x7fb95dbe7ffa

That is the address of the safe point page. Most of the time it's a complete dummy instruction which has no effect but when ctrl+c is pressed to interrupt the running thread, the safe point page becomes write protected. On Linux, that is accomplished with the mprotect call:

mprotect(safepoint_page, getpagesize(), PROT_NONE)

Then the thread continues to run. Then the next time a safe point is executed, SIGSEGV occurs because you can't write to write protected memory! And the thread is interrupted which was the whole point of pressing ctrl+c.

Why not interrupt the thread directly on SIGINT? That's because the thread may be "in the middle of something" and Factor is nice and lets it run to the next safe point before interrupting.

Stack Pointer

sub rsp, 0x18

The next instruction decrements the stack pointer kept in the rsp register 24 bytes. Think of it as the function declaring space for three local variables (which are all longs and 8 bytes in size):

void some_func() {
    long a, b c;
}

The variables will be used in the forthcoming assembler code when there is not enough space in the cpu registers to keep all values around.

Remember that the cpu stack grows downwards which is why the sub instruction is used. The two other stacks, managed by Factor is the data and retain stack which in contrast grows upwards.

Parameter Coercion

Remember the previously shown definition of the sk_value word? It's fairly simple -- just takes two parameters from the stack and puts back the result. Behind the scenes, a few more things are happening. First the stack elements, which are Factor objects, needs to be coerced to C compatible types. You can see exactly what code is used for coercing the parameters with this:

IN: scratchpad T{ alien-invoke-params { parameters { void* int } } } param-prep-quot
[ [ [ ] dip >c-ptr ] dip >fixnum ]

This code is inserted during compilation and then Factor is smart and is able to optimize it. For example by omitting the useless [ ] dip construct. You end up with these assembly instructions:

add r15, 0x8
sub r14, 0x8
mov rcx, [r14+0x8]
mov [r15], rcx
call 0x7fb95eb17640 (>c-ptr)
sub r15, 0x8
add r14, 0x8
mov rcx, [r15+0x8]
mov [r14], rcx
call 0x7fb95ec72760 (>fixnum)
sub r14, 0x8

The r14 register holds the data stack and the r15 one the retain stack. They both grow upwards. The first four instructions are just stack shuffling -- the topmost item on the data stack is popped and pushed to the retain stack. With rcx used for temporary storage.

Then the >c-ptr word is called with the top of the data stack containing the Factor value to be converted to a void*.

A similar dance is performed for the >fixnum call. The end result is that the two items on the data stack is the return values of the two coercion calls.

FFI Invocation

mov rdi, [rsp]
mov rsi, [rsp+0x8]

The rdi and rsi are the first two parameters for the x86-64 C FFI calling conventions (see wikipedia for details). sk_value expects the caller to put the two parameter values in those registers.

xor rax, rax

This instruction is strange because rax will be clobbered by sk_value anyway.

mov r11, 0x7f627a14a610
call r11

The address of the sk_value function is loaded into the r11 register and then called. More about this strange indirection later.

mov [rsp+0x10], rax

sk_value puts its return value, which is a void pointer, in the rax register (in accordance with the previously linked conventions).

Object Allocation

Object allocation is performed in two steps, first Factor checks if there is enough space in the nursery. If it isn't, garbage is collected and only then is a new object allocated.

lea rcx, [r13+0x10]

Register r13 always holds a pointer to the running VM. So this instruction loads the address to the second field of the VM struct into the rcx register. That happens to be the nursery_space.

mov rbx, [rcx]

The pointer is dereferenced which puts the value of the first field of nursery_space into rbx. Since nursery_space is a bump_allocator that field is called here.

add rbx, 0x30
cmp rbx, [rcx+0x10]
jle 0x7f627b5e8780 {skip-minor-gc}

(Actually, the disassembly doesn't show jump labels like I've added above, but I'm adding them here for clarity.)

Now it gets interesting. The first instruction just adds 48 to rbx, the second compares that sum to the third field of the bump_allocator the rcx register references -- end. These five instructions corresponds with this C code:

if (vm->nursery.here + 0x30 <= vm->nursery.end) {
    ...
} else {
    ...
}

The assembly code continues with the branch taken if the condition is not true:

mov rcx, [r13]

The first field of the VM is the context which is now pointed to by rcx. It's like a stash where you save stuff.

lea rbx, [rsp-0x8]

Not sure about this, I think it loads the return address of the current procedure.

mov [rcx], rbx
mov [rcx+0x10], r14
mov [rcx+0x18], r15

r14 points to the datastack and r15 to the retainstack. So stuff is stashed away in the context.

call 0x7f627aebd530 (minor-gc)

Performs a small garbage collection cycle. If you combine these assembly instructions with the previous ones you get this C code:

if (vm->nursery.here + 0x30 <= vm->nursery.end) {
} else {
    vm->context->callstack_top = ret_addr;
    vm->context->datatack = datastack;
    vm->context->retainstack = retainstack;
    minor_gc();
}

Alien Boxing

skip-minor-gc:
mov rbx, [rsp+0x10]
mov rax, 0x1
test rbx, rbx
jz 0x7f627b5e87c6 {00007f627b5e87c6}

After the last section in which a garbage collection cycle maybe was run, Factor guarantees that there is enough space available to allocate a wrapper for the void* that the FFI call returned. First the value is moved into the rbx register and then a check is made to see if it is 0 or NULL. If it is, then the boxing part is skipped and the value is put on the stack as it is.

lea rcx, [r13+0x10]
mov rax, [rcx]

The nurserys here pointer is moved into rax. The next instructions bumps that pointer by 48 bytes and fills in the memory at the address pointed to with an object of type alien. For the layout of alien and other objects, refer to vm/layouts.hpp.

mov qword [rax], 0x18

The first object cell is the header. It is generated by taking the type number of the object (6 in this case) and left shifting it by the number of object flag bits which is always 2. 0x6 << 0x2 = 0x18.

or rax, 0x6

Factor uses a tagged pointer system in which the four lowest bits of any pointer is a tag which describes what type of object it is. So or:ing the object pointer in rax with 0x6 states that the pointer is a pointer to an alien.

add qword [rcx], 0x30

Since 0x30 bytes have been allocated, the nurserys here field needs to be bumped.

mov qword [rax+0x2], 0x1
mov qword [rax+0xa], 0x1
mov [rax+0x12], rbx
mov [rax+0x1a], rbx
00007f627b5e87c6:

(Again: jump label inserted for clarity.)

Here values are moved into the second, third, fourth and fifth cell of the alien object. Remember that the pointer is tagged, which is why the offsets look weird.

Together, the assembly correspons to this C:

b = sk_value(...);
if (vm->nursery.here + 0x30 <= vm->nursery.end) {
} else {
    vm->context->callstack_top = ret_addr;
    vm->context->datatack = datastack;
    vm->context->retainstack = retainstack;
    minor_gc();
}
cell *ret = 1;
if (b == NULL) {
} else {
    ret = vm->nursery.here;
    ret[0] = 0x18;
    ret[1] = 0x1;
    ret[2] = 0x1;
    ret[3] = b;
    ret[4] = b;
    vm->nursery.here += 0x30;
    ret |= 0x6;
}
return ret;