Skip to content

Tips and Tricks Introspection

Björn Lindqvist edited this page Aug 18, 2015 · 14 revisions

Introspection Tips and tricks. Factor allows you not only to introspect mundane stuff, like the names of the slots of a tuple which is about the amount of introspection you can do in Java. It allows you to introspect every part of the running system, like the code heap, garbage collector and so on.

Inspecting a Word Relocations and Parameters

Here is how you get at the relocations:

! Get the address to the relocation slot in the code block
\ slot<< word-code drop 1 cells - 
! Read the cell at that address and untag the pointer to the byte array
<alien> 0 alien-cell alien-address 15 unmask
! The byte arrays payload is two cells in
[ 2 cells + <alien> ]
! Size of the byte array is a fixnum one cell in
[ 1 cells + <alien> 0 alien-cell alien-address -4 shift ] bi 
! Make an array object
memory>byte-array

Using a very similar method, you can find a words parameters:

\ callstack>array word-code drop 2 cells - 
<alien> 0 alien-cell alien-address 15 unmask 
[ 2 cells + <alien> ] 
[ 1 cells + <alien> 0 alien-cell alien-address -4 shift ] bi 
<direct-uint-array>

See the Generated Assembler Code

Suppose you create a function, hyp, and want to see what assembler code Factor generates for it:

IN: scratchpad : hyp ( x y -- z ) [ sq ] bi@ + sqrt ;
IN: scratchpad \ hyp disassemble
00007f46fba0e660: 89059a39ecfe    mov [rip-0x113c666], eax
00007f46fba0e666: 4883ec08        sub rsp, 0x8
00007f46fba0e66a: 4983c708        add r15, 0x8
00007f46fba0e66e: 498b0e          mov rcx, [r14]
00007f46fba0e671: 498b5ef8        mov rbx, [r14-0x8]
00007f46fba0e675: 49891e          mov [r14], rbx
00007f46fba0e678: 49890f          mov [r15], rcx
00007f46fba0e67b: e820bb31ff      call 0x7f46fad2a1a0 (*)
00007f46fba0e680: 4983ef08        sub r15, 0x8
00007f46fba0e684: 4983c610        add r14, 0x10
00007f46fba0e688: 498b4f08        mov rcx, [r15+0x8]
00007f46fba0e68c: 49890e          mov [r14], rcx
00007f46fba0e68f: 49894ef8        mov [r14-0x8], rcx
00007f46fba0e693: e808bb31ff      call 0x7f46fad2a1a0 (*)
00007f46fba0e698: e8a3462aff      call 0x7f46facb2d40 (+)
00007f46fba0e69d: 89055d39ecfe    mov [rip-0x113c6a3], eax
00007f46fba0e6a3: 4883c408        add rsp, 0x8
00007f46fba0e6a7: 488d1d05000000  lea rbx, [rip+0x5]
00007f46fba0e6ae: e91dd308ff      jmp 0x7f46faa9b9d0 ([ \ sqrt ~array~ 0 ~array~ inline-cache-miss-tail ])
00007f46fba0e6b3: 0000            add [rax], al
00007f46fba0e6b5: 0000            add [rax], al
00007f46fba0e6b7: 0000            add [rax], al
00007f46fba0e6b9: 0000            add [rax], al
00007f46fba0e6bb: 0000            add [rax], al
00007f46fba0e6bd: 0000            add [rax], al
00007f46fba0e6bf: 00              invalid

Compared to what assembler is generated with the optimizing compiler turned off:

IN: scratchpad disable-optimizer
IN: scratchpad : hyp ( x y -- z ) [ sq ] bi@ + sqrt ;
IN: scratchpad disassemble
00007f46fba0d8b0: 89054a47ecfe          mov [rip-0x113b8b6], eax
00007f46fba0d8b6: 4883ec18              sub rsp, 0x18
00007f46fba0d8ba: 48b8944b670b477f0000  mov rax, 0x7f470b674b94
00007f46fba0d8c4: 4983c608              add r14, 0x8
00007f46fba0d8c8: 498906                mov [r14], rax
00007f46fba0d8cb: e8105428ff            call 0x7f46fac92ce0 (bi@)
00007f46fba0d8d0: e86b542aff            call 0x7f46facb2d40 (+)
00007f46fba0d8d5: 89052547ecfe          mov [rip-0x113b8db], eax
00007f46fba0d8db: 4883c418              add rsp, 0x18
00007f46fba0d8df: 488d1d05000000        lea rbx, [rip+0x5]
00007f46fba0d8e6: e9e5e008ff            jmp 0x7f46faa9b9d0 ([ \ sqrt ~array~ 0 ~array~ inline-cache-miss-tail ])
00007f46fba0d8eb: 0000                  add [rax], al
00007f46fba0d8ed: 0000                  add [rax], al
00007f46fba0d8ef: 00                    invalid

See the Call Flow Graph

Factor even lets you access the cfg it generates for a word. Here it is using the hyp word defined in the previous section:

USE: compiler.cfg.debugger compiler.cfg.graphviz ;
IN: scratchpad \ hyp test-regs first cfgviz "/tmp/hyp" "png" "dot" graphviz

The graph will be rendered to /tmp/hyp.png. You need to have Graphviz installed.

Hack the SSA

It's not so useful perhaps (or maybe you can make a Jester clone using it!), but it's easy to modify the SSA for a word. First define some helper words:

USING: compiler.cfg.finalization compiler.codegen compiler.cfg.optimizer compiler.tree ;
QUALIFIED: compiler

: word>ssa ( word -- ssa )
    [ compiler:start ]
    [ compiler:frontend ] bi ;

: ssa>cfg ( ssa word -- cfg )
    build-cfg [ [ optimize-cfg finalize-cfg ] with-cfg ] map
    first ;

: emit-code ( cfg -- out )
    [ generate ] with-cfg ;

: upload-code ( word cfg -- )
    2array 1array t t modify-code-heap ;

Suppose you have this foo word:

: foo ( -- )
    "Hello, world!" print ;

Then you can use the following code to change the message print will output:

: hack-word ( word -- )
    [ 
        word>ssa dup [ 
            { [ #push? ] [ literal>> "Hello, world!" = ] } 1&& 
        ] find nip "Goodbye, world!" >>literal drop 
    ] 
    [ ssa>cfg emit-code ] 
    [ swap upload-code ] tri ;

IN: scratchpad \ foo hack-word
IN: scratchpad foo
Goodbye, world!

Having fun with the context object

All data structures in the Factor VM can be fiddled with using raw alien memory operations. For example, try this:

IN: scratchpad 10 20 30

--- Data stack:
10
20
30
IN: scratchpad 

You can access the three integers on the stack indirectly using:

IN: scratchpad QUALIFIED: vm
IN: scratchpad context vm:context memory>struct datastack>> 
<alien> 0 alien-cell

--- Data stack:
10
20
30
ALIEN: 1e0
IN: scratchpad 

1e0 is hexadecimal ofcourse and if we shift it down its four tag bits we attain the number that previously was the top stack item.

IN: 0x1e0 -4 shift .
30

Creating a fake callstack overflow

This recipe uses the same techniques as demonstrated in the previous one.

IN: scratchpad
    ! Load the context
    context vm:context memory>struct 
    ! Get the address to the start of the callstack segment
    callstack-region>> start>> 
    ! Attempt to write to a location two cells
    2 cells - <alien> 0 alien-unsigned-1
Call stack overflow

Type :help for debugging help.

Loading the VM struct

Getting hold of the address to the vm instance is a little harder than the context. But still not that hard. First write a word to fetch the address. This one only works on x86.64, I have to look it up to see how you do it on x86.32:

: vm-addr ( -- addr ) 
    void* { } cdecl [ RAX vm-reg MOV ] alien-assembly ;

Then

IN: scratchpad QUALIFIED: vm
IN: scratchpad vm-addr vm:vm memory>struct

--- Data stack:
S{ vm f ~context~ ~context~ ~zone~ 139836174926864...