Skip to content
bjourne edited this page Nov 19, 2014 · 5 revisions

Has three object repositories:

  • Nursery
  • Aging Space
  • Tenured Space

FAQ (or not)

The questions I asked myself while going through the code.

This part deals with general questions about Factor's VM implementation.

Where are the cards stored?

The cards are stored in the cards member of the data_heap struct which is an array of card. A card is just an uint8_t. There are several hundred thousands cards in the array and each card should be seen as an eight bit long bitfield.

And what is the purpose of decks?

Conceptually, a deck should be a stack of cards. There are about 768 decks.

What is an inline cache miss?

When a generic method is called with a dispatch parameter it is not used to.

What is a write barrier?

It's some kind of system for recording pointers from older generations, such as aging, to younger ones, such as nursery.

What is a slot visitor?

An object whose main task is to visit entities in Factor. An example of an entity is the true_object.

What is a root?

Objects which Factor's gc follows to find all live objects.

Why do string objects have an aux slot?

What are data roots?

data_roots is a vector in factor_vm which contains pointers to "saved" objects. Suppose a c++ method pops an object from the data stack. Then there might not be any references to it and it may be eligible for garbage collection which may be triggered when the allot function is called. That would be really bad so c++ code needs to ensure that free pointers are referenced from data_roots which saves them from collection.

Why are array elements added to the data roots?

Vector elements are added to data_roots in factor_vm::std_vector_to_array. They are added because the elements may point to live Factor objects and in the next step Factor makes an allocation which might cause a garbage collection. changed in git

What are bignum roots?

They seem to be redundant when you already have data roots.

What are code roots?

Shouldn't they be just as redundant as the bignum_roots? They serve some purpose in factor_vm::inline_cache_miss but maybe a "normal" data_root could be used instead.

How does Factors garbage collectors mark phase work?

The method factor_vm::collect_mark_impl implements the mark phase. The member mark_stack is a std::vector<cell> containing references to marked or live objects.

The vm has two memory heaps, one called the data heap and the other the code heap. clear_mark_bits is invoked on the code heap which delegates the work out to its free_list_allocator. With all the mark bits cleared both in the code heap and tenured space, no object is seen as alive.

What is the code heap?

It's one of the two main memory heaps in the Factor VM. The size of the heap is given in megabytes by the code_size VM parameter. So when Factor starts, it initializes a code heap of that size. By default it is 64 megabytes.

Each code heap has a segment which I think is like a raw memory area.

What is a card deck?

About Segments

A segment is a pool of memory in which different kinds of Factor objects live. Both the code heap and the data heap has their own segments. The initial size of the code heap is 64 megabytes and the size of the data heap is calculated in some way.

Each segment is protected by a lower and upper guard page. So the first writable byte in a segment is at address:

seg->start + getpagesize() + seh_area_size

Where are segments allocated?

It's in the segment::segment constructor. Each platform has its own memory allocation routine. On Windows, it wraps a call to VirtualAlloc and on Unix to mmap.

See files vm/os-unix.cpp and vm/os-windows.cpp.

Can segments grow? How?

Yes they can, and do grow when the heaps that manages them grows. Seems to be a very expensive operation in which a new segment is allocated and then all live objects in the old segment is copied over.

About Contexts

A context keeps all necessary data for loading and suspending a thread, such as datastack, retainstack, callstack and so on.

What is an active context?

The context that is currently executing.

When and how is a contexts callstack updated?

The callstack_bottom pointer in a context is updated when the http://docs.factorcode.org/content/word-__percent__save-context,cpu.architecture.html instruction is ran.

Does call instructions update the callstack?

About the Nursery Collector

The nursery is where all objects are allocated. It is the memory space that is most frequently garbage collected by Factor because young objects tend to not survive for very long.

What objects are traced?

These references are visited in order:

  1. The roots. They are object literals, c++-level guarded pointers, the special object array and a bunch others.
  2. Then the contexts. Each context has a callstack, retainstack and datastack which also contain objects that are visited.

How are nursery objects promoted to aging space?

Objects in the nursery that survives one garbage collection cycle are automatically promoted to aging space.

All global data structures are visited recursively. First the root objects and all objects that they point to and so on. Then all active contexts in the vm and their different types of stacks and the object pointers on them.

For each visited object, the algorithm checks if it exists in the nursery. If so, the object is "promoted" to aging space. What that means is that a full copy of the object is allocated in aging space and the header cell is set to indicate the memory location the object has been "forwarded" to.

What happens with all objects in tenured and aging space that now has dangling pointers to nursery objects?

Exactly where is the recursion?

Like this:

  • slot_visitor<Fixup>::visit_pointer
  • fixup.fixup_data

I can't see the recursive part.

When is address forwarding resolved?

About Full Garbage Collection

A full garbage collection cycle is different from other types of gc operations in that it will collect all garbage possible and also defragment the heaps (I think).

How does Factor perform a full garbage collection cycle?

A full collection is performed by the factor_vm::collect_full method. First factor_vm::collect_mark_impl runs the mark phase.

Why does running it take so long?

Emptying the mark stack is quite slow.

Why is emptying the mark stack so slow then?

Links

Clone this wiki locally