Prompt Collection

David Nichols edited this page Dec 28, 2017 · 11 revisions

Prompt Collection or Deterministic Garbage Collection describes the ability of Qore to support RAII or "initialism" (basically C++-style class destructors) even in objects that participate in cyclic directed graphs. A "cyclic directed graph" in Qore terms means that objects can be reached from themselves by traversing their members.

Qore generally uses strong reference counts to determine when heap-allocated data values are current (or "in scope"); when such a data value's reference count reaches 0, then it goes out of scope. When an object no longer has any strong references (i.e. it goes out of scope), it is collected by undergoing destruction. Destruction simply means running the object's class destructor() method and dereferencing object members.

So in the classic reference-counted case (without cycle detection), the internal pseudo-code for object collection and destruction looks like this:

if (reference_count == 0)
    destroy_object();

However, in case of a cycle (an object that references itself, either directly or through other objects), the reference count will never reach 0, and, without an algorithm to deal with cycles, such objects would never be collected or destroyed.

Consider the following code:

%new-style
%strict-args
%require-types

class Cycle {
    public {
        Cycle c;
    }

    constructor() {
        c = self;
    }

    destructor() {
        print("goodbye, cruel world!\n");
    }
}

Cycle cyc();

The class creates a cycle in the constructor by assigning c = self. Immediately after this assignment and during construction, object cyc's reference count will be 2. When the local variable cyc goes out of scope, the object's reference count will be 1.

Without accounting for cycles, the object will never be collected, and the goodbye, cruel world! message will never be output on the console (i.e. there will be a memory leak).

To support prompt collection, Qore calculates the cyclic reference count of the object at the latest when cyc goes out of scope, which is calculated as 1.

The cyclic reference count defines the "high water mark" for destruction; in the case of a cycle of one object, when the object has a reference count equal to its cyclic reference count, then the object can be collected.

In the case of cycles of more than one object, all objects in the cycle must have their reference counts equal to their cycle reference counts for the cycle to be collected as a whole. If any one object in the cycle has a reference count greater than its cycle reference count, it means that there is a valid strong reference from outside the cycle, and therefore that all members of the cycle are still valid.

The internal pseudo-code for object collection and destruction looks like this:

bool destroy = true;
foreach object obj in (cycle) {
    if (obj.reference_count != obj.cycle_count) {
        destroy = false;
        break;
    }
}
if (destroy)
    destroy_cycle(cycle);

The most complex parts of the prompt collection / deterministic garbage collection implementation involve the calculation of the cycle counts (the graph scan itself) and the deadlock avoidance approach.

In Qore objects and closure-bound local variables can participate in cycles, and both enforce the atomicity of operations in multithreaded contexts through read-write locks. To perform a consistent scan on a cycle, all members of the cycle must be locked at the same time, therefore there is the potential for deadlocks. Qore implements a transaction-based approach to acquiring multiple locks with a deadlock detection algorithm; if a potential deadlock condition is encountered, then the current thread will roll back the scan transaction (i.e. all locks are released) and wait for the conflicting thread (i.e. the thread holding the conflicting lock) to notify it that the scan is complete.

Support for prompt collection / deterministic garbage collection is important because it allows Qore to support completely automatic memory management and also C++-style destructors (RAII).

The complexity involved in implementing prompt collection / deterministic garbage collection in languages that support automatic memory management and permit cycles of objects is the reason that C++-style destructors and RAII are not supported in popular languages like Java, C#/.NET, Python, Ruby, and more.

For example, Java and C# do not support destructors (and do not support the RAII idiom) but have the finally block for resource management because of the difficultly in solving this problem (see http://blogs.msdn.com/b/brada/archive/2005/02/11/371015.aspx for an interesting reference to this issue regarding .NET development).

Consider the following code in C#:

// To run this code, substitute a valid path from your local machine
string path = @"c:\users\public\test.txt";
System.IO.StreamReader file = new System.IO.StreamReader(path);
char[] buffer = new char[10];
try {
    file.ReadBlock(buffer, index, buffer.Length);
}
catch (System.IO.IOException e) {
    Console.WriteLine("Error reading from {0}. Message = {1}", path, e.Message);
}
finally {
    if (file != null) {
        file.Close();
    }
}

Due to the lack of support for destructors in C#, the resource cleanup (closing the file) must be performed each time the resource is used in a finally block in exception-safe code (implementation effort: O(n)). With a language that support destructors and RAII, resource cleanup can be implemented once in the destructor (in this case closing the file descriptor) instead of every time the resource is used (implementation effort: O(1)).

For example, the Qore equivalent needs no exception handling, because the destructor closes the file descriptor automatically:

# To run this code, substitute a valid path from your local machine
string path = "c:\\users\\public\\test.txt";
StreamReader file(new FileInputStream(path));
*string buffer = file.readString(10);

Advantages of prompt collection / deterministic garbage collection:

  • support for the RAII idiom: intuitive C++-style destructors for scope-based resource management and exception-safe programming
  • predictable and deterministic performance (no periodic delays for garbage collection activities)

The disadvantage is mainly in implementation complexity, but the advantages of supporting the RAII idiom for programmers are enormous.

Here are a couple of blog posts for more information on the implementation and the relevance of this development:

Garbage collection documentation in the Qore reference manual:

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.