Threading

stringbean edited this page Dec 6, 2011 · 5 revisions

Threading architecture, thread safety, and memory management concerns.

Introduction

Version 3 of XMPPFramework brings massive parallelism, and big performance improvements. This article outlines the architecture of the threading within the framework, how to harness it, as well as what to watch out for.

Grand Central Dispatch

Writing multi-threaded code has historically been difficult and fraught with problems. Apple's Grand Central Dispatch technology changes much of this. And GCD is where the XMPPFramework gets its parallelism and power.

If you're already an expert on GCD you can skip this section. Otherwise, read on for a quick overview.

One of the problems with traditional multi-threading has to do with a few simple questions:

  • "How many threads do I create?"
  • "How many threads is too many?"
  • "What is the performance impact of creating too many threads?"

It all comes down to the fact that creating a thread is an expensive operation. It's rather slow, plus it requires a significant amount of overhead. On top of this, if you create too many threads then your process wastes a lot of CPU cycles switching back and forth between threads. And the costs of these thread-context switches starts to eat away at your performance.

To make matters worse, there are no simple answers for how many threads to create. First of all, even if you knew how processor cores are available, that doesn't mean the OS is going to give your process all those cores. The correct answer may come down to system load, and only the OS knows about this.

GCD solves these questions with a simple abstraction:

Don't worry about threads. Instead use dispatch queues. These are super lightweight, and you can create tons of them. The GCD library will automatically manage a thread pool of the proper size, and will execute your dispatch queues on threads from this pool.

So a dispatch_queue is NOT a thread. It is an abstraction designed to make you stop thinking in terms of threads. If you're anything like me, this will be a difficult habit to break. So let's look at an example:

// Creating a dispatch queue is a lightweight operation.
// Creating a thread = 512 KB
// Creating a queue = 256 Bytes !
// 
// We're going to create a serial queue.
dispatch_queue_t myQ = dispatch_queue_create("my q name", NULL);

// Placing a work item in a GCD queue is a lightweight operation.
// In fact, it requires only 15 instructions.
// By comparison, setting up a thread, and assigning work to it
// can require hundreds of instructions
// and take more than 50 times longer.

dispatch_async(myQ, task1);
dispatch_async(myQ, task2);
dispatch_async(myQ, task3);

So do task1 - task3 all operate on the same thread? No, stop thinking a queue is a thread. Here are some examples of what might happen:

  • task1 -> Thread D
  • task2 -> Thread G
  • task3 -> Thread B

Or perhaps:

  • task1 -> Thread Z
  • task2 -> Thread H
  • task3 -> Thread Z

The point is, it doesn't matter. We created a serial queue, so the tasks get executed one after the other. And task2 won't start until task1 is complete.

Parallelism via Queues

The XMPPFramework accomplishes its parallelism by allowing all modules and delegates to run in their own queue. Let's take a deeper look at this within the code.

Whenever you add a delegate, you also specify the dispatch_queue you would like your delegate methods to be invoked on.

[xmppStream addDelegate:self delegateQueue:dispatch_get_main_queue()];

The above code specifies that the delegate methods are to be invoked on the main thread. However, it would be trivial to parallelize your xmpp handling code by simply creating your own queue, and specifying that queue.

dispatch_queue_t xmppHandlingQ = dispatch_queue_create("XMPP Handling", NULL);

// Look mom! No more blocking the UI thread!
// Now I do all my expensive processing somewhere else!
[xmppStream addDelegate:self delegateQueue:xmppHandlingQ];

When the xmppStream goes to invoke your delegateMethod, it will essentially do this:

dispatch_async(delegateQueueYouProvided, ^{ @autoreleasepool {

    [delegate xmppStreamDidConnect:self];
}});

Avoiding deadlock is simple - delegates are always invoked asynchronously.

Thread Safety

If you've invoked:

[xmppStream addDelegate:self delegateQueue:...];

then don't forget to do this somewhere (maybe in dealloc, or even before):

[xmppStream removeDelegate:self];

The same applies for any xmpp module that you may add a delegate to.

Memory Management

XML is basically a tree structure. You deal with nodes that have a parent and children. Further, the XML API's are designed to allow you to traverse up and down the tree. Thus, the underlying XML tree (as a whole) may not get released if you retain child nodes.

See the detailed discussion on the KissXML page.

(Each XMPPIQ, XMPPMessage, and XMPPPresence element are their own individual trees. Rather than retain the child nodes of such elements, one should prefer to create copies.)