Skip to content

Latest commit

 

History

History
709 lines (547 loc) · 22.9 KB

iced.md

File metadata and controls

709 lines (547 loc) · 22.9 KB

What Is IcedCoffeeScript?

IcedCoffeeScript (ICS) is a system for handling callbacks in event-based code. There were two existing implementations, one in the sfslite library for C++, and another in the tamejs translator for JavaScript. This extension to CoffeeScript is a third implementation. The code and translation techniques are derived from experience with JS, but with some new Coffee-style flavoring.

This document first presents a "Iced" tutorial (adapted from the JavaScript version), and then discusses the specifics of the CoffeeScript implementation.

Installing and Running ICS

ICS is available as an npm package:

npm install -g iced-coffee-script

You can alternatively checkout ICS and install from source:

git clone https://github.com/maxtaco/coffee-script
./bin/cake install

This will give you libraries under iced-coffee-script and the binaries iced and icake, which are replacements for coffee and cake respectively. In almost all cases, iced should serve as a drop-in replacement for coffee, since the ICS language is a superset of CoffeeScript.

For more information about CS and ICS, you can also see our brochure page.

Quick Tutorial and Examples

Here is a simple example that prints "hello" 10 times, with 100ms delay slots in between:

# A basic serial loop
for i in [0..10]
  await setTimeout(defer(), 100)
  console.log "hello"

There is one new language addition here, the await ... block (or expression), and also one new primitive function, defer. The two of them work in concert. A function must "wait" at the close of a await block until all deferrals made in that await block are fulfilled. The function defer returns a callback, and a callee in an await block can fulfill a deferral by simply calling the callback it was given. In the code above, there is only one deferral produced in each iteration of the loop, so after it's fulfilled by setTimer in 100ms, control continues past the await block, onto the log line, and back to the next iteration of the loop. The code looks and feels like threaded code, but is still in the asynchronous idiom (if you look at the rewritten code output by the coffee compiler).

This next example does the same, while showcasing power of the await.. language addition. In the example below, the two timers are fired in parallel, and only when both have fulfilled their deferrals (after 100ms), does progress continue...

for i in [0..10]
  await 
    setTimeout defer(), 100
    setTimeout defer(), 10
  console.log ("hello");

Now for something more useful. Here is a parallel DNS resolver that will exit as soon as the last of your resolutions completes:

dns = require("dns");

do_one = (cb, host) ->
  await dns.resolve host, "A", defer(err, ip)
  msg = if err then "ERROR! #{err}" else "#{host} -> #{ip}"
  console.log msg
  cb()

do_all = (lst) ->
  await
    for h in lst
      do_one defer(), h

do_all process.argv[2...]

You can run this on the command line like so:

iced examples/iced/dns.coffee yahoo.com google.com nytimes.com okcupid.com tinyurl.com

And you will get a response:

yahoo.com -> 72.30.2.43,98.137.149.56,209.191.122.70,67.195.160.76,69.147.125.65
google.com -> 74.125.93.105,74.125.93.99,74.125.93.104,74.125.93.147,74.125.93.106,74.125.93.103
nytimes.com -> 199.239.136.200
okcupid.com -> 66.59.66.6
tinyurl.com -> 195.66.135.140,195.66.135.139

If you want to run these DNS resolutions in serial (rather than parallel), then the change from above is trivial: just switch the order of the await and for statements above:

do_all = (lst) ->
  for h in lst
    await
      do_one defer(), h

Slightly More Advanced Example

We've shown parallel and serial work flows, what about something in between? For instance, we might want to make progress in parallel on our DNS lookups, but not smash the server all at once. A compromise is windowing, which can be achieved in IcedCoffeeScript conveniently in a number of different ways. The 2007 academic paper on tame suggests a technique called a rendezvous. A rendezvous is implemented in CoffeeScript as a pure CS construct (no rewriting involved), which allows a program to continue as soon as the first deferral is fulfilled (rather than the last):

do_all = (lst, windowsz) ->
  rv = new iced.Rendezvous
  nsent = 0
  nrecv = 0

  while nrecv < lst.length
    if nsent - nrecv < windowsz and  nsent < n
      do_one rv.id(nsent).defer(), lst[nsent]
      nsent++
    else
      await rv.wait defer evid
      console.log "got back lookup nsent=#{evid}"
      nrecv++

This code maintains two counters: the number of requests sent, and the number received. It keeps looping until the last lookup is received. Inside the loop, if there is room in the window and there are more to send, then send; otherwise, wait and harvest. Rendezvous.defer makes a deferral much like the defer primitive, but it can be labeled with an identifier. This way, the waiter can know which deferral has fulfilled. In this case we use the variable nsent as the defer ID --- it's the ID of this deferral in launch order. When we harvest the deferral, rv.wait fires its callback with the ID of the deferral that's harvested.

Note that with windowing, the arrival order might not be the same as the issue order. In this example, a slower DNS lookup might arrive after faster ones, even if issued before them.

Composing Serial And Parallel Patterns

In IcedCoffeeScript, arbitrary composition of serial and parallel control flows is possible with just normal functional decomposition. Therefore, we don't allow direct await nesting. With inline anonymous CoffeeScript functions, you can concisely achieve interesting patterns. The code below launches 10 parallel computations, each of which must complete two serial actions before finishing:

f = (n,cb) ->
  await
    for i in [0..n]
      ((cb) ->
        await setTimeout defer(), 5 * Math.random()
        await setTimeout defer(), 4 * Math.random()
        cb()
      )(defer())
  cb()

autocb

Most of the time, an iced function will call its callback and return at the same time. To get this behavior "for free", you can simply name this callback autocb and it will fire whenever your iced function returns. For instance, the above example could be equivalently written as:

f = (n,autocb) ->
  await
    for i in [0..n]
      ((autocb) ->
        setTimeout defer(), 5 * Math.random()
        setTimeout defer(), 4 * Math.random()
      )(defer())

In the first example, recall, you call cb() explicitly. In this example, because the callback is named autocb, it's fired automatically when the iced function returns.

If your callback needs to fulfill with a value, then you can pass that value via return. Consider the following function, that waits for a random number of seconds between 0 and 4. After waiting, it then fulfills its callback cb with the amount of time it waited:

rand_wait = (cb) ->
  time = Math.floor Math.random() * 5
  if time is 0
   cb(0)
   return
  await setTimeout defer(), time
  cb(time) # return here, implicitly.....

This function can written equivalently with autocb as:

rand_wait = (autocb) ->
  time = Math.floor  Math.random() * 5 
  return 0 if time is 0
  await setTimeout defer(), time
  return time 

Implicitly, return 0; is mapped by the CoffeeScript compiler to autocb(0); return.

Language Design Considerations

In sum, the iced additions to CoffeeScript consist of three new keywords:

  • await, marking off a block or a single statement.
  • defer, which is quite similar to a normal function call, but is compiled specially to accommodate argument passing.

Finally, autocb isn't a bona-fide keyword, but the compiler searches for it in parameters to CoffeeScript functions, and updates the behavior of the Code block accordingly.

These keywords represent the potential for these iced additions to break existing CoffeeScript code --- any preexisting use of these keywords as regular function, variable or class names will cause headaches.

Debugging and Stack Traces -- Now Greatly Improved!

An oft-cited problem with async-style programming, with ICS or hand-rolled, is that stack traces are often incomplete or incomprehensible. If an exception is caught in a Iced function, the stack trace will only show the "bottom half" of the call stack, or all of those functions that are descendents of the main event loop. The "top half" of the call stack, telling you "who really called this function," is probably long gone.

ICS has a workaround to this problem. When an iced function is entered, the runtime will find the first argument to the function that was output by defer(). Such callbacks are annotated to contain the file, line and function where they were created. They also are annotated to hold a refernce to defer()-generated callback passed to the function in which they were created. This chaining creates an implicit stack that can be walked when an exception is thrown.

Consider this example:

iced.catchExceptions()

foo = (y) ->
  await setTimeout defer(), 10
  throw new Error "oh no!"
  y(10)

bar = (x) ->
  await foo defer()
  x()

baz = () ->
  await bar defer()

baz()

The function iced.catchExceptions sets the uncaughtException handler in Node to print out the standard callstack, and also the Iced "callstack", and then to exit. The callback generated by defer() in the function bar holds a reference to x. Similarly, the callback generated in foo holds a reference to y. Here's what happens when this program is run:

Error: oh no!
    at Deferrals.continuation (/Users/max/src/coffee-script/prog.iced:24:13)
    at Deferrals._call (/Users/max/src/coffee-script/lib/coffee-script/iced.js:86:19)
    at Deferrals._fulfill (/Users/max/src/coffee-script/lib/coffee-script/iced.js:97:23)
    at Object._onTimeout (/Users/max/src/coffee-script/lib/coffee-script/iced.js:53:18)
    at Timer.ontimeout (timers.js:84:39)
Iced 'stack' trace (w/ real line numbers):
   at foo (prog.iced:4)
   at bar (prog.iced:9)
   at baz (prog.iced:13)

The first stack trace is the standard Node stacktrace. It is inscrutable, since it mainly covers node internals, and has line numbering relative to the translated file (I still haven't fixed this bug, sorry). The second stack trace is much better. It tells the sequence of Iced calls the lead to this exception. Line numbers are relative to the original input file.

The relavant API is as follows:

iced.stackWalk cb

Start from the given cb, or use the currently active callback if none was given, and walk up the Iced-generated stack. Return a list of call site descriptions. You can call this from your own exception-handling code.

iced.catchExceptions()

Tell the runtime to catch uncaught exceptions, and to print a Iced-aware stack dump as above.

The Lowdown on defer

The implementation of defer is interesting --- it's trying to emulate ``call by reference'' in languages like C++ or Java. Here is an example that shows off the four different cases required to make this happen:

cb = defer x, obj.field, arr[i], rest...

And here is the output from the iced coffee compiler:

cb = __iced_deferrals.defer({
    assign_fn: (function(__slot_1, __slot_2, __slot_3) {
      return function() {
        x = arguments[0];
        __slot_1.field = arguments[1];
        __slot_2[__slot_3] = arguments[2];
        return rest = __slice.call(arguments, 3);
      };
    })(obj, arr, i)
  });

The __iced_deferrals object is an internal object of type Deferrals that's collecting all calls to defer in the current await block. The one in question should fulfill with 3 or more values. When it does, it will call into the innermost anonymous function to perform the appropriate assignments in the original scope. The four cases are:

  1. Simple assignment --- seen in x = arguments[0]. Here, the x variable is in the scope of the original defer call.

  2. Object slot assignment --- seen in __slot_1.field = arguments[1]. Here, the reference obj must be captured at the time of the defer call, and obj.field is filled in later.

  3. Array cell assignment --- seen in __slot_2[__slot_3] = arguments[2]. This of course will work on an array or an object. Here, the reference to the array, and the value of the index must be captured when defer is called, and the cell is assigned later.

  4. Splat assignment --- seen in res = __slice.call(arguments,3). This is much like a simple assignment, but allows a ``splat'' meaning assignment of multiple values at once, accessed as an array.

These specifics are also detailed in the code in the Defer class, file nodes.coffee.

Awaits No Longer Work as Expressions

The following do not work and will generate syntax errors at compile time:

y = (await foo defer x)
x = if true
  await foo defer y
  y
else 10
my_func 10, (
  await foo defer y
  y
)

That is, you can't treat await statements as expressions. And recursively speaking, you can't treat any blocks that contain await statements as expressions. Previous versions of IcedCoffeeScript supported this arcane feature, but it was extremely difficult to implement properly, and unnecessarily obscured the control flow of iced programs.

Translation Technique

The IcedCoffeeScript addition uses a similar continuation-passing translation to tamejs, but it's been refined to generate cleaner code, and to translate only when necessary. Here are the general steps involved:

  • 1 Run the standard CoffeeScript lexer, rewriter, and parser, with a few small additions (for await and defer), yielding a standard CoffeeScript-style abstract syntax tree (AST).

  • 2 Apply iced annotations:

    • 2.1 Find all await nodes in the AST. Mark these nodes and their ancestors with an A flag.

    • 2.2 Find all for, while, until, or loop nodes marked with A. Flood them and their descendants with an L flag. Stop flooding when the first loop without an A flag is hit.

    • 2.3 Find all continue or break nodes marked with an L flag. Mark them and their descendants with a P flag.

  • 3 ``Rotate'' all those nodes marked with A or P:

    • 3.1 For each Block node b in the AST marked A or P:

      • 3.1.1 Find b's first child c marked with A or P.

      • 3.1.2 Cut b's list of expressions after c, and move those expressions on the right of the cut into a new block, called d. This block is c's continuation block and becomes c's child in the AST. This is the actual ``rotation.''

      • 3.1.3 Call the rotation recursively on the child block d.

      • 3.1.4 Add an additional code to c's body, which is to call the continuation represented by d. For if statements this means calling the continuation in both branches; for switch statements, this means calling the continuation from all branches; for loops, this means calling continue at the end of the loop body; for blocks, this means just calling the continuation as the last statement in the block. See callContinuation in nodes.coffee.

  • 4 Output preamble/boilerplate; for the case of JavaScript output to browsers, inline the small class Deferrals needed during runtime; for node-based server-side JavaScript, a require statement suffices here. Only do this if the source file has a defer statement in it.

  • 5 Compile as normal. The effect of the above is to mutate the original CoffeeScript AST into another valid CoffeeScript AST. This AST is then compiled with the normal rules.

Translation Example

For an example translation, consider the following block of code:

while x1
  f1()

while x2
  if y
    f2() 
    continue
    f3()
  await 
    f4(defer())
  if z
    f5()
    break
    f6()

while x3
  f7()
  • Here is schematic diagram for this AST:

  • After Step 2.1, nodes in blue are marked with A. Recall, Step 2.1 traces upwards from all await blocks.

  • After Step 2.2, nodes in purple are marked with L. Recall, Step 2.2 floods downwards from any any loops marked with A.

  • After Step 2.3, nodes in yellow are marked with P. Recall, Step 2.3 traces upwards from any jumps marked with L.

  • The green nodes are those marked with A or P. They are "marked" for rotations in the next step.

  • In Step 3, rotate all marked nodes AST nodes. This rotation introduces the new orange block nodes in the graph, and attaches them to pivot nodes as continuation blocks.

  • In translated code, the general format of a pivot node is:

(function (k) {
   // the body
   k();
})(function () {
   // the continuation block.
}

To see how pivots and continuations are output in our example, look at this portion of the AST, introduced after Step 3:

detail

Here is the translated output (slightly hand-edited for clarity):

(function() {
  // await block f4()
  (function(k) {
    var __deferrals = new iced.Deferrals(k);
    f4(__deferrals.defer({}));
    __deferrals._fulfill();
  })(function() {
    // The continuation block, starting at 'if z'
    (function(k) {
      if (z) {
        f5();
        (function(k) {
          // 'break' throws away the current continuation 'k'
          // and just calls _break()
          _break();
        })(function() {
          // A continuation block, after 'break', up to 'f6()'
          // This code will never be reached
          f6();
          return k();
        });
      } else {
        return k();
      }
    })(function() {
      // end of the loop, call _continue() to start at the top
      return _continue();
    });
  });
});

API and Library Documentation

iced.Rendezvous

The Rendezvous is a not a core feature, meaning it's written as a straight-ahead CoffeeScript library. It's quite useful for more advanced control flows, so we've included it in the main runtime library.

The Rendezvous is similar to a blocking condition variable (or a "Hoare style monitor") in threaded programming.

iced.Rendezvous.id(i,[multi]).defer slots...

Associate a new deferral with the given Rendezvous, whose deferral ID is i, and whose callbacks slots are supplied as slots. Those slots can take the two forms of defer return as above. As with standard defer, the return value of the Rendezvous's defer is fed to a function expecting a callback. As soon as that callback fires (and the deferral is fulfilled), the provided slots will be filled with the arguments to that callback.

Also, note the optional boolean flag multi. By default, a function generated by defer can be called only once, and will generate an error on subsequent calls. Only with the multi flag set to true (and only in the case of a Rendezvous), can this restriction be relaxed.

iced.Rendezvous.defer slots...

You don't need to explicitly assign an ID to a deferral generated from a Rendezvous. If you don't, one will automatically be assigned, in ascending order starting from 0.

iced.Rendezvous.wait cb

Wait until the next deferral on this rendezvous is fulfilled. When it is, callback cb with the ID of the fulfilled deferral. If an unclaimed deferral fulfilled before wait was called, then cb is fired immediately.

Though wait would work with any hand-rolled JS function expecting a callback, it's meant to work particularly well with tamejs's await function.

Example

Here is an example that shows off the different inputs and outputs of a Rendezvous. It does two parallel DNS lookups, and reports only when the first returns:

hosts = [ "okcupid.com", "google.com" ];
ips = errs = []
rv = new iced.Rendezvous
for h,i in hosts
    dns.resolve hosts[i], rv.id(i).defer errs[i], ips[i]

await rv.wait defer which
console.log "#{hosts[which]}  -> #{ips[which]}"

connectors

A connector is a function that takes as input a callback, and outputs another callback. The best example is a timeout, given here:

iced.timeout(cb, time, res = [])

Timeout an arbitrary async operation.

Given a callback cb, a time to wait time, and an array to output a result res, return another callback. This connector will set up a race between the callback returned to the caller, and the timer that fires after time milliseconds. If the callback returned to the caller fires first, then fill res[0] = true;. If the timer won (i.e., if there was a timeout), then fill res[0] = false;.

In the following example, we timeout a DNS lookup after 100ms:

{timeout} = require 'icedlib'
info = [];
host = "pirateWarezSite.ru";
await dns.lookup host, timeout(defer(err, ip), 100, info)
if not info[0]
    console.log "#{host}: timed out!"
else if (err)
    console.log "#{host}: error: #{err}"
else
    console.log "#{host} -> #{ip}"

The Pipeliner library

There's another way to do the windowed DNS lookups we saw earlier --- you can use the control flow library called Pipeliner, which manages the common pattern of having "m calls total, with only n of them in flight at once, where m > n."

The Pipeliner class is available in the icedlib library:

{Pipeliner} = require 'icedlib'
pipeliner = new Pipeliner w,s 

Using the pipeliner, we can rewrite our earlier windowed DNS lookups as follows:

do_all = (lst, windowsz) ->
  pipeliner = new Pipeliner windowsz
  for x in list
    await pipeliner.waitInQueue defer()
    do_one pipeliner.defer(), x
  await pipeliner.flush defer()

The API is as follows:

new Pipeliner w, s

Create a new Pipeliner controller, with a window of at most w calls out at once, and waiting s seconds before launching each call. The default values are w = 10 and s = 0.

Pipeliner.waitInQueue c

Wait in a queue until there's room in the window to launch a new call. The callback c will be fulfilled when there is room.

Pipeliner.defer args...

Create a new deferal for this pipeline, and pass it to whatever function is doing the actual work. When the work completes, fulfill this deferal --- that will update the accounting in the pipeliner class, allowing queued actions to proceed.

Pipeliner.flush c

Wait for the pipeline to clear out. Fulfills the callback c when the last action in the pipeline is done.