Revactor implements the Actor model using Ruby 1.9's Fibers for concurrency and Rev for event monitoring
Ruby
Pull request Compare This branch is 27 commits ahead, 8 commits behind tarcieri:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
benchmarks
examples
lib
spec
.gitignore
CHANGES
LICENSE
README
Rakefile
revactor.gemspec

README

= Revactor

Revactor is an Actor model implementation for Ruby 1.9 built on top of the
Rev high performance event library.  Revactor is primarily designed for 
writing Erlang-like network services and tools.

You can load Revactor into your Ruby 1.9 application with:

  require 'revactor'

If you'd like to learn more about the Actor model, more information is
available on the Revactor web site:

http://revactor.org/philosophy

More information about Rev is available on Rubyforge:

http://rev.rubyforge.org/rdoc

== Anatomy

Revactor is built out of several parts which interoperate to let you build
network servers painlessly while still guaranteeing correct operation:

* Actors - Actors are the main concurrency primitive used by Revactor.

* Revactor::TCP - This module provides an API duck typed to the Ruby
  Sockets API.  However, rather than blocking calls actually blocking,
  they defer to the underlying event loop.  Actor-focused means of
  receiving data are also provided.

* Filters - Applied to all incoming data directly when it's received, Filters
  can preprocess or postprocess data before it's even delivered to an Actor.
  This is useful for handling protocol framing or other streaming transforms.

== Actors

Actors are lightweight concurrency primitives which communicate using message
passing.  They multitask cooperatively, meaning that many of the worries 
surrounding threaded programming disappear.  Any sequence of operations you do 
in an Actor are executed in the order you specify.  You don't (generally) have 
to worry about another Actor doing something in the background as you frob a 
particular data structure.

Actors are created by calling Actor.spawn:

  myactor = Actor.spawn { puts "I'm an Actor!" }

When you spawn an Actor it's scheduled to run after the current Actor either
completes or calls Actor.receive.  Speaking of which, Actor.receive is used
to receive messages:

  myactor = Actor.spawn do
    Actor.receive do |filter|
      filter.when(:dog) { puts "I got a dog!" }
    end
  end

You can send a message to an actor using its #send method or <<

Calling: 

  myactor << :dog

prints:

  "Yay, I got a dog!"

You can retrieve the current Actor by calling Actor.current.  There will always 
be a default Actor available for every Thread.

== Mailboxes

So, Actors can receive messages.  But where do those messages go?  The answer
is every Actor has a mailbox.  The mailbox is sort of like a message queue,
but you don't have to read it sequentially.  You can apply filters to it, and
change the filterset at any time.

When you call Actor.receive, it yields a filter object and lets you register
message patterns you're interested in, then it sleeps and waits for messages.
Each time the current actor receives a message, it's scanned by the filter,
and if a match occurs the appropriate action is given.

Matching is performed by the Filter#when method, which takes a pattern to match
against a message and a block to call if the message matches.  The pattern is
compared to the message using ===, the same thing Ruby uses for case statements.
You can think of the filter as a big case statement.

Like the case statement, a class matches any objects of that class.  Since all 
classes descend from Object passing Object will match all messages.  You can
also pass a regexp to match against a string.

Revactor installs the Case gem by default.  This is useful for matching against
messages stored in Arrays, or in fixed-size arrays called Tuples.  Case can
be used as follows:

  filter.when(Case[:foobar, Object, Object]) { ... }

This will look for messages which are Arrays or Tuples with three members,
whose first member is the symbol :foobar.  As you can probably guess, Case[]
matches against an Array or Tuple with the same number of members, and
matches each member of the given tuple with ===.  Once again, Object is a 
wildcard, so the other members of the message can be anything.

Want more complex pattern matching?  Case lets you use a block to match any
member by using a guard, ala:

  filter.when(Case[:foobar, Case.guard { |n| n > 100 }, Object]) { ... }

This will look for an Array / Tuple with three members, whose first member is
the symbol :foobar and whose second member is greater than 100.

You can also specify how long you wish to wait for a message before timing out.
This is accomplished with Filter#after:

  filter.after(0.5) { raise 'it timed out ;_;' }

The #after method takes a duration in seconds to wait (in the above example it
waits a half second) before the receive operation times out.

Actor.receive returns whatever value the evaluated action returned.  This means
you don't have to depend on side effects to extract values from receive, 
instead you can just interpret its return value.

== Handling Exceptions

In a concurrent environment, dealing with exceptions can be incredibly 
confusing.  By default, any unhandled exceptions in an Actor are logged
and any remaining Actors continue their normal operation.  However, Actors
also provide a powerful tool for implementing fault-tolerant systems that
can gracefully recover from exceptions.

Actors can be linked to each other:  

  another_actor = Actor.spawn { puts "I'm an Actor!" }
  Actor.link(another_actor)

This can also be done as a single "atomic" operation:

  actor = Actor.spawn_link { puts "I'm an Actor!" }

When Actors are linked, any exceptions which occur in one will be raised in the 
other, and vice versa.  This means if one Actor dies, any Actors it's linked to 
will also die.  Furthermore, any Actors those are linked to also die.  This 
occurs until the entire graph of linked Actors has been walked.

In this way, you can organize Actors into large groups which all die 
simultaneously whenever an error occurs in one.  This means that if an error
occurs and one Actor dies, you're not left with an interdependent network of
Actors which are in an inconsistent state.  You can kill off the whole group
and start over fresh.

But if an Actor crashing kills off every Actor it's linked to, what Actor will
be left to restart the whole group?  The answer is that an Actor can trap
exit events from another and receive them as messages:

  Actor.current.trap_exit = true
  actor = Actor.spawn_link { puts "I'm an Actor!" }
  Actor.receive do |filter|
    filter.when(Case[:exit, actor, Object]) { |msg| p msg }
  end

will print something to the effect of:

  I'm an Actor!
  [:exit, #<Actor:0x54ad6c>, nil]

We were sent a message in the form:

  [:exit, actor, reason]

and in this case reason was nil, which informs us the Actor exited normally.
But what if it dies due to an exception instead?

  Actor.current.trap_exit = true
  actor = Actor.spawn_link { raise "I fail!" }
  Actor.receive do |filter|
    filter.when(Case[:exit, actor, Object]) { |msg| p msg }
  end

We now get the entire exception, captured and delivered as a message:

  [:exit, #<Actor:0x53ec24>, #<RuntimeError: I fail!>]

If the Actor that died were linked to any others which were not trapping exits,
those would all die and the ones trapping exits would remain.  This allows us
to implement supervisors which trap exits and respond to exit messages.  The
supervisor's job is to start an Actor initially, and if it fails log the error
then restart it.

In this way Actors can be used to build complex concurrent systems which fail
gracefully and can respond to errors by restarting interdependent components
of the system en masse.

== Revactor::TCP

The TCP module lets you perform TCP operations on top of the Actor model.  For
those of you familiar with Erlang, it implements something akin to gen_tcp.
Everyone else, read on!

Perhaps the best part of Revactor::TCP is you don't really need to know 
anything about the Actor model to use it.  For the most part it's duck typed
to the Ruby Socket API and can operate as a drop-in replacement.

To make an outgoing connection, call:

  sock = Revactor::TCP.connect(host, port)

This will resolve the hostname for host (if it's not an IPv4 or IPv6 address),
make the connection, and return a socket to it.  The best part is: this call
will "block" until the connection is established, and raise exceptions if the
connection fails.  It works just like the Sockets API.

However, it's not actually blocking.  Underneath this call is using the
Actor.receive method to wait for events.  This means other Actors can run in
the background while the current one is waiting for a connection.

Furthermore, the Actor making this call can receive other events and they
will remain undisturbed in the mailbox.  The connect method filters for
messages specifically related to making an outgoing connection.

To listen for incoming connections, there's a complimentary method:

  listener = Revactor::TCP.listen(addr, port)

This will listen for incoming connections on the given address and port.  It
returns a listen socket with a #accept method:

  sock = listener.accept

Like TCP.connect, this method will block waiting for connections, but in
actuality is calling Actor.receive waiting for messages related to incoming
connections.

Now that you have a handle on a Revactor TCP socket, there's several ways you
can begin using it.  The first is using a standard imperative sockets API:

  data = sock.read(1024)

This call will "block" until it reads a kilobyte from the socket.  However,
you may not be interested in a specific amount of data, just whenever data
is available on the socket.  In that case, you can just call the #read method
without any argument:

  data = sock.read

There's also a corresponding command to write to the socket.  Like read this
command will also "block" until all data has been written out to the socket:

  sock.write data

For Actors that want to deal with both incoming TCP data and messages from
other Actors, Revactor's TCP sockets also support an approach called
active mode.  Active mode automatically delivers incoming data as a message
to what's known as the Socket's controller.  You can assign the Socket's
controller whenever you want:

  sock.controller = Actor.current

Once you've done this, you can turn on active mode to begin receiving messages:

  sock.active = true
  Actor.receive do |filter|
    filter.when(Case[:tcp, sock, Object]) do |_, _, data|
      ...
    end

    filter.when(Case[:somethingelse, Object]) do |_, message|
      ...
    end
  end

(note: _ is an idiom which means ignore/discard a variable)

With active mode, the controller will receive all data as quickly as it can be
read off the socket.  If the Actor processing incoming message can't process
them as quickly as they're being read, then they'll begin piling up in the
mailbox until the controller is able to catch up (if ever).

In order to prevent this from happening, sockets can be set active once:

  sock.active = :once

This means read the next incoming message, then fall back to active = false.
The underlying system will stop monitoring the socket for incoming data,
and you're free to spend as much time as you'd like handling it.  Once
you're ready for the next message, just set active to :once again.

== Filters

Not to be confused with Mailbox filters, Revactor's TCP sockets can each have
a filter chain.  Filters are specified when a connection is created:

  sock = Revactor::TCP.connect('irc.efnet.org', 6667, :filter => :line)

Filters transform data as it's read or written off the wire.  In this case
we're connecting to an IRC server, and the IRC protocol is framed using a
newline delimiter.

The line filter will scan incoming messages for a newline, and buffer until
it encounters one.  When it finds one, it will reassemble the entire message
from the buffer and deliver it to you in one fell swoop.

With the line filter on, receiving messages off IRC is easy:

  message = sock.read

This will provide you with the entire next message, with the newline delimiter
already removed.

If the filter name is a symbol, Revactor will look under its filters directory
for a class of the corresponding name.  Alternatively you can pass the name of
a class you created yourself which responds to the methods encode and decode:

  sock = Revactor::TCP.connect(host, port, :filter => MyFilter)

Filter chains can be specified by passing an array:

  sock = Revactor::TCP.connect(host, port, :filter => [MyFilter, :line])

You can pass arguments to your filter's initialize method by passing an array
with a class name as a member of a filter chain:

  sock = Revactor::TCP.connect(host, port, :filter => [[Myfilter, 42], :line])

In addition to the line filter, Revactor bundles a :packet filter.  This filter
constructs messages with a length prefix that specifies the size of the
remaining message.  This is a simple and straightforward way to frame
discrete messages on top of a streaming protocol like TCP, and is used for,
among other things, DRb.

== Mongrel

Revactor includes complete support for running Mongrel on top of Actors and
Revactor::TCP.  The implementation monkeypatches two methods in the
Mongrel::HttpServer class.  Initial benchmarks show better throughput
and concurrency than threaded Mongrel running on Ruby 1.8 or 1.9.

To use Mongrel on top of Revactor, just:

  require 'revactor/mongrel'

Then call Mongrel within an Actor.start block.  The semantics are the same.
The only difference is that Actors, not Threads, are being used for 
concurrency.

== Roadmap

Revactor is still in its infancy.  Erlang and its Open Telcom Platform are an
extremely feature rich platform, and many features can be borrowed and
incorporated into Revactor.

Short term goals include adding thread safety.  This will allow Actors in
different threads to send messages to each other, making it easy to spin off
long-term blocking tasks into separate threads.

Next on the agenda is implementing DRb.  This should be possible simply by
monkeypatching the existing DRb implementation to run on top of Revactor::TCP.
Once DRb has been implemented it should be fairly trivial to implement
distributed Actor networks over TCP using DRb as the underlying message
passing protocol.

Long term items include implementation of more Filters and protocol support
modules.  These include an HTTP client (subclassed from the client in Rev),
as well as an HTTP server adapter (using the Mongrel parser).