doc/manual/en/modules/blocks.xml

<chapter id="user-building-blocks"><title>Building Blocks</title>

    <para>
        Building blocks are layered on top of channels, and can be used instead of channels whenever
        a higher-level interface is required.
    </para>

    <para>
        Whereas channels are simple socket-like constructs, building blocks may offer a far more sophisticated
        interface. In some cases, building blocks offer access to the underlying channel, so that -- if the building
        block at hand does not offer a certain functionality -- the channel can be accessed directly. Building blocks
        are located in the <classname>org.jgroups.blocks</classname> package.
    </para>


  <section id="MessageDispatcher">
      <title>MessageDispatcher</title>

      <para>
          Channels are simple patterns to <emphasis>asynchronously</emphasis>
          send and receive messages. However, a significant number of communication patterns in group communication
          require <emphasis>synchronous</emphasis> communication. For example, a sender would like to send a message to
          the group and wait for all responses. Or another application would like to send a message to the group and
          wait only until the majority of the receivers have sent a response, or until a timeout occurred.
      </para>

      <para>
          <classname>MessageDispatcher</classname> provides blocking (and non-blocking) request sending and response
          correlation. It offers synchronous (as well as asynchronous) message sending with request-response
          correlation, e.g. matching one or multiple responses with the original request.
      </para>

      <para>
          An example of using this class would be to send a request message to all cluster members, and block until all
          responses have been received, or until a timeout has elapsed.
      </para>

      <para>
          Contrary to <xref linkend="RpcDispatcher">RpcDispatcher</xref>, MessageDispatcher deals with
          <emphasis>sending message requests and correlating message responses</emphasis>, while RpcDispatcher deals
          with <emphasis>invoking method calls and correlating responses</emphasis>. RpcDispatcher extends
          MessageDispatcher, and offers an even higher level of abstraction over MessageDispatcher.
      </para>

      <para>
          RpcDispatcher is essentially a way to invoke remote procedure calls (RCs) across a cluster.
      </para>

      <para>
          Both MessageDispatcher and RpcDispatcher sit on top of a channel; therefore an instance of
          <classname>MessageDispatcher</classname> is created with a channel as argument. It can now be
          used in both <emphasis>client and server role</emphasis>: a client sends requests and receives responses and
          a server receives requests and sends responses. <classname>MessageDispatcher</classname> allows for an
          application to be both at the same time. To be able to serve requests in the server role, the
          <methodname>RequestHandler.handle()</methodname> method has to be implemented:
      </para>
      <programlisting language="Java">Object handle(Message msg) throws Exception;</programlisting>


      <para>
          The <methodname>handle()</methodname> method is called whenever a request is received. It must return a value
          (must be serializable, but can be null) or throw an exception. The returned value will be sent to the sender,
          and exceptions are also propagated to the sender.
      </para>

      <para>
          Before looking at the methods of MessageDispatcher, let's take a look at RequestOptions first.
      </para>

      <section id="RequestOptions">
          <title>RequestOptions</title>
          <para>
              Every message sending in MessageDispatcher or request invocation in RpcDispatcher is governed by an
              instance of RequestOptions. This is a class which can be passed to a call to define the various
              options related to the call, e.g. a timeout, whether the call should block or not, the flags (see
              <xref linkend="MessageFlags"/>) etc.
          </para>
          <para>
              The various options are:
              <itemizedlist>
                  <listitem>
                      Response mode: this determines whether the call is blocking and - if yes - how long
                      it should block. The modes are:
                      <itemizedlist>
                          <listitem>GET_ALL: block until responses from all members (minus the suspected ones) have
                              been received.
                          </listitem>
                          <listitem>GET_NONE: wait for none. This makes the call non-blocking</listitem>
                          <listitem>GET_FIRST: block until the first response (from anyone) has been received</listitem>
                          <listitem>GET_MAJORITY: block until a majority of members have responded</listitem>
                      </itemizedlist>
                  </listitem>
                  <listitem>
                      Timeout: number of milliseconds we're willing to block. If the call hasn't terminated after the
                      timeout elapsed, a TimeoutException will be thrown. A timeout of 0 means to wait forever. The
                      timeout is ignored if the call is non-blocking (mode=GET_NONE)
                  </listitem>
                  <listitem>
                      Anycasting: if set to true, this means we'll use unicasts to individual members rather than sending
                      multicasts. For example, if we have have TCP as transport, and the cluster is {A,B,C,D,E}, and we
                      send a message through MessageDispatcher where dests={C,D}, and we do <emphasis>not</emphasis>
                      want to send the request to everyone, and everyone except C and D discard the message, then we'd
                      set anycasting=true. This will send the request to C and D only, as unicasts, which is better if
                      we use a transport such as TCP which cannot use IP multicasting (sending 1 packet to reach all
                      members).
                  </listitem>
                  <listitem>
                      Response filter: A RspFilter allows for filtering of responses and user-defined termination of
                      a call. For example, if we expect responses from 10 members, but can return after having
                      received 3 non-null responses, a RspFilter could be used. See <xref linkend="RspFilter"/> for
                      a discussion on response filters.
                  </listitem>
                  <listitem>
                      Scope: a short, defining a scope. This allows for concurrent delivery of messages from the same
                      sender. See <xref linkend="Scopes"/> for a discussion on scopes.
                  </listitem>
                  <listitem>
                      Flags: the various flags to be passed to the message, see <xref linkend="MessageFlags"/> for details.
                  </listitem>
                  <listitem>
                      Exclusion list: here we can pass a list of members (addresses) that should be excluded. For example,
                      if the view is A,B,C,D,E, and we set exclusion list to A,C then the caller will wait for responses
                      from everyone except A and C.
                  </listitem>
              </itemizedlist>
          </para>

          <para>
          An example of how to use RequestOptions is:
          </para>

          <programlisting language="Java">
RpcDispatcher disp;
RequestOptions opts=new RequestOptions(Request.GET_ALL)
                    .setFlags(Message.NO_FC | Message.DONT_BUNDLE);
Object val=disp.callRemoteMethod(target, method_call, opts);
          </programlisting>

      </section>

      <para>The methods to send requests are:</para>

      <programlisting language="Java">
public &lt;T&gt; RspList&lt;T&gt;
       castMessage(final Collection&lt;Address&gt; dests,
                   Message msg,
                   RequestOptions options) throws Exception;
public &lt;T&gt; NotifyingFuture&lt;RspList&lt;T&gt;&gt;
       castMessageWithFuture(final Collection&lt;Address&gt; dests,
                             Message msg,
                             RequestOptions options) throws Exception;
public &lt;T&gt; T sendMessage(Message msg,
                         RequestOptions opts) throws Exception;
public &lt;T&gt; NotifyingFuture&lt;T&gt;
       sendMessageWithFuture(Message msg,
                             RequestOptions options) throws Exception;
      </programlisting>

      <para>
          <methodname>castMessage()</methodname> sends a message to all members defined in
          <parameter>dests</parameter>. If <parameter>dests</parameter> is null, the message will be sent to all
          members of the current cluster. Note that a possible destination set in the message will be overridden.
          If a message is sent synchronously (defined by options.mode) then <parameter>options.timeout</parameter>
          defines the maximum amount of time (in milliseconds) to wait for the responses.
      </para>

      <para>
          <methodname>castMessage()</methodname> returns a RspList, which contains a map of addresses and Rsps;
          there's one Rsp per member listed in <parameter>dests</parameter>.
      </para>

      <para>
          A Rsp instance contains the response value (or null), an exception if the target handle() method threw
          an exception, whether the target member was suspected, or not, and so on. See the example below for
          more details.
      </para>

      <para>
          <methodname>castMessageWithFuture()</methodname> returns immediately, with a future. The future
          can be used to fetch the response list (now or later), and it also allows for installation of a callback
          which will be invoked whenever the future is done.
          See <xref linkend="NotifyingFuture"/> for details on how to use NotifyingFutures.
      </para>


      <para>
          <methodname>sendMessage()</methodname> allows an application programmer to send a unicast message to a
          single cluster member and receive the response. The destination of the message has to be non-null (valid
          address of a member). The <parameter>mode</parameter> argument is ignored (it is by default set to
          <constant>ResponseMode.GET_FIRST</constant>) unless it is set to <constant>GET_NONE</constant> in which case
          the request becomes asynchronous, ie. we will not wait for the response.
      </para>

      <para>
          <methodname>sendMessageWithFuture()</methodname> returns immediately with a future, which can be used to
          fetch the result.
      </para>

      <para>
          One advantage of using this building block is that failed members are removed from the set of expected
          responses. For example, when sending a message to 10 members and waiting for all responses, and 2 members
          crash before being able to send a response, the call will return with 8 valid responses and 2 marked as
          failed. The return value of <methodname>castMessage()</methodname> is a <classname>RspList</classname>
          which contains all responses (not all methods shown):
      </para>

      <programlisting language="Java">
public class RspList&lt;T&gt; implements Map&lt;Address,Rsp&gt; {
    public boolean isReceived(Address sender);
    public int     numSuspectedMembers();
    public List&lt;T&gt; getResults();
    public List&lt;Address&gt; getSuspectedMembers();
    public boolean isSuspected(Address sender);
    public Object  get(Address sender);
    public int     size();
}
      </programlisting>

      <para>
          <methodname>isReceived()</methodname> checks whether a response from <parameter>sender</parameter>
          has already been received. Note that this is only true as long as no response has yet been received, and the
          member has not been marked as failed. <methodname>numSuspectedMembers()</methodname> returns the number of
          members that failed (e.g. crashed) during the wait for responses. <methodname>getResults()</methodname>
          returns a list of return values. <methodname>get()</methodname> returns the return value for a specific member.
      </para>

    <section id="MessageDispatcherExample">
        <title>Example</title>

      <para>
          This section shows an example of how to use a <classname>MessageDispatcher</classname>.
      </para>
    
      <programlisting language="Java">
public class MessageDispatcherTest implements RequestHandler {
    Channel            channel;
    MessageDispatcher  disp;
    RspList            rsp_list;
    String             props; // to be set by application programmer

    public void start() throws Exception {
        channel=new JChannel(props);
        disp=new MessageDispatcher(channel, null, null, this);
        channel.connect("MessageDispatcherTestGroup");

        for(int i=0; i &lt; 10; i++) {
            Util.sleep(100);
            System.out.println("Casting message #" + i);
            rsp_list=disp.castMessage(null,
                new Message(null, null, new String("Number #" + i)),
                ResponseMode.GET_ALL, 0);
            System.out.println("Responses:\n" +rsp_list);
        }
        channel.close();
        disp.stop();
    }

    public Object handle(Message msg) throws Exception {
        System.out.println("handle(): " + msg);
        return "Success !";
    }

    public static void main(String[] args) {
        try {
            new MessageDispatcherTest().start();
        }
        catch(Exception e) {
            System.err.println(e);
        }
    }
}
      </programlisting>

        <para>
            The example starts with the creation of a channel. Next, an instance of
            <classname>MessageDispatcher</classname> is created on top of the channel. Then the channel is connected. The
            <classname>MessageDispatcher</classname> will from now on send requests, receive matching responses
            (client role) and receive requests and send responses (server role).
        </para>

        <para>
            We then send 10 messages to the group and wait for all responses. The <parameter>timeout</parameter>
            argument is 0, which causes the call to block until all responses have been received.
        </para>

        <para>
            The <methodname>handle()</methodname> method simply prints out a message and returns a string. This will
            be sent back to the caller as a response value (in Rsp.value). Has the call thrown an exception,
            Rsp.exception would be set instead.
        </para>

        <para>
            Finally both the <classname>MessageDispatcher</classname> and channel are closed.
        </para>

    </section>
    
  </section>

    
  <section id="RpcDispatcher">
      <title>RpcDispatcher</title>

      <para>
          <classname>RpcDispatcher</classname> is derived from <classname>MessageDispatcher</classname>. It allows a
          programmer to invoke remote methods in all (or single) cluster members and optionally wait for the return
          value(s). An application will typically create a channel first, and then create an
          <classname>RpcDispatcher</classname> on top of it. RpcDispatcher can be used to invoke remote methods
          (client role) and at the same time be called by other members (server role).
      </para>

      <para>
          Compared to<classname>MessageDispatcher</classname>, no <methodname>handle()</methodname>
          method needs to be implemented. Instead the methods to be called can be placed directly in the class using
          regular method definitions (see example below). The methods will get invoked using reflection.
      </para>

      <para>
          To invoke remote method calls (unicast and multicast) the following methods are used:
      </para>

    <programlisting language="Java">
public &lt;T&gt; RspList&lt;T&gt;
       callRemoteMethods(Collection&lt;Address&gt; dests,
                         String method_name,
                         Object[] args,
                         Class[] types,
                         RequestOptions options) throws Exception;
public &lt;T&gt; RspList&lt;T&gt;
       callRemoteMethods(Collection&lt;Address&gt; dests,
                         MethodCall method_call,
                         RequestOptions options) throws Exception;
public &lt;T&gt; NotifyingFuture&lt;RspList&lt;T&gt;&gt;
       callRemoteMethodsWithFuture(Collection&lt;Address&gt; dests,
                                   MethodCall method_call,
                                   RequestOptions options) throws Exception;
public &lt;T&gt; T callRemoteMethod(Address dest,
                              String method_name,
                              Object[] args,
                              Class[] types,
                              RequestOptions options) throws Exception;
public &lt;T&gt; T callRemoteMethod(Address dest,
                              MethodCall call,
                              RequestOptions options) throws Exception;
public &lt;T&gt; NotifyingFuture&lt;T&gt;
       callRemoteMethodWithFuture(Address dest,
                                  MethodCall call,
                                  RequestOptions options) throws Exception;
    </programlisting>

      <para>
          The family of <methodname>callRemoteMethods()</methodname> methods is invoked with a list of receiver
          addresses. If null, the method will be invoked in all cluster members (including the sender). Each call takes
          the target members to invoke it on (null mean invoke on all cluster members), a method and a RequestOption.
      </para>

      <para>
          The method can be given as (1) the method name, (2) the arguments and (3) the argument types, or a
          <classname>MethodCall</classname> (containing a java.lang.reflect.Method and argument) can be given instead.
      </para>

      <para>
          As with <classname>MessageDispatcher</classname>, a RspList or a future to a RspList is returned.
      </para>

      <para>
          The family of <methodname>callRemoteMethod()</methodname> methods takes almost the same parameters, except
          that there is only one destination address instead of a list. If the <parameter>dest</parameter>
          argument is null, the call will fail.
      </para>

      <para>
          The <methodname>callRemoteMethod()</methodname> calls return the actual result (or type T), or throw an
          exception if the method threw an exception on the target member.
      </para>

      <para>
          Java's Reflection API is used to find the correct method in the target member according to the method name and
          number and types of supplied arguments. There is a runtime exception if a method cannot be resolved.
      </para>

      <para>
          Note that we could also use method IDs and the <classname>MethodLookup</classname> interface to resolve
          methods, which is faster and has every RPC carry less data across the wire. To see how this is done,
          have a look at some of the MethodLookup implementations, e.g. in RpcDispatcherSpeedTest.
      </para>


    <section id="RpcDispatcherExample">
        <title>Example</title>

      <para>The code below shows an example of using RpcDispatcher:</para>

      <programlisting language="Java">
public class RpcDispatcherTest {
    JChannel           channel;
    RpcDispatcher      disp;
    RspList            rsp_list;
    String             props; // set by application

    public static int print(int number) throws Exception {
        return number * 2;
    }

    public void start() throws Exception {
        MethodCall call=new MethodCall(getClass().getMethod("print", int.class));
        RequestOptions opts=new RequestOptions(ResponseMode.GET_ALL, 5000);
        channel=new JChannel(props);
        disp=new RpcDispatcher(channel, this);
        channel.connect("RpcDispatcherTestGroup");

        for(int i=0; i &lt; 10; i++) {
            Util.sleep(100);
            rsp_list=disp.callRemoteMethods(null,
                                            "print",
                                            new Object[]{i},
                                            new Class[]{int.class},
                                            opts);
            // Alternative: use a (prefabricated) MethodCall:
            // call.setArgs(i);
            // rsp_list=disp.callRemoteMethods(null, call, opts);
            System.out.println("Responses: " + rsp_list);
        }
        channel.close();
        disp.stop();
    }

    public static void main(String[] args) throws Exception {
        new RpcDispatcherTest().start();
    }
}
     </programlisting>


        <para>
            Class <classname>RpcDispatcher</classname> defines method <methodname>print()</methodname> which will be
            called subsequently. The entry point <methodname>start()</methodname> creates a channel and an
            <classname>RpcDispatcher</classname> which is layered on top. Method
            <methodname>callRemoteMethods()</methodname> then invokes the remote <methodname>print()</methodname>
            in all cluster members (also in the caller). When all responses have been received, the call returns
            and the responses are printed.
        </para>

        <para>
            As can be seen, the <classname>RpcDispatcher</classname> building block reduces the amount of code that
            needs to be written to implement RPC-based group communication applications by providing a higher
            abstraction level between the application and the primitive channels.
        </para>


        <section id="NotifyingFuture"><title>Asynchronous calls with futures</title>

            <para>
                When invoking a synchronous call, the calling thread is blocked until the response (or responses) has
                been received.
            </para>

            <para>
                A <emphasis>Future</emphasis> allows a caller to return immediately and grab the result(s) later. In
                2.9, two new methods, which return futures, have been added to RpcDispatcher:
            </para>
            <programlisting language="Java">
public NotifyingFuture&lt;RspList&gt;
       callRemoteMethodsWithFuture(Collection&lt;Address&gt; dests,
                                   MethodCall method_call,
                                   RequestOptions options) throws Exception;
public &lt;T&gt; NotifyingFuture&lt;T&gt;
       callRemoteMethodWithFuture(Address dest,
                                  MethodCall call,
                                  RequestOptions options) throws Exception;
            </programlisting>


            <para>
                A NotifyingFuture extends java.util.concurrent.Future, with its regular methods such as isDone(),
                get() and cancel(). NotifyingFuture adds setListener&lt;FutureListener&gt; to get notified when
                the result is available. This is shown in the following code:
            </para>

            <programlisting language="Java">
NotifyingFuture&lt;RspList&gt; future=dispatcher.callRemoteMethodsWithFuture(...);
future.setListener(new FutureListener() {
    void futureDone(Future&lt;T&gt; future) {
        System.out.println("result is " + future.get());
    }
});
            </programlisting>

        </section>


    </section>

      <section id="RspFilter">
          <title>Response filters</title>
          <para>
              Response filters allow application code to hook into the reception of responses from cluster members and
              can let the request-response execution and correlation code know (1) wether a response is acceptable and
              (2) whether more responses are needed, or whether the call (if blocking) can return. The
              <classname>RspFilter</classname> interface looks as follows:
          </para>
          <programlisting language="Java">
public interface RspFilter {
    boolean isAcceptable(Object response, Address sender);
    boolean needMoreResponses();
}
          </programlisting>

          <para>
              <methodname>isAcceptable()</methodname> is given a response value and the address of the member which sent
              the response, and needs to decide whether the response is valid (should return true) or not
              (should return false).
          </para>

          <para>
              <methodname>needMoreResponses()</methodname> determine whether a call returns or not.
          </para>
          <para>
              The sample code below shows how to use a RspFilter:
          </para>
          <programlisting language="Java">
public void testResponseFilter() throws Exception {
    final long timeout = 10 * 1000 ;

    RequestOptions opts;
    opts=new RequestOptions(ResponseMode.GET_ALL,
                            timeout, false,
                            new RspFilter() {
                                int num=0;
                                public boolean isAcceptable(Object response,
                                                            Address sender) {
                                    boolean retval=((Integer)response).intValue() &gt; 1;
                                    if(retval)
                                        num++;
                                    return retval;
                                }
                                public boolean needMoreResponses() {
                                    return num &lt; 2;
                                }
                            });

    RspList rsps=disp1.callRemoteMethods(null, "foo", null, null, opts);
    System.out.println("responses are:\n" + rsps);
    assert rsps.size() == 3;
    assert rsps.numReceived() == 2;
}
          </programlisting>


          <para>
              Here, we invoke a cluster wide RPC (dests=null), which blocks (mode=GET_ALL) for 10 seconds max
              (timeout=10000), but also passes an instance of RspFilter to the call (in options).
          </para>
          <para>
              The filter accepts all responses whose value is greater than 2, and returns as soon as it has received
              2 responses which satisfy the above condition.
          </para>

          <warning>
              <title>Be careful with RspFilters</title>
              <para>
                  If we have a RspFilter which doesn't terminate the call even if responses from all members have
                  been received, we might block forever (if no timeout was given) ! For example, if we have 10 members,
                  and every member returns 1 or 2 as return value of foo() in the above code, then
                  <methodname>isAcceptable()</methodname> would always return false, therefore never incrementing 'num',
                  and <methodname>needMoreResponses()</methodname> would always return true; this would never terminate
                  the call if it wasn't for the timeout of 10 seconds !
              </para>
              <para>
                  This will be fixed in 3.1; a blocking call will always return if we've received as many responses as
                  we have members in 'dests', regardless of what the RspFilter says.
              </para>
          </warning>
      </section>

  </section>


    <section id="ReplicatedHashMap">
        <title>ReplicatedHashMap</title>

        <para>
            This class was written as a demo of how state can be shared between nodes of a cluster. It has never been
            heavily tested and is therefore not meant to be used in production.
        </para>

        <para>A
            <classname>ReplicatedHashMap</classname> uses a concurrent hashmap internally and allows to create several
            instances of hashmaps in different processes. All of these instances have exactly the same state at all
            times. When creating such an instance, a cluster name determines which cluster of replicated hashmaps will
            be joined. The new instance will then query the state from existing members and update itself before
            starting to service requests. If there are no existing members, it will simply start with an empty state.
        </para>

        <para>
            Modifications such as <methodname>put()</methodname>, <methodname>clear()</methodname> or
            <methodname>remove()</methodname> will be propagated in orderly fashion to all replicas. Read-only requests
            such as <methodname>get()</methodname> will only be invoked on the local hashmap.
        </para>

        <para>
            Since both keys and values of a hashtable will be sent across the network, they have to be
            serializable. Putting a non-serializable value in the map will result in an exception at marshalling time.
        </para>

        <para>
            A <classname>ReplicatedHashMap</classname> allows to register for notifications, e.g. when data is
            added removed. All listeners will get notified when such an event occurs. Notification is always local;
            for example in the case of removing an element, first the element is removed in all replicas, which then
            notify their listener(s) of the removal (after the fact).
        </para>

        <para>
            <classname>ReplicatedHashMap</classname> allow members in a group to share common state across process
            and machine boundaries.
        </para>

    </section>


    <section id="ReplCache">
        <title>ReplCache</title>
        <para>
            ReplCache is a distributed cache which - contrary to ReplicatedHashMap - doesn't replicate its values to
            all cluster members, but just to selected backups.
        </para>
        <para>
            A <methodname>put(K,V,R)</methodname> method has a <emphasis>replication count R</emphasis> which determines
            on how many cluster members key K and value V should be stored. When we have 10 cluster members, and R=3,
            then K and V will be stored on 3 members. If one of those members goes down, or leaves the cluster, then a
            different member will be told to store K and V. ReplCache tries to always have R cluster members store K
            and V.
        </para>
        <para>
            A replication count of -1 means that a given key and value should be stored on <emphasis>all</emphasis>
            cluster members.
        </para>
        <para>
            The mapping between a key K and the cluster member(s) on which K will be stored is always deterministic, and
            is computed using a <emphasis>consistent hash function</emphasis>.
        </para>
        <para>
            Note that this class was written as a demo of how state can be shared between nodes of a cluster. It has
            never been heavily tested and is therefore not meant to be used in production.
        </para>
    </section>


    <section id="LockService">
        <title>Cluster wide locking</title>
        <para>
             In 2.12, a new distributed locking service was added, replacing DistributedLockManager. The new service is
            implemented as a protocol and is used via org.jgroups.blocks.locking.LockService.
        </para>
        <para>
            LockService talks to the locking protocol via events. The main abstraction of a distributed lock is an
            implementation of java.util.concurrent.locks.Lock. All lock methods are supported, however, conditions
            are not fully supported, and still need some more testing (as of July 2011).
        </para>
        <para>
            Below is an example of how LockService is typically used:
        </para>

        <programlisting language="Java">
// locking.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/locking.xml");
LockService lock_service=new LockService(ch);
ch.connect("lock-cluster");
Lock lock=lock_service.getLock("mylock");
lock.lock();
try {
    // do something with the locked resource
}
finally {
    lock.unlock();
}
        </programlisting>

        <para>
            In the example, we create a channel, then a LockService, then connect the channel. If the channel's
            configuration doesn't include a locking protocol, an exception will be thrown.
            Then we grab a lock named "mylock", which we lock and subsequently unlock. If another member P had already
            acquired "mylock", we'd block until P released the lock, or P left the cluster or crashed.
        </para>

        <para>
            Note that the owner of a lock is always a given thread in a cluster, so the owner is the JGroups address and
            the thread ID. This means that different threads inside the same JVM trying to access the same named lock
            will compete for it. If thread-22 grabs the lock first, then thread-5 will block until thread-23
            releases the lock.
        </para>

        <para>
            JGroups includes a demo (org.jgroups.demos.LockServiceDemo), which can be used to interactively experiment
            with distributed locks. LockServiceDemo -h dumps all command line options.
        </para>
        
        <para>
            Currently (Jan 2011), there are 2 protocols which provide locking:
            <xref linkend="PEER_LOCK">PEER_LOCK</xref> and <xref linkend="CENTRAL_LOCK">CENTRAL_LOCK</xref>. The locking
            protocol has to be placed at or towards the top of the stack (close to the channel).
        </para>
        
        <section id="LockingAndMerges">
            <title>Locking and merges</title>

            <para>
                The following scenario is susceptible to network partitioning and subsequent merging: we have a cluster
                view of {A,B,C,D} and then the cluster splits into {A,B} and {C,D}. Assume that B and D now acquire a
                lock "mylock". This is what happens (with the locking protocol being CENTRAL_LOCK):
                <itemizedlist>
                    <listitem>There are 2 coordinators: A for {A,B} and C for {C,D}</listitem>
                    <listitem>B successfully acquires "mylock" from A</listitem>
                    <listitem>D successfully acquires "mylock" from C</listitem>
                    <listitem>The partitions merge back into {A,B,C,D}. Now, only A is the coordinator, but C ceases
                        to be a coordinator</listitem>
                    <listitem>Problem: D still holds a lock which should actually be invalid !</listitem>
                </itemizedlist>
                There is no easy way (via the Lock API) to 'remove' the lock from D. We could for example simply release
                D's lock on "mylock", but then there's no way telling D that the lock it holds is actually stale !
            </para>
            
            <para>
                Therefore the recommended solution here is for nodes to listen to MergeView changes if they expect
                merging to occur, and re-acquire all of their locks after a merge, e.g.:
            </para>

            <programlisting language="Java">
Lock l1, l2, l3;
LockService lock_service;
...
public void viewAccepted(View view) {
    if(view instanceof MergeView) {
        new Thread() {
            public void run() {
                lock_service.unlockAll();
                // stop all access to resources protected by l1, l2 or l3
                // every thread needs to re-acquire the locks it holds
            }
        }.start
    }
}
            </programlisting>
        </section>
    </section>
    

    <section id="ExecutionService">
        <title>Cluster wide task execution</title>
        <para>
            In 2.12, a distributed execution service was added. The new service is implemented as a protocol and is used
            via org.jgroups.blocks.executor.ExecutionService.
        </para>

        <para>
            <classname>ExecutionService</classname> extends java.util.concurrent.ExecutorService and distributes tasks
            submitted to it across the cluster, trying to distribute the tasks to the cluster members as evenly as
            possible. When a cluster member leaves or dies, the tasks is was processing are re-distributed to other
            members in the cluster.
        </para>

        <para>
            ExecutionService talks to the executing protocol via events. The main abstraction is an implementation of
            java.util.concurrent.ExecutorService. All methods are supported. The restrictions are however that
            the Callable or Runnable must be Serializable, Externalizable or Streamable.  Also the result produced
            from the future needs to be Serializable, Externalizable or Streamable.  If the Callable or Runnable are not,
            then an IllegalArgumentException is immediately thrown.  If a result is not, then a NotSerializableException
            with the name of the class will be returned to the Future as an exception cause.
        </para>
        
        <para>
            Below is an example of how ExecutionService is typically used:
        </para>

        <programlisting language="Java">
// executing.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/executing.xml");
ExecutionService exec_service =new ExecutionService(ch);
ch.connect("exec-cluster");
Future&lt;Value&gt; future = exec_service.submit(new MyCallable());
try {
    Value value = future.get();
    // Do something with value
}
catch (InterruptedException e) {
    e.printStackTrace();
}
catch (ExecutionException e) {
    e.getCause().printStackTrace();
}
        </programlisting>

        <para>
            In the example, we create a channel, then an ExecutionService, then connect the channel. Then we submit
            our callable giving us a Future.  Then we wait for the future to finish returning our value and do something
            with it.  If any exception occurs we print the stack trace of that exception.
        </para>

        <para>
            The <classname>ExecutionService</classname> follows the Producer-Consumer Pattern very closely.  The
            <classname>ExecutionService</classname> is used as the Producer for this Pattern.  Therefore the service
            only passes tasks off to be handled and doesn't do anything with the actual invocation of those tasks.  
            There is a separate class that can was written specifically as a consumer, which can be ran on any node of
            the cluster.  This class is <classname>ExecutionRunner</classname> and implements java.lang.Runnable.
            A user is required to run one or more instances of a <classname>ExecutionRunner</classname> on a node of
            the cluster.  By having a thread run one of these runners, that thread has no volunteered to be able to
            run any task that is submitted to the cluster via an <classname>ExecutionService</classname>.  This allows
            for any node in the cluster to participate or not participate in the running of these tasks and also any
            node can optionally run more than 1 <classname>ExecutionRunner</classname> if this node has additional
            capacity to do so.  A runner will run indefinately until the thread that is currently running it is 
            interrupted.  If a task is running when the runner is interrupted the task will be interrupted. 
        </para>

        <para>
            Below is an example of how simple it is to have a single node start and allow for 10 distributed tasks to be executed simultaneously on it:
        </para>

        <programlisting language="Java">
int runnerCount = 10;
// locking.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/executing.xml");
ch.connect("exec-cluster");

ExecutionRunner runner = new ExecutionRunner(ch);

ExecutorService service = Executors.newFixedThreadPool(runnerCount);
for (int i = 0; i &lt; runnerCount; ++i) {
   // If you want to stop the runner hold onto the future
   // and cancel with interrupt.
   service.submit(runner);
}
        </programlisting>

        <para>
            In the example, we create a channel, then connect the channel, then an ExecutionRunner. Then we create
            a java.util.concurrent.ExecutorService that is used to start 10 threads that each thread runs the
            ExecutionRunner.  This allows for this node to have 10 threads actively accept and work on requests
            submitted via any ExecutionService in the cluster.
        </para>

        <para>
            Since an ExecutionService does not allow for non serializable class instances to be sent across as tasks
            there are 2 utility classes provided to get around this problem.  For users that are used to using a
            CompletionService with an Executor there is an equivalent ExecutionCompletionService provided that allows 
            for a user to have the same functionality.  It would have been preferred to allow for the same 
            ExecutorCompletionService to be used, but due to it's implementation using a non serializable object
            the ExecutionCompletionService was implemented to be used instead in conjunction with an ExecutorService.
            Also utility class was designed to help users to submit tasks which use a non serializable class.  The
            Executions class contains a method serializableCallable which allows for a user to pass a constructor of a
            class that implements Callable and it's arguments to then return to a user a Callable that will upon running
            will automatically create and object from the constructor passing the provided arguments to it and then will
            call the call method on the object and return it's result as a normal callable.  All the arguments provided
            must still be serializable and the return object as detailed previously.
        </para>

        <para>
            JGroups includes a demo (org.jgroups.demos.ExecutionServiceDemo), which can be used to interactively
            experiment with a distributed sort algorithm and performance.  This is for demonstration purposes and
            performance should not be assumed to be better than local.
            ExecutionServiceDemo -h dumps all command line options.
        </para>
        <para>
            Currently (July 2011), there is 1 protocol which provide executions:
            <xref linkend="CENTRAL_EXECUTOR">CENTRAL_EXECUTOR</xref>. The executing protocol has to be placed at or
            towards the top of the stack (close to the channel).
        </para>
    </section>

    <section id="CounterService">
        <title>Cluster wide atomic counters</title>
        <para>
            Cluster wide counters provide named counters (similar to AtomicLong) which can be changed atomically. 2
            nodes incrementing the same counter with initial value 10 will see 11 and 12 as results, respectively.
        </para>
        <para>
            To create a named counter, the following steps have to be taken:
            <orderedlist>
                <listitem>
                    Add protocol COUNTER to the top of the stack configuration
                </listitem>
                <listitem>
                    Create an instanceof CounterService
                </listitem>
                <listitem>
                    Create a new or get an existing named counter
                </listitem>
                <listitem>
                    Use the counter to increment, decrement, get, set, compare-and-set etc the counter
                </listitem>
            </orderedlist>
        </para>
        <para>
            In the first step, we add COUNTER to the top of the protocol stack configuration:
        </para>
        <programlisting language="Java">
&lt;config&gt;
    ...
    &lt;MFC max_credits="2M"
         min_threshold="0.4"/&gt;
    &lt;FRAG2 frag_size="60K"  /&gt;
    &lt;COUNTER bypass_bundling="true" timeout="5000"/&gt;
&lt;/config&gt;
        </programlisting>
        <para>
            Configuration of the COUNTER protocol is described in <xref linkend="COUNTER">COUNTER</xref>.
        </para>
        <para>
            Next, we create a CounterService, which is used to create and delete named counters:
        </para>
        <programlisting language="Java">
ch=new JChannel(props);
CounterService counter_service=new CounterService(ch);
ch.connect("counter-cluster");
Counter counter=counter_service.getOrCreateCounter("mycounter", 1);
        </programlisting>
        <para>
            In the sample code above, we create a channel first, then create the CounterService referencing the channel.
            Then we connect the channel and finally create a new named counter "mycounter", with an initial value of 1.
            If the counter already exists, the existing counter will be returned and the initial value will be ignored.
        </para>
        <para>
            CounterService doesn't consume any messages from the channel over which it is created; instead it grabs
            a reference to the COUNTER protocols and invokes methods on it directly. This has the advantage that
            CounterService is non-intrusive: many instances can be created over the same channel. CounterService even
            co-exists with other services which use the same mechanism, e.g. LockService or ExecutionService (see above).
        </para>
        <para>
            The returned counter instance implements interface Counter:
        </para>
        <programlisting language="Java">
package org.jgroups.blocks.atomic;

public interface Counter {

    public String getName();

    /**
     * Gets the current value of the counter
     * @return The current value
     */
    public long get();

    /**
     * Sets the counter to a new value
     * @param new_value The new value
     */
    public void set(long new_value);

    /**
     * Atomically updates the counter using a CAS operation
     *
     * @param expect The expected value of the counter
     * @param update The new value of the counter
     * @return True if the counter could be updated, false otherwise
     */
    public boolean compareAndSet(long expect, long update);

    /**
     * Atomically increments the counter and returns the new value
     * @return The new value
     */
    public long incrementAndGet();

    /**
     * Atomically decrements the counter and returns the new value
     * @return The new value
     */
    public long decrementAndGet();


    /**
     * Atomically adds the given value to the current value.
     *
     * @param delta the value to add
     * @return the updated value
     */
    public long addAndGet(long delta);
}
        </programlisting>

        <section id="CounterServiceDesign">
            <title>Design</title>
            <para>
                The design of COUNTER is described in details in
                <ulink url="https://github.com/belaban/JGroups/blob/master/doc/design/CounterService.txt">CounterService</ulink>.
            </para>
            <para>
                In a nutshell, in a cluster the current coordinator maintains a hashmap of named counters. Members send
                requests (increment, decrement etc) to it, and the coordinator atomically applies the requests and
                sends back responses.
            </para>
            <para>
                The advantage of this centralized approach is that - regardless of the size of a cluster - every
                request has a constant execution cost, namely a network round trip.
            </para>
            <para>
                A crash or leaving of the coordinator is handled as follows. The coordinator maintains a version for
                every counter value. Whenever the counter value is changed, the version is incremented. For every
                request that modifies a counter, both the counter value and the version are returned to the requester.
                The requester caches all counter values and associated versions in its own local cache.
            </para>
            <para>
                When the coordinator leaves or crashes, the next-in-line member becomes the new coordinator. It then
                starts a reconciliation phase, and discards all requests until the reconciliation phase has completed.
                The reconciliation phase solicits all members for their cached values and versions. To reduce traffic,
                the request also carries all version numbers with it.
            </para>
            <para>
                Clients return values whose versions are higher than the ones shipped by the new coordinator. The new
                coordinator waits for responses from all members or timeout milliseconds. Then it updates its own
                hashmap with values whose versions are higher than its own. Finally, it stops discarding requests and
                sends a resend message to all clients in order to resend any requests that might be pending.
            </para>
            <para>
                There's another edge case that also needs to be covered: if a client P updates a counter, and both P and
                the coordinator crash, then the update is lost. To reduce the chances of this happening, COUNTER
                can be enabled to replicate all counter changes to one or more backup coordinators. The num_backups
                property defines the number of such backups. Whenever a counter was changed in the current coordinator,
                it also updates the backups (asynchronously). 0 disables this.
            </para>
        </section>


    </section>

 
</chapter>