/
blocks.xml
1030 lines (890 loc) · 49.1 KB
/
blocks.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<chapter id="user-building-blocks"><title>Building Blocks</title>
<para>
Building blocks are layered on top of channels, and can be used instead of channels whenever
a higher-level interface is required.
</para>
<para>
Whereas channels are simple socket-like constructs, building blocks may offer a far more sophisticated
interface. In some cases, building blocks offer access to the underlying channel, so that -- if the building
block at hand does not offer a certain functionality -- the channel can be accessed directly. Building blocks
are located in the <classname>org.jgroups.blocks</classname> package.
</para>
<section id="MessageDispatcher">
<title>MessageDispatcher</title>
<para>
Channels are simple patterns to <emphasis>asynchronously</emphasis>
send and receive messages. However, a significant number of communication patterns in group communication
require <emphasis>synchronous</emphasis> communication. For example, a sender would like to send a message to
the group and wait for all responses. Or another application would like to send a message to the group and
wait only until the majority of the receivers have sent a response, or until a timeout occurred.
</para>
<para>
<classname>MessageDispatcher</classname> provides blocking (and non-blocking) request sending and response
correlation. It offers synchronous (as well as asynchronous) message sending with request-response
correlation, e.g. matching one or multiple responses with the original request.
</para>
<para>
An example of using this class would be to send a request message to all cluster members, and block until all
responses have been received, or until a timeout has elapsed.
</para>
<para>
Contrary to <xref linkend="RpcDispatcher">RpcDispatcher</xref>, MessageDispatcher deals with
<emphasis>sending message requests and correlating message responses</emphasis>, while RpcDispatcher deals
with <emphasis>invoking method calls and correlating responses</emphasis>. RpcDispatcher extends
MessageDispatcher, and offers an even higher level of abstraction over MessageDispatcher.
</para>
<para>
RpcDispatcher is essentially a way to invoke remote procedure calls (RCs) across a cluster.
</para>
<para>
Both MessageDispatcher and RpcDispatcher sit on top of a channel; therefore an instance of
<classname>MessageDispatcher</classname> is created with a channel as argument. It can now be
used in both <emphasis>client and server role</emphasis>: a client sends requests and receives responses and
a server receives requests and sends responses. <classname>MessageDispatcher</classname> allows for an
application to be both at the same time. To be able to serve requests in the server role, the
<methodname>RequestHandler.handle()</methodname> method has to be implemented:
</para>
<programlisting language="Java">Object handle(Message msg) throws Exception;</programlisting>
<para>
The <methodname>handle()</methodname> method is called whenever a request is received. It must return a value
(must be serializable, but can be null) or throw an exception. The returned value will be sent to the sender,
and exceptions are also propagated to the sender.
</para>
<para>
Before looking at the methods of MessageDispatcher, let's take a look at RequestOptions first.
</para>
<section id="RequestOptions">
<title>RequestOptions</title>
<para>
Every message sending in MessageDispatcher or request invocation in RpcDispatcher is governed by an
instance of RequestOptions. This is a class which can be passed to a call to define the various
options related to the call, e.g. a timeout, whether the call should block or not, the flags (see
<xref linkend="MessageFlags"/>) etc.
</para>
<para>
The various options are:
<itemizedlist>
<listitem>
Response mode: this determines whether the call is blocking and - if yes - how long
it should block. The modes are:
<itemizedlist>
<listitem>GET_ALL: block until responses from all members (minus the suspected ones) have
been received.
</listitem>
<listitem>GET_NONE: wait for none. This makes the call non-blocking</listitem>
<listitem>GET_FIRST: block until the first response (from anyone) has been received</listitem>
<listitem>GET_MAJORITY: block until a majority of members have responded</listitem>
</itemizedlist>
</listitem>
<listitem>
Timeout: number of milliseconds we're willing to block. If the call hasn't terminated after the
timeout elapsed, a TimeoutException will be thrown. A timeout of 0 means to wait forever. The
timeout is ignored if the call is non-blocking (mode=GET_NONE)
</listitem>
<listitem>
Anycasting: if set to true, this means we'll use unicasts to individual members rather than sending
multicasts. For example, if we have have TCP as transport, and the cluster is {A,B,C,D,E}, and we
send a message through MessageDispatcher where dests={C,D}, and we do <emphasis>not</emphasis>
want to send the request to everyone, and everyone except C and D discard the message, then we'd
set anycasting=true. This will send the request to C and D only, as unicasts, which is better if
we use a transport such as TCP which cannot use IP multicasting (sending 1 packet to reach all
members).
</listitem>
<listitem>
Response filter: A RspFilter allows for filtering of responses and user-defined termination of
a call. For example, if we expect responses from 10 members, but can return after having
received 3 non-null responses, a RspFilter could be used. See <xref linkend="RspFilter"/> for
a discussion on response filters.
</listitem>
<listitem>
Scope: a short, defining a scope. This allows for concurrent delivery of messages from the same
sender. See <xref linkend="Scopes"/> for a discussion on scopes.
</listitem>
<listitem>
Flags: the various flags to be passed to the message, see <xref linkend="MessageFlags"/> for details.
</listitem>
<listitem>
Exclusion list: here we can pass a list of members (addresses) that should be excluded. For example,
if the view is A,B,C,D,E, and we set exclusion list to A,C then the caller will wait for responses
from everyone except A and C.
</listitem>
</itemizedlist>
</para>
<para>
An example of how to use RequestOptions is:
</para>
<programlisting language="Java">
RpcDispatcher disp;
RequestOptions opts=new RequestOptions(Request.GET_ALL)
.setFlags(Message.NO_FC | Message.DONT_BUNDLE);
Object val=disp.callRemoteMethod(target, method_call, opts);
</programlisting>
</section>
<para>The methods to send requests are:</para>
<programlisting language="Java">
public <T> RspList<T>
castMessage(final Collection<Address> dests,
Message msg,
RequestOptions options) throws Exception;
public <T> NotifyingFuture<RspList<T>>
castMessageWithFuture(final Collection<Address> dests,
Message msg,
RequestOptions options) throws Exception;
public <T> T sendMessage(Message msg,
RequestOptions opts) throws Exception;
public <T> NotifyingFuture<T>
sendMessageWithFuture(Message msg,
RequestOptions options) throws Exception;
</programlisting>
<para>
<methodname>castMessage()</methodname> sends a message to all members defined in
<parameter>dests</parameter>. If <parameter>dests</parameter> is null, the message will be sent to all
members of the current cluster. Note that a possible destination set in the message will be overridden.
If a message is sent synchronously (defined by options.mode) then <parameter>options.timeout</parameter>
defines the maximum amount of time (in milliseconds) to wait for the responses.
</para>
<para>
<methodname>castMessage()</methodname> returns a RspList, which contains a map of addresses and Rsps;
there's one Rsp per member listed in <parameter>dests</parameter>.
</para>
<para>
A Rsp instance contains the response value (or null), an exception if the target handle() method threw
an exception, whether the target member was suspected, or not, and so on. See the example below for
more details.
</para>
<para>
<methodname>castMessageWithFuture()</methodname> returns immediately, with a future. The future
can be used to fetch the response list (now or later), and it also allows for installation of a callback
which will be invoked whenever the future is done.
See <xref linkend="NotifyingFuture"/> for details on how to use NotifyingFutures.
</para>
<para>
<methodname>sendMessage()</methodname> allows an application programmer to send a unicast message to a
single cluster member and receive the response. The destination of the message has to be non-null (valid
address of a member). The <parameter>mode</parameter> argument is ignored (it is by default set to
<constant>ResponseMode.GET_FIRST</constant>) unless it is set to <constant>GET_NONE</constant> in which case
the request becomes asynchronous, ie. we will not wait for the response.
</para>
<para>
<methodname>sendMessageWithFuture()</methodname> returns immediately with a future, which can be used to
fetch the result.
</para>
<para>
One advantage of using this building block is that failed members are removed from the set of expected
responses. For example, when sending a message to 10 members and waiting for all responses, and 2 members
crash before being able to send a response, the call will return with 8 valid responses and 2 marked as
failed. The return value of <methodname>castMessage()</methodname> is a <classname>RspList</classname>
which contains all responses (not all methods shown):
</para>
<programlisting language="Java">
public class RspList<T> implements Map<Address,Rsp> {
public boolean isReceived(Address sender);
public int numSuspectedMembers();
public List<T> getResults();
public List<Address> getSuspectedMembers();
public boolean isSuspected(Address sender);
public Object get(Address sender);
public int size();
}
</programlisting>
<para>
<methodname>isReceived()</methodname> checks whether a response from <parameter>sender</parameter>
has already been received. Note that this is only true as long as no response has yet been received, and the
member has not been marked as failed. <methodname>numSuspectedMembers()</methodname> returns the number of
members that failed (e.g. crashed) during the wait for responses. <methodname>getResults()</methodname>
returns a list of return values. <methodname>get()</methodname> returns the return value for a specific member.
</para>
<section id="MessageDispatcherExample">
<title>Example</title>
<para>
This section shows an example of how to use a <classname>MessageDispatcher</classname>.
</para>
<programlisting language="Java">
public class MessageDispatcherTest implements RequestHandler {
Channel channel;
MessageDispatcher disp;
RspList rsp_list;
String props; // to be set by application programmer
public void start() throws Exception {
channel=new JChannel(props);
disp=new MessageDispatcher(channel, null, null, this);
channel.connect("MessageDispatcherTestGroup");
for(int i=0; i < 10; i++) {
Util.sleep(100);
System.out.println("Casting message #" + i);
rsp_list=disp.castMessage(null,
new Message(null, null, new String("Number #" + i)),
ResponseMode.GET_ALL, 0);
System.out.println("Responses:\n" +rsp_list);
}
channel.close();
disp.stop();
}
public Object handle(Message msg) throws Exception {
System.out.println("handle(): " + msg);
return "Success !";
}
public static void main(String[] args) {
try {
new MessageDispatcherTest().start();
}
catch(Exception e) {
System.err.println(e);
}
}
}
</programlisting>
<para>
The example starts with the creation of a channel. Next, an instance of
<classname>MessageDispatcher</classname> is created on top of the channel. Then the channel is connected. The
<classname>MessageDispatcher</classname> will from now on send requests, receive matching responses
(client role) and receive requests and send responses (server role).
</para>
<para>
We then send 10 messages to the group and wait for all responses. The <parameter>timeout</parameter>
argument is 0, which causes the call to block until all responses have been received.
</para>
<para>
The <methodname>handle()</methodname> method simply prints out a message and returns a string. This will
be sent back to the caller as a response value (in Rsp.value). Has the call thrown an exception,
Rsp.exception would be set instead.
</para>
<para>
Finally both the <classname>MessageDispatcher</classname> and channel are closed.
</para>
</section>
</section>
<section id="RpcDispatcher">
<title>RpcDispatcher</title>
<para>
<classname>RpcDispatcher</classname> is derived from <classname>MessageDispatcher</classname>. It allows a
programmer to invoke remote methods in all (or single) cluster members and optionally wait for the return
value(s). An application will typically create a channel first, and then create an
<classname>RpcDispatcher</classname> on top of it. RpcDispatcher can be used to invoke remote methods
(client role) and at the same time be called by other members (server role).
</para>
<para>
Compared to<classname>MessageDispatcher</classname>, no <methodname>handle()</methodname>
method needs to be implemented. Instead the methods to be called can be placed directly in the class using
regular method definitions (see example below). The methods will get invoked using reflection.
</para>
<para>
To invoke remote method calls (unicast and multicast) the following methods are used:
</para>
<programlisting language="Java">
public <T> RspList<T>
callRemoteMethods(Collection<Address> dests,
String method_name,
Object[] args,
Class[] types,
RequestOptions options) throws Exception;
public <T> RspList<T>
callRemoteMethods(Collection<Address> dests,
MethodCall method_call,
RequestOptions options) throws Exception;
public <T> NotifyingFuture<RspList<T>>
callRemoteMethodsWithFuture(Collection<Address> dests,
MethodCall method_call,
RequestOptions options) throws Exception;
public <T> T callRemoteMethod(Address dest,
String method_name,
Object[] args,
Class[] types,
RequestOptions options) throws Exception;
public <T> T callRemoteMethod(Address dest,
MethodCall call,
RequestOptions options) throws Exception;
public <T> NotifyingFuture<T>
callRemoteMethodWithFuture(Address dest,
MethodCall call,
RequestOptions options) throws Exception;
</programlisting>
<para>
The family of <methodname>callRemoteMethods()</methodname> methods is invoked with a list of receiver
addresses. If null, the method will be invoked in all cluster members (including the sender). Each call takes
the target members to invoke it on (null mean invoke on all cluster members), a method and a RequestOption.
</para>
<para>
The method can be given as (1) the method name, (2) the arguments and (3) the argument types, or a
<classname>MethodCall</classname> (containing a java.lang.reflect.Method and argument) can be given instead.
</para>
<para>
As with <classname>MessageDispatcher</classname>, a RspList or a future to a RspList is returned.
</para>
<para>
The family of <methodname>callRemoteMethod()</methodname> methods takes almost the same parameters, except
that there is only one destination address instead of a list. If the <parameter>dest</parameter>
argument is null, the call will fail.
</para>
<para>
The <methodname>callRemoteMethod()</methodname> calls return the actual result (or type T), or throw an
exception if the method threw an exception on the target member.
</para>
<para>
Java's Reflection API is used to find the correct method in the target member according to the method name and
number and types of supplied arguments. There is a runtime exception if a method cannot be resolved.
</para>
<para>
Note that we could also use method IDs and the <classname>MethodLookup</classname> interface to resolve
methods, which is faster and has every RPC carry less data across the wire. To see how this is done,
have a look at some of the MethodLookup implementations, e.g. in RpcDispatcherSpeedTest.
</para>
<section id="RpcDispatcherExample">
<title>Example</title>
<para>The code below shows an example of using RpcDispatcher:</para>
<programlisting language="Java">
public class RpcDispatcherTest {
JChannel channel;
RpcDispatcher disp;
RspList rsp_list;
String props; // set by application
public static int print(int number) throws Exception {
return number * 2;
}
public void start() throws Exception {
MethodCall call=new MethodCall(getClass().getMethod("print", int.class));
RequestOptions opts=new RequestOptions(ResponseMode.GET_ALL, 5000);
channel=new JChannel(props);
disp=new RpcDispatcher(channel, this);
channel.connect("RpcDispatcherTestGroup");
for(int i=0; i < 10; i++) {
Util.sleep(100);
rsp_list=disp.callRemoteMethods(null,
"print",
new Object[]{i},
new Class[]{int.class},
opts);
// Alternative: use a (prefabricated) MethodCall:
// call.setArgs(i);
// rsp_list=disp.callRemoteMethods(null, call, opts);
System.out.println("Responses: " + rsp_list);
}
channel.close();
disp.stop();
}
public static void main(String[] args) throws Exception {
new RpcDispatcherTest().start();
}
}
</programlisting>
<para>
Class <classname>RpcDispatcher</classname> defines method <methodname>print()</methodname> which will be
called subsequently. The entry point <methodname>start()</methodname> creates a channel and an
<classname>RpcDispatcher</classname> which is layered on top. Method
<methodname>callRemoteMethods()</methodname> then invokes the remote <methodname>print()</methodname>
in all cluster members (also in the caller). When all responses have been received, the call returns
and the responses are printed.
</para>
<para>
As can be seen, the <classname>RpcDispatcher</classname> building block reduces the amount of code that
needs to be written to implement RPC-based group communication applications by providing a higher
abstraction level between the application and the primitive channels.
</para>
<section id="NotifyingFuture"><title>Asynchronous calls with futures</title>
<para>
When invoking a synchronous call, the calling thread is blocked until the response (or responses) has
been received.
</para>
<para>
A <emphasis>Future</emphasis> allows a caller to return immediately and grab the result(s) later. In
2.9, two new methods, which return futures, have been added to RpcDispatcher:
</para>
<programlisting language="Java">
public NotifyingFuture<RspList>
callRemoteMethodsWithFuture(Collection<Address> dests,
MethodCall method_call,
RequestOptions options) throws Exception;
public <T> NotifyingFuture<T>
callRemoteMethodWithFuture(Address dest,
MethodCall call,
RequestOptions options) throws Exception;
</programlisting>
<para>
A NotifyingFuture extends java.util.concurrent.Future, with its regular methods such as isDone(),
get() and cancel(). NotifyingFuture adds setListener<FutureListener> to get notified when
the result is available. This is shown in the following code:
</para>
<programlisting language="Java">
NotifyingFuture<RspList> future=dispatcher.callRemoteMethodsWithFuture(...);
future.setListener(new FutureListener() {
void futureDone(Future<T> future) {
System.out.println("result is " + future.get());
}
});
</programlisting>
</section>
</section>
<section id="RspFilter">
<title>Response filters</title>
<para>
Response filters allow application code to hook into the reception of responses from cluster members and
can let the request-response execution and correlation code know (1) wether a response is acceptable and
(2) whether more responses are needed, or whether the call (if blocking) can return. The
<classname>RspFilter</classname> interface looks as follows:
</para>
<programlisting language="Java">
public interface RspFilter {
boolean isAcceptable(Object response, Address sender);
boolean needMoreResponses();
}
</programlisting>
<para>
<methodname>isAcceptable()</methodname> is given a response value and the address of the member which sent
the response, and needs to decide whether the response is valid (should return true) or not
(should return false).
</para>
<para>
<methodname>needMoreResponses()</methodname> determine whether a call returns or not.
</para>
<para>
The sample code below shows how to use a RspFilter:
</para>
<programlisting language="Java">
public void testResponseFilter() throws Exception {
final long timeout = 10 * 1000 ;
RequestOptions opts;
opts=new RequestOptions(ResponseMode.GET_ALL,
timeout, false,
new RspFilter() {
int num=0;
public boolean isAcceptable(Object response,
Address sender) {
boolean retval=((Integer)response).intValue() > 1;
if(retval)
num++;
return retval;
}
public boolean needMoreResponses() {
return num < 2;
}
});
RspList rsps=disp1.callRemoteMethods(null, "foo", null, null, opts);
System.out.println("responses are:\n" + rsps);
assert rsps.size() == 3;
assert rsps.numReceived() == 2;
}
</programlisting>
<para>
Here, we invoke a cluster wide RPC (dests=null), which blocks (mode=GET_ALL) for 10 seconds max
(timeout=10000), but also passes an instance of RspFilter to the call (in options).
</para>
<para>
The filter accepts all responses whose value is greater than 2, and returns as soon as it has received
2 responses which satisfy the above condition.
</para>
<warning>
<title>Be careful with RspFilters</title>
<para>
If we have a RspFilter which doesn't terminate the call even if responses from all members have
been received, we might block forever (if no timeout was given) ! For example, if we have 10 members,
and every member returns 1 or 2 as return value of foo() in the above code, then
<methodname>isAcceptable()</methodname> would always return false, therefore never incrementing 'num',
and <methodname>needMoreResponses()</methodname> would always return true; this would never terminate
the call if it wasn't for the timeout of 10 seconds !
</para>
<para>
This will be fixed in 3.1; a blocking call will always return if we've received as many responses as
we have members in 'dests', regardless of what the RspFilter says.
</para>
</warning>
</section>
</section>
<section id="ReplicatedHashMap">
<title>ReplicatedHashMap</title>
<para>
This class was written as a demo of how state can be shared between nodes of a cluster. It has never been
heavily tested and is therefore not meant to be used in production.
</para>
<para>A
<classname>ReplicatedHashMap</classname> uses a concurrent hashmap internally and allows to create several
instances of hashmaps in different processes. All of these instances have exactly the same state at all
times. When creating such an instance, a cluster name determines which cluster of replicated hashmaps will
be joined. The new instance will then query the state from existing members and update itself before
starting to service requests. If there are no existing members, it will simply start with an empty state.
</para>
<para>
Modifications such as <methodname>put()</methodname>, <methodname>clear()</methodname> or
<methodname>remove()</methodname> will be propagated in orderly fashion to all replicas. Read-only requests
such as <methodname>get()</methodname> will only be invoked on the local hashmap.
</para>
<para>
Since both keys and values of a hashtable will be sent across the network, they have to be
serializable. Putting a non-serializable value in the map will result in an exception at marshalling time.
</para>
<para>
A <classname>ReplicatedHashMap</classname> allows to register for notifications, e.g. when data is
added removed. All listeners will get notified when such an event occurs. Notification is always local;
for example in the case of removing an element, first the element is removed in all replicas, which then
notify their listener(s) of the removal (after the fact).
</para>
<para>
<classname>ReplicatedHashMap</classname> allow members in a group to share common state across process
and machine boundaries.
</para>
</section>
<section id="ReplCache">
<title>ReplCache</title>
<para>
ReplCache is a distributed cache which - contrary to ReplicatedHashMap - doesn't replicate its values to
all cluster members, but just to selected backups.
</para>
<para>
A <methodname>put(K,V,R)</methodname> method has a <emphasis>replication count R</emphasis> which determines
on how many cluster members key K and value V should be stored. When we have 10 cluster members, and R=3,
then K and V will be stored on 3 members. If one of those members goes down, or leaves the cluster, then a
different member will be told to store K and V. ReplCache tries to always have R cluster members store K
and V.
</para>
<para>
A replication count of -1 means that a given key and value should be stored on <emphasis>all</emphasis>
cluster members.
</para>
<para>
The mapping between a key K and the cluster member(s) on which K will be stored is always deterministic, and
is computed using a <emphasis>consistent hash function</emphasis>.
</para>
<para>
Note that this class was written as a demo of how state can be shared between nodes of a cluster. It has
never been heavily tested and is therefore not meant to be used in production.
</para>
</section>
<section id="LockService">
<title>Cluster wide locking</title>
<para>
In 2.12, a new distributed locking service was added, replacing DistributedLockManager. The new service is
implemented as a protocol and is used via org.jgroups.blocks.locking.LockService.
</para>
<para>
LockService talks to the locking protocol via events. The main abstraction of a distributed lock is an
implementation of java.util.concurrent.locks.Lock. All lock methods are supported, however, conditions
are not fully supported, and still need some more testing (as of July 2011).
</para>
<para>
Below is an example of how LockService is typically used:
</para>
<programlisting language="Java">
// locking.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/locking.xml");
LockService lock_service=new LockService(ch);
ch.connect("lock-cluster");
Lock lock=lock_service.getLock("mylock");
lock.lock();
try {
// do something with the locked resource
}
finally {
lock.unlock();
}
</programlisting>
<para>
In the example, we create a channel, then a LockService, then connect the channel. If the channel's
configuration doesn't include a locking protocol, an exception will be thrown.
Then we grab a lock named "mylock", which we lock and subsequently unlock. If another member P had already
acquired "mylock", we'd block until P released the lock, or P left the cluster or crashed.
</para>
<para>
Note that the owner of a lock is always a given thread in a cluster, so the owner is the JGroups address and
the thread ID. This means that different threads inside the same JVM trying to access the same named lock
will compete for it. If thread-22 grabs the lock first, then thread-5 will block until thread-23
releases the lock.
</para>
<para>
JGroups includes a demo (org.jgroups.demos.LockServiceDemo), which can be used to interactively experiment
with distributed locks. LockServiceDemo -h dumps all command line options.
</para>
<para>
Currently (Jan 2011), there are 2 protocols which provide locking:
<xref linkend="PEER_LOCK">PEER_LOCK</xref> and <xref linkend="CENTRAL_LOCK">CENTRAL_LOCK</xref>. The locking
protocol has to be placed at or towards the top of the stack (close to the channel).
</para>
<section id="LockingAndMerges">
<title>Locking and merges</title>
<para>
The following scenario is susceptible to network partitioning and subsequent merging: we have a cluster
view of {A,B,C,D} and then the cluster splits into {A,B} and {C,D}. Assume that B and D now acquire a
lock "mylock". This is what happens (with the locking protocol being CENTRAL_LOCK):
<itemizedlist>
<listitem>There are 2 coordinators: A for {A,B} and C for {C,D}</listitem>
<listitem>B successfully acquires "mylock" from A</listitem>
<listitem>D successfully acquires "mylock" from C</listitem>
<listitem>The partitions merge back into {A,B,C,D}. Now, only A is the coordinator, but C ceases
to be a coordinator</listitem>
<listitem>Problem: D still holds a lock which should actually be invalid !</listitem>
</itemizedlist>
There is no easy way (via the Lock API) to 'remove' the lock from D. We could for example simply release
D's lock on "mylock", but then there's no way telling D that the lock it holds is actually stale !
</para>
<para>
Therefore the recommended solution here is for nodes to listen to MergeView changes if they expect
merging to occur, and re-acquire all of their locks after a merge, e.g.:
</para>
<programlisting language="Java">
Lock l1, l2, l3;
LockService lock_service;
...
public void viewAccepted(View view) {
if(view instanceof MergeView) {
new Thread() {
public void run() {
lock_service.unlockAll();
// stop all access to resources protected by l1, l2 or l3
// every thread needs to re-acquire the locks it holds
}
}.start
}
}
</programlisting>
</section>
</section>
<section id="ExecutionService">
<title>Cluster wide task execution</title>
<para>
In 2.12, a distributed execution service was added. The new service is implemented as a protocol and is used
via org.jgroups.blocks.executor.ExecutionService.
</para>
<para>
<classname>ExecutionService</classname> extends java.util.concurrent.ExecutorService and distributes tasks
submitted to it across the cluster, trying to distribute the tasks to the cluster members as evenly as
possible. When a cluster member leaves or dies, the tasks is was processing are re-distributed to other
members in the cluster.
</para>
<para>
ExecutionService talks to the executing protocol via events. The main abstraction is an implementation of
java.util.concurrent.ExecutorService. All methods are supported. The restrictions are however that
the Callable or Runnable must be Serializable, Externalizable or Streamable. Also the result produced
from the future needs to be Serializable, Externalizable or Streamable. If the Callable or Runnable are not,
then an IllegalArgumentException is immediately thrown. If a result is not, then a NotSerializableException
with the name of the class will be returned to the Future as an exception cause.
</para>
<para>
Below is an example of how ExecutionService is typically used:
</para>
<programlisting language="Java">
// executing.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/executing.xml");
ExecutionService exec_service =new ExecutionService(ch);
ch.connect("exec-cluster");
Future<Value> future = exec_service.submit(new MyCallable());
try {
Value value = future.get();
// Do something with value
}
catch (InterruptedException e) {
e.printStackTrace();
}
catch (ExecutionException e) {
e.getCause().printStackTrace();
}
</programlisting>
<para>
In the example, we create a channel, then an ExecutionService, then connect the channel. Then we submit
our callable giving us a Future. Then we wait for the future to finish returning our value and do something
with it. If any exception occurs we print the stack trace of that exception.
</para>
<para>
The <classname>ExecutionService</classname> follows the Producer-Consumer Pattern very closely. The
<classname>ExecutionService</classname> is used as the Producer for this Pattern. Therefore the service
only passes tasks off to be handled and doesn't do anything with the actual invocation of those tasks.
There is a separate class that can was written specifically as a consumer, which can be ran on any node of
the cluster. This class is <classname>ExecutionRunner</classname> and implements java.lang.Runnable.
A user is required to run one or more instances of a <classname>ExecutionRunner</classname> on a node of
the cluster. By having a thread run one of these runners, that thread has no volunteered to be able to
run any task that is submitted to the cluster via an <classname>ExecutionService</classname>. This allows
for any node in the cluster to participate or not participate in the running of these tasks and also any
node can optionally run more than 1 <classname>ExecutionRunner</classname> if this node has additional
capacity to do so. A runner will run indefinately until the thread that is currently running it is
interrupted. If a task is running when the runner is interrupted the task will be interrupted.
</para>
<para>
Below is an example of how simple it is to have a single node start and allow for 10 distributed tasks to be executed simultaneously on it:
</para>
<programlisting language="Java">
int runnerCount = 10;
// locking.xml needs to have a locking protocol
JChannel ch=new JChannel("/home/bela/executing.xml");
ch.connect("exec-cluster");
ExecutionRunner runner = new ExecutionRunner(ch);
ExecutorService service = Executors.newFixedThreadPool(runnerCount);
for (int i = 0; i < runnerCount; ++i) {
// If you want to stop the runner hold onto the future
// and cancel with interrupt.
service.submit(runner);
}
</programlisting>
<para>
In the example, we create a channel, then connect the channel, then an ExecutionRunner. Then we create
a java.util.concurrent.ExecutorService that is used to start 10 threads that each thread runs the
ExecutionRunner. This allows for this node to have 10 threads actively accept and work on requests
submitted via any ExecutionService in the cluster.
</para>
<para>
Since an ExecutionService does not allow for non serializable class instances to be sent across as tasks
there are 2 utility classes provided to get around this problem. For users that are used to using a
CompletionService with an Executor there is an equivalent ExecutionCompletionService provided that allows
for a user to have the same functionality. It would have been preferred to allow for the same
ExecutorCompletionService to be used, but due to it's implementation using a non serializable object
the ExecutionCompletionService was implemented to be used instead in conjunction with an ExecutorService.
Also utility class was designed to help users to submit tasks which use a non serializable class. The
Executions class contains a method serializableCallable which allows for a user to pass a constructor of a
class that implements Callable and it's arguments to then return to a user a Callable that will upon running
will automatically create and object from the constructor passing the provided arguments to it and then will
call the call method on the object and return it's result as a normal callable. All the arguments provided
must still be serializable and the return object as detailed previously.
</para>
<para>
JGroups includes a demo (org.jgroups.demos.ExecutionServiceDemo), which can be used to interactively
experiment with a distributed sort algorithm and performance. This is for demonstration purposes and
performance should not be assumed to be better than local.
ExecutionServiceDemo -h dumps all command line options.
</para>
<para>
Currently (July 2011), there is 1 protocol which provide executions:
<xref linkend="CENTRAL_EXECUTOR">CENTRAL_EXECUTOR</xref>. The executing protocol has to be placed at or
towards the top of the stack (close to the channel).
</para>
</section>
<section id="CounterService">
<title>Cluster wide atomic counters</title>
<para>
Cluster wide counters provide named counters (similar to AtomicLong) which can be changed atomically. 2
nodes incrementing the same counter with initial value 10 will see 11 and 12 as results, respectively.
</para>
<para>
To create a named counter, the following steps have to be taken:
<orderedlist>
<listitem>
Add protocol COUNTER to the top of the stack configuration
</listitem>
<listitem>
Create an instanceof CounterService
</listitem>
<listitem>
Create a new or get an existing named counter
</listitem>
<listitem>
Use the counter to increment, decrement, get, set, compare-and-set etc the counter
</listitem>
</orderedlist>
</para>
<para>
In the first step, we add COUNTER to the top of the protocol stack configuration:
</para>
<programlisting language="Java">
<config>
...
<MFC max_credits="2M"
min_threshold="0.4"/>
<FRAG2 frag_size="60K" />
<COUNTER bypass_bundling="true" timeout="5000"/>
</config>
</programlisting>
<para>
Configuration of the COUNTER protocol is described in <xref linkend="COUNTER">COUNTER</xref>.
</para>
<para>
Next, we create a CounterService, which is used to create and delete named counters:
</para>
<programlisting language="Java">
ch=new JChannel(props);
CounterService counter_service=new CounterService(ch);
ch.connect("counter-cluster");
Counter counter=counter_service.getOrCreateCounter("mycounter", 1);
</programlisting>
<para>
In the sample code above, we create a channel first, then create the CounterService referencing the channel.
Then we connect the channel and finally create a new named counter "mycounter", with an initial value of 1.
If the counter already exists, the existing counter will be returned and the initial value will be ignored.
</para>
<para>
CounterService doesn't consume any messages from the channel over which it is created; instead it grabs
a reference to the COUNTER protocols and invokes methods on it directly. This has the advantage that
CounterService is non-intrusive: many instances can be created over the same channel. CounterService even
co-exists with other services which use the same mechanism, e.g. LockService or ExecutionService (see above).
</para>
<para>
The returned counter instance implements interface Counter:
</para>
<programlisting language="Java">
package org.jgroups.blocks.atomic;
public interface Counter {
public String getName();
/**
* Gets the current value of the counter
* @return The current value
*/
public long get();
/**
* Sets the counter to a new value
* @param new_value The new value
*/
public void set(long new_value);
/**
* Atomically updates the counter using a CAS operation
*
* @param expect The expected value of the counter
* @param update The new value of the counter
* @return True if the counter could be updated, false otherwise
*/
public boolean compareAndSet(long expect, long update);
/**
* Atomically increments the counter and returns the new value
* @return The new value
*/
public long incrementAndGet();
/**
* Atomically decrements the counter and returns the new value
* @return The new value
*/
public long decrementAndGet();
/**
* Atomically adds the given value to the current value.
*
* @param delta the value to add
* @return the updated value
*/
public long addAndGet(long delta);
}
</programlisting>
<section id="CounterServiceDesign">
<title>Design</title>
<para>
The design of COUNTER is described in details in
<ulink url="https://github.com/belaban/JGroups/blob/master/doc/design/CounterService.txt">CounterService</ulink>.
</para>
<para>
In a nutshell, in a cluster the current coordinator maintains a hashmap of named counters. Members send
requests (increment, decrement etc) to it, and the coordinator atomically applies the requests and
sends back responses.
</para>
<para>
The advantage of this centralized approach is that - regardless of the size of a cluster - every
request has a constant execution cost, namely a network round trip.
</para>
<para>
A crash or leaving of the coordinator is handled as follows. The coordinator maintains a version for
every counter value. Whenever the counter value is changed, the version is incremented. For every