/
ubf-user-guide.en.txt
1767 lines (1353 loc) · 58 KB
/
ubf-user-guide.en.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
// -*- Doc -*-
// vim: set syntax=asciidoc:
= UBF User's Guide _1st DRAFT_
:Date: 2011/08/29
:Revision: 0.3.2
:Copyright: 2002 Joe Armstrong
:Copyright: 2010-2011 Gemini Mobile Technologies, Inc. All rights reserved.
<<<
== Preface
UBF is a framework that permits the Erlang to talk to the outside
world <<UBFPAPER>>. The acronym "UBF" stands for "Universal Binary
Format", designed and implemented by Joe Armstrong.
This document and the corresponding open-source code repositories
hosted on github <<UBF>> are based on Joe Armstrong's original UBF
site <<UBFSITE>> and UBF code with an MIT license file added to the
distribution. Since then, a large number of enhancements and
improvements have been added.
== Introduction
UBF is a language for transporting and describing complex data
structures across a network. It has three components:
- UBF(a) is a "language neutral" data transport format, roughly
equivalent to well-formed XML.
- UBF(b) is a programming language for describing types in UBF(a) and
protocols between clients and servers. This layer is typically
called the "protocol contract". UBF(b) is roughly equivalent to
Verified XML, XML-schemas, SOAP and WDSL.
- UBF(c) is a meta-level protocol used between a UBF client and a UBF
server.
While the XML series of languages had the goal of having a human
readable format the UBF languages take the opposite view and provide a
"machine friendly" format. UBF is designed to be easy to implement.
.Programming By Contract
svgimage::images/ubf-flow-01["Programming By Contract"]
Central to UBF is the idea of a "Contract" which regulates the set of
legal conversations that can take place between a client and a server.
The client-side is depicted in "red" and the server-side is depicted
in "blue". The client and server communicate with each other via a
TCP/IP connection. All data sent by both the client and the server is
verified by the "Contract Manager" (an Erlang process on the "server"
side of the protocol). Any data that violates the contract is
rejected.
The UBF framework itself is designed to be easy to extend for
supporting other data transport formats and other network transports.
For example, JSON, Thrift, and Erlang native binary serialization data
formats over TCP/IP and JSON-RPC over HTTP are supported alternatives
to the original UBF(a) implementation.
<<<
== Specifications
[[UBFa]]
=== UBF(a)
UBF(a) is a transport format. UBF(a) was designed to be easy to parse
and to be easy to write with a text editor. UBF(a) is based on a byte
encoded virtual machine, 26 byte codes are reserved. Instead of
allocating the byte codes from 0, the printable character codes are
used to make the format easy to read.
UBF(a) has four primitive types, when a primitive type is recognized
it is pushed onto the "recognition stack" in our decoder. The
primitive types are Integer, String, Binary, and Atom. UBF(a) has two
types of "glue" for making compound objects. The compound types are
Tuple and List. Lastly, the operator '$' (i.e. "end of object")
signifies when objects are finished.
For example, the following UBF(a) object:
------
'person'>p # {p "Joe" 123} & {p 'fred' 3~abc~} & $
------
Represents the following UBF(b) term, a list that contains two
3-tuples:
------
[{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}].
------
TIP: In UBF(a), white space as well as commas are treated as a
delimiter.
For this example, the recognition stack for parsing this UBF(a) object
would be as follows:
------
'person'>p # {p "Joe" 123} & {p 'fred', 3~abc~} & $
^ ^ ^^ ^ ^^ ^ ^ ^ ^
| | || | || | | | |
1 2 ab c d3 4 5 6 7
Time Stack
1 'person'
2 []
2a { ... incomplete
[]
2b {'person' ... incomplete
[]
2c {'person', "Joe", ... incomplete
[]
2d {'person', "Joe", 123 ... incomplete}
[]
3 {'person', "Joe", 123}
[]
4 [{'person', "Joe", 123}]
5 {'person', 'fred', <<"abc">>}
[{'person', "Joe", 123}]
6 [{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}]
7 [{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}]
------
See <<ABNF-UBFa>> for a formal definition of the UBF(a) syntax.
CAUTION: There is no "Float" primitive type in the original and
current UBF(a) implementation. After Joe Armstrong's original
implementation, a "Float" type was added to UBF(b) for use in other
network transports other than UBF(a). In future, UBF(a) could be
enhanced to support a "Float" primitive type.
==== Integer: [-][0-9]+
Integers are sequences of bytes which could be described by the
regular expression [-][0-9]+, that is an optional minus (to denote a
negative integer) and then a sequence of at least one digit.
==== String: "..."
Strings are written enclosed in double quotes. Within a string two
quoting conventions are observed, " must be written \" and \ must be
written \\ - no other quotings are allowed.
==== Binary: [0-9]+ \~...~
Uninterpreted blocks of binary data are encoded. First an integer,
representing the length of the binary data is encoded, this is
followed by a tilde, the data itself which must be exactly the length
given in the integer and than a closing tilde. The closing tilde has
no significance and is retained for readability. White space can be
added between the integer length and the data for readability.
==== Atom: \'...'
Atoms are encoded as strings, only using a single quote instead of a
double quote. Atoms are commonly found in symbolic languages like
Lisp, Prolog or Erlang. In C, they would be represented by hashed
strings. The essential property of an atom is that two atoms can be
compared for equality in constant time. These are used for
representing symbolic constants.
==== Tuple: { Obj1 Obj2 ... ObjN-1 ObjN }
Tuples represent _fixed numbers_ of objects. The byte codes for "{"
and "}" are used to delimit a tuple. Obj1, Obj2, ObjN-1, and ObjN are
arbitrary UBF(a) objects.
==== List: # ObjN & ObjN-1 & ... & Obj2 & Obj1
Lists represent _variable numbers_ of objects. The first object in
the list is Obj1, the second object in the list is Obj2, etc. Objects
are presented in reverse order.
Lisp programmers will recognize '#' as an operator that pushes NIL (or
the end of list) onto the recognition stack and '&' as an operator
that takes the top two items on the recognition stack and replaces
them by a list cell.
==== Term
Terms represent primitive types and compound types.
==== White space: \s \n \r \t , %...%
For convenience, blank, carriage return, line feed, tab, comma, and
comments are treated as white space. Comments can be included in
UBF(a) with the syntax %...% and the usual quoting convention applies.
==== Tag: \`...`
In addition any item can be followed by a semantic tag this is written
\`...` - with in the tag the close quote is quoted as in the strings
encoding. This tag has no meaning in UBF(a) but might have a meaning
in UBF(b). For example:
------
12456 ~...~ `jpg`
------
Represents 12,456 bytes of raw data with the semantic tag "jpg".
UBF(a) does not know what "jpg" means - this is passed on to UBF(b)
which might know what it means - finally the end application is
expected to know what to do with an object of type "jpg", it might for
example know that this represents an image. UBF(a) will just encode
the tag, UBF(b) will type check the tag, and the application should be
able to understand the tag.
CAUTION: Currently, this feature of integrating a "tag" in UBF(a) for
the purpose of a "type" in UBF(b) is not implemented. Tags can be
specified in UBF(a) but there is no way for the application to act
upon this semantic information.
==== Register: >C C
So far, exactly 26 control characters have been used, namely:
%"~'`{}#&\s\n\t\r,-01234567890
This leaves us with 230 unallocated byte codes. These are used as
follows:
------
>C
------
Where 'C' is not one of the reserved byte codes, > means store the top
of the recognition stack in the register 'C' and pop the recognition
stack. For caching optimizations, subsequent reuse of the single
character 'C' means push register 'C' onto the recognition stack.
==== Object
Objects represent either a Term, a Register push, or a Register pop
with an optional Tag. The operator '$' signifies "end of object".
When '$' is encountered there should be only one item on the
recognition stack.
=== UBF(b)
UBF(b) is a language independent type system and protocol description
language. The protocol description language allows one to specify
client server interaction in terms of a non-deterministic finite state
machine. The type system allows one to specify the asynchronous
events and synchronous request/response pairs that define transitions
of this finite state machine.
The type system and protocol description language together define the
basis of "Contracts" between clients and servers. All data sent by
both the client and the server is verified by the "Contract Manager"
(an Erlang process on the "server" side of the protocol). Any data
that violates the contract is rejected.
A UBF(b) contract is defined by 2 mandatory sections and 3 optional
sections. The mandatory sections are the "+NAME" and the "+VERSION"
of the contract. The optional sections are the "+TYPES", the
"+STATE", and the "+ANYSTATE" of the contract.
For example, the following UBF(b) contract having the filename
"irc_plugin.con" defines a simple IRC (Internet Relay Chat) protocol
between clients and a server:
[source,erlang]
------
include::misc-codes/irc_plugin.con[]
------
See <<ABNF-UBFb>> for a formal definition of the UBF(b) syntax.
NOTE: The astute reader (and otherwise :) ) may notice that UBF(a) and
UBF(b) are Erlang-centric. By design, the two languages are supposed
to be language neutral and yet _by design_ the two are highly
influenced by Erlang. For example, the difference between a string
type and a binary type is directly due to Erlang's implementation of
binaries and strings. Similarly, the reason for supporting a record
type and extended record type is also directly due to Erlang's
implementation of records.
==== Name: +NAME("...").
The name of the contract is specified as a double-quoted string.
==== Version: +VSN("...").
The version of the contract is specified as a double-quoted string.
==== Types: +TYPES.
The UBF(b) type system has user-defined types and predefined types.
User-defined types and predefined types are either primitive types or
complex types.
The primitive types are Integer, Range, Float, Binary, String, Atom,
and Reference. The complex types are Alternative, Tuple, Record,
Extended Record, and List. User-defined "complex types" are defined
recursively.
===== Definition: X() = T
New types are defined by the notation:
------
X() = T;
------
and the last type of new types must be defined by the notation:
------
X() = T.
------
The name of the type is 'X' and the type's definition 'T' is either a
user-defined type or a predefined type.
===== Integer: [-][0-9]+ _or_ [0-9]\+#[0-9a-f]+
Positive and negative integer constants are expressed as in UBF(a).
Integer constants may also be expressed in other bases using Erlang
syntax.
===== Range: [-][0-9]\+..[-][0-9]+ _or_ [-][0-9]\+.. _or_ ..[-][0-9]+
Bounded, left unbounded, and right unbounded integer ranges are
supported.
===== Float: [-][0-9]\+.[0-9]+
Positive and negative float constants are supported for network
transports other than UBF(a).
NOTE: In future, the implementation of UBF(b) could be enhanced to
specify a float more compactly using scientific notation
(e.g. "6.02e23").
===== Binary: \<<"...">>
Binary constants are expressed similarly as strings in UBF(a) but
having two leading "less than brackets" and two following "greater
than brackets".
===== String: "..."
String constants are expressed as in UBF(a).
===== Atom: \'...' _or_ [a-z][a-zA-Z0-9_]*
Atom constants are expressed as UBF(a) atoms. Atom constants starting
with lowercase letters do not require single quotes.
===== Reference: R()
Defined types are referenced by the notation:
------
R()
------
The name of the type is 'R'.
===== Alternative: T1 | T2
A type X is of type "T1 | T2" if X is of type T1 or if X is of type
T2.
===== Tuple: {T1, T2, ..., Tn}
A type {X1, X2, ..., Xn} is of type "{T1, T2, ..., Tn}" if X1 is of
type T1, X2 is of type T2, ... and Xn is of type Tn.
===== Record: name#{x=T1, y=T2, ..., z=Tn}
A record type is syntactic sugar for a tuple of type "{name, T1, T2,
..., Tn}" where name, x, y, ..., and z are atoms.
===== Extended Record: name##{x=T1, y=T2, ..., z=Tn}
An extended record type is syntactic sugar for a tuple of type "{name,
T1, T2, ..., Tn, '$fields'=[x,y,...,z], '$extra'=Extra}" where name,
x, y, ..., and z are atoms and Extra is any valid term.
===== List: [T]
A type [X1, X2, ..., Xn] is of type [T] if all of Xi are of type T.
===== Predefined: P() _or_ P(A1, A2, ..., An)
Predefined types are referenced by the notation:
------
P()
------
or by the notation:
------
P(A1, A2, ..., An)
------
The name of the predefined type is 'P'. Using the second notation,
attributes can be specified to make the predefined type less general
and thus more specific when matching objects.
|============
| | ascii | asciiprintable | nonempty | nonundefined
| integer | X | X | X | X
| float | X | X | X | X
| binary | O | O | O | X
| string | O | O | O | X
| atom | O | O | O | O
| tuple | X | X | O | X
| list | X | X | O | X
| proplist | X | X | O | X
| term | X | X | O | O
| void | X | X | X | X
|============
The above table summarizes the set of supported predefined types and
their respective optional attributes.
The "integer", "float", "binary", "string", "atom", "tuple", and
"list" predefined types match directly to the corresponding primitive
or complex type.
The "term" predefined type matches any object.
The "proplist" predefined type is a specialized version of the "list"
predefined type that matches the following types:
------
[{term(), term()}]
------
The "void" predefined type is a placeholder to describe the return
value of a function call that does not return to the caller.
The "ascii" attribute permits matches with binaries, strings, and
atoms containing only ASCII values <<RFC20>>. Similarly, the
"asciiprintable" attribute permits matches with only printable ASCII
values.
The "nonempty" attribute permits matches with binaries, strings,
atoms, tuples, lists, proplists, and terms that are of length greater
than zero. The following objects would not be matched with the
"nonempty" attribute:
------
<<"">>
""
''
{}
[]
------
The "nonundefined" attribute permits matches with atoms and terms that
are not equal to the 'undefined' atom.
NOTE: By convention, the 'undefined' atom is commonly used to indicate
a default value or an undefined value in Erlang programs. The purpose
of 'undefined' is similar to NULL in C, to None in Python, etc.
==== State: +STATE.
The "+STATE" sections of UBF(b) defines a finite state machine (FSM)
to model the interaction between the client and server. Symbolic
names expressed as "atoms" are the states of the FSM.
Transitions expressed as request, response, and next state triplets
are the edges of the FSM. Transitions are "synchronous" calls from
the client to the server. Any request sent by the client that cannot
match at least one valid transition is ignored and a "client broke
contract" error response is returned to the client. Likewise, any
response returned by the server that cannot match at least one valid
transition is ignored and a "server broke contract" error response is
returned to the client.
The states of the FSM may also be annotated with events expressed as
"asynchronous" casts. Events are asynchronous casts either from the
client to the server or from the server to the client. Please see
next section for additional details.
NOTE: The terminology of "call" and "cast" to distinguish between
synchronous and asynchronous interaction is borrowed from Erlang.
==== Anystate: +ANYSTATE.
The "+ANYSTATE" section of UBF(b) are used to define request and
response pairs and to define events that are valid in _all_ states of
the FSM.
Events are checked based on direction first, on the current state's
valid events next, and finally on the valid anystate events. Any cast
sent by the client or sent by the server that cannot match at least
one valid event is ignored and dropped.
=== UBF(c)
UBF(c) is a meta-level protocol used between a UBF client and a UBF
server. UBF(c) has two primitives: synchronous "calls" and
asynchronous "casts".
==== Calls: Request $ => {Response, NextState} $
Synchronous calls have the following form for the request:
------
Request $
------
and for the response:
------
{Response, NextState} $
------
where "Request" is an UBF(a) type sent by the client and "Response" is
an UBF(a) type and "NextState" is an UBF(a) atom sent by the server.
If the client sends an invalid request, the server will respond with
the following "client broke contract" error:
------
{{'clientBrokeContract', Request, ExpectsIn}, State} $
------
where "ExpectsIn" is a UBF(a) type to describe the acceptable list of
input types and "State" is an UBF(a) atom.
If the server sends an invalid response, the server will respond with
the following "server broke contract" error:
------
{{'serverBrokeContract', Response, ExpectsOut}, State} $
------
where "ExpectsOut" is a UBF(a) type to describe the acceptable list of
output types and "State" is an UBF(a) atom.
CAUTION: By convention, the 3-tuples {\'clientBrokeContract\', _, _}
and {\'serverBrokeContract', _, _} are reserved terms for responses.
Please be careful when designing your application not to use either of
these 3-tuples.
==== Casts: {\'event_in', Event} $ _or_ {\'event_out', Event} $
Asynchronous casts from the client to server have the following form:
------
{'event_in', Event} $
------
and from the server to the client have the following form:
------
{'event_out', Event} $
------
where "Event" is an UBF(a) type.
If client or server send an invalid event, the event is ignored and
dropped by the server.
See <<ABNF-UBFc>> for a formal definition of the UBF(c) syntax.
CAUTION: By convention, the 2-tuples {\'event_in', _} and
{\'event_out', _} are reserved terms for requests and responses
respectively. Please be careful when designing your application not
to use either of these two tuples. This limitation introduced
unintentionally after the original UBF implementation may be removed
in the future.
<<<
== Contracts & Plugins
"Contracts" and "Plugins" are the basic building blocks of an Erlang
UBF server. Contracts are a server's specifications. Plugins are a
server's implementations.
=== Contract
A contract is a UBF(b) specification stored to a file. By convention,
a contract's filename has ".con" as the suffix part. Since all
sections of a UBF(b) specification are optional except for the "+NAME"
and "+VERSION" sections, it is possible to have "+TYPES" only
contracts, "+STATE" only contracts, "+ANYSTATE" only contracts, or any
combination of such contracts.
For example, a "+TYPES" only contract having the filename
"irc_types_plugin.con" is as follows:
[source,erlang]
------
include::misc-codes/irc_types_plugin.con[]
------
For example, a "+STATE" and "+ANYSTATE" contract having the filename
"irc_fsm_plugin.con" is as follows:
[source,erlang]
------
include::misc-codes/irc_fsm_plugin.con[]
------
=== Plugin
A plugin is just a "normal" Erlang module that follows a few simple
rules. For a "+TYPES" only contract, the plugin contains just the
name of it's contract. Otherwise, the plugin contains the name of
it's contract plus the necessary Erlang "glue code" needed to bind the
UBF server to the server's application. In either case, a plugin can
also import all or a subset of "+TYPES" from other plugins. This
simple yet powerful import mechanism permits sharing and re-use of
types between plugins and servers.
NOTE: The necessary Erlang "glue code" is presented later in the
<<Servers>> section.
For the full example IRC contract described in a previous section, the
plugin having the filename "irc_plugin.erl" is as follows:
------
-module(irc_plugin).
-compile({parse_transform,contract_parser}).
-add_contract("irc_plugin").
------
The plugin for the "+TYPES" only contract having the filename
"irc_types_plugin.erl" is as follows:
------
-module(irc_types_plugin).
-compile({parse_transform,contract_parser}).
-add_contract("irc_types_plugin").
------
=== Importing Types
The plugin for the "+STATE" and "+ANYSTATE" contract having the
filename "irc_fsm_plugin.erl" is as follows:
------
-module(irc_fsm_plugin).
-compile({parse_transform,contract_parser}).
-add_types(irc_types_plugin).
-add_contract("irc_fsm_plugin").
------
The "-add_types(\'there\')" directive imports all "+TYPES" from the
plugin named \'there' into the containing plugin. An alternative
syntax "-add_types({\'elsewhere\', [\'t1\', \'t2\', ..., \'tn\']})."
for this directive imports a subset of "+TYPEs" from the plugin named
\'elsewhere' into the containing plugin. Multiple import directives
of either syntax can be freely declared as long as the "-add_types"
directives are listed before the "-add_contract" directive. A plugin
can have only one "-add_contract" directive.
By using this Erlang "parse transform", the contract is parsed and the
imported types (if any) are processed during the compilation of the
plugin's Erlang module. The normal search path used by Erlang's
compiler to locate modules is used to import types from other plugins.
=== Compilation Errors
The plugin will fail to compile if the plugin's contract cannot be
found, cannot be parsed properly, or if one of the following errors
occurs:
{\'duplicated_records', L}::
One or more records having the same name are found.
{\'duplicated_states', L}::
One or more states having the same name are found.
{\'duplicated_types', L}::
One or more types having the same name are found.
{\'duplicated_unmatched_import_types', L}::
One or more imported types having the same name but different
definitions are found. _Type duplicates are permitted as long as
the type(s) are imported and all duplicates have the same
definition._
{\'missing_states', L}::
One or more states were found to be missing.
{\'missing_types', L}::
One or more types were found to be missing.
{\'unused_types', L}::
One or more types were found to be unused in the contract. _Unused
types are permitted as long as the unused type(s) are imported._
where L is an Erlang list.
=== Miscellaneous
As a by-product of a plugin's compilation and if one or more "record"
or "extended record" types were declared in a plugin's contract, an
Erlang "header" file containing the plugin's record definitions is
automatically created. This Erlang "header" file can be included by
the plugin module itself or by other Erlang modules used by the
server's application. By convention, this Erlang "header" file has
the same base filename as the plugin but having a ".huc" as the suffix
part.
TIP: There are 2 experimental prototypes for extending UBF's type and
plugin framework. <<UBF_ABNF>> is a framework for integrating UBF and
ABNF specifications. <<UBF_EEP8>> is a framework for integrating UBF
and EEP8 types.
<<<
== Transports
The original "UBF" network transport is UBF(a) over TCP/IP. Since
then, a number of new transports *not* based on UBF(a) and not based
on TCP/IP have been added. Nevertheless, these transports are still
considered as part of the overall UBF framework. Most importantly,
applications can share and re-use the same UBF contracts and plugins
irregardless of the network transport.
=== TCP/IP
==== UBF: Universal Binary Format
The name "UBF" is short for "Universal Binary Format". UBF is
commonly used to refer to the network transport based on UBF(a) and to
the overall UBF framework.
See <<UBFa>> for further information.
==== EBF: Erlang Binary Format
EBF is an implementation of UBF(b) but it does not use UBF(a) for the
client and server communication. Instead, Erlang-style conventions
are used instead:
- Structured terms are serialized via the Erlang BIFs term_to_binary()
and binary_to_term().
- Terms are framed using the 'gen_tcp' {packet, 4} format: a 32-bit
unsigned integer (big-endian?) specifies packet length.
+
------
+-------------------------+-------------------------------+
| Packet length (32 bits) | Packet data (variable length) |
+-------------------------+-------------------------------+
------
The name "EBF" is short for "Erlang Binary Format".
==== JSF: JavaScript Format
JSF is an implementation of UBF(b) but it does not use UBF(a) for the
client and server communication. Instead, JSON <<RFC4627>> is used
instead as the wire format. The name "JSF" is short for "JavaScript
Format".
There is no generally agreed upon convention for converting Erlang
terms to JSON objects. JSF uses the convention set forth by
MochiWeb's JSON library <<MOCHIJSON2>>. In addition, there are a
couple of other conventions layered on top of MochiWeb's
implementation.
- The UBF(b) contract checker has been modified to make a distinction
between an Erlang record and an arbitrary Erlang tuple. An
experienced Erlang developer would view such a distinction either
with skepticism or with approval.
- For the skeptics, the contract author has the option of having the
UBF(b) contract compiler automatically generate Erlang -record()
definitions for appropriate tuples within the contract. Such record
definitions are very convenient for developers on the Erlang side of
the world, but they introduce more complication to the JavaScript
side of the world. For example, JavaScript does not have a concept
of an arbitrary atom, as Erlang does. Also, the JavaScript side
must make a distinction between {foo, 42} and {bar, 42} when #foo is
a record on the Erlang side but #bar is not.
This extra convention creates something slightly messy-looking, if you
look at the raw JSON passed back-and-forth. The examples of the
Erlang record {foo, 42} and the general tuple {bar, 42} would look
like this:
------
record (defined in the contract as "foo() = #foo{attribute1 = term()};")
{"$R":"foo", "attribute1":42}
general tuple
{"$T":[{"$A":"bar"}, 42]}
------
However, it requires very little JavaScript code to convert objects
with the "$R", "$T", and "$A" notation (for records, tuples, and
atoms) into whatever object is most convenient.
See <<UBF_JSONRPC>> for further information.
TIP: Gemini Mobile Technologies, Inc. has implemented and open-sourced
a module for classifying the input character set to detect non-UTF8
JSON inputs <<GMTCHARSET>>.
==== TBF / FTBF / NTBF / FNTBF: Binary Format - Thrift / Framed Thrift / Native Thrift / Framed Native Thrift
TBF and NTBF is an implementation of UBF(b) but it does not use UBF(a)
for the client and server communication. Instead, Thrift <<THRIFT>>
is used instead as the wire format. The name "TBF" is short for
"Thrift Binary Format". The name "NTBF" is short for "Native Thrift
Binary Format". FTBF and FNTBF are framed versions of TBF and NTBF,
respectively.
TBF follows the conventions set forth by the Thrift community by
re-using Thrift's binary wire-protocol except for the following
exceptions:
- The name of Thrift messages are hard-coded to the Thrift name
"$UBF".
- The name of Thrift structs are not removed before being written to
the network.
- TBF does not use nor require a Thrift IDL.
- TBF by convention requires the client to read a "server hello"
message at the start of establishing a new TCP/IP connection.
TBF *can* encode and decode all UBF(b) objects. Synchronous calls are
implemented as Thrift 'T-CALL' and 'T-REPLY' message pairs.
Asynchronous casts are implemented as Thrift 'T-ONEWAY' messages.
CAUTION: TBF is not compatible with standard Thrift clients and
servers.
NTBF follows all of the conventions set forth by the Thrift community
by re-using Thrift's binary wire-protocol. A standard Thrift client
can communicate with a UBF "NTBF" server and a UBF "NTBF" client can
communicate with a standard Thrift server.
NTBF *cannot* encode and decode all UBF(b) objects. There is no
straigthforward convention for converting Erlang terms to Thrift
messages. Synchronous calls are implemented as Thrift 'T-CALL' and
'T-REPLY' message pairs or 'T-CALL' and 'T-EXCEPTION' message pairs.
Asynchronous casts are implemented as Thrift 'T-ONEWAY' messages.
The NTBF transport is under active development to enhance, to improve,
to simplify the integration of Thrift to the UBF framework. The
impedance mismatch between the two approaches of Thrift and UBF can
only be addressed by further development.
CAUTION: Currently, NTBF only implements the encoding and decoding of
Thrift's binary wire-protocol. Unlike standard Thrift clients and
servers, a NTBF client and server must "manually" implement the
features provided by the Thrift IDL.
See <<UBF_THRIFT>> for further information.
==== Miscellaneous
It is worthwhile to mention two new TCP/IP transports namely PBF and
ABF under investigation. The name "PBF" is short for "Google's
Protocol Buffers Format" <<PROTOBUF>>. The name "ABF" is short for
"Avro Binary Format" <<AVRO>>.
=== HTTP
==== JSON-RPC
JSON-RPC <<JSONRPC>> is a lightweight remote procedure call protocol
similar to XML-RPC. The UBF framework implementation of JSON-RPC
brings together JSF's encoder/decoder, UBF(b)'s contract checking, and
an HTTP transport.
.Programming By Contract w/ Multiple Transports
svgimage::images/ubf-flow-02["Programming By Contract w/ Multiple Transports"]
As previously stated, central to UBF is the idea of a "Contract" which
regulates the set of legal conversations that can take place between a
client and a server. The client-side is depicted in "red" and the
server-side is depicted in "blue". The client and server communicate
with each other via a TCP/IP and/or HTTP.
Central to UBF is the idea of contract(s) can be shared and re-used by
multiple transports. Any data that violates the _same_ contract(s) is
rejected regardless of the transport.
See <<UBF_JSONRPC>> for further information.
=== Miscellaneous
Several transports that do not require an explicit network socket have
been added to the UBF framework. These transports permit an
application to call a plugin directly without the need for TCP/IP or
HTTP.
==== ETF: Erlang Term Format
The concept "ETF" was added to the UBF framework. This transport
relies on Erlang's Native Distribution for synchronous calls and
asynchronous casts.
The name "ETF" is short for "Erlang Term Format".
==== LPC: Local Procedure Call
The concept "LPC" was added to the UBF framework. This transport is a
"non-transport" that invokes synchronous calls directly to a plugin.
Support for asynchronous casts has not been added (or designed) yet.
The name "LPC" is short for "Local Procedure Call".
NOTE: LPC is used to implement the JSON-RPC transport.
<<<
[[Servers]]
== Servers
The UBF framework provides two types of Erlang servers: "stateless"
and "stateful". The stateless server is an extension of Joe
Armstrong's original UBF server implementation. The "stateful" server
is Joe Armstrong's original UBF server implementation.
UBF servers are introspective - which means the servers can describe
themselves. The following commands (described in UBF(a) format) are
always available:
\'help' $::
Help information
\'info' $::
Short information about the current service
\'description' $::
Long information about the current service
\'services' $::
A list of available services
\'contract' $::
Return the service contract
{\'startSession', "Name", Args} $::
To start a new session for the Name service. Args are initial
arguments for the Name service and is specific to that service.
{\'restartService', "Name", Args} $::
To restart the Name service. Args are restart arguments for the
Name service and is specific to that service.
The "ubf_server" Erlang module implements most of the commonly-used
server-side functions and provides several ways to start a server.
Configuration options for both types of servers are the same.
However, the plugin callback API is different.
------
-module(ubf_server).
-type name() :: atom().
-type plugins() :: [module()].
-type ipport() :: pos_integer().
-type options() :: [{atom(), term()}].
-spec start(plugins(), ipport()) -> true.
-spec start(name(), plugins(), ipport()) -> true.
-spec start(name(), plugins(), ipport(), options()) -> true.
-spec start_link(plugins(), ipport()) -> true.
-spec start_link(name(), plugins(), ipport()) -> true.
-spec start_link(name(), plugins(), ipport(), options()) -> true.
------
The start/{2,3,4} and start_link/{2,3,4} functions start a registered
server and a TCP listener on ipport() and register all of the protocol
implementation modules in the plugins() list. If name() is undefined,
the server is not registered. The list of supported options() are as
follows:
{\'idletimer', non_neg_integer() | \'infinity'}::
Maximum time (in milliseconds) that a client connection may remain
idle before the server will close the connection. Default:
\'infinity'
{\'maxconn', non_neg_integer()}::
Maximum number of simultaneous TCP connections allowed. Default:
10000.
{\'proto', {\'ubf' | \'ebf' | \'jsf' | \'tbf' | \'ftbf' | atom()}}::
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire
format. Default: \'ubf'.
{\'proto', {\'ubf' | \'ebf' | \'jsf' | \'tbf' | \'ftbf' | atom(), proplist()}}::
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire
format with options. Default: {\'ubf', []}. Supported options:
\'safe';;
Prevents decoding data that may be used to attack the Erlang
system. In the event of receiving unsafe data, decoding fails
with a badarg error.
{\'registeredname', name()}::
Set the name to be registered for the TCP listener. If
\'undefined', a default name is automatically registered. Default: