{ "!": "Meta", "Status": "pre-Alpha", "Revision": "!date 18 June 2010", "Authors": [ "Ingy döt Net" ] }
JSYNC (pronounced jay-sink; IPA: /ˈdʒeɪsingk/), stands for JavaScript and YAML Notation Coding. It is a data serialization language based on the JSON data interchange format, and the YAML serialization language. It takes the simplicity of JSON and adds just a few YAML concepts to become a complete serialization language.
YAML is a language that was first conceived in 2001. It started with the 3 basic data models of modern programming languages: mappings (aka objects/hashes/dictionaries), sequences (aka arrays/lists), and scalars (aka single values). It added a URL based type system and a simple reference notation. With just those simple primitives, YAML is able to serialize any computer data graph.
JSON was also started in 2001, albeit completely independent of YAML. It used a subset of the JavaScript data syntax to describe the same primitives as YAML: mappings, sequences and scalars. JSON is not a complete serialization language, nor was intended to be. It is a data interchange format for easily communicating common data structures.
Both in syntax and data model, it turns out that JSON is a proper subset of YAML. In other words, any YAML loader can be used to properly load (decode) any valid JSON stream. In its syntax, YAML is vastly more rich and complex than JSON, but in terms of the data model, there are only a handful of things missing from JSON to give it the full power of YAML. In other words, to make it a complete serialization language:
-
JSON has no node type system, beyond Mapping, Array, String, Number, Boolean and Null.
-
JSON has no node reference system.
-
JSON only allows Strings as mapping keys. YAML allows any node.
-
JSON only allows the encoding of top level mappings and sequences, not scalars.
-
JSON only allows exactly 1 top level node. YAML allows 0 or more.
JSYNC is a format that is 100% JSON, but adds these missing concepts to become a complete serialization language.
YAML is widely used by dynamic languages like Ruby, Python, Perl and PHP, but it has suffered to some degree because the quality of implementations has varied so widely. This is due to the fact that the YAML specification is quite complex and thus difficult to implement.
JSON, on the other hand has spread like wildfire through the above languages and dozens more. This is generally attributed to its simplicity and ease of implementation.
JSYNC hopes to leverage the power of both of these formats to provide a very simple and very interoperable serialization language.
This specification uses the (programming language agnostic) terminology set forth in the YAML specification. Please refer to it as a guide. One primary term that is used is "mapping", which is the same as "object" in JSON. In this spec, "mapping" is always used to refer to a collection of key/value pairs. "Object" is used in the Object Oriented sense, meaning an in-memory instance of a class.
For every possible JSYNC serialization, there is an equivalent YAML form. For this reason, JSYNC examples will often be shown next to their YAML equivalents. Therefore, knowledge of YAML is required to understand the full meaning of the examples.
To keep this specification reasonably simple, concepts that are defined in the YAML and JSON specifications are not fully respecified here.
JSYNC attempts to:
-
Be a portable, language-independent data serialization language.
-
Be a proper superset of JSON.
-
Be a minimal extension of JSON.
-
Offer all the serialization capability of YAML.
JSYNC does not attempt to:
-
Be human friendly (readable/editable by everyone).
-
Be forgiving of syntax or semantic errors.
-
Add more YAML concepts than necessary.
This section gives a series of simple examples, to demonstrate the capabilities of JSYNC.
This would be loaded into a given programming language environment as an instance object of a Soldier class:
{ "!": "Soldier", "name": "Benjamin", "rank": "Private", "serial number": 123456789 }
Equivalent YAML:
--- !Soldier name: Benjamin rank: Private serial number: 123456789
He and she share the same car:
{ "His car": { "&": "001", "make": "Volvo", "vin": "918273645" }, "Her car": "*001" }
YAML:
His car: &001 make: Volvo vin: 918273645 Her car: *001
Looking into a mirror infinitely:
{ "&": "Mirror", "look": "*Mirror" }
YAML:
--- &Mirror look: *Mirror
If you have streaming implementations on both ends of a JSYNC communication, you could send and receive/process a non-terminating JSON stream. Here is how to send a stream of multiple top level documents in JSYNC:
[ {"%JSYNC":"1.0"} ,{ "!": "event", "coordinates": [10, 13] } ,{ "!": "event", "coordinates": [10, 15], } ,{ "!": "event", "coordinates": [10, 15.5], } ]
Note
|
The first mapping is special JSYNC meta information. See Directives below. |
YAML:
--- !event coordinates: [10, 13] ... --- !event coordinates: [10, 15] ... --- !event coordinates: [10, 15.5] ...
This section introduces the concepts that JSYNC adds to JSON: Tags, Anchors, Aliases, Complex Keys and Directives. It also discusses the top level node rules that are added from YAML.
These concepts are all fully described in the YAML Spec, so please refer to that for the complete details.
Tags are URLs that denote data types. They are denoted by beginning with a "!" character.
A fully qualified tag URL looks like this:
!<tag:example.com,2010:Thing>
More often, tags are abbreviated to something that looks like one of these:
!example!Thing !!Thing !Thing
The tags are expanded by a JSYNC processor into their fully qualified forms, by %TAG
directives (described below) or by configuring the processor directly in a program.
JSYNC uses YAML’s Anchor/Alias system to serialize multiple references to an identical node, including circular references. The first time such a node is serialized, it is marked with a unique string, preceded by a "&" character. This string is called an Anchor, and it looks like this:
&001
Subsequent serializations of the same node are identified by the same string preceded by a "*" character. This is called an Alias:
*001
Any node can be used as a mapping key by first using the node as a mapping value whose key is of the form "&" plus an identifier. Then the alias string form can be used to reference it. The original key/value pair is not loaded as part of the graph, only the alias references are.
{ "!": "DiceDistribution", "&11": [1, 1], "&66": [6, 6], "*11": 42, "*66": 53 }
YAML:
--- !DiceDistribution [1, 1]: 42 [6, 6]: 53
A directive is a piece of information that gives the parser some extra information. YAML has only 2 directives:
%YAML 1.2 %TAG !foo! tag:foo.com,2009: %TAG !bar! tag:bar.com,2010:
The %YAML
directive indicates the YAML specification version used, and the TAG
directive provides a way to turn tag abbreviations into fully qualified tags.
If a directive is needed in JSYNC, you wrap the entire stream with a sequence that has a special mapping as its first value. This mapping contains the directives, and it is required to have a %JSYNC
key.
[ { "%JSYNC": "1.0", "%TAG": { "!foo!": "tag:foo.com,2009:", "!bar!": "tag:bar.com,2010:" } }, { "!": "foo!this", "some": { "!": "bar!that", "thing": "borrowed" } } ]
YAML:
%YAML 1.2 %TAG !foo! tag:foo.com,2009: %TAG !bar! tag:bar.com,2010: --- !foo!this some: !bar!that thing: borrowed
Encoding zero or more top level nodes in JSYNC, uses the same wrapping mechanism described in the previous section. After the first special mapping in the sequence, each subsequent element represents a top level node.
[ {"%JSYNC":"1.0"}, {"!": "Message", "text": "Hello there"}, {"!": "Message", "text": "O HAI"}, {"!": "Message", "text": "KTHXBAI"} ]
YAML:
--- !Message text: Hello there --- !Message text: O HAI --- !Message text: KTHXBAI
A minimal JSYNC serialization of zero top level nodes would be:
[{"%JSYNC":"1.0"}]
JSYNC can further this wrapper notion by allowing top level scalars to be serialized. This is not possible in JSON.
[ {"%JSYNC":"1.0"}, "!Quote A rose by any other name would smell as sweet." ]
YAML:
--- !Quote A rose by any other name would smell as sweet.
Of course, multiple top level scalar nodes, or any combination of top level mappings, sequences and/or scalars is allowed.
Every JSYNC stream must be valid JSON syntax. JSYNC uses all of the JSON syntax, and adds nothing to it.
JSYNC adds extra information to mappings, sequences and scalars, using 3 simple and similar techniques. These techniques will be covered separately in the following 3 sections.
JSYNC adds extra information to JSON mappings by using 2 special mapping keys: "!" (for tag) and "&" (for anchor).
Note
|
To use these strings as actual keys, see JSYNC Escaping below. |
For example:
{ "!": "Fruit", "&": "001", "name": "apple", "color": "red" }
YAML:
--- !Fruit &001 name: apple color: red
JSYNC adds extra information to JSON sequences by using a special string in the first position of the sequence. The string must contain a tag, an anchor, or both (separated by a single space).
For example:
[ "!Groceries &002", "Bread", "Milk", "Orange Juice" ]
YAML:
--- !Groceries &002 - Bread - Milk - Orange Juice
JSYNC adds extra information to JSON scalars by prepending a tag or an anchor (or both), each followed by a single space, to the start of a string scalar.
For example:
[ "!Fruit apple", "!Fruit pear", "!Vegetable carrot", "!null " ]
YAML:
- !Fruit apple - !Fruit pear - !Vegetable carrot - !null ''
Note
|
When tagging an empty string, a space is still required after the tag. |
Note
|
In practice, programming languages do not care whether or not equivalent scalars are actually identical. Therefore, anchors are not typically used with scalar values. |
JSYNC does not reserve any strings as special. In other words, you can serialize any scalar value in JSYNC. In order to distinguish literal text from their JSYNC equivalents, a "." is used as a prefix. Any string beginning with a "!", "&", "%", "*" (or contiguous "." characters followed by one of those four) must add a period to the start, on serialization. During deserialization, a starting period followed by one of those sequences is removed. A starting period not followed by one of those, is *not* removed.
Consider this YAML:
--- !T1 &A1 "!": .! .! .! "&": ...&hmm "%": ".1" .: ... "*A1": *A1
The equivalent JSYNC would be:
{ "!": "T1", "&": "A1", ".!": "..! .! .!", ".&": "....&hmm", ".%": ".1", ".": "...", ".*A1": "*A1" }
Note
|
This section is under heavy construction at the moment. Don’t take it too seriously yet. |
It is strongly encouraged that all JSYNC implementations use the same API. This has been an adoption barrier in both YAML and JSON. To that end, the following API is suggested.
A complete YAML implementation has the following processing stack. A complete JSYNC implementation could have the same.
Loader Stack Memory Representation Dumper Stack ============================================================== Loader Dumper \ / (Native Data/Objects) / \ Constructor Representer \ / (Generic Node Graph) / \ Composer <--> Resolver <--> Serializer \ / (Event Tree) / \ Parser Emitter \ / (Token Stream) / / Scanner \ / (Character Stream) / \ Reader Writer \ / (String or File Handle)
Typically a JSON implementation will simply have an encode()
and decode()
function set. The first implements Dumper→Writer and the other implements Reader→Loader, all in one atomic operation. This is very simple, although it prevents a lot of useful things being done in between ends.