Skip to content

Commit bf8eb81

Browse files
committed
Merge pull request #264 from mitza-oci/master
Import the Intermediate Type Language (itl) library which was previously in a separate git repo.
2 parents 74ac421 + 5360b3f commit bf8eb81

37 files changed

Lines changed: 8618 additions & 3 deletions

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "tools/IntermediateTypeLang/cpp/rapidjson"]
2+
path = tools/IntermediateTypeLang/cpp/rapidjson
3+
url = git://github.com/miloyip/rapidjson.git

.travis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ branches:
2525
before_script:
2626
- export
2727
- tar czf modeling_plugins.tar.gz tools/modeling/plugins
28-
- rm -rf tools/modeling/plugins
28+
- rm -rf tools/modeling/plugins tools/IntermediateTypeLang/cpp/rapidjson
2929
- perl $DDS_ROOT/tools/scripts/dds_fuzz.pl
3030
- if [ "$CXX" == "g++" ]; then ./configure; fi
3131
- if [ "$CXX" == "clang++" ]; then ./configure --compiler=clang++ --no-inline; fi

MPC/config/itl.mpb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
feature(!no_itl) {
2-
includes += $(ITL_ROOT)/cpp
3-
includes += $(ITL_ROOT)/cpp/rapidjson/include
2+
includes += $(DDS_ROOT)/tools/IntermediateTypeLang/cpp
3+
includes += $(DDS_ROOT)/tools/IntermediateTypeLang/cpp/rapidjson/include
44
}
55

66
feature(no_itl) {
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
* General
2+
** TODO Pick a license
3+
** TODO Examples
4+
** TODO Common annotations (enums, bitsets, rational numbers, strings, maps, sets, etc.)
5+
* Grammar
6+
** TODO Tuples
7+
* Bindings
8+
Support for generating and parsing ITL in various languages.
9+
** C++
10+
*** Features
11+
*** Bugs
12+
*** Tests
13+
*** Documentation
14+
** Java
15+
*** Features
16+
*** Bugs
17+
*** Tests
18+
*** Documentation
19+
* Producers
20+
Add support for generating ITL.
21+
- IDL (DDS)
22+
- FAST
23+
* Consumers
24+
Add support for consuming ITL or producing from ITL.
25+
- FROM (Pronghorn)
26+
- Avro
27+
- Wireshark (DDS)
28+
- XMI (DDS Modeling)
29+
- DDS Modeling Eclipe Plugin
30+
- Protobuf
31+
- Thift
32+
- TypeCode
33+
* Demos
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
Intermediate Type Language
2+
==========================
3+
4+
Intermediate Type Language (ITL) is an attempt to provide a common
5+
type system for serialization schemes in a machine-friendly format.
6+
7+
# Problem
8+
A common practice for applications that transmit or store data is to
9+
define the structure of the data in a neutral representation and then
10+
generate types that provide a native representation of the data in a
11+
particular programming language and code that can serialize and
12+
deserialize those types.
13+
14+
There are three common problems with this approach. First, the
15+
neutral representation used to describe the data is often too heavy
16+
for automated translation. For example, IDL has a number of
17+
user-friendly features that are not machine friendly. Second, the
18+
type system is often coupled to the serialization scheme.
19+
Translating between serialization schemes leads to an N^2 problem of
20+
matching types in different systems. Third, serialization often
21+
assumes that the source or target is a native object in a programming
22+
language. That is, the language-neutral type has a corresponding
23+
concrete type in a programming language and the goal is to serialize
24+
to and from a value of the concrete type. This prevents potential
25+
optimization where the data is left in a serialized form and
26+
selectively deserialized as needed.
27+
28+
# Use Case
29+
30+
The primary use case for ITL is the development of translators that
31+
convert from one serialization scheme to another. The user provides a
32+
description of the incoming/outgoing data using ITL. The translator
33+
uses the ITL description of the data to perform the appropriate
34+
translation. The translator may interpret the ITL at run-time or the
35+
translator may be generated from ITL.
36+
37+
# Design Goals
38+
39+
1. General - ITL should support common types found in existing
40+
infrastructure such as IDL, FAST, Avro, Google Protocol Buffers,
41+
Thrift, etc.
42+
2. Machine-friendly - ITL should be easy to generate, easy to parse,
43+
and easy to use once parsed.
44+
3. Extensible - ITL should provide a means of annotating types with
45+
their intended use and external encoding-specific details, e.g., delta
46+
compression.
47+
48+
ITL is descriptive and not prescriptive. The types that can be
49+
described with ITL may be a subset or superset of the types that can
50+
be described in another language. If a tool cannot describe a type in
51+
ITL, then ITL should not be used (and the user should be informed).
52+
If a serializer or deserializer is given a type that cannot be
53+
represented in that serialization scheme, then an appropriate failure
54+
mode should be adopted.
55+
56+
# Scalar Types
57+
58+
- **int** - Represents an integral number. An int has the number of
59+
bits needed to represent values of this type and a flag indicating
60+
if the values are unsigned. If the number of bits is not present,
61+
then the values of this type may have arbitrary magnitude. The
62+
unsigned flag is optional and assumed to be false.
63+
- **float** - Represents a floating-point number. A float has an
64+
optional model that describes the values represented by this
65+
type.
66+
- **fixed** - Represents a fixed-point number. A fixed has a base,
67+
the total number of digits, and a scale that indicates the number of
68+
digits after the decimal point.
69+
- **string** - Represents a text sequence.
70+
71+
Integers, fixed-point numbers, and strings have an optional set of
72+
name-value pairs and an optional flag indicating if values of this
73+
type are constrained to the specified set of values. A value in a
74+
name-value pair is stored as a string for use as a union
75+
discriminator. If an integer, fixed-point, or string is used as a
76+
discriminator, then the set of name-value pairs must be one-to-one.
77+
78+
# Compound Types
79+
80+
- **sequence** - Represents a homogenous sequence of values of a given
81+
type. A sequence has has either:
82+
1. No size or capacity indicating a dynamic size.
83+
2. An integer size indicating a fixed size.
84+
3. An array of sizes indicating the size of each dimension.
85+
4. A capacity indicating a dynamic size but limit on the number of values in the sequence.
86+
The size setting is preferred to the capacity. If the elements
87+
have a fixed size, then size and capacity can be used to
88+
pre-allocate buffers.
89+
- **record** - A record represents a potentially heterogeneous sequence of
90+
named values. A record is defined by a list of fields. Each field
91+
has a name, a type, and an optional flag indicating if the field is
92+
optional. The name of each field must be unique.
93+
- **union** - A union represents a value from a finite set of types.
94+
A union has a discriminator type (int, fixed, string) that is used
95+
to determine the actual type and a non-empty set of fields. A union
96+
field has a name, a type, and a set of labels of the discriminator
97+
type. A label must correspond to a named value of the discriminator
98+
type. The name of each field must be unique. The pair-wise
99+
intersection of union field labels must be disjoint. An empty set
100+
of labels means that this field is the default.
101+
- **alias** - An alias for another type. An alias has a name and type.
102+
103+
# Float Models
104+
105+
A float model refers to a specification for floating-point numbers.
106+
When a model is specified for a floating-point type, it means that any
107+
value of the corresponding type *may* be represented by an
108+
implementation of the model. An implementation is not restrained by
109+
the model in its approach to encoding the number. However,
110+
implementations and users must be prepared to handle lossy conversions
111+
and respond appropriately.
112+
113+
- "binary16" - IEEE 754 of same name
114+
- "binary32" - IEEE 754 of same name
115+
- "binary64" - IEEE 754 of same name
116+
- "binary128" - IEEE 754 of same name
117+
- "decimal32" - IEEE 754 of same name
118+
- "decimal64" - IEEE 754 of same name
119+
- "decimal128" - IEEE 754 of same name
120+
121+
# Annotations
122+
123+
Annotations provide a way to capture semantics about encoded data that
124+
govern its use. To illustrate
125+
the first, consider the problem of serializing a set. A serialized
126+
set looks like a sequence. However, when deserializing, the
127+
translator should attempt to restore set semantics by using an
128+
appropriate data type. In this case, the sequence should
129+
be annotated as a set. Annotations also provide a way to record
130+
details related to a particular encoding. For example, FAST delta
131+
compression assumes a know starting value and then sends updates to
132+
that value. In this case, the field containing the value should be
133+
annotated with delta compression so that translators will know (and
134+
can take advantage of) this fact.
135+
136+
Annotations are a set of key/value pairs where each key corresponds a
137+
system. For the set example, the key may be "semantic" and the value
138+
may be { "preferredDataType" : "set" }. For the delta compression
139+
example, the key may be "FAST" and the value may be { "compression" :
140+
"delta" }. Value nesting is allowed to facilitate the creation of
141+
ontologies for different systems.
142+
143+
# Implementation
144+
145+
ITL is written using JSON to achieve machine friendliness. ITL
146+
presents a self-contained representation of types. There is no
147+
facility from importing types from external resources. There is no
148+
direct support for inheritance.
149+
150+
# Grammar
151+
152+
The grammar is presented as a JSON/BNF hybrid. Non-terminals are
153+
capitalized (Root) and non-terminals are lower-case (int).
154+
Terminals refer to JSON values with the same name. The terminal
155+
"value" represents any JSON value. The construct ( ... )? represents
156+
an optional group.
157+
158+
```
159+
Root:
160+
{ "types" : [ TypeDef ] }
161+
162+
TypeDef:
163+
{ "kind" : "int" (, "bits" : integer)? (, "unsigned" : boolean)? (, "values" : Values)? (, "constrained" : boolean)? }
164+
| { "kind" : "float" (, "model" : FloatModel)? }
165+
| { "kind" : "fixed", "base" : integer, "digits" : integer, "scale" : integer (, "values" : Values)? (, "constrained" : boolean)? }
166+
| { "kind" : "string" (, "values" : Values)? (, "constrained" : boolean)? }
167+
| { "kind" : "sequence", "type" : Type (,("size" : integer ) | ("size" : [ integer ] )? (, "capacity" : integer )? }
168+
| { "kind" : "record", "fields" : [ Field ] }
169+
| { "kind" : "union", "discriminator" : Type, "fields" : [ UnionField ] }
170+
| { "kind" : "alias", "name" : string, "type" : Type }
171+
172+
Type:
173+
string
174+
| TypeDef
175+
176+
Field:
177+
{ "name" : string, "type" : Type, ("optional" : boolean)? }
178+
179+
UnionField:
180+
{ "name" : string, "type" : Type, "labels" : [ string ] }
181+
182+
FloatModel: "binary16" | "binary32" | "binary64" | "binary128" | "decimal32" | "decimal64" | "decimal128"
183+
184+
Values: JSON object where all field values are strings
185+
```
186+
187+
Every JSON Object ({ ... }) has an optional note field ("note" : {
188+
... }) for annotating the field, type, etc.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
main: main.cpp itl/itl.hpp
2+
g++ -g -Wall -Irapidjson/include/ main.cpp -o main
3+
4+
.PHONY: clean
5+
clean:
6+
-rm -f main

0 commit comments

Comments
 (0)