Kotlin allows fairly easily to define / describe complex object graphs. A contrived and simplified example could be:
data class Hobby(val name: String)
data class Person(
val name: String,
val age: Int,
val hobby: Hobby,
val friends: List<Person> = listOf())
// use the data definition to define a particular object
Person("pete", 10, Hobby("reading"),
listOf(
Person("tom", 33, Hobby("writing")),
Person("mike", 9, Hobby("transpiling"))))
This ease of definition may inspire the programmer to create DSLs, leveraged to describe statically maintained lookup data in the source code or data for testing purposes, for example. To read / write such data dyanmically though, eg. reading it from a file or sending it over the network, the programmer quickly looses the nice syntax of her "object DSL." The availability of high quality JSON de-/serializers is often used at this place then to get a somewhat human readable form of the data for transmission, but it only gets our programmer so far. The JSON data will expose many implementation details the object creator did not be meant to be part of the DSL in the first place, eg. package names of the classes being involved.
Sono provides a parser (and printer), inspired by JSON's and
Kotlin's syntax, to load (and write) objects of statically defined
shape. The following are the main drivers behind Sono:
- The serialized form of the objects, ie. the notation, is to be consumed / produced primarily by humans with tooling good in supporting textual presentation.
- Type safety: loading
Sonoserialized data shall result in an explicit errors if it cannot be mapped to predefined structures. - Single source of truth: the "predefined structures" mentioned above
must be conventional constructs in Kotlin, e.g. class, object,
interface definitions. The (Kotlin) programmer is supposed to keep
her DSL defined as is, merely leverage
Sonoas a convenient, user-facing notation to describe objects. - Hide implementation details: the notation in
Sonoshall abstract away implementation details of an object's underlying structure, such that a certain degree of changes to those structures is possible over time without breakage.
As an example of the notation, "pete" from the above Kotlin code
could be described in Sono as follows:
Person {
name: "pete",
age: 10,
hobby: "reading",
friends: [
Person("tom", 33, Hobby("writing")),
Person("mike", 9, Hobby("transpiling"), [])
]
}
(The precise ways available to describe a Person varies with the
exact definition of the class / object. A discussion of the
possibilities follows further down.) Loading a string with such
content into an object at runtime would constitute:
val s = "..."
val p = try {
Sono().parse<abc.def.Person>(s)
} catch (ex: ParseException) {
for (e in ex.errors) {
print("$e")
}
throw IllegalStateException("could not load person: $ex")
}
Right now, Sono is in a proof-of-concept status with its public API
mostly stable. Aspects of focus in the future will involve:
- Language feature: Support
varargsparameters in call-syntax - Thread safety: making a shared
Sonoinstance safe for use from multiple threads - Performance: this aspect has not yet received much effort and the current implementation is suboptimal in certain aspects
- Error messages: more work will be to be done to provide good user-facing error messages
- Documentation: formal syntax definition and a language reference with many explanatory examples
- Testing: a rigorous testing suit to cover the future for changes
Parsing a Sono document always targets a programmer provided type.
This is, parsing resolves the document into an object of the specified
type or fails.
Sono understands the following basic types:
string: multiline capable, enclosed in double quotes, eg. "...", with the escape character\supporting the sequences\\,\",\b,\f,\n,\r, and\t. An escape character immediately followed by a literal newline discards that newline; allowing multiline input to be parsed into a single line string.int: always decimal; at least one digit (0..9), may be preceded with a minus to denote negativityfloat: always decimal; at least one digit (0..9) followed by a dot (.) follow by at least one more digitboolean: the literalstrueandfalsenull: the one billion dollar mistake ;-)list: a list of zero or more other items enclosed in brackets, ie. a pair of[and], with each item separated by a comma (,); if at least one item is specified, a trailing comma is allowed and silently discardedkeyword map: zero or more key / value pairs enclosed in braces, ie. a pair of{and}. Key and values are separated with a colon (:). Keys are unqualified identifiers (matching the regex[a-zA-Z_][a-zA-Z0-9_]*.) Values are, well, just that, ie. native types or objects. ("Keyword maps" are used to construct objects. See below.)value map: zero or more key / value pairs enclosed in braces, ie. a pair of{and}. Key and values are separated with a colon (:). Both, keys and values, are native types or objects with the assumption that the key type properly implements theequals/hashCodecontract. (Unlike "keyword maps", "value maps" cannot be used to construct arbitrary objects; they are solely intented to construct mappings from keys to values.)
Composed values, ie. instances of classes, are written in Sono
using:
- Named keyword maps, written as
Name { param1: value1, param2: value2 }withNamereferring to the "simple" class name of the value's type, instantiate a named class using its constructor whose parameter names match those in the key set of the provided map. - Call syntax, written as
Name(value1, value2)withNamereferring to the "simple" class name of the value's type, instantiate a named class using its constructor with the specified number of required arguments. (Note: since the call syntax does not allow to refer to the specified parameters by name, but solely addresses them by their position, only those optional parameters following the last required one are supported.)
As already mentioned, when parsing a Sono source, the programmer
specifies a particular type which the source has to represent, hence,
it's the specified type that ultimately defines the validity of the
source. Essentially, the parser makes up its mind about how the
source string has to be structured and maps the input tokens onto the
target type. The following mapping rules apply when mapping a Sono
to a Kotlin type:
stringscan be specified in place of a type implementingCharSequenceor a type implementingEnumin which case the string must match one of the enumeration's value.intscan be used in places of Kotlin's integer types, ie.Byte,Short,Int,Long, and their equivalent "unsigned variants", as well asFloatorDouble. Additionally,BigIntegerandBigDecimalare supported as targets forSonoints.floatscan be used in places whereFloats,Doubles, andBigDecimals are required.booleanscorrespond to Kotlin'sBoolean.nullscan be placed in positions allowing nullable typeslistscan be used in places expecting aListorSet.- "Unnamed"
keyword maps(ie.keyword mapswithout a preceeding identifier) can be used in places ofMap<String, *>. Due to the keys of thekeyword mapsbeing an identifier. - "Named"
keyword maps(ie.keyword mapswith a preceding identifier) must target a type whose simple name equals the map's name and has a constructor with required parameters fully covered by themapskeys. Additionally, if the target type is a final class and there is no ambiguity, the map's name can be omitted, ie. an "unnamed" keyword map can be used directly. - The "call syntax" above can be thought of as a "named list" where the name, just like for named maps, must equal the target type's simple name and the type provides a constructor with an arity of the list's length.
value mapscan be used in places ofMap<*, *>. They cannot be used to construct arbitrary objects.
Some of these rules are demo'ed in the above "Person" example.
Sono is strict about types and won't arbitrarily convert between
types. This was a deliberate design choice to 1) place the control
into the hands of the target type's author and 2) to keep the notation
consistent and predictable.
However, Sono provides one shortcut that's worth exploring. If a
target type is final and has a single argument constructor, a call
to that type's constructor can be omitted and the argument specified
directly instead. In example, the follwing is supported:
data class Kg(val amount: Int)
val a = Sono().parse<Kg>("Kg(12)")
val b = Sono().parse<Kg>("12")
assertEquals(a, b)
Kotlin allows convenient declaration of singletons through its
object keywoard and allows refering to that single instance by the
class's name. Sono, however, deliberately hides the distinction
between ordinary instances and singletons, forcing a seeming
constructor call:
sealed interface Baz
data class BazCls(val b: Int) : Baz
data object BazObj : Baz
val a = Sono().parse<Baz>("BazCls(12)")
val b = Sono().parse<Baz>("BazObj()")
assertEquals(BazObj, b)
Sometimes one might want a Sono encoded object to slightly deviate
in structure from its Kotlin equivalent, but still transparently parse
it into the original Kotlin object without its modification. One
motivation might be to keep compatibility with already persisted
Sono snippets while the original structure evolved or simply to make
the user-facing DSL nicer while still allowing the application's code
to rely on the original structure.
Sono provides the Into interface for this purpose. When parsing a
value and there's a Into<TargetType> implementation, that particular
implementation will additionally be allowed in the Sono script in
places of TargetType.
// ~ the main data structure
interface Exercise
enum class Level { Easy, Normal, Hard }
data class PushUp(val level: Level, val reps: Int, val name: String): Exercise
// ~ let's define two alternatives for `PushUp`
data class WallPushUp(val reps: Int): Into<PushUp> {
override fun into(): PushUp = PushUp(Level.Easy, reps, "Wall Push-Up")
}
data class OneArmPushUp(val reps: Int): Into<PushUp> {
override fun into(): PushUp = PushUp(Level.Hard, 10, "One Arm Push-Up")
}
val sono = Sono()
.withImport(WallPushUp::class)
.withImport(OneArmPushUp::class)
val easy = sono.parse<Exercise>("WallPushUp(10)")
assertEquals(PushUp(Level.Easy, 10, "Wall Push-Up"), easy)
val hard = sono.parse<Exercise>("OneArmPushUp(5)")
assertEquals(PushUp(Level.Hard, 10, "One Arm Push-Up"), hard)
Note: the two "push-up" alternatives in the above example have no
direct relation to the PushUp or Exercise class, hence, Sono
would not know about them. To make them available to the user in the
scripts, we explicitly "import" them.
User defined types are referred to in Sono scripts by the
corresponding class' simple name. To choose a different name for a
type, you can specify one when making an import:
val sono = Sono()
.withImport(WallPushUp::class, "SuperEasyPushUp")
The new name must be a valid identifier. The original name, ie. "WallPushUp" in this example, will then not be available to users anymore. Two give more than one name to a particular type, we can import it multiple times; each time with a different "alias."
Just like in Kotlin, Sono allows sub-types in places of their
corresponding super types. The particular implementation class a
value will be parsed into, based on its simple name as encountered in
the script, is discovered by the Sono parser as follows:
- Attempts to match the name against the set of "imported" types
- Attempts to match the name against the simple name of the target type class
- If the target type is a sealed class (or interface), attempts to match the name from the script against the simple names of the sealed sub-classes. (Only exactly one may be matching; multiple possibilities are considered an error.)
At this point, if no class could have been determined, an error will
be reported. Not all type hierarchies are "sealed" though (or might
contain multiple classes with the same "simple name"). Sono provides
the extension point .withSubtypes(..) to allow resolving all
sub-types of a given one, typically originating from somewhere in the
target type being parsed into.
Note: that "imports" (as demo'ed above) is a safe and preferred way to avoid any ambiguity and gives you, the programmer, precise control.
Sono provides a .print(..) method to render a given value in
Sono notation. Values printed this way, must then be parsable back
into an equivalent value.
The default implementation of the printer is rather conservative and
might not necessarily produce what a user would type in. To allow
tweaking the output of the printer, Sono provides the PrintAs
interface, allowing types to choose an alternative presentation when
being printed.
However, using PrintAs allows to break the "be parsable back"
requirement and explicit "imports" might then be necessary for
parsing. Therefore, printing the PrintAs alternatives is disabled
by default and has to be enabled using .withAllowedPrintAs(..).
Sono supports comments. These are introduced by a double slash,
ie. // and extend to the end of line. They are fully ignored by the
parser (the scanner actually) and cannot be captured for consumption
at runtime.