Skip to content
/ sono Public

Simple Object NOtation for Kotlin

License

Notifications You must be signed in to change notification settings

xitep/sono

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sono – A "Simple Object NOtation" language for Kotlin

Kotlin allows fairly easily to define / describe complex object graphs. A contrived and simplified example could be:

data class Hobby(val name: String)
data class Person(
    val name: String,
    val age: Int,
    val hobby: Hobby,
    val friends: List<Person> = listOf())

// use the data definition to define a particular object
Person("pete", 10, Hobby("reading"),
    listOf(
        Person("tom", 33, Hobby("writing")),
        Person("mike", 9, Hobby("transpiling"))))

This ease of definition may inspire the programmer to create DSLs, leveraged to describe statically maintained lookup data in the source code or data for testing purposes, for example. To read / write such data dyanmically though, eg. reading it from a file or sending it over the network, the programmer quickly looses the nice syntax of her "object DSL." The availability of high quality JSON de-/serializers is often used at this place then to get a somewhat human readable form of the data for transmission, but it only gets our programmer so far. The JSON data will expose many implementation details the object creator did not be meant to be part of the DSL in the first place, eg. package names of the classes being involved.

Sono provides a parser (and printer), inspired by JSON's and Kotlin's syntax, to load (and write) objects of statically defined shape. The following are the main drivers behind Sono:

  • The serialized form of the objects, ie. the notation, is to be consumed / produced primarily by humans with tooling good in supporting textual presentation.
  • Type safety: loading Sono serialized data shall result in an explicit errors if it cannot be mapped to predefined structures.
  • Single source of truth: the "predefined structures" mentioned above must be conventional constructs in Kotlin, e.g. class, object, interface definitions. The (Kotlin) programmer is supposed to keep her DSL defined as is, merely leverage Sono as a convenient, user-facing notation to describe objects.
  • Hide implementation details: the notation in Sono shall abstract away implementation details of an object's underlying structure, such that a certain degree of changes to those structures is possible over time without breakage.

As an example of the notation, "pete" from the above Kotlin code could be described in Sono as follows:

Person {
    name: "pete",
    age: 10,
    hobby: "reading",
    friends: [
        Person("tom", 33, Hobby("writing")),
        Person("mike", 9, Hobby("transpiling"), [])
    ]
}

(The precise ways available to describe a Person varies with the exact definition of the class / object. A discussion of the possibilities follows further down.) Loading a string with such content into an object at runtime would constitute:

val s = "..."
val p = try {
    Sono().parse<abc.def.Person>(s)
} catch (ex: ParseException) {
    for (e in ex.errors) {
        print("$e")
    }
    throw IllegalStateException("could not load person: $ex")
}

Status / TODOs

Right now, Sono is in a proof-of-concept status with its public API mostly stable. Aspects of focus in the future will involve:

  • Language feature: Support varargs parameters in call-syntax
  • Thread safety: making a shared Sono instance safe for use from multiple threads
  • Performance: this aspect has not yet received much effort and the current implementation is suboptimal in certain aspects
  • Error messages: more work will be to be done to provide good user-facing error messages
  • Documentation: formal syntax definition and a language reference with many explanatory examples
  • Testing: a rigorous testing suit to cover the future for changes

Minutiae

Parsing a Sono document always targets a programmer provided type. This is, parsing resolves the document into an object of the specified type or fails.

Sono understands the following basic types:

  • string: multiline capable, enclosed in double quotes, eg. "...", with the escape character \ supporting the sequences \\, \", \b, \f, \n, \r, and \t. An escape character immediately followed by a literal newline discards that newline; allowing multiline input to be parsed into a single line string.
  • int: always decimal; at least one digit (0..9), may be preceded with a minus to denote negativity
  • float: always decimal; at least one digit (0..9) followed by a dot (.) follow by at least one more digit
  • boolean: the literals true and false
  • null: the one billion dollar mistake ;-)
  • list: a list of zero or more other items enclosed in brackets, ie. a pair of [ and ], with each item separated by a comma (,); if at least one item is specified, a trailing comma is allowed and silently discarded
  • keyword map: zero or more key / value pairs enclosed in braces, ie. a pair of { and }. Key and values are separated with a colon (:). Keys are unqualified identifiers (matching the regex [a-zA-Z_][a-zA-Z0-9_]*.) Values are, well, just that, ie. native types or objects. ("Keyword maps" are used to construct objects. See below.)
  • value map: zero or more key / value pairs enclosed in braces, ie. a pair of { and }. Key and values are separated with a colon (:). Both, keys and values, are native types or objects with the assumption that the key type properly implements the equals/hashCode contract. (Unlike "keyword maps", "value maps" cannot be used to construct arbitrary objects; they are solely intented to construct mappings from keys to values.)

Composed values, ie. instances of classes, are written in Sono using:

  • Named keyword maps, written as Name { param1: value1, param2: value2 } with Name referring to the "simple" class name of the value's type, instantiate a named class using its constructor whose parameter names match those in the key set of the provided map.
  • Call syntax, written as Name(value1, value2) with Name referring to the "simple" class name of the value's type, instantiate a named class using its constructor with the specified number of required arguments. (Note: since the call syntax does not allow to refer to the specified parameters by name, but solely addresses them by their position, only those optional parameters following the last required one are supported.)

As already mentioned, when parsing a Sono source, the programmer specifies a particular type which the source has to represent, hence, it's the specified type that ultimately defines the validity of the source. Essentially, the parser makes up its mind about how the source string has to be structured and maps the input tokens onto the target type. The following mapping rules apply when mapping a Sono to a Kotlin type:

  • strings can be specified in place of a type implementing CharSequence or a type implementing Enum in which case the string must match one of the enumeration's value.
  • ints can be used in places of Kotlin's integer types, ie. Byte, Short, Int, Long, and their equivalent "unsigned variants", as well as Float or Double. Additionally, BigInteger and BigDecimal are supported as targets for Sono ints.
  • floats can be used in places where Floats, Doubles, and BigDecimals are required.
  • booleans correspond to Kotlin's Boolean.
  • nulls can be placed in positions allowing nullable types
  • lists can be used in places expecting a List or Set.
  • "Unnamed" keyword maps (ie. keyword maps without a preceeding identifier) can be used in places of Map<String, *>. Due to the keys of the keyword maps being an identifier.
  • "Named" keyword maps (ie. keyword maps with a preceding identifier) must target a type whose simple name equals the map's name and has a constructor with required parameters fully covered by the maps keys. Additionally, if the target type is a final class and there is no ambiguity, the map's name can be omitted, ie. an "unnamed" keyword map can be used directly.
  • The "call syntax" above can be thought of as a "named list" where the name, just like for named maps, must equal the target type's simple name and the type provides a constructor with an arity of the list's length.
  • value maps can be used in places of Map<*, *>. They cannot be used to construct arbitrary objects.

Some of these rules are demo'ed in the above "Person" example.

Auto-coercion

Sono is strict about types and won't arbitrarily convert between types. This was a deliberate design choice to 1) place the control into the hands of the target type's author and 2) to keep the notation consistent and predictable.

However, Sono provides one shortcut that's worth exploring. If a target type is final and has a single argument constructor, a call to that type's constructor can be omitted and the argument specified directly instead. In example, the follwing is supported:

data class Kg(val amount: Int)

val a = Sono().parse<Kg>("Kg(12)")
val b = Sono().parse<Kg>("12")
assertEquals(a, b)

Singletons / Objects

Kotlin allows convenient declaration of singletons through its object keywoard and allows refering to that single instance by the class's name. Sono, however, deliberately hides the distinction between ordinary instances and singletons, forcing a seeming constructor call:

sealed interface Baz
data class BazCls(val b: Int) : Baz
data object BazObj : Baz

val a = Sono().parse<Baz>("BazCls(12)")
val b = Sono().parse<Baz>("BazObj()")
assertEquals(BazObj, b)

Into / Alternatives

Sometimes one might want a Sono encoded object to slightly deviate in structure from its Kotlin equivalent, but still transparently parse it into the original Kotlin object without its modification. One motivation might be to keep compatibility with already persisted Sono snippets while the original structure evolved or simply to make the user-facing DSL nicer while still allowing the application's code to rely on the original structure.

Sono provides the Into interface for this purpose. When parsing a value and there's a Into<TargetType> implementation, that particular implementation will additionally be allowed in the Sono script in places of TargetType.

// ~ the main data structure
interface Exercise
enum class Level { Easy, Normal, Hard }
data class PushUp(val level: Level, val reps: Int, val name: String): Exercise

// ~ let's define two alternatives for `PushUp`
data class WallPushUp(val reps: Int): Into<PushUp> {
    override fun into(): PushUp = PushUp(Level.Easy, reps, "Wall Push-Up")
}
data class OneArmPushUp(val reps: Int): Into<PushUp> {
    override fun into(): PushUp = PushUp(Level.Hard, 10, "One Arm Push-Up")
}

val sono = Sono()
    .withImport(WallPushUp::class)
    .withImport(OneArmPushUp::class)

val easy = sono.parse<Exercise>("WallPushUp(10)")
assertEquals(PushUp(Level.Easy, 10, "Wall Push-Up"), easy)

val hard = sono.parse<Exercise>("OneArmPushUp(5)")
assertEquals(PushUp(Level.Hard, 10, "One Arm Push-Up"), hard)

Note: the two "push-up" alternatives in the above example have no direct relation to the PushUp or Exercise class, hence, Sono would not know about them. To make them available to the user in the scripts, we explicitly "import" them.

Imports

User defined types are referred to in Sono scripts by the corresponding class' simple name. To choose a different name for a type, you can specify one when making an import:

val sono = Sono()
    .withImport(WallPushUp::class, "SuperEasyPushUp")

The new name must be a valid identifier. The original name, ie. "WallPushUp" in this example, will then not be available to users anymore. Two give more than one name to a particular type, we can import it multiple times; each time with a different "alias."

Sub-Classes / Sub-Types

Just like in Kotlin, Sono allows sub-types in places of their corresponding super types. The particular implementation class a value will be parsed into, based on its simple name as encountered in the script, is discovered by the Sono parser as follows:

  1. Attempts to match the name against the set of "imported" types
  2. Attempts to match the name against the simple name of the target type class
  3. If the target type is a sealed class (or interface), attempts to match the name from the script against the simple names of the sealed sub-classes. (Only exactly one may be matching; multiple possibilities are considered an error.)

At this point, if no class could have been determined, an error will be reported. Not all type hierarchies are "sealed" though (or might contain multiple classes with the same "simple name"). Sono provides the extension point .withSubtypes(..) to allow resolving all sub-types of a given one, typically originating from somewhere in the target type being parsed into.

Note: that "imports" (as demo'ed above) is a safe and preferred way to avoid any ambiguity and gives you, the programmer, precise control.

PrintAs / Printing

Sono provides a .print(..) method to render a given value in Sono notation. Values printed this way, must then be parsable back into an equivalent value.

The default implementation of the printer is rather conservative and might not necessarily produce what a user would type in. To allow tweaking the output of the printer, Sono provides the PrintAs interface, allowing types to choose an alternative presentation when being printed.

However, using PrintAs allows to break the "be parsable back" requirement and explicit "imports" might then be necessary for parsing. Therefore, printing the PrintAs alternatives is disabled by default and has to be enabled using .withAllowedPrintAs(..).

Comments

Sono supports comments. These are introduced by a double slash, ie. // and extend to the end of line. They are fully ignored by the parser (the scanner actually) and cannot be captured for consumption at runtime.

About

Simple Object NOtation for Kotlin

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages