Skip to content

Egg #3: Structured String Literals

Ramkumar Ramachandra edited this page Aug 12, 2017 · 6 revisions

Motivation

In many programming languages, there is often a dilemma when working with structured data types such as XML, JSON or S-Expressions. Usually, the programmer could either use the native syntax of the particular language in the form of strings, lacking the expected type safety; or the programmer could use a type-safe abstract representation, loosing much of the readability. Structured string literals seek to solve this problem by providing a type-safe compile-time of parsing structured data, while still providing the usual safety guarantees one could expect in a staticly-typed language.

Similar Technologies

Description

Adding Support for Structured String Literals

The basic idea for creating a structured string literal parser, should be to implement a type class akin to the following:

class Parse t where
  parse : String -> Result (Description t)

where the relevant data types are:

data Result a = Success a | Error String

data Description : Type -> Type where
  Escape : String                                -> Description t
  Ap     : Description (a -> t) -> Description a -> Description t
  Pure   : t                                     -> Description t

Essentially the Description data type is a defunctionalisation of applicative functors/idiom brackets, but with the additional property that some parts of the expression might be escaped Idris expressions of the relevant data type.

Using Structured String Literals

Since structured string literal parsers are usually defined in modules, using structured string literals is as simple as importing the relevant module.

Syntactically, structured data literals would start with a sigil @ followed by an optional Idris identifier and end with a string literal. The structured string literal parser author is free to choose whatever representation for escaped expressions the author desires (although a recommendation should be made to ensure consistency). Examples of literals:

xmlPerson : String -> Nat -> Xml
xmlPerson name age = @"""<person>
  <name>$name</name>
  <age>$age</age>
</person>
"""

prettiedPerson : String -> Nat -> String
prettiedPerson name age = @format"name: %s=name\t %d=age"

Semantically, structured string literal resolution would for a given target type t find the relevant Parse instance, parse the content and then try to reify the structure to t. If an identifier was provided, then the relevant instance would be the named instance that has the identifier as name, e.g. the format structured string literal would correspond to instance [format] Parse String where ....

Example: JSON

JSON is a popular language for representing structured data. An example of a JSON document containing a persons name, age and a list of hobbies would look something like the following:

{
   "name" : "John Doe",
   "age" : 42,
   "hobbies" : [ { "type" : "Capoeira", "club" : "São Paulo Hobbyist Capoeira Club"},
                 { "type" : "Go", "club" : "International Go Club - Los Angeles" } ]
}

However this notation is not available in Idris, and therefore the corresponding structure must be represented as abstract syntax internally. A translation of the above document to abstracts could look something like the following:

JsonObject([
  ("name", JsonString "John Doe"),
  ("age",  JsonInt 42),
  ("hobbies", 
             JsonArray [ JsonObject [("type", JsonString "Capoeira"), ("club", JsonString "São Paulo Hobbyist Capoeira Club")],
                         JsonObject [("type", JsonString "Go"), ("club", JsonString "International Go Club - Los Angeles") ]])
])

While the abstract representation provides a good approximation of the concrete syntax, its readability is somewhat reduced due to verbosity. Furthermore, the programmer is required to map between the different representations which might introduce some unforeseen bugs.

With structured string literals, the programmer can have the readability of the concrete syntax with the type safety of the abstract syntax. For example:

jdDoc : Json
jdDoc = @"""
{
   "name" : "John Doe",
   "age" : 42,
   "hobbies" : [ { "type" : "Capoeira", "club" : "São Paulo Hobbyist Capoeira Club"},
                 { "type" : "Go", "club" : "International Go Club - Los Angeles" } ]
}
"""

In this case the compiler (with support for structured string literals) would automatically parse the literal to JSON and check that it is well structured.

Resolution of Literals

helloWorld = """Hello
  World"""

now works.

Clone this wiki locally