Macro-based type providers for Scala (examples)
Scala
Latest commit c60c3b1 Mar 8, 2015 @travisbrown Updated versions

README.markdown

Scala type provider examples

Build status

This repository contains type provider examples that were discussed by Eugene Burmako and Travis Brown at the 2014 Scalar Conference. The talk was not recorded, but our slides are available in this repository.

A type provider is a compile-time metaprogramming component that allows the user to generate types (and implementations) from an external schema or other information source—you can think of them as a more principled solution to problems that would traditionally be solved with textual code generation (or a lot of repetitive boilerplate). For more background information, see the following resources:

Type providers are made possible in Scala by the experimental macro system introduced in 2.10; neither of the approaches outlined here will work on earlier versions.

After some conversations at the Scalar conference, we also began work on a regular expression type provider for Scala. This project is currently a fairly rough proof of concept, but it may help to show the range of possibilities for these kinds of components.

Motivation

We'll take as a running example the construction of RDF graphs using Banana RDF, a Scala RDF library developed by the World Wide Web Consortium.

Banana provides a clear and concise embedded DSL for building RDF graphs. For example, we might describe the second draft notebook of Mary Shelley's Frankenstein as follows:

val frankensteinNotebookB = (
  URI("http://shelleygodwinarchive.org/data/ox/ox-ms_abinger_c57")
    .a(dct.Text)
    -- dc.title ->- "Frankenstein Draft Notebook B"
    -- dc.creator ->- URI("https://en.wikipedia.org/wiki/Mary_Shelley")
)

dct and dc here are org.w3.banana.Prefix objects that represent the Dublin Core Metadata Initiative's types and terms vocabularies.

The following is an example of how you'd define a Prefix class in Banana:

class DCPrefix[Rdf <: RDF](implicit ops: RDFOps[Rdf])
  extends PrefixBuilder("dc", "http://purl.org/dc/terms/")(ops) {
  val title = apply("title")
  val creator = apply("creator")
  // and so on...
}

And then somewhere else in our project we'd write the following:

object dc extends DCPrefix[Rdf]

Which would allow the usage above.

This isn't too bad, but in some cases our vocabularies can be quite large (the Dublin Core terms vocabulary defines 77 properties, for example), which can make manually creating Prefix classes inconvenient and error-prone. Creating these classes manually can be especially frustrating when we have access to a machine-readable description of the vocabulary in the form of an RDF Schema. The terms schema, for example, is published on the web by the DCMI under a Creative Commons license.

Type providers allow us to avoid the boilerplate of translating these schemas into Prefix class definitions manually. This repository includes two macro-based implementations of type providers in Scala, one demonstrating the "public" approach, in which the body of a publicly-visible class is provided by the macro, and the other demonstrating the "anonymous" approach, where the macro defines and instantiates an anonymous class that is visible to the rest of the program as a structural type.

Anonymous type providers

The anonymous approach is supported by def macros, which means that it can be used in Scala 2.10 without additional compiler plugins (although note that the example implementation provided here uses quasiquotes and therefore does require the Macro Paradise plugin).

The syntax looks quite natural—we simply call a method with some arguments (in this case a single argument—the path to the schema resource).

val dct = PrefixGenerator.fromSchema[Rdf]("/dctype.rdf")

val dc = PrefixGenerator.fromSchema[Rdf]("/dcterms.rdf")

The macro does impose some additional constraints on the user of the type provider, however. In particular, the path argument must be a string literal, and it must point to a valid RDF Schema resource on the build classpath. If either of these constraints is not satisfied, the type provider will fail with a compile-time error.

The inferred types of dct and dc in this example are structural types that allow the usage demonstrated in the definition of frankensteinNotebookB above. More specifically, the generated code for the second line will look something like the following:

val dc = {
  class Prefix2 extends PrefixBuilder("dc", "http://purl.org/dc/terms/") {
    val title = apply("title")
    val creator = apply("creator")
    // and so on...
  }
  new Prefix2 {}
}

I.e., we're defining a class inside a block and then instantiating an anonymous subclass of that class (see this Stack Overflow answer and this earlier question for some discussion of why this two-step process is necessary). We can't see the class itself outside of the block, but we can see its methods on the structural type that will be inferred for dc.

Note that the type provider has also inferred the schema URI from the RDF Schema file (the second argument to the PrefixBuilder constructor) and has picked a reasonable short name for the Prefix (the first argument).

Please see the comments in the implementation for more detail about how exactly this approach works.

Public type providers

The public approach uses macro annotations, which allow us to expand the body of an annotated object definition.

@fromSchema("/dctype.rdf") object dct extends PrefixBuilder[Rdf]

@fromSchema("/dcterms.rdf") object dc extends PrefixBuilder[Rdf]

These definitions support the same usage as the anonymous examples above, but dct and dc are full-fledged objects, not instances of structural types. The generated code looks fairly similar:

object dc extends PrefixBuilder[Rdf]("dc", "http://purl.org/dc/terms/") {
  val title = apply("title")
  val creator = apply("creator")
  // and so on...
}

The implementation is also pretty similar to the anonymous type provider implementation.

Vampire methods

One of the disadvantages of using structural types in Scala is that they involve reflective access, which means you have to deal with warnings (and a hit to performance). For example, when you compile the example project you'll see the following:

[warn] ...reflective access of structural type member value title should be enabled
[warn] by making the implicit value scala.language.reflectiveCalls visible.
[warn]       -- dc.title ->- "Frankenstein Draft Notebook B"
[warn]             ^

It's possible, however, to use "vampire methods" to avoid this penalty. Vampire methods are macro methods on the anonymous class that read their values from some location at compile time (in this case we're using a static annotation on the method; if this sounds confusing, that's because it is).

(Note that in Scala 2.10.4 the use of vampire methods will still result in a reflective access warning, but this has been fixed in 2.10.5.)

While we provide an implementation of our example using vampire methods here, in general it's probably better to avoid the added complexity, unless you know for a fact that the performance of calls to methods on the structural type is a problem in your application.

Licenses

Portions of this software may use RDF schemas copyright © 2011 DCMI, the Dublin Core Metadata Initiative. These are licensed under the Creative Commons 3.0 Attribution license.

All other code is released under the Apache License, Version 2.0.