Andres Nötzli edited this page May 20, 2014 · 23 revisions

Next: Tracking Effects


Introduction

In this tutorial we’ll see step by step how to define a DSL and its concrete syntax, how to interprete it, how to build an intermediate representation on top of which domain specific optimizations can be performed and finally how to generate efficient code for a given target.

Project Setup

First, check that LMS is locally published on your system (see the README to know how to do this).

Create an empty sbt project using Scala 2.10 and depending on LMS. Your build.sbt file should look like the following:

name := "lms-tutorial"

scalaOrganization := "org.scala-lang.virtualized"

scalaVersion := "2.10.1"

libraryDependencies += "EPFL" %% "lms" % "0.3-SNAPSHOT"

scalacOptions += "-Yvirtualize"

Create an empty file LinearAlgebra.scala in the src/main/scala/ subdirectory of your project.

DSL concepts and concrete syntax

The first step consists in defining the DSL concepts and its concrete syntax. In our case, we’ll define a DSL for linear algebra, so we want to be able to express vectors and arithmetic operations on them. We will reduce this tutorial to only a few concepts, for brevity. For instance, we want to be able to express the concept of scaling a vector with an expression looking like the following:

val u = v * 12.34

Where v is a vector.

Thus we have a concept of vector and a scale operation. Let’s write these concepts in terms of functions and types:

import scala.virtualization.lms.common._

trait LinearAlgebra extends Base {
  // We use an abstract type to represent the vector data type
  type Vector
  // Scaling a vector `v` by a factor `k` gives another vector
  def vector_scale(v: Rep[Vector], k: Rep[Double]): Rep[Vector]
}

The Base trait comes from LMS and defines two main things:

  • The Rep[T] abstract type member, used to distinguish between values and representations of values ;
  • The unit[T](x: T): Rep[T] abstract method, used to lift a value x into a representation of it.

Then, as we want to allow DSL users to use a sweet syntax, we define a convenient infix operator * taking a vector and a scalar and returning a vector. So, here is what our user facing DSL trait looks like:

import scala.virtualization.lms.common._

trait LinearAlgebra extends Base {
  // Concepts
  type Vector
  def vector_scale(v: Rep[Vector], k: Rep[Double]): Rep[Vector]

  // Concrete syntax
  implicit class VectorOps(v: Rep[Vector]) {
    def * (k: Rep[Double]) = vector_scale(v, k)
  }
}

Now that we have a DSL we can write programs such as:

trait Prog extends LinearAlgebra {
  def f(v: Rep[Vector]): Rep[Vector] = v * unit(12.34)
}

The call unit(12.34) is used to lift the Double into Rep[Double], as required by our * operator. We could also import an implicit conversion lifting any Double to a Rep[Double], when required, by calling unit, allowing us to just write v * 12.34.

The Prog trait is still abstract since its vector_scale operation has no implementation. At the end we’ll build an intermediate representation of the concept of scaling a vector, we’ll generate code from the expressions intermediate representation graph, and this code will finally be runnable.

Shallow Interpreter

For prototyping purpose it can be interesting to have a shallow interpreter of a DSL, bypassing the intermediate representation and simplifying the build chain.

So we can write an Interpreter trait defining that the intermediate representation of a type is this type itself:

trait Interpreter extends Base {
  override type Rep[+A] = A
  override protected def unit[A : Manifest](a: A) = a
}

And we can write an interpreter for our linear algebra DSL:

trait LinearAlgebraInterpreter extends LinearAlgebra with Interpreter {
  override type Vector = Seq[Double]
  override def vector_scale(v: Seq[Double], k: Double) = v map (_ * k)
}

Now we are able to run the program written above:

val prog = new Prog with LinearAlgebraInterpreter
println(prog.f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”

Note that this shallow interpreter can’t benefit from the optimizations performed at the deep embedding level we’ll see in the next section.

Intermediate Representation

Building an intermediate representation (IR) of a program will allow us to generate code for several (eventually different) targets and to perform domain specific optimization at the IR level.

Basically, at the IR level we reify each DSL concept in an abstract definition:

trait LinearAlgebraExp extends LinearAlgebra with BaseExp {

  // Reification of the concept of scaling a vector `v` by a factor `k`
  case class VectorScale(v: Exp[Vector], k: Exp[Double]) extends Def[Vector]

  override def vector_scale(v: Exp[Vector], k: Exp[Double]) = toAtom(VectorScale(v, k))

  // Here we say how a Rep[Vector] will be bound to a Seq[Double] in regular Scala code
  override type Vector = Seq[Double]
}

We use the BaseExp trait that defines intermediate representations in terms of constants and symbols pointing to a composite definition:

Expressions class diagram

It aliases the type Rep[T] to Exp[T] which is the base class defining the abstract concept of expression. Such an expression may be a constant (Const[T]), e.g. Const(42) or Const("foo"), or a symbol (Sym[T]), namely an identifier referencing some composite definition (a Def[T]). The framework internally registers each program statement (Stm) and allows to navigate from a symbol (Sym[T]) to its definition (Def[T]) and vice versa.

Finally, we call the toAtom function to associate a symbol to the VectorScale produced value. Indeed, the composite node VectorScale is not an Exp[Vector] itself so we ask the framework to create a fresh symbol and to associate it with our node. This separation between Exp and Def allows common subexpressions elimination: actually if toAtom is called with a value equals to a previously registered value, the framework returns the symbol created for this previous value instead of creating a new one.

Now if we write and run the following code:

new Prog with LinearAlgebraExp {
  println(f(unit(Seq(1.0))))
}

We see Sym(0) printed in the console. f is a regular Scala function working at the IR level, here it returned the created symbol used to reference the VectorScale definition. The following objects diagram may help to see what happened:

Expressions class diagram

Everything starts with the v constant value we give to the f function. The scaling operation in the function body creates a value k, also a constant, and the definition of the scaling operation: the vs value. Then, the call to toAtom creates a fresh symbol, sym and a statement stm linking the symbol to the vs definition. Finally, the sym value is the intermediate representation of the scaling operation: a symbol referencing the operation definition.

So the Exp intermediate representation helps us building a graph of constants and symbols referencing composite nodes linked to other constant or symbols.

At this point, to evaluate a staged program we could write another interpreter (a “deep” interpreter) or generate executable code from the IR graph. In this tutorial we choose the second solution.

Generating Code

To generate code we need to traverse the expressions IR graph and generate the code corresponding to each statement, watching carefully for dependencies between them. Fortunately, LMS already provides the traversing code so the only remaining task is to define the code corresponding to each statement. LMS allows us to do it in a modular fashion, so in our case we will only say how to generate code for linear algebra statements:

trait ScalaGenLinearAlgebra extends ScalaGenBase {
  // This code generator works with IR nodes defined by the LinearAlgebraExp trait
  val IR: LinearAlgebraExp
  import IR._

  override def emitNode(sym: Sym[Any], node: Def[Any]): Unit = node match {
    case VectorScale(v, k) => {
      emitValDef(sym, quote(v) + ".map(x => x * " + quote(k) + ")")
    }
    case _ => super.emitNode(sym, node)
  }
}

The quote function returns a literal representation of an expression for the target language (here Scala since we extend ScalaGenBase).

Our code generator can be used as follows:

val prog = new Prog with LinearAlgebraExp with EffectExp
val codegen = new ScalaGenEffect with ScalaGenLinearAlgebra { val IR: prog.type = prog }
codegen.emitSource(prog.f, "F", new java.io.PrintWriter(System.out))

Running the above code produces the following output:

class F extends ((scala.collection.Seq[Double])=>(scala.collection.Seq[Double])) {
  def apply(x0:scala.collection.Seq[Double]): scala.collection.Seq[Double] = {
    val x1 = x0.map(x => x * 12.34)
    x1
  }
}

This generated code can then be compiled and used as follows:

val f = new F
println(f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”

Applying Domain Specific Optimizations

The IR level allows us to transform expressions in order to apply domain specific optimizations. For instance, in our case we could handle the case of scaling a vector by a factor of 1 by just returning the vector itself instead of producing a VectorScale node:

trait LinearAlgebraExpOpt extends LinearAlgebraExp {
  override def vector_scale(v: Exp[Vector], k: Exp[Double]) = k match {
    case Const(1.0) => v
    case _ => super.vector_scale(v, k)
  }
}

This optimization is defined in a proper trait and can be mixed in a program as follows:

val prog = new Prog with LinearAlgebraExpOpt

Generating and Compiling in One Step

It may be more convenient to compile and load a program generated by a staged program in one step. LMS provides a tool to do this through the CompileScala trait:

val prog = new Prog with LinearAlgebraExpOpt with EffectExp with CompileScala { self =>
  override val codegen = new ScalaGenEffect with ScalaGenLinearAlgebra { val IR: self.type = self }
}
val f = prog.compile(prog.f)
println(f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”

Worth reading

The Scala-virtualized wiki or the Delite tutorial.

Polymorphic Embedding of DSLs

Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs

Tool Demo: Scala-Virtualized

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.