# Home

Next: Tracking Effects

# Introduction

In this tutorial we’ll see step by step how to define a DSL and its concrete syntax, how to interprete it, how to build an intermediate representation on top of which domain specific optimizations can be performed and finally how to generate efficient code for a given target.

# Project Setup

First, check that LMS is locally published on your system (see the README to know how to do this).

Create an empty sbt project using Scala 2.10 and depending on LMS. Your `build.sbt` file should look like the following:

```name := "lms-tutorial"

scalaOrganization := "org.scala-lang.virtualized"

scalaVersion := "2.10.1"

libraryDependencies += "EPFL" %% "lms" % "0.3-SNAPSHOT"

scalacOptions += "-Yvirtualize"```

Create an empty file `LinearAlgebra.scala` in the `src/main/scala/` subdirectory of your project.

# DSL concepts and concrete syntax

The first step consists in defining the DSL concepts and its concrete syntax. In our case, we’ll define a DSL for linear algebra, so we want to be able to express vectors and arithmetic operations on them. We will reduce this tutorial to only a few concepts, for brevity. For instance, we want to be able to express the concept of scaling a vector with an expression looking like the following:

`val u = v * 12.34`

Where `v` is a vector.

Thus we have a concept of vector and a scale operation. Let’s write these concepts in terms of functions and types:

```import scala.virtualization.lms.common._

trait LinearAlgebra extends Base {
// We use an abstract type to represent the vector data type
type Vector
// Scaling a vector `v` by a factor `k` gives another vector
def vector_scale(v: Rep[Vector], k: Rep[Double]): Rep[Vector]
}```

The `Base` trait comes from LMS and defines two main things:

• The `Rep[T]` abstract type member, used to distinguish between values and representations of values ;
• The `unit[T](x: T): Rep[T]` abstract method, used to lift a value `x` into a representation of it.

Then, as we want to allow DSL users to use a sweet syntax, we define a convenient infix operator `*` taking a vector and a scalar and returning a vector. So, here is what our user facing DSL trait looks like:

```import scala.virtualization.lms.common._

trait LinearAlgebra extends Base {
// Concepts
type Vector
def vector_scale(v: Rep[Vector], k: Rep[Double]): Rep[Vector]

// Concrete syntax
implicit class VectorOps(v: Rep[Vector]) {
def * (k: Rep[Double]) = vector_scale(v, k)
}
}```

Now that we have a DSL we can write programs such as:

```trait Prog extends LinearAlgebra {
def f(v: Rep[Vector]): Rep[Vector] = v * unit(12.34)
}```

The call `unit(12.34)` is used to lift the `Double` into `Rep[Double]`, as required by our `*` operator. We could also import an implicit conversion lifting any `Double` to a `Rep[Double]`, when required, by calling `unit`, allowing us to just write `v * 12.34`.

The `Prog` trait is still abstract since its `vector_scale` operation has no implementation. At the end we’ll build an intermediate representation of the concept of scaling a vector, we’ll generate code from the expressions intermediate representation graph, and this code will finally be runnable.

# Shallow Interpreter

For prototyping purpose it can be interesting to have a shallow interpreter of a DSL, bypassing the intermediate representation and simplifying the build chain.

So we can write an `Interpreter` trait defining that the intermediate representation of a type is this type itself:

```trait Interpreter extends Base {
override type Rep[+A] = A
override protected def unit[A : Manifest](a: A) = a
}```

And we can write an interpreter for our linear algebra DSL:

```trait LinearAlgebraInterpreter extends LinearAlgebra with Interpreter {
override type Vector = Seq[Double]
override def vector_scale(v: Seq[Double], k: Double) = v map (_ * k)
}```

Now we are able to run the program written above:

```val prog = new Prog with LinearAlgebraInterpreter
println(prog.f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”```

Note that this shallow interpreter can’t benefit from the optimizations performed at the deep embedding level we’ll see in the next section.

# Intermediate Representation

Building an intermediate representation (IR) of a program will allow us to generate code for several (eventually different) targets and to perform domain specific optimization at the IR level.

Basically, at the IR level we reify each DSL concept in an abstract definition:

```trait LinearAlgebraExp extends LinearAlgebra with BaseExp {

// Reification of the concept of scaling a vector `v` by a factor `k`
case class VectorScale(v: Exp[Vector], k: Exp[Double]) extends Def[Vector]

override def vector_scale(v: Exp[Vector], k: Exp[Double]) = toAtom(VectorScale(v, k))

// Here we say how a Rep[Vector] will be bound to a Seq[Double] in regular Scala code
override type Vector = Seq[Double]
}```

We use the `BaseExp` trait that defines intermediate representations in terms of constants and symbols pointing to a composite definition:

It aliases the type `Rep[T]` to `Exp[T]` which is the base class defining the abstract concept of expression. Such an expression may be a constant (`Const[T]`), e.g. `Const(42)` or `Const("foo")`, or a symbol (`Sym[T]`), namely an identifier referencing some composite definition (a `Def[T]`). The framework internally registers each program statement (`Stm`) and allows to navigate from a symbol (`Sym[T]`) to its definition (`Def[T]`) and vice versa.

Finally, we call the `toAtom` function to associate a symbol to the `VectorScale` produced value. Indeed, the composite node `VectorScale` is not an `Exp[Vector]` itself so we ask the framework to create a fresh symbol and to associate it with our node. This separation between `Exp` and `Def` allows common subexpressions elimination: actually if `toAtom` is called with a value equals to a previously registered value, the framework returns the symbol created for this previous value instead of creating a new one.

Now if we write and run the following code:

```new Prog with LinearAlgebraExp {
println(f(unit(Seq(1.0))))
}```

We see `Sym(0)` printed in the console. `f` is a regular Scala function working at the IR level, here it returned the created symbol used to reference the `VectorScale` definition. The following objects diagram may help to see what happened:

Everything starts with the `v` constant value we give to the `f` function. The scaling operation in the function body creates a value `k`, also a constant, and the definition of the scaling operation: the `vs` value. Then, the call to `toAtom` creates a fresh symbol, `sym` and a statement `stm` linking the symbol to the `vs` definition. Finally, the `sym` value is the intermediate representation of the scaling operation: a symbol referencing the operation definition.

So the `Exp` intermediate representation helps us building a graph of constants and symbols referencing composite nodes linked to other constant or symbols.

At this point, to evaluate a staged program we could write another interpreter (a “deep” interpreter) or generate executable code from the IR graph. In this tutorial we choose the second solution.

# Generating Code

To generate code we need to traverse the expressions IR graph and generate the code corresponding to each statement, watching carefully for dependencies between them. Fortunately, LMS already provides the traversing code so the only remaining task is to define the code corresponding to each statement. LMS allows us to do it in a modular fashion, so in our case we will only say how to generate code for linear algebra statements:

```trait ScalaGenLinearAlgebra extends ScalaGenBase {
// This code generator works with IR nodes defined by the LinearAlgebraExp trait
val IR: LinearAlgebraExp
import IR._

override def emitNode(sym: Sym[Any], node: Def[Any]): Unit = node match {
case VectorScale(v, k) => {
emitValDef(sym, quote(v) + ".map(x => x * " + quote(k) + ")")
}
case _ => super.emitNode(sym, node)
}
}```

The `quote` function returns a literal representation of an expression for the target language (here Scala since we extend `ScalaGenBase`).

Our code generator can be used as follows:

```val prog = new Prog with LinearAlgebraExp with EffectExp
val codegen = new ScalaGenEffect with ScalaGenLinearAlgebra { val IR: prog.type = prog }
codegen.emitSource(prog.f, "F", new java.io.PrintWriter(System.out))```

Running the above code produces the following output:

```class F extends ((scala.collection.Seq[Double])=>(scala.collection.Seq[Double])) {
def apply(x0:scala.collection.Seq[Double]): scala.collection.Seq[Double] = {
val x1 = x0.map(x => x * 12.34)
x1
}
}```

This generated code can then be compiled and used as follows:

```val f = new F
println(f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”```

# Applying Domain Specific Optimizations

The IR level allows us to transform expressions in order to apply domain specific optimizations. For instance, in our case we could handle the case of scaling a vector by a factor of 1 by just returning the vector itself instead of producing a `VectorScale` node:

```trait LinearAlgebraExpOpt extends LinearAlgebraExp {
override def vector_scale(v: Exp[Vector], k: Exp[Double]) = k match {
case Const(1.0) => v
case _ => super.vector_scale(v, k)
}
}```

This optimization is defined in a proper trait and can be mixed in a program as follows:

`val prog = new Prog with LinearAlgebraExpOpt`

# Generating and Compiling in One Step

It may be more convenient to compile and load a program generated by a staged program in one step. LMS provides a tool to do this through the `CompileScala` trait:

```val prog = new Prog with LinearAlgebraExpOpt with EffectExp with CompileScala { self =>
override val codegen = new ScalaGenEffect with ScalaGenLinearAlgebra { val IR: self.type = self }
}
val f = prog.compile(prog.f)
println(f(Seq(1.0, 2.0))) // prints “Seq(12.34, 24.68)”```