# Emma Source Language

Emma is a *domain-specific language (DSL)* for parallel data analysis embedded in Scala. As such, Emma accepts a subset of Scala as valid source expressions. The language induced by this set is called the *Emma Source* and is introduced by example below.

## Notebook Setup

The snippet assumes that you have installed the current version of the Emma in your local Maven repository before opeining the notebook. If this is not the case, you should do this from the project root with the following Maven command.

```
mvn clean install -DskipTests
```

We can then load the `emma-language` artifact as follows.

In [1]:
// register the maven repository
classpath.addRepository(
  s"file://${System.getenv("HOME")}/.m2/repository/"
)
// add the required manen modules
classpath.add("eu.stratosphere" % "emma-language" % "1.0-SNAPSHOT")

Adding 18 artifact(s)




## Compiler Infrastructure

Scala offers facilities for both compile-time and runtime reflection (for more information, read the [Reflection Overview](http://docs.scala-lang.org/overviews/reflection/overview.html) documentation.
While the two APIs are mostly similar, there are some subtle differences that need to be considered.

Emma unifies the two approaches under a single `Compiler` interface with two implementations:

- [`MacroCompiler`](https://github.com/stratosphere/emma/blob/newir/emma-language/src/main/scala/eu/stratosphere/emma/compiler/MacroCompiler.scala) (which operates at compile-time), and 
- [`RuntimeCompiler`](https://github.com/stratosphere/emma/blob/newir/emma-language/src/main/scala/eu/stratosphere/emma/compiler/RuntimeCompiler.scala) (which operates at runtime).

This unified approach gives people playing with the Emma compiler infrastructure the freedom to decide ad-hoc which parts of the compiler pipeline are performed during the statically and which dynamically.

The examples below are illustrated based on the `RuntimeCompiler`, which is instantiated as follows.

In [2]:
import eu.stratosphere.emma.compiler.RuntimeCompiler
val compiler = new RuntimeCompiler()

[32mimport [36meu.stratosphere.emma.compiler.RuntimeCompiler[0m
[36mcompiler[0m: [32meu[0m.[32mstratosphere[0m.[32memma[0m.[32mcompiler[0m.[32mRuntimeCompiler[0m = eu.stratosphere.emma.compiler.RuntimeCompiler@7bc175bb

Once we have a `Compiler` instance, we can import the (path-dependent) `Tree` universe.

In [3]:
import compiler.universe._

[32mimport [36mcompiler.universe._[0m

## Compiler Pipelines

The `Compiler` trait mixes is a sequence of traits which add functionality in a modular way.

Each trait defines a set of referentially transparent (i.e., *functional*) transformations that consume a Scala `Tree` and produce a new `Tree`.

Compilation pipelines can defined in the so-called *point-free* style by means of chaining such transformation functions using the `andThen` combinator.

For example, a trivial compiler pippeline that just typechecks a reified Scala expression can be defined as follows.

In [4]:
def typeCheck[T]: Expr[T] => Tree = {
  (_: Expr[T]).tree
} andThen {
  compiler.Type.check(_: Tree)
}

defined [32mfunction [36mtypeCheck[0m

To see this pipeline at work, reify a Scala code snippet and pass it as an argument to the `typeCheck` method.

In [5]:
val QandA = typeCheck(reify {
  val Q = "What is the meaning of Life, the Universe, and Everything?"
  val A = 42
})

[36mQandA[0m: [32mTree[0m = {
  val Q: String = "What is the meaning of Life, the Universe, and Everything?";
  val A: Int = 42;
  ()
}

## Language Constructs

The [`Source`](https://github.com/stratosphere/emma/blob/newir/emma-language/src/main/scala/eu/stratosphere/emma/compiler/lang/Source.scala) trait defines a `Source` object that contains facilities for handling *Emma Source* expressions.

In [6]:
import compiler.Source

[32mimport [36mcompiler.Source[0m

The `Source.Language` member contains a set of objects corresponding to the basic language features.

In [7]:
import Source.Language._

[32mimport [36mSource.Language._[0m

We are also going to use the `eq` method which checks alpha-equivalence between Source trees.

In [8]:
import Source.Language.eq

[32mimport [36mSource.Language.eq[0m

### Atomic Expresssions

The `QandA` code snippet gives us some ideas about the Scala language constructs supported by Emma.

The `Source.Language` primitive that models literals is `lit`. It enables (1) explicit construction and (2) pattern matching of literal trees.

In [9]:
// quoted Emma Source code
object lit$act {
  val code: Tree = typeCheck(reify { 
    42
  })
}

// constructed Emma Source tree
object lit$exp {
  val code: Tree = lit(42)
}

// destructed Emma Source tree (pattern matching)
Seq(lit$act.code, lit$exp.code) collect {
  case lit(v) => v
}

defined [32mobject [36mlit$act[0m
defined [32mobject [36mlit$exp[0m
[36mres8_2[0m: [32mSeq[0m[[32mAny[0m] = [33mList[0m(42, 42)