Scala AST reference

Vivek Srikumar edited this page Jan 3, 2014 · 52 revisions

This document is not a formal reference for the AST created by the Scala compiler. It provides several examples for the kinds of nodes in the AST that we may encounter.

General notes

  • All the case classes here (say X) should be prefixed by global.X when used in a case statement. To avoid that, we could import global._.

  • Type names (like Int, Map, etc) either belong to a class called TypeName_S or TypeName_R. But these are private members of the class Names so we cannot pattern match against them. However, any AST node that is a type name has the flag isTypeName set to true. This could be used to identify type names. Below, all type names are represented by the string newTypeName("<some-name>").

Types

Types are identified by either Ident, Select or AppliedTypeTree. See notes on Select below. Here are some examples.

Original code AST after type inference
Int Ident(newTypeName("Int"))
String Ident(newTypeName("String"))
Foo Ident(newTypeName("Foo"))
java.util.Random Select(Select(Ident(newTermName("java")), newTermName("util")), newTypeName("Random"))
scala.util.Random Select(Select(Ident(newTermName("scala")), newTermName("util")), newTypeName("Random"))
Set[String] AppliedTypeTree(Ident(newTypeName("Set")),List(Ident(newTypeName("String"))))
Map[String, Int] AppliedTypeTree(Ident(newTypeName("Map")), List(Ident(newTypeName("String")), Ident(newTypeName("Int"))))
(String, Int) AppliedTypeTree(Select(Ident(scala), newTypeName("Tuple2")), List(Ident(newTypeName("String")), Ident(newTypeName("Int"))))
(String, Int, Double) AppliedTypeTree(Select(Ident(scala), newTypeName("Tuple3")), List(Ident(newTypeName("String")), Ident(newTypeName("Int")), Ident(newTypeName("Double"))))
String => Double AppliedTypeTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Function1")), List(Ident(newTypeName("String")), Ident(newTypeName("Double"))))
(String, Int) => Double AppliedTypeTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Function2")), List(Ident(newTypeName("String")), Ident(newTypeName("Int")), Ident(newTypeName("Double")))
(String, Int, Boolean) => Double AppliedTypeTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Function3")), List(Ident(newTypeName("String")), Ident(newTypeName("Int")), Ident(newTypeName("Boolean")), Ident(newTypeName("Double"))))
String => Int => Double AppliedTypeTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Function1")), List(Ident(newTypeName("String")), AppliedTypeTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Function1")), List(Ident(newTypeName("Int")), Ident(newTypeName("Double"))))))

Notes

  • In general, AppliedTypeTree takes two arguments. The first argument tells us about the type and the second argument is a list that gives information about the types of parameters used to construct the types.
  • Select is used to pick a member of a type or an object as Select(identifier, member). In the examples above, note that this is nested repeatedly till we drill down to the member we need. We will see more of this below too.
  • Some Idents take the word scala as a parameter. Most likely, we won't be using this. So, for now, let's ignore them. Ditto for nme.ROOTPKG.

Aliasing new types

New types defined using the type construction are represented by TypeDef nodes in the AST. TypeDef takes four arguments:

  1. Modifiers, which indicates whether the new type is public/private/etc. We won't be needing this for now.
  2. The name of the new type
  3. A list of parameters for the type (eg, for cases like type Foo[T, S] = Map[T, S]. Not sure if we will need this.
  4. The RHS which should be a valid type expression. Since we will be looking at ASTs after many compiler phases, in all cases, we will be looking at an object called TypeTree(). This object should have a member called original which will give us a type expressionthat looks like what is defined in the Types section above.
Original code AST after type inference
type Foo1 = Int TypeDef(Modifiers(), newTypeName("Foo1"), List(), TypeTree().setOriginal(Select(Ident(scala), scala.Int)))
type Foo2 = String TypeDef(Modifiers(), newTypeName("Foo2"), List(), TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))))
type Foo3 = Set[String] TypeDef(Modifiers(), newTypeName("Foo3"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("Set")), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String")))))))
type Foo4 = Map[String, Int] TypeDef(Modifiers(), newTypeName("Foo4"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("Map")), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(Select(Ident(scala), scala.Int))))))
type Foo5 = (String, Int) TypeDef(Modifiers(), newTypeName("Foo5"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Ident(scala), scala.Tuple2), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(Select(Ident(scala), scala.Int))))))
type Foo6 = scala.util.Random TypeDef(Modifiers(), newTypeName("Foo6"), List(), TypeTree().setOriginal(Select(Select(Ident(scala), scala.util), scala.util.Random)))
type Foo7 = (String, Int, Double) TypeDef(Modifiers(), newTypeName("Foo7"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Ident(scala), scala.Tuple3), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(Select(Ident(scala), scala.Int)), TypeTree().setOriginal(Select(Ident(scala), scala.Double))))))
type Foo9 = String => Double TypeDef(Modifiers(), newTypeName("Foo9"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Select(Ident(_root_), scala), scala.Function1), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(Select(Ident(scala), scala.Double))))))
type Foo10 = (String, Int, Boolean) => Double TypeDef(Modifiers(), newTypeName("Foo10"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Select(Ident(_root_), scala), scala.Function3), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(Select(Ident(scala), scala.Int)), TypeTree().setOriginal(Select(Ident(scala), scala.Boolean)), TypeTree().setOriginal(Select(Ident(scala), scala.Double))))))
type Foo11 = String => Int => Double TypeDef(Modifiers(), newTypeName("Foo11"), List(), TypeTree().setOriginal(AppliedTypeTree(Select(Select(Ident(_root_), scala), scala.Function1), List(TypeTree().setOriginal(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("String"))), TypeTree().setOriginal(AppliedTypeTree(Select(Select(Ident(_root_), scala), scala.Function1), List(TypeTree().setOriginal(Select(Ident(scala), scala.Int)), TypeTree().setOriginal(Select(Ident(scala), scala.Double)))))))))
type Foo12[T] = Set[T] TypeDef(Modifiers(), newTypeName("Foo12"), List(TypeDef(Modifiers(PARAM), newTypeName("T"), List(), TypeTree().setOriginal(TypeBoundsTree(TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Nothing)), TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Any)))))), TypeTree().setOriginal(AppliedTypeTree(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("Set")), List(TypeTree().setOriginal(Ident(newTypeName("T")))))))
type Foo13[T, S] = Map[T, S] TypeDef(Modifiers(), newTypeName("Foo13"), List(TypeDef(Modifiers(PARAM), newTypeName("T"), List(), TypeTree().setOriginal(TypeBoundsTree(TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Nothing)), TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Any))))), TypeDef(Modifiers(PARAM), newTypeName("S"), List(), TypeTree().setOriginal(TypeBoundsTree(TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Nothing)), TypeTree().setOriginal(Select(Select(Ident(_root_), scala), scala.Any)))))), TypeTree().setOriginal(AppliedTypeTree(Select(Select(This(newTypeName("scala")), scala.Predef), newTypeName("Map")), List(TypeTree().setOriginal(Ident(newTypeName("T"))), TypeTree().setOriginal(Ident(newTypeName("S")))))))

Creating objects

Scala objects are either literals (integers, strings, etc) or are created by applying a constructor. Symbols look like a combination of both.

Scala object AST case class
1 Literal(Constant(1))
"wolfe" Literal(Constant("wolfe"))
1.3 Literal(Constant(1.3))
true Literal(Constant(true))
(1, 2) Apply(Select(Ident(scala), newTermName("Tuple2")), List(Literal(Constant(1)), Literal(Constant(2))))
'Anna Apply(Select(Ident(scala), newTermName("Symbol")), List(Literal(Constant("Anna"))))
new Foo Apply(Select(New(Ident(newTypeName("Foo"))), nme.CONSTRUCTOR), List())
new Foo(a, b) Apply(Select(New(Ident(newTypeName("Foo"))), nme.CONSTRUCTOR), List(Ident(newTermName("a")), Ident(newTermName("b"))))
new some.where.Foo Apply(Select(New(Select(Select(Ident(newTermName("some")), newTermName("where")), newTypeName("Foo"))), nme.CONSTRUCTOR), List())
new some.where.Foo(a,b) Apply(Select(New(Select(Select(Ident(newTermName("some")), newTermName("where")), newTypeName("Foo"))), nme.CONSTRUCTOR), List(Ident(newTermName("a")), Ident(newTermName("b"))))

After typer: | new String | Apply(Select(New(Ident(newTypeName("String"))), nme.CONSTRUCTOR), List()) |

After refchecks: | new String | Apply(Select(New(TypeTree()), nme.CONSTRUCTOR), List())|

Assigning to val

Assignment to a val is represented using the ValDef case class, which takes four arguments:

  1. Modifier, representing private/public/etc, which we won't use for now
  2. The name of the new term
  3. The type of the new term (if explicitly stated), as a type node (see section on Types)
  4. The right hand side (which could represent any expression)

Some examples below

Scala statement AST case class
val num = 1 ValDef(Modifiers(), newTermName("num"), TypeTree(), Literal(Constant(1)))
private val dbl: Double = 1.3 ValDef(Modifiers(PRIVATE), newTermName("dbl"), Ident(newTypeName("Double")), Literal(Constant(1.3)))
val foo = new Foo ValDef(Modifiers(), newTermName("foo"), TypeTree(), Apply(Select(New(Ident(newTypeName("Foo"))), nme.CONSTRUCTOR), List()))
val fooab = new Foo(a, b) ValDef(Modifiers(), newTermName("fooab"), TypeTree(), Apply(Select(New(Ident(newTypeName("Foo"))), nme.CONSTRUCTOR), List(Ident(newTermName("a")), Ident(newTermName("b")))))

Function calls

Function calls are translated to the Apply case class. Apply takes two arguments -- a Select or an Ident for the function and a list of arguments. Each element in the argument list is an AST representing it.

Examples:

Scala statement AST case class
f(10) Apply(Ident(newTermName("f")), List(Literal(Constant(10))))
val a = foo(10) ValDef(Modifiers(), newTermName("a"), TypeTree(), Apply(Ident(newTermName("f")), List(Literal(Constant(10)))))
foo.call(1, 3) Apply(Select(Ident(newTermName("foo")), newTermName("call")), List(Literal(Constant(1)), Literal(Constant(3))))
1 + 3 Apply(Select(Literal(Constant(1)), newTermName("$plus")), List(Literal(Constant(3))))
bar.execute("mary", "lamb") Apply(Select(Ident(newTermName("bar")), newTermName("execute")), List(Literal(Constant("mary")), Literal(Constant("lamb"))))
a + 3 Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3))))
f1 + f2 Apply(Select(Ident(newTermName("f1")), newTermName("$plus")), List(Ident(newTermName("f2"))))
(f1 + f2) dot weights Apply(Select(Apply(Select(Ident(newTermName("f1")), newTermName("$plus")), List(Ident(newTermName("f2")))), newTermName("dot")), List(Ident(newTermName("weights"))))

Notes

  1. Operators are just functions, selected from the previous identifier. This is consistent with the way we define operators.
  2. One weird case is the x :: xs construction used to create a list. The :: is really a function of xs. To handle this, this is translated to val x$1 = x; xs.$colon$colon(x$1). This is then parsed as a Block object containing a list of two statements: a val assignment followed by a function call.
  3. Note that as far as the AST is concerned, we don't distinguish between calling a function with a dot or with a space.

Defining functions

Functions are defined using the DefDef construction. DefDef takes five arguments:

  1. Modifiers: public/private/etc
  2. Name of the function
  3. Type parameters: Any generic type parameters associated with the function
  4. Parameters: A list of list of ValDefs. Each inner list is one set of parameters.
  5. A type tree that may contain information about the function's return type
  6. The body of the function, as a tree

Examples

Function definition AST case class
def f = 10 DefDef(Modifiers(), newTermName("f"), List(), List(), TypeTree(), Literal(Constant(10)))
def f: Int = 10 DefDef(Modifiers(), newTermName("f"), List(), List(), Ident(newTypeName("Int")), Literal(Constant(10)))
def f(a: Int) = a + 3 DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("Int")), EmptyTree))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3)))))
def f[T](a: T) = a.toString DefDef(Modifiers(), newTermName("f"), List(TypeDef(Modifiers(PARAM), newTypeName("T"), List(), TypeBoundsTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Nothing")), Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Any"))))), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("T")), EmptyTree))), TypeTree(), Select(Ident(newTermName("a")), newTermName("toString")))
def f[T, S](a: T) = createS(a) DefDef(Modifiers(), newTermName("f"), List(TypeDef(Modifiers(PARAM), newTypeName("T"), List(), TypeBoundsTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Nothing")), Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Any")))), TypeDef(Modifiers(PARAM), newTypeName("S"), List(), TypeBoundsTree(Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Nothing")), Select(Select(Ident(nme.ROOTPKG), scala), newTypeName("Any"))))), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("T")), EmptyTree))), TypeTree(), Apply(Ident(newTermName("createS")), List(Ident(newTermName("a")))))
def f(a: Int) = { a + 3 } DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("Int")), EmptyTree))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3)))))
def f(a: Int = 2) = a + 3 DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM | DEFAULTPARAM/TRAIT), newTermName("a"), Ident(newTypeName("Int")), Literal(Constant(2))))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3)))))
def f(a: Int = 2): Int = a + 3 DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM | DEFAULTPARAM/TRAIT), newTermName("a"), Ident(newTypeName("Int")), Literal(Constant(2))))), Ident(newTypeName("Int")), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3)))))
def f(a: Int) = { println(a); a + 3 } DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("Int")), EmptyTree))), TypeTree(), Block(List(Apply(Ident(newTermName("println")), List(Ident(newTermName("a"))))), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Literal(Constant(3))))))
def f(a: Int, b: String) = a + b DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("Int")), EmptyTree), ValDef(Modifiers(PARAM), newTermName("b"), Ident(newTypeName("String")), EmptyTree))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Ident(newTermName("b")))))
def f(a: Int)(b: String) = a + b DefDef(Modifiers(), newTermName("f"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(newTypeName("Int")), EmptyTree)), List(ValDef(Modifiers(PARAM), newTermName("b"), Ident(newTypeName("String")), EmptyTree))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Ident(newTermName("b")))))

Lambda expressions

Lambda expressions are defined using the case class Function, which takes two arguments

  1. The parameters, as a list of ValDefs
  2. The body of function as a Tree

Examples:

Lambda expression AST case class
{x: Int => x} Function(List(ValDef(Modifiers(PARAM), newTermName("x"), Ident(newTypeName("Int")), EmptyTree)), Ident(newTermName("x")))
{p: (Int, Int) => p._1 + p._2} Function(List(ValDef(Modifiers(PARAM), newTermName("p"), AppliedTypeTree(Select(Ident(scala), newTypeName("Tuple2")), List(Ident(newTypeName("Int")), Ident(newTypeName("Int")))), EmptyTree)), Apply(Select(Select(Ident(newTermName("p")), newTermName("_1")), newTermName("$plus")), List(Select(Ident(newTermName("p")), newTermName("_2")))))
{x: Int => y: Int => x + y} Function(List(ValDef(Modifiers(PARAM), newTermName("x"), Ident(newTypeName("Int")), EmptyTree)), Function(List(ValDef(Modifiers(PARAM), newTermName("y"), Ident(newTypeName("Int")), EmptyTree)), Apply(Select(Ident(newTermName("x")), newTermName("$plus")), List(Ident(newTermName("y"))))))

Defining classes/case classes

These are huge. Not sure if we need this.

Annotations

Annotations are part of the Modifiers argument of the ValDef/DefDef type definitions. If there are any annotations, then the modifiers is populated. The case class Modifiers has three arguments:

  1. A Long indicating flags (like PARAM, etc) which we probably don't care for at this point
  2. A qualifier for public/private/etc
  3. A list of Trees, each of which contains an annotation

For example, if a function/field is marked as @deprecated, its AST will contain the following modifier Modifiers(NoFlags, tpnme.EMPTY, List(Apply(Select(New(Ident(newTypeName("deprecated"))), nme.CONSTRUCTOR).

Note that the annotation is created by calling the constructor of the class deprecated.

Important note: The scala doc for the Modifier class says

the typechecker drops these annotations, use the AnnotationInfo's (Symbol.annotations) in later phases.

Not sure what this means, but we should watch out.

Others

  1. for comprehensions: For comprehensions are syntactically translated into expressions that use map, filter, etc. So we will only be dealing with such expressions. To see examples of the translations see http://docs.scala-lang.org/tutorials/FAQ/yield.html

  2. Blocks: Blocks are enclosed in a Block object, which contains a list of trees.