Skip to content

Commit

Permalink
Import Stratego manual from metaborg.org website
Browse files Browse the repository at this point in the history
  • Loading branch information
Gohla committed Apr 5, 2016
1 parent 7ea46ef commit d1c193f
Show file tree
Hide file tree
Showing 16 changed files with 3,365 additions and 7 deletions.
4 changes: 3 additions & 1 deletion conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -400,7 +400,9 @@ class StrategoLexer(RegexLexer):

tokens = {
'root': [
(words(('strategies','rules','where','with'), suffix=r'\b'), Keyword),
(words(('module','imports','strategies','rules','where','with','signature','constructors','sorts','not','one',
'some','all','let','in','end','external','call','if','then','else','switch','case','otherwise','rec'),
suffix=r'\b'), Keyword),
(r'(\+|\?|!)', Operator),
(r'"[^"^\n]*"', Literal.String),
(r'\d+', Literal.Number),
Expand Down
2 changes: 1 addition & 1 deletion source/langdev/meta/lang/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Meta Languages Manual

esv
sdf3
stratego
Stratego <stratego/index>
nabl
ts
nabl2
Expand Down
5 changes: 0 additions & 5 deletions source/langdev/meta/lang/stratego.md

This file was deleted.

45 changes: 45 additions & 0 deletions source/langdev/meta/lang/stratego/01-introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
```eval_rst
.. highlight:: str
```

# 1. Introduction

Program transformation is the mechanical manipulation of a program in order to improve it relative to some cost function $C$, such that $C(P) > C(tr(P))$, i.e. the cost decreases as a result of applying the transformation. The cost of a program can be measured in different dimensions such as performance, memory usage, understandability, flexibility, maintainability, portability, correctness, or satisfaction of requirements. Related to these goals, program transformations are applied in different settings; e.g. compiler optimizations improve performance and refactoring tools aim at improving understandability.

While transformations can be achieved by manual manipulation of programs, in general, the aim of program transformation is to increase programmer productivity by automating programming tasks, thus enabling programming at a higher-level of abstraction, and increasing maintainability and re-usability of programs. Automatic application of program transformations requires their implementation in a programming language. In order to make the implementation of transformations productive, such a programming language should support abstractions for the domain of program transformation.

Stratego is a language designed for this purpose. It is a language based on the paradigm of rewriting with programmable rewriting strategies and dynamic rules.


## 1.1. Transformation by Rewriting

Term rewriting is an attractive formalism for expressing basic program transformations. A rewrite rule `p1 -> p2` expresses that a program fragment matching the left-hand side pattern `p1` can be replaced by the instantiation of the right-hand side pattern `p2`. For instance, the rewrite rule

|[ i + j ]| -> |[ k ]| where (i, j) => k

expresses constant folding for addition, i.e. replacing an addition of two constants by their sum. Similarly, the rule

|[ if 0 then e1 else e2 ]| -> |[ e2 ]|

defines unreachable code elimination by reducing a conditional statement to its right branch since the left branch can never be executed. Thus, rewrite rules can directly express laws derived from the semantics of the programming language, making the verification of their correctness straightforward. A correct rule can be safely applied anywhere in a program. A set of rewrite rules can be directly operationalized by rewriting to normal form, i.e. exhaustive application of the rules to a term representing a program. If the rules are confluent and terminating, the order in which they are applied is irrelevant.


## 1.2. Limitations of Pure Rewriting

However, there are two limitations to the application of standard term rewriting techniques to program transformation: the need to intertwine rules and strategies in order to control the application of rewrite rules and the context-free nature of rewrite rules.


## 1.3. Transformation Strategies

Exhaustive application of all rules to the entire abstract syntax tree of a program is not adequate for most transformation problems. The system of rewrite rules expressing basic transformations is often non-confluent and/or non-terminating. An ad hoc solution that is often used is to encode control over the application of rules into the rules themselves by introducing additional function symbols. This intertwining of rules and strategies obscures the underlying program equalities, incurs a programming penalty in the form of rules that define a traversal through the abstract syntax tree, and disables the reuse of rules in different transformations.

Stratego solves the problem of control over the application of rules while maintaining the separation of rules and strategies. A strategy is a little program that makes a selection from the available rules and defines the order and position in the tree for applying the rules. Thus rules remain pure, are not intertwined with the strategy, and can be reused in multiple transformations.


## 1.4. Context-Sensitive Transformation

The second problem of rewriting is the context-free nature of rewrite rules. A rule has access only to the term it is transforming. However, transformation problems are often context-sensitive. For example, when inlining a function at a call site, the call is replaced by the body of the function in which the actual parameters have been substituted for the formal parameters. This requires that the formal parameters and the body of the function are known at the call site, but these are only available higher-up in the syntax tree. There are many similar problems in program transformation, including bound variable renaming, typechecking, data flow transformations such as constant propagation, common-subexpression elimination, and dead code elimination. Although the basic transformations in all these applications can be expressed by means of rewrite rules, these require contextual information.



The following chapters give a tutorial for the Stratego language in which these ideas are explained and illustrated.
114 changes: 114 additions & 0 deletions source/langdev/meta/lang/stratego/02-terms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
```eval_rst
.. highlight:: str
```

# 2. Terms

Stratego programs transform terms. When using Stratego for program transformation, terms typically represent the abstract syntax tree of a program. But Stratego does not much care what a term represents. Terms can just as well represent structured documents, software models, or anything else that can be rendered in a structured format. The chapters on the [XT Transformation Tools](#) show how to transform a program text into a term by means of parsing, and to turn a term into program text again by means of pretty-printing. From now on we will just assume that we have terms that should be transformed and ignore parsing and pretty-printing.


## 2.1. Annotated Term Format

Terms in Stratego are terms in the _Annotated Term Format_, or _ATerms_ for short. The ATerm format provides a set of constructs for representing trees, comparable to XML or abstract data types in functional programming languages. For example, the code `4 + f(5 * x)` might be represented in a term as:

Plus(Int("4"), Call("f", [Mul(Int("5"), Var("x"))]))

ATerms are constructed from the following elements:

* **Integer**: An integer constant, that is a list of decimal digits, is an ATerm.

Examples: `1`, `12343`.
* **String**: A string constant, that is a list of characters between double quotes is an ATerm. Special characters such as double quotes and newlines should be escaped using a backslash. The backslash character itself should be escaped as well.

Examples: `"foobar"`, `"string with quotes\""`, `"escaped escape character\\ and a newline\n"`.
* **Constructor application**: A constructor is an identifier, that is an alphanumeric string starting with a letter, or a double quoted string.

A constructor application `c(t1,...,tn)` creates a term by applying a constructor to a list of zero or more terms. For example, the term `Plus(Int("4"),Var("x"))` uses the constructors `Plus`, `Int`, and `Var` to create a nested term from the strings `"4"` and `"x"`.

When a constructor application has no subterms the parentheses may be omitted. Thus, the term `Zero` is equivalent to `Zero()`. Some people consider it good style to explicitly write the parentheses for nullary terms in Stratego programs. Through this rule, it is clear that a string is really a special case of a constructor application.
* **List**: A list is a term of the form `[t1,...,tn]`, that is a list of zero or more terms between square brackets. While all applications of a specific constructor typically have the same number of subterms, lists can have a variable number of subterms. The elements of a list are typically of the same type, while the subterms of a constructor application can vary in type.

Example: The second argument of the call to `"f"` in the term `Call("f",[Int("5"),Var("x")])` is a list of expressions.
* **Tuple**: A tuple `(t1,...,tn)` is a constructor application without a constructor.

Example: `(Var("x"), Type("int"))`
* **Annotation**: The elements defined above are used to create the structural part of terms. Optionally, a term can be annotated with a list terms. These annotations typically carry additional semantic information about the term. An annotated term has the form `t{t1,...,tn}`.

Example: `Lt(Var("n"),Int("1")){Type("bool")}`. The contents of annotations is up to the application.


## 2.2. Exchanging Terms

The term format described above is used in Stratego programs to denote terms, but is also used to exchange terms between programs. Thus, the internal format and the external format exactly coincide. Of course, internally a Stratego program uses a data-structure in memory with pointers rather than manipulating a textual representation of terms. But this is completely hidden from the Stratego programmer. There are a few facts that are useful to be aware of, though.

> TODO: Is maximal sharing currently preserved?
The internal representation used in Stratego programs maintains maximal sharing of subterms. This means that all occurrences of a subterm are represented by a pointer to the same node in memory. This makes comparing terms in Stratego for syntactic equality a very cheap operation, i.e., just a pointer comparison.

> TODO: picture of tree vs dag representation
When writing a term to a file in order to exchange it with another tool there are several representations to choose from. The textual format described above is the canonical _meaning_ of terms, but does not preserve maximal sharing. Therefore, there is also a _Binary ATerm Format (BAF)_ that preserves sharing in terms. The program _baffle_ can be used to convert between the textual and binary representations.


## 2.3. Inspecting Terms

> TODO: Does `pp-aterm` still exist?
As a Stratego programmer you will be looking a lot at raw ATerms. Stratego pioneers would do this by opening an ATerm file in _emacs_ and trying to get a sense of the structure by parenthesis highlighting and inserting newlines here and there. These days your life is much more pleasant through the tool `pp-aterm`, which adds layout to a term to make it readable. For example, parsing the following program

let function fact(n : int) : int =
if n < 1 then 1 else (n * fact(n - 1))
in printint(fact(10))
end

produces the following ATerm (say in file `fac.trm`):

Let([FunDecs([FunDec("fact",[FArg("n",Tp(Tid("int")))],Tp(Tid("int")),
If(Lt(Var("n"),Int("1")),Int("1"),Seq([Times(Var("n"),Call(Var("fact"),
[Minus(Var("n"),Int("1"))]))])))])],[Call(Var("printint"),[Call(Var(
"fact"),[Int("10")])])])

By pretty-printing the term using `pp-aterm` as

> TODO: Is this syntax still correct?
$ pp-aterm -i fac.trm -o fac-pp.trm --max-term-size 20

we get a much more readable term:

Let(
[ FunDecs(
[ FunDec(
"fact"
, [FArg("n", Tp(Tid("int")))]
, Tp(Tid("int"))
, If(
Lt(Var("n"), Int("1"))
, Int("1")
, Seq([ Times(Var("n"), Call(Var("fact"), [Minus(Var("n"), Int("1"))]))
])
)
)
]
)
]
, [ Call(Var("printint"), [Call(Var("fact"), [Int("10")])])
]
)


## 2.4. Signatures

To use terms in Stratego programs, their constructors should be declared in a signature. A signature declares a number of sorts and a number of constructors for these sorts. For each constructor, a signature declares the number and types of its arguments. For example, the following signature declares some typical constructors for constructing abstract syntax trees of expressions in a programming language:

signature
sorts Id Exp
constructors
: String -> Id
Var : Id -> Exp
Int : Int -> Exp
Plus : Exp * Exp -> Exp
Mul : Exp * Exp -> Exp
Call : Id * List(Exp) -> Exp

Currently, the Stratego compiler only checks the arity of constructor applications against the signature. Still, it is considered good style to also declare the types of constructors in a sensible manner for the purpose of documentation. Also, a later version of the language may introduce type checking.

0 comments on commit d1c193f

Please sign in to comment.