Where to find stuff
- Yuvi Masory
- https://github.com/ymasory/ToyLisp (code)
- https://github.com/ymasory/ToyLisp/README.md (language handout)
- https://github.com/ymasory/ToyLisp/PHASE-outline.md (talk outline handout)
- Break the barrier! language user -> language tinkerer
- Give an intro to the parser combinators API.
- Give an intro to evaluation.
- Who is comfortable with basic regular expressions?
- Who could sit down and write an interpreter?
- Who could sit down and write a compiler?
Overview of Lisp
- Lisp is easy to parse.
- Lisp is easy to evaluate.
- I'm not a Lisper.
- Simple literals:
[1 2 3]
- Function call:
(+ 2 2)
(lambda [x] (* x 2))
(set! timestwo (lambda [x] (* x 2)))
We want to interpret this program:
(set! - (lambda [x y] (+ x (opp y)))) (- 10 15)
Overview of Interpretation
- Sequence of characters -> tokens. (lexical parsing).
- Sequence of tokens -> parse trees. (syntactic parsing).
- Parse trees -> abstract syntax trees (AST).
- AST -> value. (evaluation).
A value is an expression that cannot be evaluated any further.
The parser (lisp-speak: the reader) does 1-3. The evaluator (lisp-speak: the
eval function) does 4.
interpretis a procedure that transforms the program text into a value.
interpretfeeds source text the
readfunction of type
String => ToyList.
ToyList, along with an empty
evalwhich is of type
(ToyList, Environment) -> (ToyForm, Environment).
interpretthen takes the last
ToyFormof the program and prints the result.
Why use parser combinators
- Advantages: fast, perfectly customized to your needs, can reasonably integrate lexing and parsing
- Disadvantages: difficult to write, difficult to maintain, lack formal results
Parsers generated by parser generators
- Advantages: relatively fast, formal results
- Disadvantages: steep learning curve, separate tools for lexical parsing (JFlex) and syntactic parsing (ANTLR), require extensive customization to produce exactly the kinds of objects you want, will not produce idiomatic Scala objects
- Advantages: fast to write, easy to learn, bridge lexical and syntactic parsing, compositional, mantainable, high-level look-alike of EBNF, idiomatic Scala
- Disadvantages: extremely slow, lack formal results
Top-level types and methods
Parser[X]is the type of a class for some input into
Xs. It's a function object so it has an
applymethod. It also has some combinators unique to it.
- Our parser (er, reader) will be a
Parsers.parseAllparses all of its input or fails. Contrast with
- But how do we get our very first
Parser? We don't want to subclass
RegexParsersgives you implicit conversions from regexes or
Parserobjects for various Java tokens.
- FYI: you need to override
skipWhitespaceotherwise you will lack fine grained control over whitespace.
- A combinator is a function that takes two elements from some domain, and returns another element from that same domain.
- A parser combinator therefore takes two parsers and gives you a new parser.
- Sequential combinators:
parser1 ~ parser2,
parserIgnored ~> parser,
parser <~ parserIgnored,
~!guarantees no backtracking
- Optional combinator:
- Repetition combinators:
- Alternative combinators:
log(parser)prints the parsing
Mapping parser outputs
- We need
ToyFormobjects as the output of our parsers! That's what
^^is for. It also has a variant
^?for partial functions.
- Antipattern: manipulating the parsed string inside the transformation function.
Sorry no outline for this :(