Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tree: 8d88f3af66
Fetching contributors…

Cannot retrieve contributors at this time

99 lines (79 sloc) 4.218 kb

Intro

Where to find stuff

My goals

  • Break the barrier! language user -> language tinkerer
  • Give an intro to the parser combinators API.
  • Give an intro to evaluation.

Audience background

  • Who is comfortable with basic regular expressions?
  • Who could sit down and write an interpreter?
  • Who could sit down and write a compiler?

Overview of Lisp

Why Lisp?

  • Lisp is easy to parse.
  • Lisp is easy to evaluate.
  • I'm not a Lisper.

ToyLisp

Available forms

  • Simple literals: "a string", 'c', 3
  • List: [1 2 3]
  • Function call: (+ 2 2)
  • Lambda: (lambda [x] (* x 2))
  • Assignment: (set! timestwo (lambda [x] (* x 2)))

Our goal

We want to interpret this program:

(set! -
      (lambda [x y]
        (+ x (opp y))))

(- 10 15)

Overview of Interpretation

Conceptual overview

  1. Sequence of characters -> tokens. (lexical parsing).
  2. Sequence of tokens -> parse trees. (syntactic parsing).
  3. Parse trees -> abstract syntax trees (AST).
  4. AST -> value. (evaluation).

A value is an expression that cannot be evaluated any further.

The parser (lisp-speak: the reader) does 1-3. The evaluator (lisp-speak: the eval function) does 4.

Scala overview

  • interpret is a procedure that transforms the program text into a value.
  • interpret feeds source text the read function of type String => ToyList.
  • interpret gives the ToyList, along with an empty Environment, to eval which is of type (ToyList, Environment) -> (ToyForm, Environment).
  • interpret then takes the last ToyForm of the program and prints the result.

Parsing

Why use parser combinators

Hand-written parsers

  • Advantages: fast, perfectly customized to your needs, can reasonably integrate lexing and parsing
  • Disadvantages: difficult to write, difficult to maintain, lack formal results

Parsers generated by parser generators

  • Advantages: relatively fast, formal results
  • Disadvantages: steep learning curve, separate tools for lexical parsing (JFlex) and syntactic parsing (ANTLR), require extensive customization to produce exactly the kinds of objects you want, will not produce idiomatic Scala objects

Parser combinators

  • Advantages: fast to write, easy to learn, bridge lexical and syntactic parsing, compositional, mantainable, high-level look-alike of EBNF, idiomatic Scala
  • Disadvantages: extremely slow, lack formal results

Top-level types and methods

  • Parser[X] is the type of a class for some input into Xs. It's a function object so it has an apply method. It also has some combinators unique to it.
  • Our parser (er, reader) will be a Parser[ToyList].
  • Parsers.parseAll parses all of its input or fails. Contrast with Parsers.parse.
  • But how do we get our very first Parser? We don't want to subclass Parser.
  • RegexParsers gives you implicit conversions from regexes or Strings to Parser objects.
  • JavaTokenParsers gives you Parser objects for various Java tokens.
  • FYI: you need to override skipWhitespace otherwise you will lack fine grained control over whitespace.

Combinators

  • A combinator is a function that takes two elements from some domain, and returns another element from that same domain.
  • A parser combinator therefore takes two parsers and gives you a new parser.
  • Sequential combinators: parser1 ~ parser2, parserIgnored ~> parser, parser <~ parserIgnored, ~! guarantees no backtracking
  • Optional combinator: parser?
  • Repetition combinators: parser*, parser+
  • Alternative combinators: |, |||
  • log(parser) prints the parsing

Mapping parser outputs

  • We need ToyForm objects as the output of our parsers! That's what ^^ is for. It also has a variant ^? for partial functions.
  • Antipattern: manipulating the parsed string inside the transformation function.

Evaluation

Sorry no outline for this :(

Jump to Line
Something went wrong with that request. Please try again.