Skip to content

jolonf/SimpleParsec

Repository files navigation

SimpleParsec

Simple parser combinator library for Swift inspired by NimbleParsec for Elixir.

Each function in the library creates a Parser which can be used to create an AST from text.

For example the string(_: String) -> Parser function returns a Parser which matches the exact string passed to the string() function:

import SimpleParsec

let parser: Parser = string("def")

The above example constructs a parser which matches the exact string "def".

Parser is just a typealias for a function which we can call to perform the parsing. The Parser typealias is defined as:

public typealias Parser = (Substring) -> ParserResult

Therefore we need to invoke the parser with a substring (the text to be parsed) and we get a ParserResult.

let result = parser("def functionName")

The above will successfully parse the text as it begins with "def".

Note that the Parser parameter type is actually Substring and not String. This allows for efficient processing throughout the parsers as substrings don't copy the string instead represent index locations within the string. Swift automatically converts the string literal into a Substring however if we have an already defined string we will need to convert it to a substring first:

let text = "def name"
let result = parser(Substring(text))

A parser returns a ParserResult which is an enum with two cases, either .ok or .error.

public enum ParserResult {
    case ok(Substring, AST?)
    case error(Substring, String)
}

Both include the remaining text that still needs to be parsed (the Substring) and .ok also includes the AST constructed up to this point whereas .error includes an error message. We can desconstruct the enum using an if case let:

if case let .ok(remain, astOpt) = result,
    let ast = astOpt {
    print(ast)
}

Note that astOpt is an optional, i.e. it can be nil even if the result is .ok. The AST can be nil for parsers such as ignore and optional where no match is okay or the result is not intended to be added to the AST. Here is a more complex example:

func functionHeader() -> Parser {
    tag(label: "function", concat([
       ignore(string("def")),
       ignore(iws()),
       tag(label: "functionName", alphaString()),
       ignore(string("(")),
       tag(label: "params", optional(params())),
       ignore(string(")"))
    ]))
}

func params() -> Parser {
    choose([
       concat([
         times(min: 1, concat([
            param(),
            ignore(string(",")),
            ignore(optional(iws()))
         ])),
         param()
       ]),
       param()
   ])
}

func param() -> Parser {
    tag(label: "param", alphaString())
}

let parser = functionHeader()

let result = parser("def myFunction(paramOne, paramTwo, paramThree)")

if case let .ok(_, ast) = result {
    print(ast!)
}

Outputs:

tag("function", SimpleParsec.AST.list([
    SimpleParsec.AST.tag("functionName", SimpleParsec.AST.value("myFunction")), 
    SimpleParsec.AST.tag("params", SimpleParsec.AST.list([
        SimpleParsec.AST.tag("param", SimpleParsec.AST.value("paramOne")), 
        SimpleParsec.AST.tag("param", SimpleParsec.AST.value("paramTwo")), 
        SimpleParsec.AST.tag("param", SimpleParsec.AST.value("paramThree"))
    ]))
]))
  • tag() adds a label to a nested parser result which can be used for processing the AST later.
  • ignore() will match its parser but not add the results to the AST.
  • concat() takes an array of parsers and ensures they all occur one after the other.
  • times() expects the parser to occur a multiple number of times, with a specified minimum.
  • iws() is short for in-line whitespace, i.e. whitespace that doesn't include new lines, or simply spaces and tabs. It matches one or more. To match a single character use iwsChar(). See also ws() which also matches new lines, and the single character version wsChar().

About

Swift Parser Combinator library inspired by NimbleParsec for Elixir.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages