Skip to content
/ gll Public
forked from rust-lang/gll

GLL parsing framework.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

anyska/gll

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GLL parsing framework

Build Status Latest Version Rust Documentation

Usage

Easiest way to get started is through gll-macros:

[dependencies]
gll = "0.0.2"
gll-macros = "0.0.2"
proc-macro2 = "0.4.20"
extern crate gll;
extern crate gll_macros;
extern crate proc_macro2;

As an example, this is what you might write for a JSON-like syntax, that uses plain identifiers instead of string literals for field names, and allows values to be parenthesized Rust expressions:

mod json_like {
    ::gll_macros::proc_macro_parser! {
        Value =
            | Null:"null"
            | False:"false"
            | True:"true"
            | Literal:LITERAL
            | Array:{ "[" elems:Value* % "," "]" }
            | Object:{ "{" fields:Field* % "," "}" }
            | InterpolateRust:{ "(" TOKEN_TREE+ ")" }
            ;
        Field = name:IDENT ":" value:Value;
    }
}

You can also use a build script to generate the parser (TODO: document).

To parse a string with that grammar:

use proc_macro2::TokenStream;

let tokens: TokenStream = string.parse().unwrap();
json_like::Value::parse_with(tokens, |parser, result| {
    let value = result.unwrap();
    // ...
});

Grammar

All grammars contain a set of named rules, with the syntax Name = rule;. (the order between the rules doesn't matter)

Rules are made out of:

  • grouping, using {...}
  • string literals, matching input characters / tokens exactly
  • character ranges: 'a'..='d' is equivalent to "a"|"b"|"c"|"d"
    • only in scannerless mode
  • builtin rules: IDENT, PUNCT, LITERAL, TOKEN_TREE
    • only in proc macro mode
  • named rules, referred to by their name
  • concatenation: A B - "A followed by B"
  • alternation: A | B - "either A or B"
  • optionals: A? - "either A or nothing"
  • lists: A* - "zero or more As", A+ - "one or more As"
    • optional separator: A* % "," - "comma-separated As"

Parts of a rule can be labeled with field names, to allow later access to them:

LetDecl = "let" pat:Pat { "=" init:Expr }? ";"; produces:

// Note: generic parameters omitted for brevity.
struct LetDecl {
    pat: Handle<Pat>,
    init: Option<Handle<Expr>>,
}

One Rust-specific convention is that alternation fields are enum variants.

Expr = Lit:LITERAL | Add:{ a:Expr "+" b:Expr }; produces:

enum Expr {
    Lit(Handle<LITERAL>),
    Add {
        a: Handle<Expr>,
        b: Handle<Expr>,
    },
}

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

GLL parsing framework.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%