Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silently ignores circular rules #3

Open
safinaskar opened this issue Jul 10, 2022 · 1 comment
Open

Silently ignores circular rules #3

safinaskar opened this issue Jul 10, 2022 · 1 comment

Comments

@safinaskar
Copy link

Consider this code:

use santiago::lexer::LexerRules;
use santiago::grammar::Grammar;

pub fn lexer_rules() -> LexerRules {
    santiago::lexer_rules!(
        "DEFAULT" | "A" = string "a";
    )
}

pub fn grammar() -> Grammar<()> {
    santiago::grammar!(
        "X" => lexemes "A";
        "X" => rules "X";
    )
}

fn main() {
    let input = "a";
    let lexer_rules = lexer_rules();
    let lexemes = santiago::lexer::lex(&lexer_rules, input).unwrap();
    let grammar = grammar();
    let parse_trees = santiago::parser::parse(&grammar, &lexemes).unwrap();
    println!("{:?}", parse_trees);
}

I use santiago 1.3.1 .

This grammar is infinitely ambiguous on this input, so the program should (theoretically) output this infinite output:

[Γ := rules "X"
  X := lexemes "A"
    A "a" (1, 1)
, Γ := rules "X"
  X := rules "X"
    X := lexemes "A"
      A "a" (1, 1)
, ...

But instead the program prints one parse tree only:

[Γ := rules "X"
  X := lexemes "A"
    A "a" (1, 1)
]

I think this is a bug.

Of course, in practice this is impossible to return infinite vector, so the library should somehow report error. But the library simply returns one parse tree instead. This probably means that there is some bug in the library. So I lose trust in this library

@safinaskar
Copy link
Author

Even worse bug! I found another bug. I suspect both bugs caused by one root cause. Consider this code:

use santiago::lexer::LexerRules;
use santiago::grammar::Grammar;

pub fn lexer_rules() -> LexerRules {
    santiago::lexer_rules!(
        "DEFAULT" | "ID" = pattern r"[a-z]+";
        "DEFAULT" | "==>" = string "==>";
        "DEFAULT" | "::" = string "::";
    )
}

pub fn grammar() -> Grammar<()> {
    santiago::grammar!(
        "t0" => rules "t1";
        "t1000" => lexemes "ID";
        "t3" => rules "t4" "dotdot" "t0";
        "t1" => rules "t2" "arr" "t1";
        "t1" => rules "t2";
        "t2" => rules "t3";
        "t3" => rules "t4";
        "t4" => rules "t1000";
        "dotdot" => lexemes "::";
        "arr" => lexemes "==>";
    )
}

fn main() {
    let input = "x::y==>z";
    let lexer_rules = lexer_rules();
    let lexemes = santiago::lexer::lex(&lexer_rules, input).unwrap();
    let grammar = grammar();
    let parse_trees = santiago::parser::parse(&grammar, &lexemes).unwrap();
    println!("{:?}", parse_trees);
}

x::y==>z allows two parse trees: (x::y)==>z and x::(y==>z). But this program prints one only:

[Γ := rules "t0"
  t0 := rules "t1"
    t1 := rules "t2" "arr" "t1"
      t2 := rules "t3"
        t3 := rules "t4" "dotdot" "t0"
          t4 := rules "t1000"
            t1000 := lexemes "ID"
              ID "x" (1, 1)
          dotdot := lexemes "::"
            :: "::" (1, 2)
          t0 := rules "t1"
            t1 := rules "t2"
              t2 := rules "t3"
                t3 := rules "t4"
                  t4 := rules "t1000"
                    t1000 := lexemes "ID"
                      ID "y" (1, 4)
      arr := lexemes "==>"
        ==> "==>" (1, 5)
      t1 := rules "t2"
        t2 := rules "t3"
          t3 := rules "t4"
            t4 := rules "t1000"
              t1000 := lexemes "ID"
                ID "z" (1, 8)
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant