Skip to content

Upgrade to Lezer system 1.0.0 #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 30, 2022
Merged

Upgrade to Lezer system 1.0.0 #18

merged 8 commits into from
Jun 30, 2022

Conversation

zampino
Copy link
Contributor

@zampino zampino commented Jun 27, 2022

This upgrades to lezer to 1.0.0.

We also move the package name to @lezer/clojure to be consistent with how other grammars are now published.

Remove the constructor call syntax to work around the Inconsistent skip sets after "#" ident error.

Closes #13, closes #14 and closes #15. Fixes #16.

@zampino
Copy link
Contributor Author

zampino commented Jun 27, 2022

If I move conflicting prefixed terms out of the @skip{}{...} block and allow ambiguity to be resolved later like this:

DataLiteral { dataLiteral ~amb }
ConstructorPrefix[prefixEdge] { dataLiteral ~amb }
Constructor[prefixColl] { ConstructorPrefix (Vector | Map) }

then inconsistency moves around the ignore next form:

Inconsistent skip sets after "#_"

If I remove Discard from the top skip set, then the parser builds (but 2 tests around Discard expressions fail of course).

@zampino
Copy link
Contributor Author

zampino commented Jun 28, 2022

Notes on a minimal grammar which reproduces issue #16.

@top Program { expression+ }
@skip { whitespace | Skipped }
Skipped { "#_" expression }

expression { A | B | ReaderTag | Constructor }
A { "a" }
B { "b" }

@skip{}{
  ReaderTag { "#" ident ~amb }
  prefix { "#" ident ~amb }
}

Constructor { prefix B }

@tokens {
  "a" "b"
  ident { "*" }
  whitespace { std.whitespace+ }
}
@detectDelim

Build with lezer-generator src/mini.grammar -o src/mini. The grammar builds and the tests below are passing in lezer-generator v0.13.4

import {parser} from './src/mini.js'
let p = parser.configure({strict: true})
// import {testTree} from "@lezer/generator/dist/test"
import {testTree} from "lezer-generator/dist/test"

testTree(p.parse("a"),           "Program(A)")
testTree(p.parse("a #_ a"),      "Program(A, Skipped(A))")
testTree(p.parse("a #_ #_ b b"), "Program(A, Skipped(Skipped(B), B))")
testTree(p.parse("a #* a"),      "Program(A, ReaderTag, A)")
testTree(p.parse("a #* b"),      "Program(A, Constructor( B ))")

but not building in @lezer/generator 1.0.0 with error

Inconsistent skip sets after "#_" "#" ident

where the offending expression is Constructor: if we remove it from the valid expressions we can actually generate a parser and tests pass except the Constructor one of course.

@zampino
Copy link
Contributor Author

zampino commented Jun 28, 2022

Asked in Lezer-discussion.

@zampino
Copy link
Contributor Author

zampino commented Jun 29, 2022

Given @marijnh's answer, I think maybe we shouldn't handle reader tags as expressions (DataLiteral) on their own unless we had a specific reason in the past? This is also reflected by our current highlighting and nav behaviour (clojure-mode left):

CleanShot 2022-06-29 at 10 44 31@2xCleanShot 2022-06-29 at 10 42 53@2x

which doesn't seem right. Maybe we should introduce a compound WithReaderTag expression like this

@top Program { expression+ }
@skip { whitespace | Skipped }
Skipped { "#_" expression }

@precedence { ct @right }
expression { A | B | WithReaderTag | Constructor }
A { "a" }
B { "b" }

@skip{}{
  readertag { "#" ident  }
  constructortag { !ct "#" ident  }
}

Constructor { constructortag B }
WithReaderTag { readertag expression }

@tokens {
  "a" "b"
  ident { "*" }
  whitespace { std.whitespace+ }
}

@detectDelim

seem to comply with latest lezer

import {parser} from './src/mini.js'
let p = parser.configure({strict: true})
import {testTree} from "@lezer/generator/dist/test"
// import {testTree} from "lezer-generator/dist/test"

testTree(p.parse("a"),           "Program(A(a))")
testTree(p.parse("a #_ a"),      "Program(A(a), Skipped(A(a)))")
testTree(p.parse("a #_ #_ b b"), "Program(A(a), Skipped( Skipped(B(b)) , B(b) ))")

testTree(p.parse("a #* a"),      "Program(A(a), WithReaderTag(A(a)))")
testTree(p.parse("a #* b"),      "Program(A(a), Constructor(B(b)))")

testTree(p.parse("a #* #* a"),   "Program(A(a), WithReaderTag(WithReaderTag(A(a))))")
testTree(p.parse("a #* #* b"),   "Program(A(a), WithReaderTag(Constructor(B(b))))")

testTree(p.parse("a #_ #* a"),   "Program(A(a), Skipped(WithReaderTag(A(a))))")
testTree(p.parse("a #_ #* b"),   "Program(A(a), Skipped(Constructor( B(b) )) )")

@zampino zampino marked this pull request as ready for review June 30, 2022 09:42
@mk mk merged commit bc0789b into nextjournal:master Jun 30, 2022
@mk mk deleted the lezer-1.0.0 branch June 30, 2022 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent skip sets after "#" ident
2 participants