Draft: Introduce new `%error` handler mode and `catch` for resumable parsing #272

sgraf812 · 2024-02-02T15:33:51Z

A drafty PoC serving as motivation for a GSoC proposal; I don't want to see this merged or reviewed for now.

Consider this excerpt from an example (errormonad-resume.y):

%name parseStmts Stmts
%tokentype { Token }
%error { \_ -> abort } { reportError } -- the entire point of this test

%monad { ParseM } { (>>=) } { return }

%token
  '1' { TOne }
  '+' { TPlus }
  ';' { TSemi }

%%

Stmts : {- empty -}           { [] }
      | Stmt                  { [$1] }
      | Stmts ';' Stmt        { $1 ++ [$3] }
      | catch ';' Stmt %shift { [$3] } -- Could insert error AST token here in place of $1

Stmt : Exp { ExpStmt $1 }

Exp : '1'                { One }
    | Exp '+' Exp %shift { Plus $1 $3 }

{
recordParseError :: [String] -> ParseM ()
recordParseError expected = recordError [ParseError expected]

reportError :: [Token] -> [String] -> ([Token] -> ParseM a) -> ParseM a
reportError ts expected resume = do
  recordParseError expected
  resume ts

The point of this example is that reporting a parse error (i.e. adding a diagnostic) is independent from aborting a parse (i.e., a fatal error that can't produce a syntax tree). Hence the new %error form takes two code blocks: One for the abort handler, the other for the report handler.

Additionally, the report handler gets a resume continuation that it may call to resume parsing; otherwise it would simply have to abort (TODO: In the current encoding it should perhaps simply be ParseM () and we call the resumption unconditionally).

Where does the parser resume parsing? That is mostly up to the user to specify, through use of the special catch terminal. In the example above, catch occurs before a ; is shifted, so that upon an error during parsing a Stmt, the parser will "unwind" the stack until it finds a situation in which catch can be shifted. After having done that, it will discard input until it finds the next ; so as to resume parsing.

The result is that for inputs such as 1++1;1;+, two errors (and a partial syntax tree) can be reported: One at the second + and the other at the third.

I've already tried to apply this patch to GHC; it seems to work: See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/11990 for a worked example.

God, this was annoying

sgraf812 · 2024-02-05T08:43:42Z

For context, I proposed bringing this PR into a mergeable state and applying the result to GHC as a GSoC proposal.

kd1729 · 2024-03-04T12:02:18Z

Hi @sgraf812 I went through this issue. https://summer.haskell.org/ideas.html#parse-error-recovery
I am interested in this and want to take it up for GSOC'24. I am going to write a draft proposal for the same and want to get it reviewed by you. Any guidelines or suggestions regarding the same are appreciated.

sgraf812 · 2024-03-06T13:40:16Z

Hi Kaustubh, that's great! Perhaps it's good to have a short chat in private to see whether you would actually enjoy working on this project.

For example,

What is your background in Haskell?
Do you maintain any open source projects in Haskell or related to compilers?
Have you previously used parser generators such as bison or yacc?
Have you attended any previous classes on compiler engineering at your university?
What kind of improvements to happy would you find exciting to have that are not listed in the GSoC proposal, and why?

alinab · 2024-03-16T09:00:16Z

@sgraf812 It would be great if I could ask a few questions on the scope of this work. Would setting up a chat work for this?And if so, please just let me know how. Thanks.

xevor11 · 2024-03-19T01:58:18Z

I'm a potential gsoc contributor I am interested in this project! I am currently taking a compiler course at my university along with a functional programming course in Haskell. My thesis project revolves around extending the Cool Compiler (a subset of Scala, utilized by Stanford) with LLVM as it's backend and utilizing ANTLR. I have just recently worked with yacc in building a lexer and scala bison (a version of bison) in building a parser for the same language aforementioned. Last year, I was a GSOC contributor for the GNU organization working on adding support for the Hurd OS to the Rust Compiler. Since I am a beginner in both Haskell and have had some experience working with parser generators and defining grammars I think this project would be a good starting point for me. I'd be happy to have a short private chat @sgraf812 to further discuss my background and projects I have worked on to see if this good be a good fit? This is one project I attempted using the E-Graph Library (that might be interesting):
https://github.com/xevor11/E-Graph-Optimizer
Thanks!
Vedant

sgraf812 · 2024-03-19T15:46:03Z

Hi Alina and Vedant, feel free to reach out to me via mail (Vedant already did so) or via Matrix (@sgraf812:matrix.org).

xevor11 · 2024-03-19T22:35:00Z

Thanks for the update! I wanted to mention that I sent an email with my first rough draft proposal, I was eager to get your feedback if possible on improving the Milestones and Deliverables section, I provided the specifications in the email!

sgraf812 added 10 commits January 26, 2024 16:46

Update .gitignore

22b3dbb

-Wno-incomplete-uni-patterns

086b3b0

God, this was annoying

tmp

96c3943

Introduce catchTok

be178d9

Implement error fixup and resume logic

c530267

32 bit table entries

8d4f7d6

Simulate reductions and overhaul expected token generation

8e94258

Better abort/report factoring

1db8446

Only report when not already trying to resume

cb667c7

Fix bootstrapped parser

ece7d39

sgraf812 mentioned this pull request Feb 4, 2024

Encode 32 bit integers in arrays #266

Open

sgraf812 mentioned this pull request Feb 5, 2024

Bold change: Continue supporting LR's --array --ghc --coerce mode, abandon the rest #268

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Introduce new `%error` handler mode and `catch` for resumable parsing #272

Draft: Introduce new `%error` handler mode and `catch` for resumable parsing #272

sgraf812 commented Feb 2, 2024

sgraf812 commented Feb 5, 2024

kd1729 commented Mar 4, 2024

sgraf812 commented Mar 6, 2024

alinab commented Mar 16, 2024

xevor11 commented Mar 19, 2024 •

edited

sgraf812 commented Mar 19, 2024

xevor11 commented Mar 19, 2024

Draft: Introduce new %error handler mode and catch for resumable parsing #272

Are you sure you want to change the base?

Draft: Introduce new %error handler mode and catch for resumable parsing #272

Conversation

sgraf812 commented Feb 2, 2024

sgraf812 commented Feb 5, 2024

kd1729 commented Mar 4, 2024

sgraf812 commented Mar 6, 2024

alinab commented Mar 16, 2024

xevor11 commented Mar 19, 2024 • edited

sgraf812 commented Mar 19, 2024

xevor11 commented Mar 19, 2024

Draft: Introduce new `%error` handler mode and `catch` for resumable parsing #272

Draft: Introduce new `%error` handler mode and `catch` for resumable parsing #272

xevor11 commented Mar 19, 2024 •

edited