Skip to content

Conversation

klausler
Copy link
Contributor

Rather than admit complete confusion with a "FAIL" message when parsing runs into the end of the source file unexpectedly, deal better with missing or garbled END statements through the use of consumedAllInput in end-of-line and end-of-statement parsers and error recovery. Adds a new test (the original motivation) and updates the expected results in two extant tests that were failing before.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels Sep 19, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2025

@llvm/pr-subscribers-flang-parser

Author: Peter Klausler (klausler)

Changes

Rather than admit complete confusion with a "FAIL" message when parsing runs into the end of the source file unexpectedly, deal better with missing or garbled END statements through the use of consumedAllInput in end-of-line and end-of-statement parsers and error recovery. Adds a new test (the original motivation) and updates the expected results in two extant tests that were failing before.


Full diff: https://github.com/llvm/llvm-project/pull/159834.diff

7 Files Affected:

  • (modified) flang/docs/ParserCombinators.md (+1)
  • (modified) flang/lib/Parser/executable-parsers.cpp (+19-14)
  • (modified) flang/lib/Parser/program-parsers.cpp (+7-4)
  • (modified) flang/lib/Parser/stmt-parser.h (+11-11)
  • (modified) flang/test/Parser/at-process.f (+1-1)
  • (added) flang/test/Parser/come-to-a-bad-end.f90 (+13)
  • (modified) flang/test/Parser/unparseable.f90 (+5-1)
diff --git a/flang/docs/ParserCombinators.md b/flang/docs/ParserCombinators.md
index 076e76f703c49..72cc4ba00307b 100644
--- a/flang/docs/ParserCombinators.md
+++ b/flang/docs/ParserCombinators.md
@@ -63,6 +63,7 @@ These objects and functions are (or return) the fundamental parsers:
   the value that the parser never returns.
 * `nextCh` consumes the next character and returns its location,
   and fails at EOF.
+* `consumedAllInput` is equivalent, but preferable, to `!nextCh`.
 * `"xyz"_ch` succeeds if the next character consumed matches any of those
   in the string and returns its location.  Be advised that the source
   will have been normalized to lower case (miniscule) letters outside
diff --git a/flang/lib/Parser/executable-parsers.cpp b/flang/lib/Parser/executable-parsers.cpp
index ecabbe1db6def..fadec1f11d1db 100644
--- a/flang/lib/Parser/executable-parsers.cpp
+++ b/flang/lib/Parser/executable-parsers.cpp
@@ -65,21 +65,26 @@ constexpr auto obsoleteExecutionPartConstruct{recovery(ignoredStatementPrefix >>
         statement("REDIMENSION" >> name /
                 parenthesized(nonemptyList(Parser<AllocateShapeSpec>{}))))))};
 
-TYPE_PARSER(recovery(
-    CONTEXT_PARSER("execution part construct"_en_US,
-        first(construct<ExecutionPartConstruct>(executableConstruct),
-            construct<ExecutionPartConstruct>(statement(indirect(formatStmt))),
-            construct<ExecutionPartConstruct>(statement(indirect(entryStmt))),
-            construct<ExecutionPartConstruct>(statement(indirect(dataStmt))),
-            extension<LanguageFeature::ExecutionPartNamelist>(
-                "nonstandard usage: NAMELIST in execution part"_port_en_US,
+// The "!consumedAllInput >>" test prevents a cascade of errors at EOF.
+TYPE_PARSER(!consumedAllInput >>
+    recovery(
+        CONTEXT_PARSER("execution part construct"_en_US,
+            first(construct<ExecutionPartConstruct>(executableConstruct),
                 construct<ExecutionPartConstruct>(
-                    statement(indirect(Parser<NamelistStmt>{})))),
-            obsoleteExecutionPartConstruct,
-            lookAhead(declarationConstruct) >> SkipTo<'\n'>{} >>
-                fail<ExecutionPartConstruct>(
-                    "misplaced declaration in the execution part"_err_en_US))),
-    construct<ExecutionPartConstruct>(executionPartErrorRecovery)))
+                    statement(indirect(formatStmt))),
+                construct<ExecutionPartConstruct>(
+                    statement(indirect(entryStmt))),
+                construct<ExecutionPartConstruct>(
+                    statement(indirect(dataStmt))),
+                extension<LanguageFeature::ExecutionPartNamelist>(
+                    "nonstandard usage: NAMELIST in execution part"_port_en_US,
+                    construct<ExecutionPartConstruct>(
+                        statement(indirect(Parser<NamelistStmt>{})))),
+                obsoleteExecutionPartConstruct,
+                lookAhead(declarationConstruct) >> SkipTo<'\n'>{} >>
+                    fail<ExecutionPartConstruct>(
+                        "misplaced declaration in the execution part"_err_en_US))),
+        construct<ExecutionPartConstruct>(executionPartErrorRecovery)))
 
 // R509 execution-part -> executable-construct [execution-part-construct]...
 TYPE_CONTEXT_PARSER("execution part"_en_US,
diff --git a/flang/lib/Parser/program-parsers.cpp b/flang/lib/Parser/program-parsers.cpp
index 5f4e62ffdbbf2..92c0a64b39a9d 100644
--- a/flang/lib/Parser/program-parsers.cpp
+++ b/flang/lib/Parser/program-parsers.cpp
@@ -67,8 +67,11 @@ static constexpr auto programUnit{
     lookAhead(maybe(label) >> validFunctionStmt) >>
         construct<ProgramUnit>(indirect(functionSubprogram)) ||
     construct<ProgramUnit>(indirect(Parser<MainProgram>{}))};
-static constexpr auto normalProgramUnit{StartNewSubprogram{} >> programUnit /
-        skipMany(";"_tok) / space / recovery(endOfLine, SkipPast<'\n'>{})};
+
+static constexpr auto normalProgramUnit{
+    !consumedAllInput >> StartNewSubprogram{} >> programUnit /
+        skipMany(";"_tok) / space / recovery(endOfLine, skipToNextLineIfAny)};
+
 static constexpr auto globalCompilerDirective{
     construct<ProgramUnit>(indirect(compilerDirective))};
 
@@ -86,7 +89,7 @@ static constexpr auto globalOpenACCCompilerDirective{
 TYPE_PARSER(
     construct<Program>(extension<LanguageFeature::EmptySourceFile>(
                            "nonstandard usage: empty source file"_port_en_US,
-                           skipStuffBeforeStatement >> !nextCh >>
+                           skipStuffBeforeStatement >> consumedAllInput >>
                                pure<std::list<ProgramUnit>>()) ||
         some(globalCompilerDirective || globalOpenACCCompilerDirective ||
             normalProgramUnit) /
@@ -107,7 +110,7 @@ constexpr auto actionStmtLookAhead{first(actionStmt >> ok,
     // first in the execution part
     "ALLOCATE ("_tok, "CALL" >> name >> "("_tok, "GO TO"_tok, "OPEN ("_tok,
     "PRINT"_tok / space / !"("_tok, "READ ("_tok, "WRITE ("_tok)};
-constexpr auto execPartLookAhead{first(actionStmtLookAhead >> ok,
+constexpr auto execPartLookAhead{first(actionStmtLookAhead,
     openaccConstruct >> ok, openmpExecDirective >> ok, "ASSOCIATE ("_tok,
     "BLOCK"_tok, "SELECT"_tok, "CHANGE TEAM"_sptok, "CRITICAL"_tok, "DO"_tok,
     "IF ("_tok, "WHERE ("_tok, "FORALL ("_tok, "!$CUF"_tok)};
diff --git a/flang/lib/Parser/stmt-parser.h b/flang/lib/Parser/stmt-parser.h
index ee45c6fd5d38c..b2bb8dd843642 100644
--- a/flang/lib/Parser/stmt-parser.h
+++ b/flang/lib/Parser/stmt-parser.h
@@ -27,21 +27,22 @@ template <typename PA>
 inline constexpr auto unterminatedStatement(const PA &p) {
   return skipStuffBeforeStatement >>
       sourced(construct<Statement<typename PA::resultType>>(
-          maybe(label), space >> p));
+          maybe(label / space), p));
 }
 
 constexpr auto atEndOfStmt{space >>
     withMessage("expected end of statement"_err_en_US, lookAhead(";\n"_ch))};
 constexpr auto checkEndOfKnownStmt{recovery(atEndOfStmt, SkipTo<'\n'>{})};
 
-constexpr auto endOfLine{
-    "\n"_ch >> ok || fail("expected end of line"_err_en_US)};
+constexpr auto endOfLine{consumedAllInput ||
+    withMessage("expected end of line"_err_en_US, "\n"_ch >> ok)};
 
 constexpr auto semicolons{";"_ch >> skipMany(";"_tok) / space / maybe("\n"_ch)};
 constexpr auto endOfStmt{
     space >> withMessage("expected end of statement"_err_en_US,
                  semicolons || endOfLine)};
-constexpr auto forceEndOfStmt{recovery(endOfStmt, SkipPast<'\n'>{})};
+constexpr auto skipToNextLineIfAny{consumedAllInput || SkipPast<'\n'>{}};
+constexpr auto forceEndOfStmt{recovery(endOfStmt, skipToNextLineIfAny)};
 
 template <typename PA> inline constexpr auto statement(const PA &p) {
   return unterminatedStatement(p) / endOfStmt;
@@ -70,17 +71,17 @@ constexpr auto ignoredStatementPrefix{
 // Error recovery within a statement() call: skip *to* the end of the line,
 // unless at an END or CONTAINS statement.
 constexpr auto inStmtErrorRecovery{!"END"_tok >> !"CONTAINS"_tok >>
-    SkipTo<'\n'>{} >> construct<ErrorRecovery>()};
+    (consumedAllInput || SkipTo<'\n'>{}) >> construct<ErrorRecovery>()};
 
 // Error recovery within statement sequences: skip *past* the end of the line,
 // but not over an END or CONTAINS statement.
 constexpr auto skipStmtErrorRecovery{!"END"_tok >> !"CONTAINS"_tok >>
-    SkipPast<'\n'>{} >> construct<ErrorRecovery>()};
+    (consumedAllInput || SkipPast<'\n'>{}) >> construct<ErrorRecovery>()};
 
 // Error recovery across statements: skip the line, unless it looks
 // like it might end the containing construct.
 constexpr auto stmtErrorRecoveryStart{ignoredStatementPrefix};
-constexpr auto skipBadLine{SkipPast<'\n'>{} >> construct<ErrorRecovery>()};
+constexpr auto skipBadLine{skipToNextLineIfAny >> construct<ErrorRecovery>()};
 constexpr auto executionPartErrorRecovery{stmtErrorRecoveryStart >>
     !"END"_tok >> !"CONTAINS"_tok >> !"ELSE"_tok >> !"CASE"_tok >>
     !"TYPE IS"_tok >> !"CLASS"_tok >> !"RANK"_tok >>
@@ -93,7 +94,7 @@ constexpr auto noNameEnd{"END" >> missingOptionalName};
 
 // For unrecognizable construct END statements.  Be sure to not consume
 // a program unit's END statement.
-constexpr auto progUnitEndStmt{
+constexpr auto progUnitEndStmt{consumedAllInput ||
     "END" >> (lookAhead("\n"_ch) || "SUBROUTINE"_tok || "FUNCTION"_tok ||
                  "PROCEDURE"_tok || "MODULE"_tok || "SUBMODULE"_tok ||
                  "PROGRAM"_tok || "BLOCK DATA"_tok)};
@@ -103,9 +104,8 @@ constexpr auto namedConstructEndStmtErrorRecovery{
     constructEndStmtErrorRecovery >> missingOptionalName};
 
 constexpr auto progUnitEndStmtErrorRecovery{
-    (many(!"END"_tok >> SkipPast<'\n'>{}) >>
-        ("END"_tok >> SkipTo<'\n'>{} || consumedAllInput)) >>
-    missingOptionalName};
+    many(!"END"_tok >> SkipPast<'\n'>{}) >>
+    maybe("END"_tok >> SkipTo<'\n'>{}) >> missingOptionalName};
 
 constexpr auto beginDirective{skipStuffBeforeStatement >> "!"_ch};
 constexpr auto endDirective{space >> endOfLine};
diff --git a/flang/test/Parser/at-process.f b/flang/test/Parser/at-process.f
index 4f54c6b65638b..e1abedf04ec84 100644
--- a/flang/test/Parser/at-process.f
+++ b/flang/test/Parser/at-process.f
@@ -19,4 +19,4 @@ subroutine f()
 !CHECK: Character in fixed-form label field must be a digit
 @precoss 
 
-!CHECK: at-process.f:14:1: error: parser FAIL (final position)
+      end
diff --git a/flang/test/Parser/come-to-a-bad-end.f90 b/flang/test/Parser/come-to-a-bad-end.f90
new file mode 100644
index 0000000000000..5fbadfae0ef54
--- /dev/null
+++ b/flang/test/Parser/come-to-a-bad-end.f90
@@ -0,0 +1,13 @@
+!RUN: not %flang_fc1 -fsyntax-only %s 2>&1 | FileCheck %s
+!CHECK:/come-to-a-bad-end.f90:13:4: error: expected '('
+!CHECK: in the context: statement function definition
+!CHECK: in the context: SUBROUTINE subprogram
+!CHECK:error: expected declaration construct
+!CHECK:/come-to-a-bad-end.f90:13:1: in the context: specification part
+!CHECK: in the context: SUBROUTINE subprogram
+!CHECK:error: end of file
+!CHECK: in the context: SUBROUTINE subprogram
+subroutine a
+end
+subroutine b
+gnd
diff --git a/flang/test/Parser/unparseable.f90 b/flang/test/Parser/unparseable.f90
index 9e7a890b67a34..f3e47e294001c 100644
--- a/flang/test/Parser/unparseable.f90
+++ b/flang/test/Parser/unparseable.f90
@@ -1,5 +1,9 @@
 ! RUN: not %flang_fc1 -fsyntax-only %s 2>&1 | FileCheck %s
-! CHECK: unparseable.f90:5:1: error: parser FAIL (final position)
+! CHECK:error: end of file
+! CHECK:in the context: END PROGRAM statement
+! CHECK:/unparseable.f90:9:1: in the context: main program
+! CHECK:error: end of file
+! CHECK:/unparseable.f90:9:1: in the context: SELECT TYPE construct
 module m
 end
 select type (barf)

Rather than admit complete confusion with a "FAIL" message
when parsing runs into the end of the source file unexpectedly,
deal better with missing or garbled END statements through
the use of `consumedAllInput` in end-of-line and end-of-statement
parsers and error recovery.  Adds a new test (the original
motivation) and updates the expected results in two extant tests
that were failing before.
@klausler klausler merged commit 977e75c into llvm:main Sep 23, 2025
10 checks passed
@klausler klausler deleted the scott branch September 23, 2025 22:44
@DanielCChen
Copy link
Contributor

DanielCChen commented Sep 25, 2025

@klausler
It seems this PR caused some regressions in our downstream run for the fixed-form.
A reducer:

!dir$ fixed
        integer a(20)
!dir$ free

        end

Flang now complains:

error: Could not parse t.f
error: expected declaration construct
in the context: specification part
error: end of file
in the context: END PROGRAM statement
in the context: main program
error: expected declaration construct
in the context: specification part

@klausler
Copy link
Contributor Author

Will fix; thanks for the bug report.

@klausler
Copy link
Contributor Author

#160780

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants