Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Fix parsing time explosion #76533

Merged
merged 1 commit into from
Jan 2, 2024
Merged

[flang] Fix parsing time explosion #76533

merged 1 commit into from
Jan 2, 2024

Conversation

klausler
Copy link
Contributor

When parsing a deeply-nested expression like
A1(A2(A3(A4(A5(A6(...A99(i)...))))))
the parser can get into an exponential state due to the need to consider the possibility that each "An(...)" might be the beginning of a reference to a procedure component ("An(...)%PROC(...)") so that alternative has to be attempted first before proceeding to try parsing "An(...)" as a function reference or as an array element designator. The parser for a structure component, which is used by the procedure designator parser, was not protected with the usual failure memoization technique, leading to exponentially bad behavior parsing a deeply-nested expression. Fix by exploiting the instrumented() parser combinator so that failed structure component parsers aren't repeated.

Fixes #76477.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 28, 2023

@llvm/pr-subscribers-flang-parser

Author: Peter Klausler (klausler)

Changes

When parsing a deeply-nested expression like
A1(A2(A3(A4(A5(A6(...A99(i)...))))))
the parser can get into an exponential state due to the need to consider the possibility that each "An(...)" might be the beginning of a reference to a procedure component ("An(...)%PROC(...)") so that alternative has to be attempted first before proceeding to try parsing "An(...)" as a function reference or as an array element designator. The parser for a structure component, which is used by the procedure designator parser, was not protected with the usual failure memoization technique, leading to exponentially bad behavior parsing a deeply-nested expression. Fix by exploiting the instrumented() parser combinator so that failed structure component parsers aren't repeated.

Fixes #76477.


Full diff: https://github.com/llvm/llvm-project/pull/76533.diff

1 Files Affected:

  • (modified) flang/lib/Parser/Fortran-parsers.cpp (+3-2)
diff --git a/flang/lib/Parser/Fortran-parsers.cpp b/flang/lib/Parser/Fortran-parsers.cpp
index c070bc1de37352..0dd95d69d3c662 100644
--- a/flang/lib/Parser/Fortran-parsers.cpp
+++ b/flang/lib/Parser/Fortran-parsers.cpp
@@ -1151,8 +1151,9 @@ TYPE_PARSER(construct<PartRef>(name,
 
 // R913 structure-component -> data-ref
 // The final part-ref in the data-ref is not allowed to have subscripts.
-TYPE_PARSER(construct<StructureComponent>(
-    construct<DataRef>(some(Parser<PartRef>{} / percentOrDot)), name))
+TYPE_CONTEXT_PARSER("component"_en_US,
+    construct<StructureComponent>(
+        construct<DataRef>(some(Parser<PartRef>{} / percentOrDot)), name))
 
 // R919 subscript -> scalar-int-expr
 constexpr auto subscript{scalarIntExpr};

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

When parsing a deeply-nested expression like
  A1(A2(A3(A4(A5(A6(...A99(i)...))))))
the parser can get into an exponential state due to the need
to consider the possibility that each "An(...)" might be the
beginning of a reference to a procedure component ("An(...)%PROC(...)")
so that alternative has to be attempted first before proceeding
to try parsing "An(...)" as a function reference or as an array
element designator.  The parser for a structure component, which
is used by the procedure designator parser, was not protected with
the usual failure memoization technique, leading to exponentially
bad behavior parsing a deeply-nested expression.  Fix by exploiting
the instrumented() parser combinator so that failed structure component
parsers aren't repeated.

Fixes llvm#76477.
@klausler klausler merged commit 3bbdbb2 into llvm:main Jan 2, 2024
4 checks passed
@klausler klausler deleted the bug76477 branch January 2, 2024 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flang] Compilation of a chain of indirect accesses of arrays needs a lot of time
3 participants