[sql/plsql] Degradation of performance with recent changes #4014

kaby76 · 2024-03-13T04:54:35Z

There have been changes to plsql that I suspect to impact performance. I wrote a script to track performance in the grammar over the last 8 relevant PRs. The script performs a grouped-parse of the 374 .sql input files for the grammar 10 times and computes the mean and standard deviation, taking care to make sure no new input files were tested in subsequent PRs, which would add to the length of the time of test and be misleading.. The results are shown in the figure below.

I have not done a t-test, but eyeballing the data suggest that the performance took a hit with #4003, which may be caused by a large "k" lookahead being introduced. I suggest investigating whether #4003 indeed caused a performance drop.

I wouldn't say it's a large drop in performance, but I have seen grammars with significant ambiguity or large k. It's sometimes hard to know whether a change causes a catastrophic performance drop given how Github Actions only returns "pass" or "fail".

I am planning to repeat the test with a larger repetition count in order to see if that affects the standard deviation. I also plan to perform an "individual" parse test (one test input file per one run of the test application) to see how the performance was affected.

I am planning to make a PR to test performance changes with a PR.

kaby76 · 2024-03-13T10:43:14Z

Updated stats with a sample size of 40. We now "see" the clear decrease in performance with #4003.

As I look at #4003, rules were added to enforce the type of the expression through syntax. This is something that is never done. Static semantics is enforced by a static analysis phase. Why was this added?

Ideally, precedence of expression operators should be implemented using the Antlr "ordered alt" rather than the "chain of rules" method. Then, expressions would be far more compact and parse times much improved. But, that is another issue.

lbovet · 2024-03-13T16:12:39Z

So a solution would be to move json_condition as a unary_logical_operation. This would maybe make the grammar a bit too lax since this expression could be used in contexts where it is not allowed (not verified, just an assumption).
For me it would be ok. @KvanTTT, what do you think?

kaby76 · 2024-03-13T16:29:24Z

The problem is that you added ambiguity to expressions because AND and OR operators are now in two rules for expressions. We now have two different ways to parse logical expressions from condition. Correcting "lax"-ness should be done via semantic analysis not syntax.

lbovet · 2024-03-13T17:09:56Z

Yes, I completely figure out the problem, now. I am quite new in the grammar area so I did it the naive way.
Working on a fix.

- Replace antlr#4003 - Avoids performance impact (fixes antlr#4014) - Avoids semantic rules that should be done in static analysis

- Replace #4003 - Avoids performance impact (fixes #4014) - Avoids semantic rules that should be done in static analysis

kaby76 changed the title ~~[sql/plsql] Significant degradation of performance with recent changes~~ [sql/plsql] Degradation of performance with recent changes Mar 13, 2024

KvanTTT added plsql performance labels Mar 13, 2024

lbovet added a commit to lbovet/grammars-v4 that referenced this issue Mar 13, 2024

[PlSql] Simplify JSON condition handling

89356d2

- Replace antlr#4003 - Avoids performance impact (fixes antlr#4014) - Avoids semantic rules that should be done in static analysis

lbovet mentioned this issue Mar 13, 2024

[PlSql] Simplify JSON condition handling #4015

Merged

KvanTTT closed this as completed in #4015 Mar 14, 2024

KvanTTT pushed a commit that referenced this issue Mar 14, 2024

[PlSql] Simplify JSON condition handling

3663cc5

- Replace #4003 - Avoids performance impact (fixes #4014) - Avoids semantic rules that should be done in static analysis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sql/plsql] Degradation of performance with recent changes #4014

[sql/plsql] Degradation of performance with recent changes #4014

kaby76 commented Mar 13, 2024 •

edited

Loading

kaby76 commented Mar 13, 2024 •

edited

Loading

lbovet commented Mar 13, 2024

kaby76 commented Mar 13, 2024 •

edited

Loading

lbovet commented Mar 13, 2024

[sql/plsql] Degradation of performance with recent changes #4014

[sql/plsql] Degradation of performance with recent changes #4014

Comments

kaby76 commented Mar 13, 2024 • edited Loading

kaby76 commented Mar 13, 2024 • edited Loading

lbovet commented Mar 13, 2024

kaby76 commented Mar 13, 2024 • edited Loading

lbovet commented Mar 13, 2024

kaby76 commented Mar 13, 2024 •

edited

Loading

kaby76 commented Mar 13, 2024 •

edited

Loading

kaby76 commented Mar 13, 2024 •

edited

Loading