Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sql/plsql] Degradation of performance with recent changes #4014

Closed
kaby76 opened this issue Mar 13, 2024 · 4 comments · Fixed by #4015
Closed

[sql/plsql] Degradation of performance with recent changes #4014

kaby76 opened this issue Mar 13, 2024 · 4 comments · Fixed by #4015

Comments

@kaby76
Copy link
Contributor

kaby76 commented Mar 13, 2024

@lbovet @KvanTTT

There have been changes to plsql that I suspect to impact performance. I wrote a script to track performance in the grammar over the last 8 relevant PRs. The script performs a grouped-parse of the 374 .sql input files for the grammar 10 times and computes the mean and standard deviation, taking care to make sure no new input files were tested in subsequent PRs, which would add to the length of the time of test and be misleading.. The results are shown in the figure below.

times

I have not done a t-test, but eyeballing the data suggest that the performance took a hit with #4003, which may be caused by a large "k" lookahead being introduced. I suggest investigating whether #4003 indeed caused a performance drop.

I wouldn't say it's a large drop in performance, but I have seen grammars with significant ambiguity or large k. It's sometimes hard to know whether a change causes a catastrophic performance drop given how Github Actions only returns "pass" or "fail".

I am planning to repeat the test with a larger repetition count in order to see if that affects the standard deviation. I also plan to perform an "individual" parse test (one test input file per one run of the test application) to see how the performance was affected.

I am planning to make a PR to test performance changes with a PR.

@kaby76 kaby76 changed the title [sql/plsql] Significant degradation of performance with recent changes [sql/plsql] Degradation of performance with recent changes Mar 13, 2024
@kaby76
Copy link
Contributor Author

kaby76 commented Mar 13, 2024

Updated stats with a sample size of 40. We now "see" the clear decrease in performance with #4003.

times

As I look at #4003, rules were added to enforce the type of the expression through syntax. This is something that is never done. Static semantics is enforced by a static analysis phase. Why was this added?

Ideally, precedence of expression operators should be implemented using the Antlr "ordered alt" rather than the "chain of rules" method. Then, expressions would be far more compact and parse times much improved. But, that is another issue.

@lbovet
Copy link
Contributor

lbovet commented Mar 13, 2024

So a solution would be to move json_condition as a unary_logical_operation. This would maybe make the grammar a bit too lax since this expression could be used in contexts where it is not allowed (not verified, just an assumption).
For me it would be ok. @KvanTTT, what do you think?

@kaby76
Copy link
Contributor Author

kaby76 commented Mar 13, 2024

The problem is that you added ambiguity to expressions because AND and OR operators are now in two rules for expressions. We now have two different ways to parse logical expressions from condition. Correcting "lax"-ness should be done via semantic analysis not syntax.

@lbovet
Copy link
Contributor

lbovet commented Mar 13, 2024

Yes, I completely figure out the problem, now. I am quite new in the grammar area so I did it the naive way.
Working on a fix.

lbovet added a commit to lbovet/grammars-v4 that referenced this issue Mar 13, 2024
- Replace antlr#4003
- Avoids performance impact (fixes antlr#4014)
- Avoids semantic rules that should be done in static analysis
KvanTTT pushed a commit that referenced this issue Mar 14, 2024
- Replace #4003
- Avoids performance impact (fixes #4014)
- Avoids semantic rules that should be done in static analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants