-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Eliminate recursion from toString(). #3387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'll just @-mention @github/codeql-c-analysis since the |
It looks like the check is failing because of the identical files check:
I'll fix this up, but it would be good to get feedback on those changes first. |
Thanks for putting in the effort to eliminate this recursion. I had no idea how many changes it would require, but it looks like it's fairly minimal. Will all the changes here preserve the exact same output we had before? It looks like |
Yes, though I should mention that I've only looked at what happens when you
With the exception of |
Fixed the "sync identical files" issue. That means the part of this change in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any issue with the XML changes as far as Python is concerned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's causing some slightly concerning tuple count differences for JS (XSS query on juliusjs). I understand it's not exactly the fault of this PR, but rather a consequence of the unfortunate fact that XMLFile overrides File.toString
.
It seems to me that XMLFile should just not be a subclass of File at all - logically it makes no sense that File.toString
should have a special value for XML files, it's just an artifact of multiple inheritance. Is it feasible to remove the File base class instead?
With that said if it's important for C++ then LGTM. I'll deal with the fallout if it turns out to be an issue for JS.
I take back what I said in the meeting yesterday about this PR not having much benefit unless every last bit of recursion is eliminated. I just looked at the evaluation logs for a large customer snapshot and found this in iteration 2:
Notice how it concatenates "using " with all results from the previous iteration before filtering those results to contain only The RA for iteration 1 seem fine in contrast, with the concatenation always happening before the filtering. |
Thanks for spotting this. As far as I can tell from your report this isn't affecting results, just the number of rows in some intermediate computations - with potential performance impact. Do we know whether performance changes on this snapshot? Is the concern the extra Unfortunately I'm not very well set up to test/debug this - I'm not set up for JS dev and haven't been able to find a single CPP snapshot that actually has |
|
There's clearly something I don't understand here. AFAICT the
There's no telling whether external users depend on this inheritance relationship, but I can easily imagine that they do. If we ensure that all ambiguous overrides on |
(@geoffw0 sorry I missed your comment from 14d ago.)
I agree the results should be the same, it's just a question of how well the optimizer can handle the code. In any case, I'm not able to reproduce the results anymore; I must have done something wrong in my initial test. Instead I get a large reduction in tuple counts now. Sorry about that. Please just move ahead with the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've re-triggered the tests and will merge if they pass.
Removes recursion from
toString()
, resulting in moderately better performance when it's computed. The improvement comes fromElement::ElementBase::toString_dispred#ff
, which is still large and complex but is no longer recursive.Some results:
As above but with #3352:
As a sequence (clearing cache only at the start):
And:
Note that in the case of
UsingDeclarationEntry
I have traded detail in the result for better performance. There's a lot of scope for more changes like this if it's something we're comfortable doing.