Skip to content

C#/Java: Materialize sink/source/summary predicates to avoid bad join order. #12118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 9, 2023

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Feb 7, 2023

Running java/telemetry/supported-external-api-sinks on glopezGitHub/Android23 (database artifact found here: https://github.com/codeql-testing/glopezGitHub--Android23/actions/runs/4050012179)
yielded a log containing:

[2023-02-07 09:36:48] Evaluated non-recursive predicate _Expr#20226873::Call::getCallee#0#dispred#ff_ExternalApi#3b305853::ExternalApi::getAnInput#0#dispred__#count_range@3483fcrm in 644206ms (size: 102815).
Evaluated relational algebra for predicate _Expr#20226873::Call::getCallee#0#dispred#ff_ExternalApi#3b305853::ExternalApi::getAnInput#0#dispred__#count_range@3483fcrm with tuple counts:
             357828        ~0%    {3} r1 = SCAN Expr#20226873::Call::getCallee#0#dispred#ff OUTPUT In.1, 1, In.0
             357828        ~0%    {3} r2 = r1 AND NOT _Expr#20226873::Call::getCallee#0#dispred#ff_GeneratedFiles#c1ec4513::GeneratedFile#class#f_Location__#antijoin_rhs(1, Lhs.2, Lhs.0)
             357828        ~0%    {3} r3 = JOIN r2 WITH Member#9eba3c33::Callable::getSourceDeclaration#0#dispred#ff ON FIRST 1 OUTPUT Rhs.1, 1, Lhs.2
        24250368013        ~2%    {4} r4 = JOIN r3 WITH ExternalApi#3b305853::ExternalApi::getAnInput#0#dispred#ff ON FIRST 1 OUTPUT Rhs.1, 1, Lhs.2, Lhs.0
         5820107998  ~5344227%    {3} r5 = JOIN r4 WITH ExternalFlow#030a639a::Cached::sinkNode#2#ff ON FIRST 1 OUTPUT Lhs.3, 1, Lhs.2
         1649881354  ~1631058%    {4} r6 = JOIN r5 WITH ExternalApi#3b305853::ExternalApi::getApiName#0#dispred#bf ON FIRST 1 OUTPUT Rhs.1, Lhs.2, Lhs.0, 1
                                  return r6

To avoid the join to getAnInput before the filtering (sinkNode) we materialize the predicate isSink (that is, add pragma[nomagic]). We do the same for isSource and hasSummary as they could potentially exhibit the same issue. The same is done for C#.

It turned out that after introducing the pragmas on the predicates mentioned above another bad join was introduced in the characteristic predicate for TestLibrary where the name of all elements were matched against the regexp before joining only with those elements that represented packages. This has also been fixed.

@michaelnebel michaelnebel force-pushed the telemetry/performancefix branch from aff041f to 02364d0 Compare February 8, 2023 11:00
@michaelnebel
Copy link
Contributor Author

DCA looks good.

@michaelnebel michaelnebel marked this pull request as ready for review February 9, 2023 07:34
@michaelnebel michaelnebel requested review from a team as code owners February 9, 2023 07:34
@aschackmull aschackmull merged commit 3e2bf23 into github:main Feb 9, 2023
@michaelnebel michaelnebel deleted the telemetry/performancefix branch April 21, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C# Java no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants