Skip to content

Ruby: do fewer regexp matches in SensitiveActions #16314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

nickrolfe
Copy link
Contributor

Follow-up to #16306, making the same changes for Ruby.

Before:

[2024-04-24 10:42:04] Evaluated non-recursive predicate _Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965_SensitiveDataHeuristics::HeuristicName__#shared@fd52ac02 in 10567ms (size: 991).
Evaluated relational algebra for predicate _Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965_SensitiveDataHeuristics::HeuristicName__#shared@fd52ac02 with tuple counts:
        3539396  ~4%    {4} r1 = JOIN `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76` WITH `Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965` CARTESIAN PRODUCT OUTPUT Lhs.0, Lhs.1, Rhs.0, Rhs.1
            993  ~2%    {4}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.3,Lhs.1
            993  ~4%    {3}    | SCAN OUTPUT In.0, In.2, In.3
                        return r1

[2024-04-24 11:53:40] Evaluated non-recursive predicate SensitiveActions::BasicSensitiveWrite#35b96d84@defc92e9 in 794ms (size: 474).
Evaluated relational algebra for predicate SensitiveActions::BasicSensitiveWrite#35b96d84@defc92e9 with tuple counts:
         216317  ~0%    {4} r1 = SCAN `SensitiveActions::writesProperty/2#249e0c5f` OUTPUT In.0, _, In.1, _
         216317  ~4%    {2}    | REWRITE WITH Tmp.1 := "^@*", Tmp.3 := "", Out.1 := regexpReplaceAll(In.2,Tmp.1,Tmp.3) KEEPING 2
         214165  ~4%    {2}    | AND NOT `_SensitiveActions::writesProperty/2#249e0c5f#antijoin_rhs`(FIRST 2)
        1499155  ~0%    {4}    | JOIN WITH `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76` CARTESIAN PRODUCT OUTPUT Lhs.0, Lhs.1, Rhs.0, Rhs.1
            474  ~0%    {4}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.1,Lhs.3
            474  ~0%    {2}    | SCAN OUTPUT In.0, In.2
                        return r1

[2024-04-24 12:08:52] Evaluated non-recursive predicate SensitiveActions::BasicSensitiveVariableAccess#56ade80c@1dba36vl in 8106ms (size: 1775).
Evaluated relational algebra for predicate SensitiveActions::BasicSensitiveVariableAccess#56ade80c@1dba36vl with tuple counts:
        1152036  ~0%    {4} r1 = SCAN SensitiveActions::SensitiveVariableAccess#bbd64d3d OUTPUT In.0, In.1, _, _
        1152036  ~2%    {3}    | REWRITE WITH Tmp.2 := "^@*", Tmp.3 := "", Out.2 := regexpReplaceAll(InOut.1,Tmp.2,Tmp.3) KEEPING 3
        1144165  ~1%    {3}    | AND NOT _SensitiveActions::SensitiveVariableAccess#bbd64d3d#antijoin_rhs(FIRST 3)
        8009155  ~0%    {5}    | JOIN WITH `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76` CARTESIAN PRODUCT OUTPUT Lhs.0, Lhs.1, Lhs.2, Rhs.0, Rhs.1
           1775  ~4%    {5}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.2,Lhs.4
           1775  ~2%    {3}    | SCAN OUTPUT In.0, In.3, In.1
                        return r1

After:

[2024-04-24 11:05:49] Evaluated non-recursive predicate _Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965_SensitiveDataHeuristics::HeuristicName__#shared@750c34s0 in 6618ms (size: 912).
Evaluated relational algebra for predicate _Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965_SensitiveDataHeuristics::HeuristicName__#shared@750c34s0 with tuple counts:
             1  ~0%    {1} r1 = AGGREGATE `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76_1_#concat_range`, `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76_1_1_#concat_term` ON In.2, In.3 WITH CONCAT<0 ASC> OUTPUT , Agg.0
        505628  ~5%    {5}    | JOIN WITH `Constant::ConstantValue.isStringlikeValue/1#dispred#ff1ad965` CARTESIAN PRODUCT OUTPUT _, Rhs.0, Rhs.1, Lhs.0, _
        505628  ~4%    {3}    | REWRITE WITH Tmp.0 := "(?:", Tmp.4 := ")", Out.0 := (Tmp.0 ++ In.3 ++ Tmp.4) KEEPING 3
           912  ~2%    {3}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.2,Lhs.0
           912  ~2%    {2}    | SCAN OUTPUT In.1, In.2
                       return r1

[2024-04-24 12:00:43] Evaluated non-recursive predicate _SensitiveActions::writesProperty/2#249e0c5f_SensitiveDataHeuristics::HeuristicNames::maybeSensitive__#shared@9f327ffo in 379ms (size: 478).
Evaluated relational algebra for predicate _SensitiveActions::writesProperty/2#249e0c5f_SensitiveDataHeuristics::HeuristicNames::maybeSensitive__#shared@9f327ffo with tuple counts:
             1  ~0%    {1} r1 = AGGREGATE `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76_1_#concat_range`, `SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76_1_1_#concat_term` ON In.2, In.3 WITH CONCAT<0 ASC> OUTPUT , Agg.0
        216317  ~2%    {6}    | JOIN WITH `SensitiveActions::writesProperty/2#249e0c5f` CARTESIAN PRODUCT OUTPUT _, Rhs.0, _, Lhs.0, Rhs.1, _
        216317  ~0%    {3}    | REWRITE WITH Tmp.0 := "(?:", Tmp.2 := ")", Out.0 := (Tmp.0 ++ In.3 ++ Tmp.2), Tmp.2 := "^@*", Tmp.5 := "", Out.2 := regexpReplaceAll(In.4,Tmp.2,Tmp.5) KEEPING 3
           478  ~0%    {3}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.2,Lhs.0
           478  ~2%    {2}    | SCAN OUTPUT In.1, In.2
                       return r1

[2024-04-24 12:15:32] Evaluated non-recursive predicate _SensitiveActions::SensitiveVariableAccess#bbd64d3d__SensitiveDataHeuristics::HeuristicNames::maybeS__#shared@e7090b16 in 11116ms (size: 1784).
Evaluated relational algebra for predicate _SensitiveActions::SensitiveVariableAccess#bbd64d3d__SensitiveDataHeuristics::HeuristicNames::maybeS__#shared@e7090b16 with tuple counts:
        1152036  ~0%    {5} r1 = JOIN `_SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp/1#d87d0a76_1_#concat_range_SensitiveD__#shared` WITH SensitiveActions::SensitiveVariableAccess#bbd64d3d CARTESIAN PRODUCT OUTPUT Lhs.0, Rhs.0, Rhs.1, _, _
        1152036  ~1%    {4}    | REWRITE WITH Tmp.3 := "^@*", Tmp.4 := "", Out.3 := regexpReplaceAll(InOut.2,Tmp.3,Tmp.4) KEEPING 4
           1784  ~0%    {4}    | JOIN WITH PRIMITIVE regexpMatch#bb ON Lhs.3,Lhs.0
           1784  ~0%    {3}    | SCAN OUTPUT In.1, In.2, In.3
                        return r1

@github-actions github-actions bot added the Ruby label Apr 24, 2024
@nickrolfe
Copy link
Contributor Author

DCA shows stage timing improvements for SensitiveActions on a couple of databases.

@nickrolfe nickrolfe marked this pull request as ready for review April 24, 2024 15:01
@nickrolfe nickrolfe requested a review from a team as a code owner April 24, 2024 15:01
@nickrolfe nickrolfe added the no-change-note-required This PR does not need a change note label Apr 24, 2024
@nickrolfe nickrolfe merged commit 116873c into main Apr 25, 2024
@nickrolfe nickrolfe deleted the nickrolfe/rb-sensitive branch April 25, 2024 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required This PR does not need a change note Ruby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants