Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace common patterns in query object model #7284

Closed
lukaseder opened this issue Mar 12, 2018 · 3 comments
Closed

Replace common patterns in query object model #7284

lukaseder opened this issue Mar 12, 2018 · 3 comments

Comments

@lukaseder
Copy link
Member

lukaseder commented Mar 12, 2018

There are some common expressions that can be simplified by the jOOQ parser into better equivalents, such as:

-- Input
RTRIM(LTRIM(something))

-- Output
TRIM(something)

In the above example, if jOOQ has a trim() function in its expression tree, that will perform (and look) better in all dialects that support it, while it still can be transformed back to RTRIM(LTRIM(...)) on the other dialects.

This should be an configurable feature. There are some new Settings flags that allow for opting in to a set of sub-flags:

  • transformPatterns allows for removing excessively verbose syntax, replacing some patterns by a shorter, equivalent syntax
    • transformPatternsTrim: RTRIM(LTRIM(x)) => TRIM(x)
    • transformPatternsTrim: LTRIM(RTRIM(x)) => TRIM(x)
    • transformPatternsIdempotentFunctionRepetition:
      • LTRIM(LTRIM(x)) or LTRIM(TRIM(x))
      • RTRIM(RTRIM(x)) or RTRIM(TRIM(x))
      • TRIM(TRIM(x))
      • UPPER(UPPER(x))
      • LOWER(LOWER(x))
      • ABS(ABS(x))
      • SIGN(SIGN(x))
      • CEIL(CEIL(x))
      • FLOOR(FLOOR(x))
      • ROUND(ROUND(x))
      • TRUNC(TRUNC(x))
      • CAST(CAST(x AS t) AS t)
    • transformPatternsArithmeticExpressions:
      • Identities
        • 0 + x => x
        • 1 * x => x
        • POWER(x, 1) => x
      • Negation
        • 0 - x => -x
        • x + -y => x - y
        • x - -y => x + y
        • -x * -y => x * y
        • -x / -y => x / y
        • -(const) => -const (change the value, don't use Neg)
        • -1 * x => -x
      • Other
        • 1 / y * x => x / y
        • x * x => SQUARE(x)
        • x <commutative op> const => const <commutative op> x
          • x + const => const + x
          • x * const => const * x
          • No other commutative ops are affected, currently, e.g. for =, the inverse is being done
    • transformPatternsTrigonometricFunctions:
      • SIN(x) / COS(x) => TAN(x)
      • 1 / COT(x) => TAN(x)
      • COS(x) / SIN(x) => COT(x)
      • 1 / TAN(x) => COT(x)
    • transformPatternsHyperbolicFunctions (from https://en.wikipedia.org/wiki/Hyperbolic_functions, some of these might be derived from other maths transformations):
      • (EXP(x) - EXP(-x)) / 2 => SINH(x)
      • (EXP(2 * x) - 1) / (2 * EXP(x)) => SINH(x)
      • (1 - EXP(-2 * x)) / (2 * EXP(-x)) => SINH(x)
      • (EXP(x) + EXP(-x)) / 2 => COSH(x)
      • (EXP(2 * x) + 1) / (2 * EXP(x)) => COSH(x)
      • (1 + EXP(-2 * x)) / (2 * EXP(-x)) => COSH(x)
      • SINH(x) / COSH(x) => TANH(x)
      • 1 / COTH(x) => TANH(x)
      • (EXP(x) - EXP(-x)) / (EXP(x) + EXP(-x)) => TANH(x)
      • (EXP(2 * x) - 1) / (EXP(2 * x) + 1) => TANH(x)
      • COSH(x) / SINH(x) => COTH(x)
      • 1 / TANH(x) => COTH(x)
      • (EXP(x) + EXP(-x)) / (EXP(x) - EXP(-x)) => COTH(x)
      • (EXP(2 * x) + 1) / (EXP(2 * x) - 1) => COTH(x)
    • transformPatternsInverseHyperbolicFunctions (from https://en.wikipedia.org/wiki/Inverse_hyperbolic_functions)
      • LN(x + SQRT(SQUARE(x) + 1)) => ASINH(x)
      • LN(x + SQRT(SQUARE(x) - 1)) => ACOSH(x)
      • LN((1 + x) / (1 - x)) / 2) => ATANH(x)
      • LN((x + 1) / (x - 1)) / 2) => ACOTH(x)
    • transformPatternsLogarithmicFunctions
      • LN(value) / LN(base) => LOG(base, value)
      • EXP(1) => e
      • LOG(e, x) => LN(x)
      • LOG(10, x) => LOG10(x)
      • POWER(e, x) => EXP(x)
    • transformPatternsOrEqToIn:
      • x = a OR x = b OR x = c OR ... => x IN (a, b, c, ...)
      • x = a OR x IN (b, c, ...) => x IN (a, b, c, ...)
      • This should work for both x = a and a = x. Related (but not dependent on: transformPatternsNormaliseFieldCompareValue)
      • This should work even if other types of predicates are in the middle of the chain
    • transformPatternsAndNeToNotIn: x != a AND x != b AND x != c AND ... => x NOT IN (a, b, c, ...)
      • All details analogous to the above
    • transformPatternsMergeOrComparison:
      • x = a OR x > a => x >= a
      • x = a OR x < a => x <= a
      • x > a OR x < a => x != a
      • x > a OR x != a => x != a
      • x < a OR x != a => x != a
      • And many more
    • transformPatternsMergeAndComparison:
      • x = a AND x >= a => x = a
      • x = a AND x <= a => x = a
      • x > a AND x < a => x != x
      • x > a AND x != a => x > a
      • x < a AND x != a => x < a
      • And many more
    • transformPatternsMergeInPredicates: x IN (a, b, c) AND x IN (b, c, d) => x IN (b, c)
    • transformPatternsMergeRangePredicates: x >= a AND x <= b => x BETWEEN a AND b
    • transformPatternsRedundantPredicates: (done as part of transformPatternsMergeOrComparison)
      • a <op> b AND/OR a <op> b => a <op> b
      • a <op> b AND/OR b <inverse op> a => a = b
    • transformPatternsTrivialCaseAbbreviation:
      • NVL(NULL, a) => a
      • NULLIF(a, a) => NULL
      • NULLIF(NULL, a) => NULL
      • NULLIF(a, NULL) => a
    • transformPatternsTrivialPredicates:
      • a is not distinct from a => TRUE
      • a is distinct from a => FALSE
      • a >= a and a <= a => a = a (leave as is, because of NULL semantics)
      • a > a and a < a => a != a (leave as is, because of NULL semantics)
      • <const> IS NOT NULL => TRUE
      • <const> IS NULL => FALSE
      • TRUE AND FALSE => FALSE (and the whole truth table)
      • TRUE OR FALSE => TRUE (and the whole truth table)
      • NOT (TRUE) => FALSE and NOT (FALSE) => TRUE
      • SELECT .. WHERE TRUE => SELECT ..
      • SELECT .. HAVING TRUE => SELECT ..
      • SELECT .. QUALIFY TRUE => SELECT ..
    • transformPatternsNotNot: NOT (NOT (p)) => p
    • transformPatternsNotComparison:
      • NOT (a = b) => A != B
      • NOT (a != b) => A = B
      • NOT (a > b) => A <= B
      • NOT (a >= b) => A < B
      • NOT (a < b) => A >= B
      • NOT (a <= b) => A > B
    • transformPatternsNotNotDistinct: MySQL NOT(a <=> b) => a IS DISTINCT FROM b
    • transformPatternsNegNeg: -(-(i)) => i
    • transformPatternsBitNotBitNot: ~(~(i)) => i
    • transformPatternsBitNotNand: (both for scalar and aggregate functions)
      • ~(bitnand(a, b)) => bitand(a, b)
      • ~(bitand(a, b)) => bitnand(a, b)
    • transformPatternsBitNotNor:
      • ~(bitnor(a, b)) => bitor(a, b)
      • ~(bitnor(a, b)) => bitnor(a, b)
    • transformPatternsBitNotXNor:
      • ~(bitxnor(a, b)) => bitxor(a, b)
      • ~(bitxor(a, b)) => bitxnor(a, b)
    • (a + b) + c => a + b + c (works the same way for all associative operations). It's now a binary operator, so this no longer applies

These are normalisation transformations that transform more esoteric syntaxes to more standardised ones. By normalising things, we can greatly reduce the search space of possible patterns, as not to miss any possible transformations:

  • transformPatternsNormaliseAssociativeOps: (a + b) + (c + d) => (((a + b) + c) + d)

    • +
    • *
    • AND
    • OR
  • transformPatternsNormaliseFieldCompareValue: 1 = a => a = 1

  • transformPatternsNormaliseInListSingleElementToComparison:

    • x IN (a) => x = a
    • x NOT IN (a) => x != a
  • transformPatternsScalarSubqueryCountAsteriskGtZero: (SELECT COUNT(*) ..) > 0 => EXISTS (SELECT 1 ..)

    • Only in the absence of UNION and other set operations
    • Only in the absence of GROUP BY and HAVING
    • Only with COUNT(*), not with COUNT(expr)
    • Also (SELECT COUNT(*) ..) >= 1
  • transformPatternsScalarSubqueryCountExpressionGtZero: (SELECT COUNT(expr) ..) > 0 => EXISTS (SELECT 1 .. WHERE expr IS NOT NULL)

  • transformPatternsEmptyScalarSubquery: SELECT .. FROM .. WHERE FALSE => NULL

A part of this task is to also add support for this functionality (but not each individual flag (yet)):

Caveats

See follow-up task: #13593

@lukaseder
Copy link
Member Author

This isn't really related to parsing, but could be done in general. Once we have a better query object model, pattern matching the object model in such ways would be quite common

We'll review this in the context of the query object model redesign project.

@lukaseder lukaseder changed the title Add parser option to recognise and replace patterns Replace common patterns in query object model Jan 16, 2020
lukaseder added a commit that referenced this issue Jan 18, 2022
- transformPatternsNormaliseAssociativeOps (WIP)
- transformPatternsNormaliseFieldCompareValue
- transformPatternsOrEqToIn (WIP)
lukaseder added a commit that referenced this issue Jan 19, 2022
- transformPatternsNormaliseAssociativeOps (complete)
lukaseder added a commit that referenced this issue Jan 19, 2022
- transformPatternsOrEqToIn (support interleaving unrelated predicates)
- transformPatternsAndNeToNotIn
- transformPatternsMergeOrComparison
lukaseder added a commit that referenced this issue Jan 19, 2022
- transformPatternsNormaliseInListSingleElementToComparison
- transformPatternsMergeInLists
- Support the feature in ParserCLI
lukaseder added a commit that referenced this issue Jan 19, 2022
- transformPatternsMergeRangePredicates
lukaseder added a commit that referenced this issue Jan 19, 2022
lukaseder added a commit that referenced this issue Jan 19, 2022
- transformPatternsMergeAndComparison
lukaseder added a commit that referenced this issue Jan 20, 2022
- transformPatternsMergeInLists (merge also Eq with InList)
@lukaseder
Copy link
Member Author

The various edge cases that arose from testing the transformations around hyperbolic functions have shown that it is important to separate canonicalisation/normalisation steps from merging steps. For example

(EXP(2 * x) + 1) / (EXP(2 * x) - 1) => COTH(x)

That's the formula on wikipedia, but it is equivalent to these:

  • (1 + EXP(2 * x)) / (EXP(2 * x) - 1) => COTH(x) (1 + x could be seen as more "canonical" than x + 1)
  • (1 + EXP(x)) / (EXP(x) - 1) => COTH(x / 2) (the 2 * x formula makes recognising the pattern harder)

Much more thought will need to go into this. We're hardly breaking new grounds here, these symbol transformation algorithms must have been studied well already.

@lukaseder
Copy link
Member Author

The status quo will ship in jOOQ 3.17. More work in jOOQ 3.18 here: #13593

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant