Skip to content

Reuse existing columns as discriminators#1247

Merged
szeiger merged 3 commits into
masterfrom
tmp/issue-1241
Aug 24, 2015
Merged

Reuse existing columns as discriminators#1247
szeiger merged 3 commits into
masterfrom
tmp/issue-1241

Conversation

@szeiger
Copy link
Copy Markdown
Member

@szeiger szeiger commented Aug 20, 2015

Plus some more performance improvements. Fixes #1241.

- In `expandSums` we now check for a suitable existing column in the
  type of the join side to expand, and use this as the discriminator
  instead of generating an additional column that always leads to a
  subquery in SQL ("select 1, ..."). Suitable columns must be of a
  primitive non-Option type. Primary keys are preferred over other
  fields and those over computed columns. A conditional expression in
  the wrapping Bind converts the column to the proper discriminator
  type `Option[Int]`.

- These conditional expressions are eliminated in `hoistClientOps` if
  they occur at the top level, passing the substitute discriminator
  directly to the client side.

- A more flexible `IsDefinedResultConverter` is used for reading
  discriminator columns of any Option type (only NULL vs non-NULL
  matters). JdbcProfile uses a specialized, fused version.

- Computing the new StructNodes in `flattenProjections` now removes
  duplicate definitions, except at the top level (where the projection
  has to match the translated top-level type) and directly below a
  Union (where both sides have to match up).

- `hoistClientOps` can now pull operations out of the non-Option sides
  of joins which is required for avoiding subqueries in the case of
  nested outer joins.

Tests in NewQuerySemanticsTest.testNewFusion, UnionTest.testBasicUnions.
Fixes #1241.
An important take-away from these improvements is that it is generally
cheaper to broadly discard Path types when a NominalType has been
invalidated and rely on inference to rebuild them instead of performing
expensive type transformations.

Even with the extra transformations for the improved discriminator
column handling, performance is now >30% better than in 3.0.
@szeiger
Copy link
Copy Markdown
Member Author

szeiger commented Aug 20, 2015

This could still use some peephole optimization of the 3VL shenanigans that can remain in the join condition when using nested outer joins:

select s4."id", s4."a", s4."b"
from "A_NEWFUSION" s5
left outer join "A_NEWFUSION" s6
on s5."id" = s6."id"
left outer join "A_NEWFUSION" s4
on (case when ((case when (s6."id" is null) then null else 1 end) = 1) then s6."b" else null end) = s4."b"

- A new compiler phase `optimizeScalar` performs the required local
  optimizations to eliminate unnecessary null checks arising from
  outer joins after `expandSums`.

- `expandSums` now keeps track of paths referenced within OptionApply
  nodes in Join and Filter predicates. These are preferred over other
  fields when picking a discriminator column for an outer join. The
  goal is to avoid NVL2 checks of the discriminator column in Join and
  Filter conditions (where this could prevent the use of an index).
  In the long term this may not be enough. A more complex alternative
  would be to pick *all* possible discriminators in `expandSums` and
  narrow them down to the best one in a later phase on a case by case
  basis.
@szeiger
Copy link
Copy Markdown
Member Author

szeiger commented Aug 21, 2015

Done.

select s4."id", s4."a", s4."b"
from "A_NEWFUSION" s5
left outer join "A_NEWFUSION" s6
on s5."id" = s6."id"
left outer join "A_NEWFUSION" s4
on s6."b" = s4."b"

szeiger added a commit that referenced this pull request Aug 24, 2015
Reuse existing columns as discriminators
@szeiger szeiger merged commit f109739 into master Aug 24, 2015
@szeiger szeiger deleted the tmp/issue-1241 branch August 24, 2015 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant