You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.
Right now, all analyzer rules have the same priority and they're executed in the order they're added to the analyzer. This makes it a bit difficult to set foreign rules (as gitquery will do) in a specific place to happen before some certain rules.
For example, let's imagine a rule to squash inner joins. It needs the tables resolved, so it needs to be after "resolve_tables". But before the pushdown, because after that it won't be transformed again.
Instead of knowing by heart and relying on a number that may change to insert the rule in a specific place we should implement a few phases.
For example:
typeAnalyzerPhaseintconst (
// ResolutionPhase is the phase in which all unresolved nodes are resolved.ResolutionPhaseAnalyzerPhase=iota// PostResolutionPhase is the phase in which all nodes are already resolved and they can // be changed and rearranged with the certainty everything is resolved and before the// tree is optimized.PostResolutionPhase// OptimizationPhase is the phase in which optimizations to improve query performance// are applied.OptimizationPhase
)
Then we could insert rules in any of these phases knowing in which state the tree is.
I agree. We should separate PreAnalysis, Analysis, PostAnalysis, Optimizations, PostOptimization and Planning, PostPlanning (or some subset). That should better match the theory and practice on how to implement a SQL engine.
The bad thing with the squash rule is that it is physical planning (supposedly to run after optimization) but it interacts with logical optimization. Fortunately, since we don't really have decoupled logical and physical plans, it could just run on PostAnalysis.
Maybe we can do like on Spark, have a Batch type that allows you to execute a group of rules n times. This will allow us more options creating different phases, even for implementors of the library.
Right now, all analyzer rules have the same priority and they're executed in the order they're added to the analyzer. This makes it a bit difficult to set foreign rules (as gitquery will do) in a specific place to happen before some certain rules.
For example, let's imagine a rule to squash inner joins. It needs the tables resolved, so it needs to be after "resolve_tables". But before the pushdown, because after that it won't be transformed again.
Instead of knowing by heart and relying on a number that may change to insert the rule in a specific place we should implement a few phases.
For example:
Then we could insert rules in any of these phases knowing in which state the tree is.
Thoughts? @smola @jfontan @ajnavarro
The text was updated successfully, but these errors were encountered: