-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mega QP context and entrypoint overhaul. Reduce QP hairiness by 30% #35465
Conversation
… not around [ci skip]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a PR!
I think the changes make a lot of sense. Dropping context
for a handful of dynamic vars feels like a win, and dropping unnecessary async is a huge win.
Just some small edits and suggestions.
(let [qp (ee.qp.perms/check-download-permissions | ||
(fn [query _rff] | ||
query))] | ||
(qp query qp.reducible/default-rff))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gone looking for other examples of this pattern, but it seems like the identity transformation here is likely to recur and might deserve a name in eg. qp.reducible
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of too many other places we do this, maybe in a few other tests but in that case it probably belongs in test tooling. I'll try to look around for more
modules/drivers/bigquery-cloud-sdk/src/metabase/driver/bigquery_cloud_sdk.clj
Show resolved
Hide resolved
was executed in QueryExecution and other places. It is considered bad form for middleware to change its behavior | ||
based on this information, don't do it!" | ||
;; TODO - this schema is somewhat misleading because if you use a function | ||
;; like [[metabase.query-processor/userland-query]] some of these keys (e.g. `:context`) are in fact required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is :context
the best example of something actually-required now? I'm not sure if this is the same context as the context
parameter getting dropped from many QP functions and replaced with dynamic vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah if things weren't confusing enough already, we had the :context
key which was the context the query was getting executed in, like :pulse
or :dashboard
or whatever. It is actually required to save a QueryExecution
(u/minutes->ms | ||
(if config/is-prod? | ||
20 | ||
3))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be going in the opposite direction we took (or wanted to take) with the API stuff: there we said we would prefer fewer dynamic vars, didn't we?
Maybe we can implement a function that we could call when debugging that would give the values of the "context" vars and also say if they have been redefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in general less dynamic variables is a win, but I think sometimes there is still a good use case for them. I played around with this a lot and in this case having a few dynamic vars makes it a simpler than passing around that context map in like 100 different functions. Maybe we can revisit it in the future tho, I want to get this PR in before it gets any more merge conflicts, maybe following on with more (smaller) tweaks makes sense
Co-authored-by: metamben <103100869+metamben@users.noreply.github.com>
@camsaul Did you forget to add a milestone to the issue for this PR? When and where should I add a milestone? |
The overall goal of this PR is to simplify the surface area of the QP and make its behavior a little more obvious and easy to understand, clearly separate out different stages of the QP pipeline, and to improve performance by eliminating unneeded core.async nonsense.
Major changes:
(qp/process-userland-query query)
are replaced with(qp/process-query (qp/userland-query query))
qp.setup
code does all the stuff needed to run any part of the QP -- resolve Database ID, set up QP store, binddriver/*driver*
, bind database-local settings, etc. This is used at the top level and also in substeps likepreprocess
; duplicate calls no-op so there is no perf hitqp.preprocess
; this can now be called completely independently of the larger QPqp.postprocess
qp.compile
qp.execute
core.async
promise channel and blocking until that channel got results from the background thread. Eliminate completely unneeded and hard-to-followcore.async
nonsenseqp.reducible
qp.schema
namespace with Malli schemas for a handful of things. Add typechecking around more QP stuff. This caught a bunch of bugs already e.g.info
maps accidentally being passed in where we meant to pass incontext
(qp/process-query (mt/mbql-query :venues))
now takes 8.6ms on average on my computer compared to 10.2, which is 18% faster. Not sure 2ms is going to make a huge difference for anything besides test H2 databases, but when we do it ten thousand times it might speed up the test suite a nice amount