New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MBQL Refactor: Combine various Field clauses into one new clause #14897
Conversation
It's easy enough to "upgrade" the old-style clauses to the new clause on the backend -- I have this all working. I still need to update a lot of QP middleware to handle the new clause and update a lot tests as well -- there are about 400 backend tests left to update (which would take another day or so) |
Co-authored-by: Paul Rosenzweig <paulrosenzweig@users.noreply.github.com>
@paulrosenzweig RE sniffing stuff: there's two new methods in the new |
@paulrosenzweig I added normalization coming out of the DB for all Card & DashboardCard |
The JSON-serialized field refs might trip this up. Column formatting and pivot table partitions are two places where that happens. It's probably worth having tests around some of those cases specifically. I expect some saved questions out in the wild will never be updated and rely on this migration continuing to work for years. |
;; this is a separate function so we can use the same tests for DashboardCards as well | ||
(defn test-visualization-settings-normalization [f] | ||
(testing "visualization settings should get normalized to use modern MBQL syntax" | ||
(testing "Field references in column settings" | ||
(doseq [[original expected] {[:ref [:field-literal "foo" :type/Float]] | ||
[:ref [:field "foo" {:base-type :type/Float}]] | ||
|
||
[:ref [:field-id 1]] | ||
[:ref [:field 1 nil]] | ||
|
||
[:ref [:expression "wow"]] | ||
[:ref [:expression "wow"]]} | ||
;; also check that normalization of already-normalized refs is idempotent | ||
original [original expected] | ||
;; frontend uses JSON-serialized versions of the MBQL clauses as keys | ||
:let [original (json/generate-string original) | ||
expected (json/generate-string expected)]] | ||
(testing (format "Viz settings field ref key %s should get normalized to %s" | ||
(pr-str original) | ||
(pr-str expected)) | ||
(f | ||
{:column_settings {original {:currency "BTC"}}} | ||
{:column_settings {expected {:currency "BTC"}}})))) | ||
|
||
(testing "Other MBQL field clauses" | ||
(let [original {:map.type "region" | ||
:map.region "us_states" | ||
:pivot_table.column_split {:rows [["datetime-field" ["field-id" 807] "year"]], | ||
:columns [["fk->" ["field-id" 805] ["field-id" 808]]], | ||
:values [["aggregation" 0]]}} | ||
expected {:map.type "region" | ||
:map.region "us_states" | ||
:pivot_table.column_split {:rows [[:field 807 {:temporal-unit :year}]] | ||
:columns [[:field 808 {:source-field 805}]] | ||
:values [[:aggregation 0]]}}] | ||
(f original expected))) | ||
|
||
(testing "Don't normalize non-MBQL arrays" | ||
(let [original {:graph.show_goal true | ||
:graph.goal_value 5.9 | ||
:graph.dimensions ["the_day"] | ||
:graph.metrics ["total_per_day"]}] | ||
(f original original))) | ||
|
||
(testing "Don't normalize key-value pairs in maps that could be interpreted as MBQL clauses" | ||
(let [original {:field-id 1}] | ||
(f original original))))) | ||
|
||
(deftest normalize-visualization-settings-test | ||
(test-visualization-settings-normalization | ||
(fn [original expected] | ||
(mt/with-temp Card [card {:visualization_settings original}] | ||
(is (= expected | ||
(db/select-one-field :visualization_settings Card :id (u/the-id card)))))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulrosenzweig yup, I already added tests for normalization of serialized Card/DashboardCard field references as keys in viz settings here (specifically the json/generate-string
part at the top). Are the serialized keys used anywhere else besides visualization_settings.column_settings
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the serialized keys used anywhere else besides visualization_settings.column_settings?
Sorry slow responding to that. I had thought both conditional formatting and pivot tables did, but I just checked and neither do! So thankfully I guess it's just column settings.
@paulrosenzweig this is green no so I'm going to go ahead and merge it in so we don't end up in merge-conflict hell since the PR is so big. Once it's up on stats I'll try to go test things and make sure everything is working properly and add more normalization test cases/fixes as needed |
Part of the #14057 wish list. This change has been on my wish list for years now, but I have an actual reason to do it -- implementing #9934 is super difficult without being able to attach arbitrary info for Field clauses.
Background
The original version of MBQL (known as the "Structured Query" language at the time, which was a little confusing) just referred to all fields by integer ID. e.g.
That made clauses like this ambiguous:
To clear up ambiguity and to let you filter Fields against Fields, we added the
:field-id
clause to make it explicit that you were referring to a Field, not an Integer:We soon added a couple of new types of Field clauses,
:fk->
and:datetime-field
, both of which wrap the original:field-id
:So far things were still pretty reasonable, the worst you'd have to do is something like
Things started to get out-of-hand IMO when we continued to add more Field clause types that just wrapped everything else. We added
:binning-strategy
:then
:field-literal
, to refer to a Field from a nested native source query:then we added
:joined-field
to specify the Table you're explicitly joining against that is the source of a Field:This ends up getting really hairy when you combine things. This is an actual real clause you can use right now:
And even with all of those clauses, we still can't pass around arbitrary extra information in a way that would make it easy to implement performance optimizations. If we ever try to add another new clause, the whole house of cards is going to come crashing down.
The proposal
Combine all of the Field clauses into a single new
:field
clauses with the schema[:field id-or-name options-map]
Here are some before & after examples: