Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BE: add based_on_upload to GET /api/collections/:collection/items response #38322

Merged
merged 22 commits into from
Feb 5, 2024

Conversation

calherries
Copy link
Contributor

@calherries calherries commented Jan 31, 2024

epic
product doc
eng doc

This PR implements the BE part of milestone 1 of the epic.

It adds a :based_on_upload key to the response of the GET /api/collections/:id/items and GET /api/collections/root/items endpoints.

The :based_on_upload key is only added for items that correspond to models, not questions.

The :based_on_upload is either:

  1. the ID of the underlying table if the model can be uploaded to
  2. missing if the model cannot be uploaded to

It reuses the same logic as #37052 to derive the :based_on_upload key.

calherries and others added 4 commits January 3, 2024 16:05
* update upload function to take arguments as an object

* upload to model from context menu

* add e2e tests for appends

* update unit tests

* clean up

* fix types

* ellipsify status title

* don't restrict appends to models on the frontend

* move upload button out of context menu

* automatically show upload error modal

* update tests

* better data fetching and selecting
Copy link

replay-io bot commented Jan 31, 2024

StatusComplete ↗︎
Commit269ac86
Results
⚠️ 1 Flaky
2268 Passed

Base automatically changed from appends/milestone-0-model-page to master February 1, 2024 19:09
@calherries calherries added the no-backport Do not backport this PR to any branch label Feb 2, 2024
@calherries calherries requested a review from a team February 2, 2024 18:08
(let [table-ids (->> models
;; as an optimization, we might already know that the table is not an upload
;; if is_upload=false. We can skip making more queries if so
(remove #(false? (:is_upload %)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which case we benefit from this optimization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code path from the collection items endpoint. I'll add this to the comment

(:id t)))
tables)))))

(defn- no-joins?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(defn- no-joins?
(defn- query-no-joins?

IMO, if the object that a helper function operates on is not the main object of the namespace, we should prefix it.

For example, it doesn't give much context if you call (upload/no-joins ...) since upload doesn't have a join property by itself.

Copy link
Contributor Author

@calherries calherries Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see much benefit of applying that rule for private helper functions like this one which are only used in one function. I would agree with your example though, no one should use uploads/no-joins? without renaming it appropriately or moving it to an appropriate namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on not needing a prefix for a private method (implicit in the arg name), but I do find this a strange name for a predicate. Having a has-joins? method that just gets negated at the call site feels a bit clearer to me.

[:table_id [:maybe ms/PositiveInt]]
;; is_upload can be provided for an optional optimization
[:is_upload {:optional true} [:maybe :boolean]]]]]
(let [table-ids (->> models
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these lets binding are not aligned, and I noticed you no longer do that for some of your PRs. Have we stopped going that :( IMO, it aids readability by a good amount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still align let bindings by default but sometimes break it if I think it leads to significantly more new lines. I can align this one though and can tend more towards aligning things in the future

(contains? uploadable-table-ids (:table_id model))
(no-joins? query))
(lib/source-table-id query)))))]
(map #(m/assoc-some % :based_on_upload (based-on-upload %))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why assoc-some ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought it was cleaner. Why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect based_on_upload to be a boolean key after calling model-hydrate-based-on-upload.

it'd be inconsistent if, for a list of models, some will have the key, and some are not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it feels a little surprising. Using assoc here would mean based_on_upload would be not present for cards, but based_on_upload is present for models, which feels equally inconsistent to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer using assoc here, as I think having the key uniformly present for all models is easier to reason about.

For a regular card we know the field is not relevant, but for models a response without the key could either be an implicit false, or just a response from an older version of the server. This is why it's good in general to have explicit nulls in server responses, since checking which keys were optional for a given past version will be painful when working on or providing support for old versions.

src/metabase/upload.clj Outdated Show resolved Hide resolved
calherries and others added 2 commits February 5, 2024 10:20
Co-authored-by: Ngoc Khuat <qn.khuat@gmail.com>
Copy link
Contributor

@qnkhuat qnkhuat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, LGTM

@calherries calherries merged commit 8201e54 into master Feb 5, 2024
105 checks passed
@calherries calherries deleted the csv-appends-be-based-on-upload-collection branch February 5, 2024 16:31
Copy link

github-actions bot commented Feb 5, 2024

@calherries Did you forget to add a milestone to the issue for this PR? When and where should I add a milestone?

Comment on lines +412 to +416
(map #(update % :dataset_query (comp mbql.normalize/normalize json/parse-string)))
upload/model-hydrate-based-on-upload
(map (fn [dataset_query row]
(assoc row :dataset_query dataset_query))
dataset_queries)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "normalize then undo" pattern could be made clearer IMHO. I see this has already shipped, so I'll send another PR for it.

(defn uploadable-table-ids
"Returns the subset of table ids where the user can upload to the table."
[table-ids]
(if (empty? table-ids)
Copy link
Contributor

@crisptrutski crisptrutski Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does toucan / honeysql not have a fast "no query" return path for [:in []] queries? It just seems a bit sad to hand roll this optimization 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they don't

[table-ids]
(if (empty? table-ids)
#{}
(let [tables (t2/hydrate (t2/select :model/Table :id [:in table-ids]) :db)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this would have been clearer with a threading macro:

(->> (t2/select ..)
      t2/hydrate
      (keep ...)
      set)

"Batch hydrates `:based_on_upload` for each item of `models`. Assumes each item of `model` represents a model."
[models :- [:sequential [:map
;; query_type and dataset_query can be null in tests, so we make them nullable here.
;; they should never be null in production
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How sure are we really that it's never null in production?

I'd love to have a backlog item to backfill and make non-nullable, and update all the tests.

;; as an optimization when listing collection items (GET /api/collection/items),
;; we might already know that the table is not an upload if is_upload=false. We
;; can skip making more queries if so
(remove #(false? (:is_upload %)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relevant to the discussion about assoc versus assoc-with, or would there still be call paths that never populate the key for models?

based-on-upload (fn [model]
(when-let [dataset_query (:dataset_query model)] ; dataset_query is sometimes null in tests
(let [query (lib/->pMBQL dataset_query)]
(when (and (= (name (:query_type model)) "query")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be "failing closed" for nil query_type? (= (name (:query_type model "query")) "query")

(remove #(false? (:is_upload %)))
(keep :table_id)
set)
uploadable-table-ids (set (uploadable-table-ids table-ids))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method already returns a set.

(keep :table_id)
set)
uploadable-table-ids (set (uploadable-table-ids table-ids))
based-on-upload (fn [model]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it this case it would be more natural to have (for [m models] ...) in the body below this let, than to define this method to call in a map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-backport Do not backport this PR to any branch .Team/BackendComponents also known as BEC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants