Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move archived items to the trash #41543

Conversation

johnswanson
Copy link
Contributor

@johnswanson johnswanson commented Apr 17, 2024

This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Migration

Tables with a collection_id had ON DELETE SET NULL on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what SET NULL actually does.

Moving things to the Trash

When we get an API request to update the archived flag on a card, collection, or dashboard, we either move the item to the trash (if archived is true) or from the trash (if archived is false). We set the trashed_from_collection_id flag as appropriate, and use it when restoring an item if possible.

Permissions changes

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of perms-objects-set-for-parent-collection so that it returns the permission set for the collection the object was trashed from, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can write. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Collection API changes

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

/collection/tree

This endpoint still takes an exclude-archived parameter, which defaults to false. When it's false, we return the entire tree including the Trash collection and archived children. When it's true, we exclude the Trash collection (and its subtree) from the results.

/collection/:id/items

Previously, this endpoint took an archived parameter, which defaulted to false. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default archived to the archived status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

/collection/root/items

No change - this endpoint still takes an archived parameter. When archived=true, we return the Trash collection, as it is in the root collection. Otherwise, we don't.

@metabase-bot metabase-bot bot added the .Team/AdminWebapp Admin and Webapp team label Apr 17, 2024
Copy link

replay-io bot commented Apr 17, 2024

Status In Progress ↗︎ 51 / 52
Commit d36b9c3
Results
7 Failed
⚠️ 10 Flaky
2419 Passed

@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch from f1d82f5 to 026b05b Compare April 17, 2024 20:34
@johnswanson johnswanson force-pushed the create-trash-collection-on-startup branch 2 times, most recently from e112920 to 9f04779 Compare April 18, 2024 15:19
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch 4 times, most recently from a2970a7 to 7729a80 Compare April 22, 2024 16:27
Base automatically changed from create-trash-collection-on-startup to make-trash-usable-feature-branch April 22, 2024 16:47
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch 5 times, most recently from 57b4c12 to 957d582 Compare April 22, 2024 21:30
@johnswanson johnswanson changed the title Move archived collections to the trash Move archived items to the trash Apr 24, 2024
@johnswanson johnswanson force-pushed the make-trash-usable-feature-branch branch from 015ef6d to 5a4598d Compare April 26, 2024 19:07
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch 5 times, most recently from 33828dd to 4e1c433 Compare April 30, 2024 11:33
@@ -220,7 +220,6 @@
:visualization-settings (:visualization_settings card)}
(and (= (:type card) :model) (seq (:result_metadata card)))
(assoc :metadata/model-metadata (:result_metadata card)))]
(api/check-not-archived card)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this hasn't ever actually checked anything, because we select the card with an itemized list of columns and :archived isn't one of them 😭

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check it, though? Ought we to add :archived to the columns we select from card?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the goals here was to allow people to view cards that had been archived, so I think we need to remove this check. I did add a similar check to the /public API, to prevent running public cards that had been archived - but generally speaking it should be allowable to process an archived card.

@johnswanson johnswanson force-pushed the make-trash-usable-feature-branch branch from 5a4598d to 7efc23b Compare April 30, 2024 20:31
When a collection is marked as `archived`, we:

- move the collection to the trash collection

- set `trashed_from_location` and `archived` appropriately on the
collection and all of its descendants.
When retrieving collection contents, we no longer need to exclude
archived items, because if we're not looking at the trash we're
guaranteed to not get archived things back, and if we *are* looking at
the trash we'll want to get archived things back.
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch from f74d89a to 09f95b9 Compare April 30, 2024 20:32
It now:

- optionally returns both the Trash and Archived items (if `archived=true`)

- by default, returns neither
I'm not sure what I was thinking: we don't need to change the search API
right now. We can change it down the road if/when it's needed.
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch from f8e1f1e to b195c5e Compare April 30, 2024 23:19
Copy link
Member

@noahmoss noahmoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I mostly left nits; the only real potential bug I found was enforcing permissions for model index search results properly (not totally sure if it's an issue or not).

Stamping to unblock since I'm about to be OOO, but this is a lot to take in at once so wouldn't hurt if @escherize or someone else reviews too

Comment on lines 91 to 96
false [:and
[:not= :id collection/trash-collection-id]
[:not :archived]]
true [:or
[:= :id collection/trash-collection-id]
:archived])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we still need to check the archived flag, or could we rely entirely on the presence of trash-collection-id to indicate whether something is archived?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check both - we're checking that the ID is the trash-collection-id (i.e. that it is actually the Trash collection itself). We need to separately check whether we're looking at a collection that was trashed.

(defn is-trash?
"Is this the trash collection?"
[collection]
(and (not (collection.root/is-root-collection? collection))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doesn't hurt, but maybe not necessary? looks like the root collection ID is either nil or "root" (ugh) so it'll fail the ID check anyway

src/metabase/api/collection.clj Outdated Show resolved Hide resolved
src/metabase/api/collection.clj Show resolved Hide resolved
Comment on lines +1019 to +1021
(when-let [alerts (seq (pulse/retrieve-alerts-for-cards
{:card-ids (t2/select-pks-set Card :collection_id (u/the-id collection-before-update))}))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure I understand — what happens to items in a nested collection if the parent collection gets trashed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They all get trashed. There is an existing bug that you probably noticed here - we're only sending notifications for things one level down, directly in the trashed collection. We should fix this at some point but I just kept it as-is for now.

@@ -110,7 +110,7 @@
"Add a `WHERE` clause to the query to only return Collections the Current User has access to; join against Collection
so we can return its `:name`."
[honeysql-query :- ms/Map
collection-id-column :- keyword?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was there a reason to remove the schema here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no longer a keyword, because sometimes it's [:coalesce ...] - but I just removed this "temporarily" and forgot to add back a new schema. Adding that back now, thanks for catching this!

@@ -216,6 +216,17 @@
{:arglists '([model search-context])}
(fn [model _] model))

(defn collection-id-for-table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use this in add-model-index-permissions-clause as well? It uses model.collection_id to enforce perms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do, thanks for pointing this out!

Comment on lines 223 to 224
When we're not looking for collections, use `{my-table}.collection_id` or `{my-table}.trashed_from_collection_id`,
depending on if we're looking for archived items."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get what this is going for — trashed_from_collection_id will only be non-nil when something is in the trash, so we always can use it if it's set, and otherwise use collection_id.

But the wording here is a bit confusing without that knowledge. It seems to be implying that this function should be returning something different depending on the input. Maybe you could clarify that a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes - I actually wrote this when I was imagining the implementation slightly differently and forgot to update it. Will change that!

src/metabase/models/collection.clj Outdated Show resolved Hide resolved
src/metabase/models/collection.clj Outdated Show resolved Hide resolved
@johnswanson johnswanson force-pushed the move-archived-collections-to-trash branch from d659492 to 9bd31bd Compare May 2, 2024 12:26
Copy link
Contributor

@escherize escherize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major, but I had 1 question that I'm curious about double clicking on:


Should we keep using the '1337133x' pattern, or let the db determine the id for e.g. the trash collection?


(defn- model-name->toucan-model [model-name]
(case (keyword model-name)
:collection Collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are trying to move away from using these vars, and should prefer the :model/Collection style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I'll switch these over.

Comment on lines 503 to 514
(api/column-will-change? :archived card-before-update card-updates)
(assoc :collection_id (cond
(:archived card-updates) collection/trash-collection-id

(api/column-will-change? :collection_id card-before-update card-updates)
(:collection_id card-updates)

:else
(:trashed_from_collection_id card-before-update))
:trashed_from_collection_id (when (:archived card-updates)
(:collection_id card-before-update)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to (move it somewhere else and then) call the api.dashboard/move-on-archive-or-unarchive function here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming they're exactly the same (I think they are, but I'm not super sure).

@@ -49,15 +49,30 @@
"The fixed ID of the trash collection."
13371339)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this horse left the barn? Would it be better to let the db assign a db for the trash collection here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of think we should change this, but I don't think it's been decided yet. I might keep it like this for now and, if we do want to change it, do it in a followup PR, since this is going into a feature branch. WDYT?

@@ -327,7 +343,7 @@
([collection-ids :- VisibleCollections]
(visible-collection-ids->honeysql-filter-clause :collection_id collection-ids))

([collection-id-field :- :keyword
([collection-id-field :- [:or [:vector :keyword] :keyword] ;; `[:vector :keyword]` allows `[:coalesce :option-1 :option-2]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it just literally [:coalesce :option-1 :option-2] that you wanna accept, in addition to vector?

if so you could do that with this:

[:tuple [:= :coalesce] :keyword :keyword]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like that.

will-be-trashed? (str/starts-with? new-location trash-path)]
(when will-be-trashed?
(throw (ex-info "Cannot `move-collection!` into the Trash. Call `archive-collection!` instead."
{})))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could return the collection in the exception, that might help track down a bug

:write collection-readwrite-path)]
:write collection-readwrite-path)
collection-id (if (and (contains? this :trashed_from_collection_id)
(contains? this :archived)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to figure out what this check does, but I think it's redundant with (:archied this)

  1. {... :archived true}

The (:archived this) call will be truthy

  1. {... :archived false}

The (:archived this) call will be falsey

  1. {...} ;; no archived

The (:archived this) call will be falsey


Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! You're right, I'll take this out.

Comment on lines +46 to +47
(deferred-tru "trashed {0}" identifier)
(deferred-tru "untrashed {0}" identifier))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📤 ➡️ 🗑️

@@ -220,7 +220,6 @@
:visualization-settings (:visualization_settings card)}
(and (= (:type card) :model) (seq (:result_metadata card)))
(assoc :metadata/model-metadata (:result_metadata card)))]
(api/check-not-archived card)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check it, though? Ought we to add :archived to the columns we select from card?

@johnswanson johnswanson merged commit 52d8a65 into make-trash-usable-feature-branch May 3, 2024
104 of 107 checks passed
@johnswanson johnswanson deleted the move-archived-collections-to-trash branch May 3, 2024 20:24
sloansparger pushed a commit that referenced this pull request May 9, 2024
This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.
johnswanson added a commit that referenced this pull request May 13, 2024
This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.
johnswanson added a commit that referenced this pull request May 14, 2024
This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.
johnswanson added a commit that referenced this pull request May 14, 2024
This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.
sloansparger pushed a commit that referenced this pull request May 16, 2024
This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.
sloansparger added a commit that referenced this pull request May 17, 2024
* Migration to add `trashed_from_*` (#41529)

We want to record where things were trashed *from* for two purposes:

- first, we want to be able to put things back if they're "untrashed".

- second, we want to be able to enforce permissions *as if* something is
still in its previous location. That is, if we trash a card or a
dashboard from Collection A, the permissions of Collection A should
continue to apply to the card or dashboard (e.g. in terms of who can
view it).

To achieve this, collections get a `trashed_from_location` (paralleling
their `location`) and dashboards/cards get a
`trashed_from_collection_id` (paralleling their `collection_id`).

* Create the trash collection on startup (#41535)

* Create the trash collection on startup

The trash collection (and its descendants) can't have its permissions
modified.

Note that right now, it's possible to move the Trash collection. I'll
fix this when I implement the API for moving things to the Trash.

* s/TRASH_COLLECTION_ID/trash-collection-id/g

* Add a comment to explain null comparison

* Move archived items to the trash (#41543)

This PR is probably too big. Hopefully documenting all the changes made here will make it easier to review and digest. So:

Tables with a `collection_id` had `ON DELETE SET NULL` on the foreign key. Since we only deleted collections in testing and development, this didn't really affect anything. But now that we are actually deleting collections in production, it's important that nothing accidentally get moved to the root collection, which is what `SET NULL` actually does.

When we get an API request to update the `archived` flag on a card, collection, or dashboard, we either move the item *to* the trash (if `archived` is `true`) or *from* the trash (if `archived` is `false`). We set the `trashed_from_collection_id` flag as appropriate, and use it when restoring an item if possible.

Because moving something to the trash by itself would have permissions implications (since permissions are based on the parent collection) we need to change the way permissions are enforced for trashed items.

First, we change the definition of `perms-objects-set-for-parent-collection` so that it returns the permission set for the collection the object was *trashed from*, if it is archived.

Second, when listing objects inside the Trash itself, we need to check permissions to make sure the user can actually read the object - usually, of course, if you can read a collection you can read the contents, but this is not true for the trash so we need to check each one. In this case we're actually a little extra restrictive and only return objects the user can *write*. The reasoning here is that you only really want to browse the Trash to see things you could "act on" - unarchive or permanently delete. So there's no reason to show you things you only have read permissions on.

Because previously the Collections API expected archived collections to live anywhere, not just in a single tree based in the Trash, I needed to make some changes to some API endpoints.

This endpoint still takes an `exclude-archived` parameter, which defaults to `false`. When it's `false`, we return the entire tree including the Trash collection and archived children. When it's `true`, we exclude the Trash collection (and its subtree) from the results.

Previously, this endpoint took an `archived` parameter, which defaulted to `false`. Thus it would only return non-archived results. This is a problem, because we want the frontend to be able to ask for the contents of the Trash or an archived subcollection without explicitly asking for archived results. We just want to treat these like normal collections.

The change made to support this was to just default `archived` to the `archived` status of the collection itself. If you're asking for the items in an archived collection, you'll only get archived results. If you're asking for the items in a non-archived collection, you'll only get unarchived results.

This is, for normal collections, the same thing as just returning all results. But see the section on namespaced collections for details on why we needed this slightly awkward default.

No change - this endpoint still takes an `archived` parameter. When `archived=true`, we return the Trash collection, as it is in the root collection. Otherwise, we don't.

* Make Trash Usable - UI (#41666)

* Remove Archive Page + Add `/trash` routing (#42226)

* removes archive page and related resources, adds new /trash route for trash collection, adds redirects to ensure consistent /trash routing instead of collection path

* fixes unit + e2e tests, corrects links generated for trash collection to use /trash over /collect/:trashId route

* updates comment

* Serialize trash correctly (#42345)

Also, create the Trash in a migration rather than on startup. Don't set a specific trash collection.id, instead just select based on the `type` when necessary.

* Fix collection data for trashed items (#42284)

* Fix collection IDs for trashed items

When something is in the trash, we need to check permissions on the
`trashed_from_collection_id` rather than the `collection_id`. We were
doing this. However, we want the actual collection data on the search
result to represent the actual collection it's in (the trash). I added
the code to do this, a test to make sure it works as intended, and a
comment to explain what we're doing here and why.

* Refactor permission for trashed_from_collection_id

Noah pointed out that the logic of "what collection do I check for
permissions" was getting sprinkled into numerous places, and it felt a
little scary. In fact, there was a bug in the previous implementation.
If you selected a collection with `(t2/select [:model/Collection ...])`,
selecting a subset of columns, and *didn't* include the
`trashed_from_collection_id` in that set of columns, then called
`mi/can-write?` on the result, you'd get erroneous results.
Specifically, we'd return `nil` (representing the root collection).

I think this is a reasonable fix, though I'm pretty new to using fancy
multimethods with `derive` and such. But basically, I wanted a way to
annotate a model to say "check these permissions using
`:trashed_from_collection_id` if the item is archived." And in this
case, we'll throw an exception if `:trashed_from_collection_id` is not
present, just like we currently throw an exception if `:collection_id`
is not present when checking permissions the normal way.

* Move existing archived items to the trash (#42241)

Move everything that's archived directly to the trash. It's not ideal
because we're wiping out the existing collection structure, but the
existing Archive page also has no collection structure.

* lint fixes from rebase

* Fix backend tests (#42506)

`can_write` on collection items was wrong because we didn't have a
`trashed_from_collection_id` - the refactor to ensure we didn't make
that mistake worked! 🎉

* Add an /api/collections/trash endpoint (#42476)

This fetches the Trash in exactly the same way as if we'd fetched it
with `/api/collection/:id` with the Trash ID. I hadn't realized that the
frontend was doing this with the previously hardcoded Trash ID.

* Make Trash Usable - Dynamic Trash Collection Id (#42532)

* wip

* fixes hardcoded reference to trash id in sidebar

* remove TRAHS_COLLECTION

* fixes line and e2e tests

* fix invert logic mistake and fixes lint

* Make Trash Usable - Search Filter UI (#42402)

* adds filtering for archived items on the search page

* fix typing mistake

* Make Trash Usable - Bug Bash 1 (#42541)

* disables reordering columns in archived questions, disables modifying archived question name in question header, collection picker no longer defaults to archived item, keeps trashed collection from appearing collection picker search results, shops showing empty area in trashed dashboard info sidebar, disables uploading csvs to trashed collections

* impls pr feedback

* fix e2e failure

* Localize the name of the Trash (#42514)

There are at least two spots where we can't just rely on the
after-select hook, and select the collection name directly from the
database: the Search and Collection API.

In these cases we need to call `collection/maybe-localize-trash-name`,
which will localize the name if the passed Collection is the Trash
collection.

* Update migration IDs

Migration IDs need to be in order.

* Fix failing mariadb:latest test (#42608)

Hooooly cow. Glad to finally track this down. The issue was that
booleans come back from MariaDB as NON BOOLEANS, and clojure says 0 is
truthy, and together that makes for FUN TIMES.

We need to turn MariaDB's bits to booleans before we can work with them.

* Make Trash Usable: Add `can_restore` to API endpoints (#42654)

Rather than having two separate implementations of the `can_restore`
hydration for cards and dashboards, I set up automagic hydration for the
`trashed_from_collection_id`. Then the hydration method for
`can_restore` can just first hydrate `trashed_from_collection` and then
operate based on that.

* fix can_restore for collections trashed from root

* Fix `trashed_from_location` for trashed subcols

We were setting `trashed_from_location` on subcollections to the
`trashed_from_location` of the ancestor that was trashed - which is
wrong.

This would have caused bugs where collections would be restored to the
root collection, regardless of where they were originally trashed from.

---------

Co-authored-by: Sloan Sparger <sloansparger@gmail.com>

* Fix Trash rollbacks (#42710)

* Fix Trash rollbacks

There were a few errors in my initial implementation.

First, we can't simply assume that `trashed_from_location` and
`trashed_from_collection_id` are set for anything in the archive. They
should be set for things *directly* trashed, but not for things trashed
as part of a collection.

Therefore, we need to set `location` and `collection_id` to something
like "if the item is in the trash, then the `trashed_from` - otherwise,
don't modify it".

Second, because `trashed_from_collection_id` is not a foreign key but
`collection_id` is, we have a potential problem. If someone upgrades,
Trashes a dashboard, then Trashes and permanently deletes the collection
that dashboard _was_ in, then downgrades, how do we move the dashboard
"back"? What do we set the dashboard's `collection_id` to?

The solution here is that when we downgrade, we don't actually move
dashboards, collections, or cards out of the Trash collection. Instead
we just make Trash a normal collection and leave everything in it.

* Make Trash Usable - Bug Bash 2 (#42787)

* wip

* fixes access to property on null value

* pr clean up

* more clean up

* Fix up tests

- permissions will check the archived from location. So recents need to
select :report_card.trashed_from_collection_id and
:dash.trashed_from_collection_id to satisfy mi/can-read?
- some commented out code with `(def wtf (mt/user-http-request
...))`. Restore tests.
- probably commented out because the recent views come back in an order
but we don't care about the order. And it was asserting against an
ordered collection. Just go from [{:id <id> :model <model>} ...] to a
map of {<id> <model>} and then our comparison works fine and is not
sensitive to order.

* put try/finally around read-only-mode

ensure the setting read-only-mode! to false happens in a finally
block. Also, set read-only-mode to false in

```clojure
(deftest read-only-login-test
  (try
    (cloud-migration/read-only-mode! true)
    (mt/client :post 200 "session" (mt/user->credentials :rasta))
    (finally
      (cloud-migration/read-only-mode! false))))
```

But worryingly, I think we have a lurking problem that I'm not sure why
we haven't hit yet. We run tests in parallel and then put all of the
non-parallel tests on the same main thread. And when one of these puts
the app in read-only-mode, the parallel tests will fail. HOPEFULLY since
they are parallel they won't be hitting the app-db necessarily, but
there could be lots of silly things that break.

---------

Co-authored-by: John Swanson <john.swanson@metabase.com>
Co-authored-by: dan sutton <dan@dpsutton.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.Team/AdminWebapp Admin and Webapp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants