Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset not visible in a user's history (compressed/uncompressed, same hid) #17220

Closed
mira-miracoli opened this issue Dec 20, 2023 · 8 comments
Closed

Comments

@mira-miracoli
Copy link
Contributor

Describe the bug
We got a support request from a user, that ran out of storage and can not delete 2 large datasets, because they are not shown in the history (independent of the filter combinations).
In the storage manager, the datasets are listed and you can also find them in User→Datasets, but when you then click on view in history the datasets will not appear (filters: deleted:any visible:any hid:61).
However, what appears is the compressed version, which has the same hid.
When I looked the datasets up in our database using

gxadmin query q "select dataset_id, d.uuid, name, hid, visible, d.deleted, d.purged, d.file_size  \
from history_dataset_association inner join dataset d on dataset_id = d.id where history_id = xxxxx and d.purged = 'f'"

The datasets showed up as follows:

dataset_id |                                name                                | hid | visible | deleted | purged |  file_size        
------------+--------------------------------------------------------------------+-----+---------+---------+--------+--------------
...
  xxxxxxx55 | Concatenate datasets on data 53, data 57, and data 55 uncompressed |  61 | f       | f       | f      | 166396579020
  xxxxxxx58 | Concatenate datasets on data 63, data 65, and data 64 uncompressed |  66 | f       | f       | f      | 166396579020

Galaxy Version and/or server at which you observed the bug
Galaxy Version: 23.1_europe
Commit: 5af8ba0

To Reproduce
Steps to reproduce the behavior:
Not sure how
The user reported that two jobs were running and only one finished and the other one was paused due to using ~200% of disk quota.
Galaxy shows the following message in the paused job:

Execution of this dataset's job is paused because you were over your disk quota at the time it was ready to run

Expected behavior
Dataset shown in the history and can be deleted and purged;
the compressed and the uncompressed version is visible and the uncompressed version can be deleted.
Screenshots

Screenshot from 2023-12-20 09-06-26-censored
Screenshot from 2023-12-20 09-05-27-censored
Screenshot from 2023-12-20 16-18-21

Screenshot from 2023-12-20 16-18-39

@mvdbeek
Copy link
Member

mvdbeek commented Apr 22, 2024

@ahmedhamidawan I think you fixed this in 84b6272

@mvdbeek mvdbeek closed this as completed Apr 22, 2024
@ahmedhamidawan
Copy link
Member

@ahmedhamidawan I think you fixed this in 84b6272

Yes @mvdbeek , thank you!

@mvdbeek
Copy link
Member

mvdbeek commented Aug 2, 2024

Hmm, guess that wasn't all of it, here's a history with duplicate hids from an implicit conversion:
https://usegalaxy.org/u/marius/h/copy-of-human22chrsnps

Compare UI:
Screenshot 2024-08-02 at 19 22 13

with API https://usegalaxy.org/api/histories/c845ae1a2747ea06/contents

@ahmedhamidawan
Copy link
Member

Hmm, guess that wasn't all of it, here's a history with duplicate hids from an implicit conversion: https://usegalaxy.org/u/marius/h/copy-of-human22chrsnps

For that history for e.g., I was able to come up with a solution where we add a item.sub_items property to items in the historyItemsStore, which allows us to do something like this:

sub_items_in_history.mp4

I will open a PR tomorrow for this

@mvdbeek
Copy link
Member

mvdbeek commented Aug 8, 2024

This looks cool, but the original dataset is the one you'd want to see by default

@martenson
Copy link
Member

How does it show in the toolform dataset input?

@ahmedhamidawan
Copy link
Member

How does it show in the toolform dataset input?

Oh that would show the "original" dataset, which is set based on what the api returns for the current filter...

ahmedhamidawan added a commit to ahmedhamidawan/galaxy that referenced this issue Aug 12, 2024
…id in history

This adds a `sub_items` array to history items when they are fetched in the `historyItemsStore`, so that if any history item has other related items with the same hid, they are pushed to this array, and can be shown in the history.

_Using an incomplete `COMPRESSED_EXTENSIONS` array here, there is probably a better way of confirming a dataset is compressed versus the original dataset (maybe using the backend?)_

Fixes galaxyproject#17220
ahmedhamidawan added a commit to ahmedhamidawan/galaxy that referenced this issue Aug 13, 2024
…id in history

This adds a `sub_items` array to history items when they are fetched in the `historyItemsStore`, so that if any history item has other related items with the same hid, they are pushed to this array, and can be shown in the history.

_Using an incomplete `COMPRESSED_EXTENSIONS` array here, there is probably a better way of confirming a dataset is compressed versus the original dataset (maybe using the backend?)_

Fixes galaxyproject#17220
@mira-miracoli
Copy link
Contributor Author

Thank you for fixing this @ahmedhamidawan ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants