What limits usability of collections? #8403

mvdbeek · 2019-08-02T13:22:57Z

I would like to see wider adoption of collections among Galaxy users, and I'm trying to understand how we can improve collections (or the documentation surrounding collections) to reach that goal. So if you've tried them recently and were stuck at any point and felt that going back to regular datasets was the better option please let us know!

ThomasWollmann · 2019-08-02T13:30:40Z

Visual feedback of tool outputs into collections is unsatisfactory. You do not know if just collection content is fetching or results still have to be computed. Also job logs, rerun and these features are not available.
handling of collections in the tool wrappers is very cumbersome, since discover_datasets and datatypes are messy.
Collection manipullation bgruening/docker-jupyter-notebook#11
Support for complex dataset filtering would be nice
Support for sending subsets of nested collections to a workflow would be awesome.
https://toolshed.g2.bx.psu.edu/repository/view_repository?sort=name&operation=view_or_manage_repository&id=c30c030673c90378 does not support creation of nested collections from zipped files. I do not know how to fix it, since I can not get the "recurse" option not working.

mmiladi · 2019-08-02T13:52:13Z

collection is not well scalable. My browser has difficulties with opening and searching collections of large size.
takes too much (twice? unnecessary?) hidden entries. histories in the "show hidden" mode are super lengthy & unmanageable. A "show hidden but not the collection single entries" button would be great! :)
restart an entry from an output collection: There seems to be no way to automatically switch between restarting single entry to restarting the whole collection (at least idk)
Different tags for different entries of a collection are not shown.
A sustainable and reliable way to make collection out of a zipped input would be great!

hexylena · 2019-08-02T13:53:15Z

failed collections which contain no datasets do not offer a way to view or report errors

ThomasWollmann · 2019-08-02T13:54:07Z

A sustainable and reliable way to make collection out of a zipped input would be great!

Checkout: https://toolshed.g2.bx.psu.edu/repository/view_repository?sort=name&operation=view_or_manage_repository&id=c30c030673c90378

bernt-matthias · 2019-08-06T10:42:51Z

For me its often the automatic naming of data sets, which mainly comes from the fact that with multiple="true" collections are not treated as collections but as a bag of data sets. Take for instance the following history that I created in the (short) mothur tutorial from the GTN.

Here the list 168 sub.sample shared has been the chosen as input to 179 collapse collection, but the name of the data set says Collapse collection on data set 171 which is hidden in the history. So the connection of the steps is obfuscated.

This has been raised here:

#7467
#7392

Problems are at least documented now:

galaxyproject/planemo#930

In addition I don't see a way to find out which data sets are contained in a collection. For instance the single member of 179 has the name "0.03", which is data set 171. If this data set number would be shown in the collection display the connection would be clear(er).

bernt-matthias · 2019-08-06T20:59:12Z

But for sure collections are a good step in the right direction. The alternative of having a flat history with no structure (apart from the linear structure corresponding to creation time).

Maybe collections just should be used more -- in particular nested collections. For instance, the output of each tool could be a collection containing all the other outputs. The tools of the stacks suite do something in this direction, but without nesting. On the other hand the user then needs to click more to get to the desired information.

birnbera · 2019-08-12T17:40:57Z

At the moment, it seems collections are primarily for grouping inputs and outputs vis a vis a mapped workflow situation. I am more interested in using collections to group outputs together for easy download as a single unit. Currently, it doesn't seem like you can apply data filters to elements of a collection (as opposed to the whole collection itself), which makes sense for a mapped workflow situation, but not so much for simply grouping outputs. Also, the documentation on filtering output collections vs output data appears to have mistakenly duplicated the docs for filtering data outputs only.

hexylena · 2019-08-15T12:51:48Z

Some UX bugs during my trawl through old issues, mostly around "missing the standard edit/view/bug icons of normal datasets"

nsoranzo · 2019-10-07T16:06:10Z

Re-run of dataset collection elements via API: Rerun error'ed dataset / job using bioblend API bioblend#277

simonbray · 2020-03-19T12:26:28Z

Not being able to change the datatype for all datasets in a collection. It's not even possible to write a script with the API, which would be an acceptable second option. This is just torture, please focus on fixing these bread-and-butter issues because it really affects usability.

simonbray · 2020-03-19T13:50:51Z

~~Downloading all files in a collection (can be done with bioblend).~~

bernt-matthias · 2020-03-19T13:52:57Z

Downloading all files in a collection (can be done with bioblend).

This is already possible in the web UI. Open the collection. Then you see a little download symbol at the top.

simonbray · 2020-03-19T13:56:09Z

Thanks @bernt-matthias, you are right. I wish it was more visible, though.

simonbray · 2020-04-14T15:55:41Z

Collections cannot be published in the same way that datasets can. Published datasets can be downloaded via /datasets/{id}/download; this doesn't work for collections.

(The download button mentioned in the comments above points to /api/dataset_collections/{id}/download (/api/histories/{hid}/contents/dataset_collections/{id}/download seems to work fine too) but these require API authentication so are not helpful. Yes, you can probably log in, import the history and download, but IMO 'public' means 'available to everyone, not just people with Galaxy accounts'.)

If I missed something here, please let me know, it would be very helpful.

mvdbeek added planning help wanted also "hacktoberfest", beginner friendly set of issues area/dataset-collections labels Aug 2, 2019

hexylena added status/planning and removed planning labels Aug 22, 2019

hexylena mentioned this issue Apr 20, 2020

UX user feedback #9630

Open

mvdbeek mentioned this issue May 21, 2020

Allow tool submission with DatasetCollectionElements / make build_for_rerun consumable by API #9802

Merged

bernt-matthias mentioned this issue Apr 13, 2021

tags can not be set in collections #8553

Closed

hexylena mentioned this issue Jun 8, 2021

Failed collections which contain no datasets do not offer a way to view or report errors #12106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What limits usability of collections? #8403

What limits usability of collections? #8403

mvdbeek commented Aug 2, 2019

ThomasWollmann commented Aug 2, 2019 •

edited

Loading

mmiladi commented Aug 2, 2019

hexylena commented Aug 2, 2019

ThomasWollmann commented Aug 2, 2019

bernt-matthias commented Aug 6, 2019

bernt-matthias commented Aug 6, 2019

birnbera commented Aug 12, 2019

hexylena commented Aug 15, 2019

nsoranzo commented Oct 7, 2019

simonbray commented Mar 19, 2020

simonbray commented Mar 19, 2020 •

edited

Loading

bernt-matthias commented Mar 19, 2020

simonbray commented Mar 19, 2020

simonbray commented Apr 14, 2020

What limits usability of collections? #8403

What limits usability of collections? #8403

Comments

mvdbeek commented Aug 2, 2019

ThomasWollmann commented Aug 2, 2019 • edited Loading

mmiladi commented Aug 2, 2019

hexylena commented Aug 2, 2019

ThomasWollmann commented Aug 2, 2019

bernt-matthias commented Aug 6, 2019

bernt-matthias commented Aug 6, 2019

birnbera commented Aug 12, 2019

hexylena commented Aug 15, 2019

nsoranzo commented Oct 7, 2019

simonbray commented Mar 19, 2020

simonbray commented Mar 19, 2020 • edited Loading

bernt-matthias commented Mar 19, 2020

simonbray commented Mar 19, 2020

simonbray commented Apr 14, 2020

ThomasWollmann commented Aug 2, 2019 •

edited

Loading

simonbray commented Mar 19, 2020 •

edited

Loading