Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Improved Collection and Workflow State with Applications #5013
This is a fairly expansive reworking on the concept of state for both workflows and dataset collections and many new features and fixes these state changes allow. These features include displaying more information with less noise to users of collections and making both collections and workflows simultaneously more correct and more efficient.
(Sorry for the unwieldy PR in terms of both scope and code lines - I've broken many smaller PRs off this branch but the core changes each involve modifying the same database migration repeatedly and so are inter-dependent despite seemingly being seemingly disjoint.)
The main features are as follows and described in detail below.
Display collection state in the GUI.
To do this I implement a model abstraction called
With the addition of
We don't have the connections to properly produce this data efficiently for existing collections - so legacy collections (collections created prior to 18.01) will have the improved display of collection counts and such but will not have proper gray, yellow, green, red states. New collections however will.
Optimize dataset collection APIs and UX interactions.
Element collection counts used to be relatively expensive to compute and not included in the summary collection API view so the history panel would wait for the history contents to render and then would pause a little and issue a separate AJAX request for the collection counts and populate them in the GUI. I've reworked the collection models so this information is persisted (this was easy since it is only set once and in one thread ever) and then I included this information in the summary API. This allows the eliminate of that separate request from the GUI and allows rendering more information about collections in the GUI more quickly.
Progress toward #3939.
Hide outputs of collection mapping immediately.
When executing tools (ad-hoc and in workflows) create "mapped over" collections ahead of time instead of at the end only if they are valid and use this to track state of collection mapping progress so we can hide output datasets immediately.
I believe this should improve performance and prevent locks because we no longer have conflicting threads trying to write dataset state. The intermittent failures described in #1790 likely are causing undue contention at the database level as well.
Pre-creating these collections and the ImplicitCollectionJobs allows us to re-schedule workflows steps as discussed below.
Allow empty collections and empty collection mapping.
If one maps a tool over an empty collection, one should get an empty output collection for each of the tool's outputs.
This should have always been a valid operation, the code just wasn't quite structured right when the collection work was merged - there were too many places I was assuming the existence of at least one actual output or input that was mapped over to continue going. The improved state handling finally allowed me to do away with this and this reworks all the collection plumbing above the state level to handle empty collections appropriately.
This was used by real world CWL workflows in the GA4GH challenge - so I think that is good evidence this is the right thing to do and continued progress on the challenge and CWL will require this work.
Explicitly track workflow invocation outputs
This is modeled somewhat on job outputs, output datasets and output dataset collections are now tracked for each workflow invocation and exposed via the workflow invocation API. This required adding new tables (linked to
Previously one could imagine backtracking this information for simple tool steps via the
This makes it much clearer what a workflow is producing via the API at least and required for workflow testing and CWL from Planemo.
Explicitly track workflow invocation step outputs
Tracking the outputs of
Tracking these job outputs for
Rather then a
This transition involved fairly substantial changes to the workflow module interface. Any place a list of
This also fixes a bug (or implements a missing feature) where Subworkflow modules had no
Add concept of workflow invocation step state - allow re-scheduling same step multiple times.
This can be exercised by setting
If this doesn't fix all the memory issues - the rest of this state tracking will positions us to do things a lot more cleverly now - so we should be able to follow up with even more fixes.
Many tests covering workflow and collection applications.
In addition to various testing framework enhancements - this PR includes tests:
referenced this pull request
Nov 15, 2017
This was referenced
Nov 20, 2017
@martenson Sure - what would you prefer - any easy visual clues we could use to distinguish these?
Update: We discussed in person and gray is fine but I will switch this to use the circular spinning icon instead of the stop watch (like when waiting for tool search for instance).
Nov 30, 2017
1 of 7 checks passed
Thanks @jmchilton for the updates. I wish this PR would solve the problem I encountered sub-workflow scheduling :-)
Today I did some tests using the docker-galaxy built on the dev branch
This is a feature, the client is lazy-loading the collection contents.
This sounds like a bug but I am unable to replicate.
@mmiladi Thanks a bunch for the testing!
I was going to ping you and ask you to check after the next release :( - I swear this solves some serious problems with subworkflows but clearly there are still issues. If you get that stack trace I'll definitely take a look. These subworkflow issues are tricky because it is hard to determine if it is a problem with how the GUI is constructing the workflow or how the backend is running it.
Yes - the UI renders all the datasets and collections first and then fetches the collection states after a bit of delay when it has more information. Hopefully the job status bar indicates that it is fetching this information. This keeps the initial rendering quick (in theory much quicker than before for very large collections since we don't need to grab the full collection contents to put the initial box on the screen - this probably isn't noticeable for small collections though). I could probably fetch the more detailed job information with less of a delay if it is disorienting - I think it is like two seconds or something currently but that isn't needed I suppose - it is kind of a remanent of how collection element counts used to be fetched .
Well that is just a flat out bug I think - nothing that was green should go back to yellow I don't think - really odd. I'll try to recreate it.
@martenson @jmchilton : Thanks much for your quick reply! The 2-seconds delay is not annoying, just was not sure if it's a feature or a bug :) Also I am running a cloud instance that might be slower than usual hardware resources being used for galaxy servers.
With the 2nd visualization error: This happens after rerunning the terminated workflow in the same history. So it might be dependent on the termination issue.
@jmchilton Thanks again! Actually Bjoern recommended to test this branch and see if sub-workflow issue is resolved :) . With this branch I receive streams of HTTP log messages in the console, from the webserver I guess. So it's a bit difficult to catch error messages. Could you please mention the log files that I should first check?
Ok finally I managed to get full log, this seems to be the error:
@jmchilton The tool that fails here (cmsearch) has dynamic number of input fields. By default there is a single input, but here 'covariance models from history' is selected instead of the "locally installed cm files". This would dynamically add a second input pin to the tool.
Ahhhh - I think I get it now. I bet optional output
@jmchilton Thanks much for the help!
The error log should be this one:
added a commit
this pull request
Dec 6, 2017
referenced this pull request
Dec 6, 2017
@mmiladi Thanks so much for the stack traces - this is really helpful!
Are you in a position to update the running Galaxy code and let me know if it fixes this problem? I have a vague sense that this f2b584f might fix the problem but I need to think about it more before actually opening a PR to fix it. I'd like to be able to re-create this in my testing also so I need to do some more work.