-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyses fail randomly due to IndexError #2993
Comments
This error occurs 100% of the time with my local setup, ~50% of the time with Ilya's local setup, and 0% of the time with a CloudMan Cluster and AWS Refinery instance. I'm hesitant to write a fix for something that we haven't observed in our production environment. Especially due to the fact that I am able to reproduce this error reliably with my local setup. |
The code in question is broken by design. Different error rates are simply due to random order of items returned by the filter() function. It's just a matter of time before this error occurs in production. Also, it may have already caused some analysis outputs to be attached to wrong inputs. |
After taking the time and testing this with a CloudMan cluster, the error is 100% reproducible and isolated to the @hackdna Your 50% success rate was because you were also running the What the error stems from is the fact that we hadn't updated the Rename DataSet field properly for the When this is done properly the order of that You can try this out locally with the following updated version of |
I did not use Test workflow: SPP analog. The IndexError is due to the value of the |
@hackdna are you able to reproduce the failure condition with the Workflow that I've provided? This error will not occur (nor do we care about the ordering of the mentioned filter) with properly annotated workflows. Specifically the |
Updating workflow annotations is useful but it does not fix fundamental problems with the code in _get_output_connection_to_analysis_result_mapping(). |
@hackdna That didn't answer my question. When we properly annotate our workflows (as we should be doing to provide more meaningful names to the derived results of our workflow runs Ref #2373) the ordering of the filter that you have mentioned does not matter. I believe that I have done my part to dive deeper and investigate the root cause of the issue that you're experiencing. It would be helpful if you could confirm that this behavior does not exist with the workflow I've provided you. If you still feel that this issue is critical please feel free to present your findings in our next team meeting. |
The function in question
|
A Travis build just failed again due to this issue:
|
Steps to reproduce
Run an analysis using a test workflow
Observed behavior
Analysis completes successfully in Galaxy but fails randomly in Refinery with the following error message in the Celery log:
Expected behavior
No error
Notes
There is no default order for items returned by filter() call.
Also, this may cause outputs to be added incorrectly to the provenance graph in case of analyses with more than one input.
The text was updated successfully, but these errors were encountered: