-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CU's fail on Stampede - no STDOUT no STDERR #620
Comments
Duplicate of your own #281? |
No it is not, but in principle could as well be even a triplicate having 127 open issues. |
So what are you reporting? That a unit fails? That all units fails? That there are no STDOUT and STDERR? |
FWIW, the agent.err contains: 2015:05:07 15:07:37 16837 StageinWorker-0 radical.pilot.agent :
I would assume there is an input file missing? On Fri, May 8, 2015 at 1:06 AM, Mark Santcroos notifications@github.com
99 little bugs in the code. 127 little bugs in the code... |
This file is missing because unit.000467 failed - which is Amber MD run. Contents of unit.000467: |
how do you see that this unit failed? The DB records are gone, On Fri, May 8, 2015 at 1:16 AM, Antons notifications@github.com wrote:
99 little bugs in the code. 127 little bugs in the code... |
|
Thanks. Antons, can you please try to rerun it and see if this is reproducible? It seems you hit an internal error on the radical.utils layer. I don't understand it yet I'm afraid -- knowing if it is reproducible would help to triage it. Thanks... |
|
And the stacktrace:
|
Thank you for your input gentelmen. If I may ask, apart from verifying that this issue is reproducible is there any other reason to re-run? |
Mainly getting an intuition about occurrence frequency and more hints about where to instrument the code to get further debugging information |
An attempt to reproduce radical-cybertools/radical.pilot#620
Reproduced with RADICAL_PILOT_VERBOSE=debug SAGA_VERBOSE=debug RADICAL_VERBOSE=debug RADICAL_REPEX_VERBOSE=info |
Great! Would you mind giving us instructions on how exactly to run your code to reproduce it? Thanks! |
Sorry, logs and terminal output are missing. Will post soon. |
Antons, could you please try to add the following entries to your
Also, see above, could you provide instructions on how to run your code? Thanks! |
This was US use-case from repex. You can run it by doing this:
|
Proposed solution results in the following error:
|
Hi Antons -- we dropped the ball on this ticket. Does this still pop up for you? |
Optimistically closing due to inactivity (and also because the lease manager has seen change since then) |
Terminal output
agent.err
agent.log
agent.out
The text was updated successfully, but these errors were encountered: