New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTCondor CI is failing #568
Comments
Thanks @guillaumeeb ! Yes, this looks like the worker jobs are failing directly.
I certainly broke this with my changes, I guess because Otherwise, seeing stderr would help a lot, probably -- I remember that I had to do some manual downgrades of click for some python versions. But I tried a number of setups, so I don't remember really... |
Due to config read with `default=[]`, `env_extra` will not stay `None` but become an empty list. This resulted in a command template starting with a semicolon. Possibly related to dask#568
* Fix command template for empty `env_extra` in HTCondor Due to config read with `default=[]`, `env_extra` will not stay `None` but become an empty list. This resulted in a command template starting with a semicolon. Possibly related to #568 * formatting * reenable HTCondor CI workflow * DEBUG: downgrade click * DEBUG: condor_q + condor logs * DEBUG: more output * DEBUG * DEBUG * test_basic[HTCondorCluster]: use 2GiB memory * DEBUG more output * test_basic[HTCondorCluster]: use 500MiB memory * adapt assertion to 500MiB * DEBUG: revert all debugging stuff * test_basic[HTCondorCluster]: use 500MiB memory * Revert "DEBUG: downgrade click" This reverts commit 56e56d4. * EMPTY to trigger CI * Fix also test_extra_args_broken_cancel * Re-add some debugging outputs for HTCondor CI just in case Co-authored-by: Guillaume EB <g.eynard.bontemps@gmail.com>
Closed by #570. |
I've put some time in trying to debug HTCondor CI (see #562 (comment)).
If it did work on my laptop, I've not been able to make it works in github Actions, don't know why yet.
Some details:
From the last run here, we can see from the Cleanup sections and
condor_history
command output that the HTCondor queuing system is working. The problem seems to come from the worker jobs which complete really fast, Dask workers never connect to Scheduler.I guess we would need to see the jobs stdout/stderr to debug further the problem.
cc @riedel @mivade @jolange.
The text was updated successfully, but these errors were encountered: