You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running desi_proc --batch on an arc exposure with a single camera, the generated batch script has the wrong number of MPI ranks leading to a specex wrapper failure:
(for simplicity I dropped the full path to desiproc and the long --timing-file option)
Note that has 11 instead of 21 ranks. Running that causes:
...
Traceback (most recent call last):
File "/global/common/software/desi/cori/desiconda/20200801-1.4.0-spec/code/desispec/0.48.0/bin/desi_proc", line 7, in
<module>
proc.main(args)
File "/global/common/software/desi/cori/desiconda/20200801-1.4.0-spec/code/desispec/0.48.0/lib/python3.8/site-packages
/desispec/scripts/proc.py", line 464, in main
desispec.scripts.specex.run(comm,cmds,args.cameras)
File "/global/common/software/desi/cori/desiconda/20200801-1.4.0-spec/code/desispec/0.48.0/lib/python3.8/site-packages
/desispec/scripts/specex.py", line 336, in run
sc = Schedule(fitbundles,comm=comm,njobs=len(cameras),group_size=group_size)
File "/global/common/software/desi/cori/desiconda/20200801-1.4.0-spec/code/desispec/0.48.0/lib/python3.8/site-packages
/desispec/workflow/schedule.py", line 67, in __init__
raise Exception("can't have group_size larger than world size - 1")
Exception: can't have group_size larger than world size - 1
@marcelo-alvarez please update the bookkeeping and test desi_proc --batch --nosubmit ... with various combinations of single cameras, N>1 random individual cameras, complete spectrographs, and all spectrographs to confirm that it generates the intended number of ranks.
The text was updated successfully, but these errors were encountered:
When running
desi_proc --batch
on an arc exposure with a single camera, the generated batch script has the wrong number of MPI ranks leading to a specex wrapper failure:Example:
Generates a script with
(for simplicity I dropped the full path to desiproc and the long --timing-file option)
Note that has 11 instead of 21 ranks. Running that causes:
@marcelo-alvarez please update the bookkeeping and test
desi_proc --batch --nosubmit ...
with various combinations of single cameras, N>1 random individual cameras, complete spectrographs, and all spectrographs to confirm that it generates the intended number of ranks.The text was updated successfully, but these errors were encountered: