New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
desi_proc single exposure script #837
Conversation
Just a note- if you are running this inside a "real" production directory, you will need to do:
afterwards for this info to get into the database. Also, I assume that you know you can do:
Remember we designed these tools to allow manual running of single exposures. If these need tweaked we should make those changes... |
I had some offline discussion with @sbailey, and understand better the reason for this tool (i.e. we don't even have fibermaps that we can use to setup a production in the first place). |
We don't have fibermaps in the raw data stream yet, so we can't "desi_pipe create" a new production with the standard tools to be able to proceed with "desi_pipe tasks" etc. And we also have data that we want to process that don't have corresponding afternoon arcs/flats to make the psfnight and fiberflat night, so we also needed a way to just use the $DESI_SPECTRO_CALIB versions. I considered adding these special cases to the standard pipeline, but in the end wanted to focus on something lightweight that meets the special cases of current needs without also trying to support long term scaling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runs fast at Kitt Peak. That's great. Example on desi-11.
export OMP_NUM_THREADS=1
time mpirun -n 20 desi_proc -n 20191026 -e 21528 --cameras r3 --mpi
...
Summary of completion times:
start Mon Oct 28 10:52:01 2019
init Mon Oct 28 10:52:03 2019 (0.0 min)
preproc Mon Oct 28 10:52:11 2019 (0.1 min)
traceshift Mon Oct 28 10:52:25 2019 (0.2 min)
extract Mon Oct 28 10:53:30 2019 (1.1 min)
picksky Mon Oct 28 10:53:31 2019 (0.0 min)
sky Mon Oct 28 10:54:06 2019 (0.6 min)
done Mon Oct 28 10:54:06 2019 (0.0 min)
real 2m5.398s
user 39m59.555s
sys 1m32.372s
But, some bugs
- several cameras at once fail
time mpirun -n 20 desi_proc -n 20191026 -e 21528 --cameras b3,r3,z3 --mpi
...
OSError: extension not found: b3,r3,z3 (case insensitive)
- error with ARC exposures when running without mpi (with mpi there is another error downstream because specex has to be recompiled with the latest version)
time desi_proc -n 20191027 -e 21828 --cameras r3
...
INFO:calibfinder.py:202:__init__: Found data version V20191001 for camera r3 in /software/datasystems/desi_spectro_calib/trunk/spec/sp3/r3.yaml
WARNING:desi_proc:253:<module>: fitting PSFs without MPI parallelism; this will be SLOW
ERROR:util.py:47:runcmd: missing input r3
After specex update, PSF fit on r camera takes 3 min on one KPNO node.
|
Thanks for testing; bugs fixed and pushed; please retest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now working fine. Merge when you want.
@julienguy please test this, since you seem quite good at immediately finding the problems that I missed... :)
This PR adds a work-in-progress script
desi_proc
to simplify processing a single exposure independent of other exposures and outside of the constraints of the pipeline (e.g. it works even if there aren't daily arcs/flats).It looks for data in
$DESI_SPECTRO_DATA/{night}/{expid}/desi-{expid}.fits.fz
(or--input filename
) and outputs to$DESI_SPECTRO_REDUX/$SPECPROD/{night}/{expid}/preproc/
andexposures/
following the same directory structure as the full pipeline (currently not overridable).Examples are in /global/project/projectdirs/desi/spectro/redux/sjbailey/
It selects the steps to perform based upon the OBSTYPE, which can be overriden with the
--obstype
option.Any steps that already have their outputs are skipped so that while debugging you can just keep re-running it and it will pickup where it left off.
In general you should give it either 20 ranks (1 node) or 60 ranks (2 nodes) per spectrograph.
Examples run from an interactive session:
Directly submitting to the realtime queue from a login node
(only works for pre-approved users):
I suggest that we fix anything that is outright broken in this PR and save other features for additional PRs. Top of my list for additional features:
Current steps: