Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

desi_proc single exposure script #837

Merged
merged 6 commits into from Oct 28, 2019
Merged

desi_proc single exposure script #837

merged 6 commits into from Oct 28, 2019

Conversation

sbailey
Copy link
Contributor

@sbailey sbailey commented Oct 25, 2019

@julienguy please test this, since you seem quite good at immediately finding the problems that I missed... :)

This PR adds a work-in-progress script desi_proc to simplify processing a single exposure independent of other exposures and outside of the constraints of the pipeline (e.g. it works even if there aren't daily arcs/flats).

It looks for data in $DESI_SPECTRO_DATA/{night}/{expid}/desi-{expid}.fits.fz (or --input filename) and outputs to $DESI_SPECTRO_REDUX/$SPECPROD/{night}/{expid}/preproc/ and exposures/ following the same directory structure as the full pipeline (currently not overridable).
Examples are in /global/project/projectdirs/desi/spectro/redux/sjbailey/

It selects the steps to perform based upon the OBSTYPE, which can be overriden with the --obstype option.

Any steps that already have their outputs are skipped so that while debugging you can just keep re-running it and it will pickup where it left off.

In general you should give it either 20 ranks (1 node) or 60 ranks (2 nodes) per spectrograph.

Examples run from an interactive session:

#- process night 20191024 exposure 20662 (ARC)
#- Use 60 ranks so that it processed 3 cameras in parallel with 20 ranks each
time srun -n 60 -c 2 desi_proc -n 20191022 -e 19980 --mpi

#- for a FLAT, the walltime is dominated by the serial fiberflat step,
#- so we just give it 20 ranks and it does 3 extractions in a row and
#- then 3 fiberflats in parallel 
time srun -n 20 -c 2 desi_proc -n 20191024 -e 20672 --mpi 

#- override the OBSTYPE for earlier data with no OBSTYPE
#- and misleading FLAVOR
time srun -n 20 -c 2 desi_proc -n 20191022 -e 19980 --mpi

Directly submitting to the realtime queue from a login node
(only works for pre-approved users):

time srun -n 20 -c 2 -C haswell -t 10:00 --qos realtime \
    desi_proc -n 20191022 -e 19980 --obstype flat --mpi

I suggest that we fix anything that is outright broken in this PR and save other features for additional PRs. Top of my list for additional features:

  • if OBSTYPE=SKY and no skies in the fibermap, make up some sky fibers and proceed with sky modeling
  • separate the logging to different files for different tasks (like the full pipeline)
  • Add --batch option to generate and submit a batch job instead of blocking while running.
  • Move into desispec/py/scripts/ and general structural cleanup

Current steps:

SCIENCE SKY TWILIGHT ARC FLAT ZERO/DARK
preproc yes yes yes yes yes yes
traceshift yes yes yes yes
psf yes
extract yes yes yes yes
fiberflat yes
sky not yet...

@tskisner
Copy link
Member

Just a note- if you are running this inside a "real" production directory, you will need to do:

desi_pipe sync

afterwards for this info to get into the database. Also, I assume that you know you can do:

desi_pipe tasks --night <blah> --expid 12345 --tasktypes psf | desi_pipe run --nersc cori_haswell ....

Remember we designed these tools to allow manual running of single exposures. If these need tweaked we should make those changes...

@tskisner
Copy link
Member

I had some offline discussion with @sbailey, and understand better the reason for this tool (i.e. we don't even have fibermaps that we can use to setup a production in the first place).

@sbailey
Copy link
Contributor Author

sbailey commented Oct 26, 2019

We don't have fibermaps in the raw data stream yet, so we can't "desi_pipe create" a new production with the standard tools to be able to proceed with "desi_pipe tasks" etc. And we also have data that we want to process that don't have corresponding afternoon arcs/flats to make the psfnight and fiberflat night, so we also needed a way to just use the $DESI_SPECTRO_CALIB versions. I considered adding these special cases to the standard pipeline, but in the end wanted to focus on something lightweight that meets the special cases of current needs without also trying to support long term scaling.

Copy link
Contributor

@julienguy julienguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runs fast at Kitt Peak. That's great. Example on desi-11.

export OMP_NUM_THREADS=1 
time mpirun -n 20 desi_proc -n 20191026 -e 21528 --cameras r3 --mpi
...
Summary of completion times:
  start      Mon Oct 28 10:52:01 2019
  init       Mon Oct 28 10:52:03 2019 (0.0 min)
  preproc    Mon Oct 28 10:52:11 2019 (0.1 min)
  traceshift Mon Oct 28 10:52:25 2019 (0.2 min)
  extract    Mon Oct 28 10:53:30 2019 (1.1 min)
  picksky    Mon Oct 28 10:53:31 2019 (0.0 min)
  sky        Mon Oct 28 10:54:06 2019 (0.6 min)
  done       Mon Oct 28 10:54:06 2019 (0.0 min)

real	2m5.398s
user	39m59.555s
sys	1m32.372s

But, some bugs

  • several cameras at once fail
time mpirun -n 20 desi_proc -n 20191026 -e 21528 --cameras b3,r3,z3 --mpi
...
OSError: extension not found: b3,r3,z3 (case insensitive)
  • error with ARC exposures when running without mpi (with mpi there is another error downstream because specex has to be recompiled with the latest version)
time desi_proc -n 20191027 -e 21828 --cameras r3 
...
INFO:calibfinder.py:202:__init__: Found data version V20191001 for camera r3 in /software/datasystems/desi_spectro_calib/trunk/spec/sp3/r3.yaml
WARNING:desi_proc:253:<module>: fitting PSFs without MPI parallelism; this will be SLOW
ERROR:util.py:47:runcmd: missing input r3

@julienguy
Copy link
Contributor

After specex update, PSF fit on r camera takes 3 min on one KPNO node.

time mpirun -n 20 time desi_proc -n 20191027 -e 21828 --cameras r3 --mpi
...
real	3m4.123s
user	60m20.409s
sys	0m48.772s

@sbailey
Copy link
Contributor Author

sbailey commented Oct 28, 2019

Thanks for testing; bugs fixed and pushed; please retest.

Copy link
Contributor

@julienguy julienguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now working fine. Merge when you want.

@sbailey sbailey merged commit 9506125 into master Oct 28, 2019
@sbailey sbailey deleted the desiproc branch October 28, 2019 22:26
sbailey added a commit that referenced this pull request Oct 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants