Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major refactor of survey simulation #60

Merged
merged 55 commits into from
Nov 26, 2018
Merged

Major refactor of survey simulation #60

merged 55 commits into from
Nov 26, 2018

Conversation

dkirkby
Copy link
Member

@dkirkby dkirkby commented Nov 16, 2018

This version is a major refactoring of the code to take advantage of the refactoring in desisurvey 0.11.0 and simplify the simulation classes used to track survey statistics and exposure metadata.

Refer to the updated tutorial for an overview (written assuming that 18.12 is already available at NERSC).

@dkirkby
Copy link
Member Author

dkirkby commented Nov 16, 2018

The files owned by surveysim that probably need documented data models are:

  • Survey per-tile and per-night statistics written by surveysim.stats.SurveyStatistics.save
  • Simulated exposure metadata written bysurveysim.exposures.ExposureList.save

Something similar to these are required during operations, but might duplicate tracking that is already handled better by ICS. This should become clearer when we finalize the wrapping of desisurvey afternoon planning and next-tile selection into ICS applications. For now, these contain everything needed to save/restore simulation state and study simulation outputs.

@dkirkby
Copy link
Member Author

dkirkby commented Nov 16, 2018

Note to self:

@sbailey
Copy link
Contributor

sbailey commented Nov 16, 2018

Overall, looks good. One request that is needed for integration with the rest of the end-to-end simulation chain: this branch broke surveysim.util.add_calibration_exposures because of the fewer columns in the new exposures_surveysim.fits file compared to the table that used to be generated by Progress.get_exposures().

Closely related: Although these are derivable from the TILEID, MJD, ephemeris, etc they would be really convenient to retain pre-calculated in the exposures file: RA, DEC, EBMV, NIGHT, PASS, PROGRAM, FLAVOR, MOONALT, MOONFRAC, MOONSEP. I also don't see an obvious way to tell if an exposure was a twilight exposure without crossmatching the MJD to the ephemeris file (e.g. a SUNALT column).

Traceback:

In [1]: from surveysim.util import add_calibration_exposures
   ...: 

In [2]: explist = Table.read('exposures_surveysim.fits', 'EXPOSURES')

In [3]: e2 = add_calibration_exposures(explist)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-45c3483604ff> in <module>()
----> 1 e2 = add_calibration_exposures(explist)

/global/cscratch1/sd/sjbailey/desi/code/surveysim/py/surveysim/util.py in add_calibration_exposures(exposures, flats_per_night, arcs_per_night, darks_per_night, zeroes_per_night, exptime, readout)
     49     expid_in_exposures = 'EXPID' in exposures.colnames
     50     for c in ('RA', 'DEC'):
---> 51         if output[c].unit is None:
     52             output[c].unit = 'deg'
     53     current_night = '19730703'

/global/common/software/desi/cori/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/site-packages/astropy/table/table.py in __getitem__(self, item)
   1220     def __getitem__(self, item):
   1221         if isinstance(item, six.string_types):
-> 1222             return self.columns[item]
   1223         elif isinstance(item, (int, np.integer)):
   1224             return self.Row(self, item)

/global/common/software/desi/cori/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/site-packages/astropy/table/table.py in __getitem__(self, item)
    107         """
    108         if isinstance(item, six.string_types):
--> 109             return OrderedDict.__getitem__(self, item)
    110         elif isinstance(item, (int, np.integer)):
    111             return self.values()[item]

KeyError: 'RA'

@dkirkby
Copy link
Member Author

dkirkby commented Nov 17, 2018

I will look at fixing add_calibration_exposures now.

(What was the no-op to include [ci skip] commit for?)

@dkirkby
Copy link
Member Author

dkirkby commented Nov 17, 2018

@sbailey What per-exposure fields are currently required by the end-to-end simulation chain? I propose to add those now, so this PR does not break anything, then we should separately discuss how exposure tracking should work during ops (and who owns the authoritative exposure list).

@dkirkby
Copy link
Member Author

dkirkby commented Nov 17, 2018

Trying to answer my own question: the only caller of add_calibration_exposures I found is wrap_newexp in desisim, which then references the following fields:

  • existing: MJD, TILEID
  • need to be added: PROGRAM, NIGHT, EXPID, FLAVOR

@dkirkby
Copy link
Member Author

dkirkby commented Nov 19, 2018

@sbailey I have updated add_calibration_exposures (and its unit tests) to be backwards compatible for wrap_newexp: can you test this easily?

@sbailey
Copy link
Contributor

sbailey commented Nov 21, 2018

@dkirkby I'm trying to get the minitest working with these new branches. Can you help debug the files in /global/cscratch1/sd/sjbailey/desi/dev/end2end/survey/ ?

  • test-tiles.fits
  • desisurvey-config.yaml
  • generated ephem*.fits, surveyinit*.fits, exposures*.fits, stats*.fits, planner*.fits, scheduler*.fits

There are 10 test tiles across all layers, which the old surveysim was able to observe within a month. New surveysim is only getting 1 or 2 of the 3 bright tiles. Perhaps I need a different fiberassign delay strategy for the bright tiles?

It would be helpful if surveyinit et al would use the start/stop dates from the config file instead of hardcoded ranges in desisurvey/ephem.py . I know I could copy the ephem file from $DESI_ROOT/datachallenge/surveysim2018/shared/, but I would prefer the minitest to be standalone and include running surveyinit without it taking 50 minutes. For now I can work around it by loading the module, changing ephem.START_DATE and ephem.STOP_DATE before proceeding with programmatically calling surveysim.main(), but it would be helpful to also be able to use the command line with the limited date range too.

Are the planner and scheduler files documented somewhere, even if they aren't in the formal data model yet? That would be helpful for me trying to understand this. e.g. I'm not even certain about the conceptual difference between "planner" and "scheduler".

Thanks for the help.

@sbailey
Copy link
Contributor

sbailey commented Nov 21, 2018

I was able to fix the bright tiles issue by using the following config fiber_assignment_order:

fiber_assignment_order:
    P1: P0 delay 1
    P2: P0+P1 delay 1
    P3: P0+P1 delay 1

i.e. leaving P5, P6, and P7 unspecified which I think means they are all assigned together at the start.

For exposures_surveysim.fits: this is used later by newexp-mock (via the MPI-aware wrap-newexp wrapper), which additionally needs MOONFRAC, MOONALT, and MOONSEP columns to pass forward to specsim.

@dkirkby
Copy link
Member Author

dkirkby commented Nov 22, 2018

The moon params are not currently factored into the exposure times, except for an average factor, so the BRIGHT tiles simulated by newexp-mock should not be taken too seriously. With this in mind, I suggest that we fix MOONFRAC=0.5, MOONALT=-10deg and MOONSEP=ALT+10deg. This will be revisited soon when I implement the full moon exposure-time model.

@sbailey
Copy link
Contributor

sbailey commented Nov 22, 2018

Summarizing:

I've verified that I can use these desisurvey and surveysim branches with the end-to-end integration test, though it requires some hacks. I would prefer the following changes before merging, but if they are unduly complicated let's chat options:

Provide options to surveyinit to override the hardcoded first and last days of the ephemeris calculation, which takes a long time, plus an option to surveysim to control which ephemeris file is loaded. Or provide both with an option to use the first_day and last_day from the config file or similar.

Add columns MOONFRAC, MOONALT, MOONSEP to exposures_surveysim.fits output. Even if they aren't used by the survey simulations yet, they have well defined values and can be meaningfully simulated and used downstream. At minimum let's get these into the data model.

Add columns RA, DEC to exposures_surveysim.fits output. This is less critical since this is an easy join with the tiles table but would still be nice to have; the moon params are harder.

@dkirkby
Copy link
Member Author

dkirkby commented Nov 26, 2018

Provide options to surveyinit to override the hardcoded first and last days of the ephemeris
calculation, which takes a long time, plus an option to surveysim to control which ephemeris file is
loaded.

One of the changes in this PR is to decouple the ephemerides date range from the nominal survey start/stop dates. This is slightly less convenient for testing and special runs, but more robust for commissioning and operations. Specifically, the ephemerides are now calculated (by default) to cover all of 2019 - 2025, so one file should be sufficient for all needs. At NERSC, a copy of this file is in:

$DESI_ROOT/datachallenge/surveysim2018/shared/ephem_2019-01-01_2025-12-31.fits

Note that the time-consuming step of surveyinit is not calculating the ephemerides, but optimizing the design hour angles for some nominal survey start / stop dates, which are taken from the config file, as always. The ephem date range only needs to cover the survey start / stop dates. For the default start / stop dates, the design hour angles are precomputed in:

$DESI_ROOT/datachallenge/surveysim2018/shared/surveyinit.fits

For testing in an environment where all files need to be downloaded or created from scratch (e.g., travis), I set the ephem and survey date ranges to just one month (Dec 2019). This is handled in one place for all unit tests by a unittest.TestCase superclass defined in desisurvey.test.base.

For special runs at NERSC, you can copy / link one or both of the files above to your $DESISURVEY_OUTPUT to avoid regenerating them.

@dkirkby
Copy link
Member Author

dkirkby commented Nov 26, 2018

Add columns MOONFRAC, MOONALT, MOONSEP to exposures_surveysim.fits output.

These columns are now added by util.add_calibration_exposures.

Add columns RA, DEC to exposures_surveysim.fits output.

The exposures table has no redundancy, by design, so that it clearly identifies and puts in one place the minimal set of info that must be recorded and saved/restored to capture the observing history for the purposes of progress tracking and forecasting (at least in simulations). I can provide code to expand these minimal columns into a "fat exposures" table with MOONFRAC, RA, DEC, etc, but this will be auxiliary code that is not essential (or used) for the primary operations and simulation tasks.

I will create a new issue so we can start defining the "fat exposures" (to be implemented in a future PR).

@sbailey
Copy link
Contributor

sbailey commented Nov 26, 2018

OK, the ephemeris calculation takes 2 minutes, not too bad.

MOONFRAC, MOONALT, MOONSEP are in the datamodel but we'll need to address #61 with the expanded exposures table for them to be useful downstream. I'll work around this for now in the integration tests.

Proceeding with merging desisurvey and surveysim PRs.

@sbailey sbailey merged commit 292718d into master Nov 26, 2018
@sbailey sbailey deleted the refactor branch November 26, 2018 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants