Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-16702: Add reference star support to fgcmcal fitting #13

Merged
merged 10 commits into from
May 30, 2019
116 changes: 69 additions & 47 deletions cookbook/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ The environment should be set up as follows:
```
setup lsst_distrib

export RCRERUN=RC/w_2018_38/DM-15690
export COOKBOOKRERUN=fgcm_cookbook_w_2018_38
export RCRERUN=RC/w_2019_18/DM-19151-sfm
export COOKBOOKRERUN=fgcm_cookbook_w_2019_18
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Updated doc!

The `RCRERUN` env variable should be set to the most recent completed rerun
Expand Down Expand Up @@ -119,6 +119,12 @@ to scan for available `src` catalogs speeds things up by x100, which is
necessary. You can also specify by field or other valid dataId. A [sample
config](fgcmBuildStarsHsc.py) for HSC is available.

If `doReferenceCalibration = True` in the configuration (the new default), then
stars from a reference catalog (e.g. PS1) will be loaded and matched to the
internal stars. The signal-to-noise cut specified here should be the minimum
one expects to use in the actual fit, and the fit may also be performed without
the use of reference stars if desired.

### Running `fgcmBuildStars.py`

This is also a simple command-line task. This task will take around 2.5 hours
Expand All @@ -141,18 +147,19 @@ by MJD) which should correspond to changes in observing (years, camera
warm-ups, etc), preferably coordinated with when new flat-fields were
generated. At the moment flats are regenerated for HSC every couple of weeks
which is too frequent for a good fit from FGCM. This is being
investigated. There is also a place to describe when the mirror was washed (in
MJD), as the system throughput usually jumps at these times.

There are three important differences between the first fit cycle and subsequent
cycles. The first is that a "bright observation" algorithm is employed in the
first cycle to choose approximately photometric observations, while for
subsequent iterations the fit parameters from the previous iterations are used
to find photometric observations. The second is that it is recommended that
you freeze the atmosphere to the "standard parameters" for the first fit
cycle. The third is that for HSC it can be useful to set
`precomputeSuperStarInitialCycle` which can build a "super-star flat" based on
the bright observation selection.
investigated. There is also a place to describe when the mirror was washed or
recoated (in MJD), as the system throughput usually jumps at these times.

There are three important differences between the first fit cycle ("cycle 0")
and subsequent cycles. The first is that a "bright observation" algorithm is
employed in the first cycle to choose approximately photometric observations,
while for subsequent iterations the best fit parameters from the previous
iterations are used to determine photometric observations. The second is that
it is recommended that you freeze the atmosphere to the "standard parameters"
for the first fit cycle (the default). The third is that for HSC it can be
useful to set `precomputeSuperStarInitialCycle` which can build a "super-star
flat" based on the bright observation selection to smooth out the large
illumination correction term.

At the end of the first (and all subsequent) cycles a whole bunch of diagnostic
plots are made in a subdirectory of the current directory where you invoked the
Expand All @@ -171,30 +178,46 @@ run on `lsst-dev01`.
fgcmFitCycle.py /datasets/hsc/repo --rerun \
private/${USER}/${COOKBOOKRERUN}/wide:private/${USER}/${COOKBOOKRERUN}/fit1 \
--configfile $FGCMCAL_DIR/cookbook/fgcmFitCycleHscCookbook_cycle00_config.py \
|& tee ${COOKBOOKRERUN}_cycle00.log
|& tee fgcmFitCycleHscCookbook_cycle00.log
```

### Subsequent Fit Cycles

Before running any subsequent cycle, you should look at the diagnostic plots.
The most important plots are `cycleNUMBER_expgray_BAND.png` plots. Use these
histograms to choose appropriate cuts for each band to select "photometric"
exposures in the next cycle with the `expGrayPhotometricCut` (negative side)
and `expGrayHighCut` (positive side) parameters. You want to capture the core
and reject outliers, but not too tight that you lose a lot of visits and can't
constrain the fit. You can also up the number of iterations per fit cycle. In
my experience, the fit does not improve if you go beyond ~50 iterations. The
best way to get the fit to improve is to remove non-photometric exposures. In
the future, I will explore more automated metrics of convergence, but my
assumption is that global calibration is run infrequently enough, and is
important enough to require manual checks.
Before running subsequent fit cycles, it is helpful to look at the diagnostic
plots. The most important plots are `cycleNUMBER_expgray_BAND.png` plots. As
of [DM-17305](https://jira.lsstcorp.org/browse/DM-17305), the width of these
histograms are used to set a default photometricity cut for the next fit
cycle. However, it is worth inspecting these histograms to ensure that they
look "normal". The expectation is that they have a Gaussian core with an
excess non-photometric tail out to the negative side. Any significant
bi-modality is a sign that something has gone horribly awry with the fits,
perhaps indicating issues with the input data or the configuration parameters.

The relevant configuration parameters that you might want to adjust based on
these plots are `expGrayPhotometricCut` (on the negative side) and
`expGrayHighCut` (on the positive side). You want to capture the core and
reject outliers, but not too tight that you lose a lot of visits and can't
constrain the fit.

You can also increase the number of iterations per fit cycle. In my
experience, the fit does not improve if you go beyond ~50 iterations with small
datasets, but large datasets can use ~100 iterations per cycle.

```bash
fgcmFitCycle.py /datasets/hsc/repo --rerun private/${USER}/${COOKBOOKRERUN}/fit1 \
fgcmFitCycle.py /datasets/hsc/repo --rerun \
private/${USER}/${COOKBOOKRERUN}/fit1 \
--configfile fgcmFitCycleHscCookbook_cycle01_config.py |& tee \
${COOKBOOKRERUN}_cycle01.log
fgcmFitCycleHscCookbook_cycle01.log
```

```bash
fgcmFitCycle.py /datasets/hsc/repo --rerun \
private/${USER}/${COOKBOOKRERUN}/fit1 \
--configfile fgcmFitCycleHscCookbook_cycle02_config.py |& tee \
fgcmFitCycleHscCookbook_cycle02.log
```


### Final fit cycle

After the user has concluded that the fit has converged to her satisfaction,
Expand All @@ -207,9 +230,9 @@ cycle run should only take 2-3 minutes.

```bash
fgcmFitCycle.py /datasets/hsc/repo --rerun private/${USER}/${COOKBOOKRERUN}/fit1 \
--configfile fgcmFitCycleHscCookbook_cycle02_config.py \
--config isFinalCycle=True --config outputStandards=True |& tee \
${COOKBOOKRERUN}_cycle02.log
--configfile fgcmFitCycleHscCookbook_cycle03_config.py \
--config isFinalCycle=True |& tee \
fgcmFitCycleHscCookbook_cycle03.log
```

## Outputs
Expand Down Expand Up @@ -280,16 +303,18 @@ used by downstream processing in the stack.
The final task can output atmosphere transmission curves derived from the model
fit for each visit (`doAtmosphereOutput'); output a reference catalog of
"standard stars" that may be used for single-ccd calibration
(`doRefcatOutput`); and `jointcal_photoCalib` files for each visit/ccd
(`doRefcatOutput`); and `fgcm_photoCalib` files for each visit/ccd
(`doZeropointOutput`). All of these options are on by default.

One key thing is that FGCM does not use any absolute reference sources, so the
overall throughput for each band is unconstrained. Therefore, to get the
output reference catalog and zeropoints on a physical scale, we must use
calspec standards (preferred, and deferred to the future) or a reference
catalog. At the moment, this uses the same PS1 catalog that is used as a
reference for the `PhotoCal` and `jointcal` tasks. This is set with
`doReferenceCalibration`, and the default is on. Please see the [sample
The FGCM calibration can be run with or without reference stars as an anchor.
The new default, including in this cookbook, is to run with reference sources
from PS1. However, if you run without reference sources you get a good
relative calibraiton but the overall throughput for each band is
unconstrained. To get the output reference catalog and zeropoints on a
physical scale, we must use calspec standards (which is preferred, but not
currently supported) or a reference catalog such as PS1. To turn on this final
computation of the system throughput, set `doReferenceCalibration=True` in the
`FgcmOutputProducts` task. (The default is now `False`). Please see the [sample
config](fgcmOutputProductsHsc.py) for example usage of setting up the reference
catalog.

Expand All @@ -301,7 +326,7 @@ be converged and should be output. This should take around 5 minutes on `lsst-d
fgcmOutputProducts.py /datasets/hsc/repo --rerun \
private/${USER}/${COOKBOOKRERUN}/wide:private/${USER}/${COOKBOOKRERUN}/fit1 \
--configfile $FGCMCAL_DIR/cookbook/fgcmOutputProductsHsc.py \
--config cycleNumber=2 |& tee ${COOKBOOKRERUN}_output.log
--config cycleNumber=3 |& tee fgcmFitCycleHscCookbook_output.log
```

In the output repo
Expand All @@ -312,21 +337,18 @@ atmosphere parameters to the "standard" values, these will be computed
specifically for the airmass of each individual observation.

In the output repo
`/datasets/hsc/repo/rerun/private/erykoff/rc2_w_2018_32/fit1/ref_cats/fgcm_stars/'
`/datasets/hsc/repo/rerun/private/${USER}/${COOKBOOKRERUN}/fit1/ref_cats/fgcm_stars/'
you will find a sharded reference catalog suitable to use for any observations
overlapping the survey images that have been calibrated.

And in the output repo
`/datasets/hsc/repo/rerun/private/erykoff/rc2_w_2018_32/fit1/jointcal-results`
you will find all of the `jointcal_photoCalib` spatially-variable zeropoint
`/datasets/hsc/repo/rerun/private/${USER}/${COOKBOOKRERUN}/fit1/fgcmcal-results`
you will find all of the `fgcmcal_photoCalib` spatially-variable zeropoint
files that can be used in coaddition. At the moment, these combine information
from the WCS Jacobian distortions as well as the spatially variable
transmission (due to the illumination correction derived by FGCM). In the
future, the Jacobian information will be factored out.

Note that all the files are set to tract `0000`, as FGCM has no concept of tracts
and patches.

### FGCM-Specific Data Products

The data products that may be of use to the end user, and grabbed from the butler,
Expand Down
24 changes: 20 additions & 4 deletions cookbook/fgcmBuildStarsHsc.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
HSC-specific overrides for FgcmBuildStars
"""

import os.path

from lsst.utils import getPackageDir

# Minimum number of observations per band for a star to be considered for calibration
config.minPerBand = 2
# Match radius to associate stars from src catalogs (arcseconds)
Expand All @@ -14,12 +18,24 @@
config.densityCutMaxPerPixel = 1500
# Dictionary that maps "filters" (instrumental configurations) to "bands"
# (abstract names). All filters must be listed in the LUT.
config.filterToBand = {'g':'g', 'r':'r', 'i':'i', 'z':'z', 'y':'y'}
config.filterMap = {'g': 'g', 'r': 'r', 'i': 'i', 'z': 'z', 'y': 'y'}
# Which bands are required to be observed to be considered a calibration star
config.requiredBands = ['g','r','i','z']
# The reference band is used for initial star selection
config.referenceBands = ['i']
config.requiredBands = ['g', 'r', 'i', 'z']
# The reference CCD is a good CCD used to select visit to speed up the scanning
config.referenceCCD = 13
# If smatch matching is available, use this nside. Not used with default LSST stack.
config.matchNside = 4096
# A star must be observed in one of these bands to be considered as a calibration star
config.primaryBands = ['i']
# Match reference catalog as additional constraint on calibration
config.doReferenceMatches = True

# Reference object loader configuration parameters
config.fgcmLoadReferenceCatalog.refObjLoader.ref_dataset_name = 'ps1_pv3_3pi_20170110'
config.fgcmLoadReferenceCatalog.applyColorTerms = True
hscConfigDir = os.path.join(getPackageDir("obs_subaru"), "config", "hsc")
config.fgcmLoadReferenceCatalog.colorterms.load(os.path.join(hscConfigDir, 'colorterms.py'))
config.fgcmLoadReferenceCatalog.referenceSelector.doSignalToNoise = True
config.fgcmLoadReferenceCatalog.referenceSelector.signalToNoise.fluxField = 'i_flux'
config.fgcmLoadReferenceCatalog.referenceSelector.signalToNoise.errField = 'i_fluxErr'
config.fgcmLoadReferenceCatalog.referenceSelector.signalToNoise.minimum = 10.0
17 changes: 12 additions & 5 deletions cookbook/fgcmFitCycleHscCookbook_cycle00_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
config.requiredFlag = (1, 1, 1, 1, 1)
# Dictionary that maps "filters" (instrumental configurations) to "bands"
# (abstract names). All filters must be listed in the LUT.
config.filterToBand = {'g':'g', 'r':'r', 'i':'i', 'z':'z', 'y':'y'}
config.filterMap = {'g':'g', 'r':'r', 'i':'i', 'z':'z', 'y':'y'}
# Maximum number of fit iterations (15 for testing, 50+ for a full run.)
config.maxIterBeforeFinalCycle = 15
config.maxIterBeforeFinalCycle = 30
# Number of cores to run with python multiprocessing
config.nCore = 4
# Cycle number (should start at 0)
Expand All @@ -24,9 +24,9 @@
# This value will depend on your longitude/time zone!
config.utBoundary = 0.0
# MJD dates on which the mirror was washed
config.washMjds = (0.0, )
config.washMjds = (56700.0, 57500.0, 57700.0, 58050.0)
# Dividing point between observing epochs (years, camera events, etc.)
config.epochMjds=[0.0, 57000.0, 57200.0, 100000.0]
config.epochMjds = (56700., 57420., 57606.)
# Latitude of the observatory
config.latitude = 19.8256
# Pixel scale (arcseconds)
Expand Down Expand Up @@ -57,4 +57,11 @@
config.superStarSubCcd = True
# Chebyshev order of sub-ccd superstar fits
config.superStarSubCcdChebyshevOrder = 2

# Model instrumental variation over time per band
config.instrumentParsPerBand = True
# Use reference catalog as additional constraint on calibration
config.doReferenceCalibration = True
# Reference star signal-to-noise minimum to use in calibration
config.refStarSnMin = 50.0
# Number of sigma compared to average mag for reference star to be considered an outlier
config.refStarOutlierNSig = 4.0
1 change: 0 additions & 1 deletion cookbook/fgcmMakeLutHscFromModtran.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
# Short-code Filter names
config.filterNames = ('g', 'r', 'i', 'z', 'y')
# Each filter maps onto a "standard filter".
# Here, both i and i2 map onto i2 which will be the standard
config.stdFilterNames = ('g', 'r', 'i', 'z', 'y')

# Telescope elevation in meters
Expand Down
1 change: 0 additions & 1 deletion cookbook/fgcmMakeLutHscFromTable.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
# Short-code Filter names
config.filterNames = ('g', 'r', 'i', 'z', 'y')
# Each filter maps onto a "standard filter".
# Here, both i and i2 map onto i2 which will be the standard
config.stdFilterNames = ('g', 'r', 'i', 'z', 'y')
# Pre-generated atmosphere table distributed with FGCM
config.atmosphereTableName = 'fgcm_atm_subaru2'
Expand Down
4 changes: 2 additions & 2 deletions cookbook/fgcmOutputProductsHsc.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@

from lsst.utils import getPackageDir

# Do the reference catalog calibration
doReferenceCalibration = True
# Do not do the post-fit reference catalog calibration
doReferenceCalibration = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to contract the inline comment right above

# Do the standard star catalog output
doRefcatOutput = True
# Do the atmosphere output
Expand Down
1 change: 1 addition & 0 deletions python/lsst/fgcmcal/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@
from .fgcmBuildStars import *
from .fgcmMakeLut import *
from .fgcmOutputProducts import *
from .fgcmLoadReferenceCatalog import *

__path__ = pkgutil.extend_path(__path__, __name__)