## How to Create Reference files

The recommended way to work with reference files for the JWST pipeline is to use the [jwst.datamodels](<http://jwst-pipeline.readthedocs.io/en/latest/jwst/datamodels/index.html#classes>)

As an example, let's read in one of the Nirspec **dark** reference files.

In [1]:
from jwst import datamodels
dm = datamodels.DarkModel('../jwst_nirspec_dark_0136.fits')


A datamodel, representing a reference file, has all the general attributes of a datamodel. 

In [2]:
print(dm.meta.instrument.name)
print(dm.meta.instrument.detector)
print(dm.meta.exposure.type)

NIRSPEC
NRS1
N/A


It has also the mandatory reference file keywords

In [3]:
print(dm.meta.author)
print(dm.meta.pedigree)
print(dm.meta.useafter)
print(dm.meta.description)
print(dm.meta.telescope)
print(dm.meta.reftype)

ESA JWST SOT
GROUND
2015-01-01T00:00:00
Master dark reference file
JWST
DARK


And it has the fields/attributes specific to this reference file. For example, the **dark** reference file has a *data* array, *err* array, *dq* array and so on ...

In [4]:
dm.data
dm.err
dm.dq

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ..., 
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint16)

When all else fails - every datamodel has an **_instance** attribute which holds all data for this specific model.

Don't use this in code, as it is a private attribute. however, on occasions it may be useful for debugging.

In [5]:
#dm._instance

Reference data models have schemas associated with them. The schemas are used to validate the reference files. They list all attributes for a specific reference file and their types. For example, the schema for the dark reference file is



~~~
allOf:
- $ref: referencefile.schema.yaml
- $ref: keyword_exptype.schema.yaml
- $ref: keyword_readpatt.schema.yaml
- $ref: keyword_preadpatt.schema.yaml
- $ref: keyword_nframes.schema.yaml
- $ref: keyword_ngroups.schema.yaml
- $ref: keyword_groupgap.schema.yaml
- $ref: keyword_gainfact.schema.yaml
- $ref: subarray.schema.yaml
- type: object
  properties:
    data:
      title: Dark current array
      fits_hdu: SCI
      default: 0.0
      ndim: 3
      datatype: float32
    dq:
      title: 2-D data quality array for all planes
      fits_hdu: DQ
      default: 0
      ndim: 2
      datatype: uint16
    err:
      title: Error array
      fits_hdu: ERR
      default: 0.0
      ndim: 3
      datatype: float32
    dq_def:
      $ref: dq_def.schema.yaml
$schema: http://stsci.edu/schemas/fits-schema/fits-schema
~~~

To validate the file, run the **validate** method

In [6]:
dm.validate()

### The ASDF format

[ASDF](<http://asdf-standard.readthedocs.org/en/latest/>) is a next-generation interchange format for scientific data. The *asdf* package contains the Python implementation of the ASDF Standard.

The ASDF format has the following features:

- A hierarchical, human-readable metadata format (implemented using YAML)
- Numerical arrays are stored as binary data blocks which can be memory mapped.
  Data blocks can optionally be compressed.
- The structure of the data can be automatically validated using schemas (implemented using JSON Schema)
- Native Python data types (numerical types, strings, dicts, lists) are serialized automatically
- ASDF can be extended to serialize custom data types

In [7]:
import asdf
from asdf import AsdfFile

# Save "Hello World!" to a file
f = AsdfFile()
f.tree['Hello'] = 'World!'
f.write_to('hw.asdf')

In [8]:
# The top of the file has software and version information
#! less hw.asdf

In [9]:
# Read back the file
# Every ASDF file object has an attribute "tree" which is a dict
fa = AsdfFile.open('hw.asdf')
fa.tree

{'Hello': 'World!',
 'asdf_library': {'author': 'Space Telescope Science Institute',
  'homepage': 'http://github.com/spacetelescope/asdf',
  'name': 'asdf',
  'version': '2.1.0.dev1406'},
 'history': {'extensions': [<asdf.tags.core.ExtensionMetadata at 0x7f4170dd2c50>]}}

In [10]:
# Save an array to file
import numpy as np

data = np.random.random((3,4))
tree = {'data': data}
f = AsdfFile(tree)
f.write_to('data.asdf')

In [11]:
# The array is saved in a binary block at the end of the file
# It's possible to save the binary block in compressed form
#!less data.asdf

Many astropy objects can already be serialized to file. Example saving an astropy compound model.

In [12]:
from astropy.modeling.models import Rotation2D, Polynomial2D, Mapping

model = (Mapping((0, 1, 0, 1)) | 
         Polynomial2D(1, c0_0=1) & Polynomial2D(1, c0_0=2) |
         Rotation2D(23.5))

f = AsdfFile()
f.tree['model'] = model
f.write_to('model.asdf')

In [13]:
# Reading it back reconstructs the model and it can be evaluated directly.
fa = AsdfFile.open('model.asdf')
new_model = fa.tree['model']
new_model(1, 1)

(0.11956193653463165, 2.2328692176954945)

The primary motivation for ASDF was serializing complex WCS objects and transforms which can't be described by the FITS WCS or written to FITS files.

## WCS reference files are in ASDF format

While it is possible to write or edit an ASDF file in a text editor, or to use the ASDF interface, the best way to create
reference files is using the datamodels in the jwst pipeline
[jwst.datamodels](<http://jwst-pipeline.readthedocs.io/en/latest/jwst/datamodels/index.html#classes>) and
[astropy.modeling](<http://astropy.readthedocs.io/en/latest/modeling/index.html>).

There are two steps in this process:

- create a transform using the simple models and the rules to combine them
- save the transform to an ASDF file (this automatically validates it)

The rest of this document provides a brief description and examples of models in
[astropy.modeling](<http://astropy.readthedocs.org/en/latest/modeling/index.html>)
which are most relevant to WCS and examples of creating WCS reference files.

### Create a transform

All models are imported unders the **models** namespace. If necessary all fitters can be imported through the **fitting** module.

In [14]:
from astropy.modeling import models as astmodels
from astropy.modeling import fitting

Many analytical models are already implemented and it is
easy to implement new ones. Models are initialized with their parameter values.
They are evaluated by passing the inputs directly, similar
to the way functions are called. For example,

In [15]:
poly_x = astmodels.Polynomial2D(degree=2, c0_0=.2, c1_0=.11, c2_0=2.3, c0_1=.43, c0_2=.1, c1_1=.5)
poly_x(1, 1)

3.6399999999999997

Models have their analytical inverse defined if it exists and accessible through the **inverse** property.
An inverse model can also be (re)defined by assigning to the **inverse** property.


In [16]:
rotation = astmodels.Rotation2D(angle=23.4)
print(rotation.inverse)

Model: Rotation2D
Inputs: ('x', 'y')
Outputs: ('x', 'y')
Model set size: 1
Parameters:
    angle
    -----
    -23.4


In [17]:
coeffs = {'c0_0': 2.3, 'c0_1': 1.2, 'c1_0': 1}
pinv = astmodels.Polynomial2D(degree=1, **coeffs)
poly_x.inverse = pinv

astropy.modeling also provides the means to combine models in various ways.

Model concatenation uses the **&** operator. Models are evaluated on independent
inputs and results are concatenated. The total number of inputs must be equal to the
sum of the number of inputs of all models.

In [18]:
shift_x = astmodels.Shift(-34.2)
shift_y = astmodels.Shift(-120)
model = shift_x & shift_y
print(model(1, 1))

(-33.2, -119.0)


Model composition uses the **|** operator. The output of one model is passed
as input to the next one, so the number of outputs of one model must be equal to the number
of inputs to the next one.


In [19]:
model = shift_x & shift_y | rotation

Two models, **Mapping** and **Identity**, are useful for axes manipulation - dropping
or creating axes, or switching the order of the inputs.

Mapping takes a tuple of integers and an optional number of inputs. The tuple
represents indices into the inputs. For example, to represent a 2D Polynomial distortion
in ``x`` and ``y``, preceded by a shift in both axes:

In [20]:
poly_y = astmodels.Polynomial2D(degree=2, c0_0=.2, c1_0=1.1, c2_0=.023, c0_1=3, c0_2=.01, c1_1=2.2)
model = shift_x & shift_y | astmodels.Mapping((0, 1, 0, 1)) | poly_x & poly_y
print(model(1, 1))

(5872.03, 8465.401520000001)


**Identity** takes an integer which represents the number of inputs to be passed unchanged.
This can be useful when one of the inputs does not need more processing. As an example,
two spatial (V2V3) and one spectral (wavelength) inputs are passed to a composite model which
transforms the spatial coordinates to celestial coordinates and needs to pass the wavelength unchanged.


In [21]:
tan = astmodels.Pix2Sky_TAN()
model = tan & astmodels.Identity(1)
print(model(0.2, 0.3, 10**-6))

(146.30993247402023, 89.63944963170002, 1e-06)


### Create the reference file

The [CollimatorModel](<http://jwst-pipeline.readthedocs.io/en/latest/api/jwst.datamodels.CollimatorModel.html#jwst.datamodels.CollimatorModel>) in jwst.datamodels is used as an example of how to create a reference file. Similarly data models should be used to create other types of reference files as this process provides validation of the file structure.


In [22]:
from jwst.datamodels import CollimatorModel
collimator = CollimatorModel(model=model)

In [23]:
collimator.validate()
collimator.save("new_distortion.asdf")

'new_distortion.asdf'

In [24]:
poly_x = astmodels.Polynomial2D(degree=2, c0_0=.2, c1_0=.11, c2_0=2.3, c0_1=.43, c0_2=.1, c1_1=.5)
poly_y = astmodels.Polynomial2D(degree=2, c0_0=.2, c1_0=1.1, c2_0=.023, c0_1=3, c0_2=.01, c1_1=2.2)
shift_x = astmodels.Shift(-34.2)
shift_y = astmodels.Shift(-120)
model = shift_x & shift_y | astmodels.Mapping((0, 1, 0, 1)) | poly_x & poly_y

# To validate it pass the "strict_validation" flag
collimator = datamodels.CollimatorModel(model=model, strict_validation=True)

#collimator.validate()



In [25]:
collimator.meta.author="NRS team"
collimator.meta.description="Latest model"
collimator.meta.reftype="collimator"
collimator.meta.useafter="2018-10-12"
collimator.meta.pedigree="GROUND"
collimator.validate()

### Matching Keywords Patterns

To support automatic *rmap* updates in CRDS, any keyword used to assign best references must be added to the reference metadata. For example, if a FITS reference type uses *DETECTOR* to help assign a reference in the rmap, the reference file is required to define *detector* or *p_detector*. The values in the keyworsd starting with *p* are matched against and used to automatically derive *rmaps* in CRDS.

For example, the **collimator** reference file is valid for NRS1 and NRS2. It also is valid for most EXP_TYPEs.
Wh


In [26]:
collimator.meta.instrument.p_detector = "NRS1|NRS2|"


collimator.meta.exposure.p_exptype = "NRS_TACQ|NRS_TASLIT|NRS_TACONFIRM|\
                                      NRS_CONFIRM|NRS_FIXEDSLIT|NRS_IFU|NRS_MSASPEC|NRS_IMAGE|NRS_FOCUS|\
                                      NRS_MIMF|NRS_BOTA|NRS_LAMP|NRS_BRIGHTOBJ|"



The string value above uses the **or | ** symbol. It means that detector NRS1 or detector NRS2 is a match.

**Note: A pattern matching string must end with | .**

Alternatively we could write 

collimator.meta.exposure.detector = "ANY"

A value of **ANY** means that any DETECTOR will be a match.

When thinking about and writing rmaps it's always helpful to look at the CRDS web interface.

https://jwst-crds.stsci.edu/

There are scripts for converting WCS reference files from the ESA format to ASDF. The repository is

https://github.com/spacetelescope/jwreftools
