# DM-1766: After

Now let's try opening that file with a master version of the LSST stack, and seeing what the `Schema` looks like now:

In [2]:
import lsst.afw.table

catalog = lsst.afw.table.SourceCatalog.readFits("empty-v0.fits")
print catalog.schema

Schema(
    (Field['L'](name="id", doc="unique ID"), Key<L>(offset=0, nElements=1)),
    (Field['Angle'](name="coord_ra", doc="position in ra/dec"), Key<Angle>(offset=8, nElements=1)),
    (Field['Angle'](name="coord_dec", doc="position in ra/dec"), Key<Angle>(offset=16, nElements=1)),
    (Field['L'](name="parent", doc="unique ID of parent source"), Key<L>(offset=24, nElements=1)),
    (Field['Flag'](name="flag_negative", doc="set if source was detected as significantly negative"), Key['Flag'](offset=32, bit=0)),
    (Field['I'](name="deblend_nchild", doc="Number of children this object has (defaults to 0)"), Key<I>(offset=40, nElements=1)),
    (Field['Flag'](name="deblend_deblended-as-psf", doc="Deblender thought this source looked like a PSF"), Key['Flag'](offset=32, bit=1)),
    (Field['D'](name="deblend_psf-center_x", doc="If deblended-as-psf, the PSF centroid"), Key<D>(offset=48, nElements=1)),
    (Field['D'](name="deblend_psf-center_y", doc="If deblended-as-psf, the PSF centro

### Field Names

First off, you'll note that it seems we've translated periods to underscores.  In fact, all we really did was *not* translate underscores to periods when reading; if we look at the raw form of the original file, we actually saved all those periods as underscores:

In [3]:
import pyfits
fits = pyfits.open("empty-v0.fits")
print fits[1].header["ttype5"]

deblend_nchild


This translation was due to me misinterpreting a recommendation in the FITS standard about sticking to lowercase letters, numbers, and underscores in field names as a requirement.  Of course, we're still mostly in line with that recommendation, though we do use capital letters now too (and none of this is mandated by the code - these are all just conventions).

We haven't tried here to create camelCase field names that look like the ones the measurement framework produces now.  That would probably still be confusing, since the resulting names wouldn't actually be in use in either the old codebase or the new one.  And it's not really possible to do automatically anyway, since we conflated namespace separation with word separation in the old conventions.

### Compound Fields to FunctorKeys

We've also removed all the compound fields, replacing them with the scalar fields that comprise them (e.g. two `Angle` fields instead of a `Coord` field).  Those compound field types have now been completely removed from `afw`, with a new `FunctorKey` system to replace them.  `FunctorKey`s are subclasses of `InputFunctorKey` and/or `OutputFunctorKey`, and they're really just a simple callback mechanism: when you pass a `FunctorKey` to a `Record` instead of a regular `Key`, it just calls the `get` or `set` method on the `FunctorKey` and returns the result.  Most `FunctorKey`s simply hold a few regular `Key`s and use them to pack or unpack scalar values into first-class objects.  `FunctorKey`s have a few advantages over the old compound keys:
 - There's no hard limit on the number of types of `FunctorKey`s we can support (there is a limit on the number of intrinsic types, imposed by Boost.Variant).
 - `FunctorKey`s can be defined in any package (intrinsic types must be defined in `afw.table`).
 - Only the constituent scalar field can be extracted as column arrays in either approach, but with `FunctorKey`s these are what appear when printing or otherwise introspecting the `Schema`; with the old compound fields users were frequently confused by the fact that these fields could not be accessed as column arrays, and the workaround (accessing subfields of the compound fields) was not at all intuitive.
 - `FunctorKey`s can do more than just aggregate fields: we could also use them to do calculations.  For instance, one could hold a `Calib` and convert fluxes to magnitudes on-the-fly.  (At present, all `FunctorKey`s are aggregates, though `CovarianceMatrixKey` does some squares and square-roots to produce matrices from stored sigma values and vice versa).

### Aliases, Slots, and Naming Conventions

At the bottom of the `Schema`, we can see some alias definitions.  These are how slots are now defined, but we're using them for more than that here: we've created aliases to convert between some of the old conventions and the new ones.

In the old conventions, algorithms had the following fields:
 - `"<algorithm-name>"`
 - `"<algorithm-name>.err"`
 - `"<algorithm-name>.flags"`
They typically also had `"<algorithm-name>.flag.<problem>"` fields to indicate specific failure modes.  For flux algorithms, the first two fields were scalars holding the flux and its 1-sigma uncertainty.  For centroids (and shapes) the first field was a `PointD` (`MomentsD`) field and the uncertainty a `CovPointF` (`CovMomentsF`).  This convention couldn't really handle algorithms that produced more than one kind of result, and the fact that the `Flag` field had a plural name was confusing.

In the new conventions, field names start with the algorithm name (which is now CamelCase with a prefix indicating the package, e.g. `"base_SdssCentroid"`), and have suffixes for different kinds of measurements:
 - `"<algorithm-name>_flux"`
 - `"<algorithm-name>_x"`, `"<algorithm-name>_y"`
Uncertainties are handled by the `CovarianceMatrixKey` `FunctorKey`, which aggregates fields that match the conventions specified in the LSST database schema, e.g.:
 - `"<algorithm-name>_fluxSigma"`
 - `"<algorithm-name>_xSigma"`, `"<algorithm-name>_ySigma"`, `"<algorithm-name>_x_y_Cov"`
These naturally handle algorithms with more than one kind of result, and they make it easier to get uncertainties on individual scalar components as well.  We've also removed the `"s"` from the main flag field:
 - `"<algorithm-name>_flag"` (with `"<algorithm-name>_flag_<probem>"` optional).
Note that the new conventions often lead to names with some duplication, e.g. `"base_PsfFlux_flux"`, but I think that's a price worth paying from consistency and predictability.

We've partially renamed some of those fields to match thew new conventions when it was required to convert from a compound field to a `FunctorKey` (mostly the case for the covariance matrices).  But instead of renaming the rest, we've created aliases that map any field that obeyed the old conventions to the new ones.  Crucially, that lets the algorithms be used in the slot mechanism, which relies on the suffix conventions to provide consistent access to different algorithms.  You can see the slots in the last few aliases in the file; we simply alias the algorithm name to the slot definition name, and because aliases are resolved recursively, we can get from the expected fields names in the new conventions to the actual field names being used.

### Versioning

Before the changes on DM-1766, we supported both the new-style `Schema`s and the old ones, using an integer version number attached to the `Schema` object, and a lot of special-case code all over the pipeline (version=0 indicated the old version, version=1 is the new).  That in-memory version number has now been removed, and all tables and now written as version 1.  When we detect a version 0 table on disk (either because it's explicitly saved as version 0, or if it has no version number *and* it has a tag indicating it was written by `afw.table`), we apply all of the above conversions, and it becomes a version 1 table in memory.