# The What and Why of ASDF (Advanced Scientific Data Format)

## Outline

This is a brief summary of the ASDF format. Other tutorials will go into greater detail about the rationales for it, and the details of the internal format. This notebook will only summarize these items briefly. This will also give a brief listing of some of the features of ASDF, particularly with regard to FITS. Finally we will list some of the tutorial resources available (as they become available). **Note the list of tutorials at the end of this notebook.**

## Why

These are the reasons we developed ASDF instead of using FITS

- 8 character limitation on keyword sizes
- Keyword, value, comment restriction to 80 characters
- Lack of any widely accepted mechanism for hierarchical organization of metadata or data
- Poor support for widely used binary types (particularly unsigned 16 bit integers)
- Lack of versioning of the standard
- Lack of good validation tools
- Allowance of many nonstandard FITS files that libraries must support
- Lack of streaming support
- Inflexible data models for cloud use
- Very poor support for complex World Coordinate System transforms (a very large headache for HST data)

The last item is a topic in its own right and probably the single biggest factor in developing ASDF.

Other existing formats were looked at and all had serious flaws. (See xxx for more details)

## What to retain from FITS

- Readable metadata
- Support for binary data, including arrays and tables
- Useful as an archival format (i.e., not just a software library)

## High Level ASDF File Structure

An ASDF file consists of a single YAML header, followed by zero or more binary blocks.

YAML (Yet Another Markup Language) is a standard text format that is easier to read than XML and more concise and readable than JSON. It is generally intended that the YAML header will hold all the metadata for the data contained within the file. It also is used to indicate the organization of all the data and how it is associated with relevant metadata.

## Added Features

- No limit on the length of attribute names (analogous to FITS keyword names).
- No limit on attribute values, which may theselves be complex values. This allows arbitrarily nested structures.
- Ability to "tag" values as having a special type that software may know how to specially handle.
- Ability to have different attributes reference the same data or object without making copies, so that data or object can be shared by many things.
- Full support for all standard numerical binary values.
- Supports mode of putting some or all data "in-line", i.e., as text values.
- Supports very flexible WCS model.
- Mechanism for extensions to the standard format, both in the standard itself, and for local purposes.
- Versioned globally, and for specific extensions.
- Support for schema files that can be used to validate the proper structure of the file, both for reading and writing.
- Support for streamed data.
- Language agnostic, but currently only fully supported in Python

Many of these items value may not yet be obvious to the reader, but tutorials will illustrate  the value with examples.

## FITS Features not yet Supported

- Variable length records in tables (using heap storage)
- Random Groups

## Prerequisites to Tutorials

### Installations

Of course Python 3.6 or later must be installed, along with ipython, jupyter, numpy, astropy, gwcs, and asdf. We generally recommend installing Python using miniconda since conda allows easy cloning of different Python environments ([miniconda download](https://docs.conda.io/en/latest/miniconda.html) and using pip install for the rest after activating the miniconda environment. For example, at the shell level:

`
pip install ipython
pip install jupyter
pip install numpy
pip install astropy
pip install asdf
pip install gwcs
pip install matplotlib
`
### Data Files

The tutorials generally require accessing sample ASDF files. The tutorials usually do this by referring to files on the network that will be downloaded if not already present on your system. The needed downloads will be part of the tutorials.

## Tutorials

- [Anatomy of an ASDF file](Anatomy_of_an_ASDF_file.ipynb): Basic explantion of ASDF file structure. Recommended if starting from the beginning with no particular ASDF data to deal with.
- [Reading a JWST ASDF file](Reading_a_JWST_ASDF_file.ipynb): Addresses an example JWST ASDF file. Most useful place to start if given a JWST file and want to quickly be able to examine its contents. More useful examples of looking at a more complex file than in the above tutorial.
- [ASDF and World Coordinate Systems](ASDF_and_World_Coordinate_Systems.ipynb): Examples of using Generalized World Coordinate Systems saved and read from ASDF files. Should start with one of the two above tutorials.
- [Validation and dealing with errors](Validation_and_Dealing_with_Errors.ipynb) How to deal with errors that may arise.

## References and Documentation

- [Original ASDF Paper](https://www.sciencedirect.com/science/article/pii/S2213133715000645)
- [Python ASDF Library Documenation](https://asdf.readthedocs.io/en/)
- [ASDF Standard Documentation](https://asdf-standard.readthedocs.io/en/)
- [asdf-users mailing list](https://groups.google.com/g/asdf-users)