In [2]:
import asdf
import numpy as np

np.random.seed(42)

# 3 - Creating ASDF Files

## Introduction

ASDF files store their information using a tree (nested key/value) structure. This allows the stored
information be be hierarchically organized within the file. Without any extensions, this tree is a
nested combination of basic data structures:
- maps,
- lists,
- arrays,
- strings,
- booleans,
- and numbers.

All of which are stored using `yaml`. Note that more complex structures (ones not directly supported
by `yaml`) are denoted using `yaml` tags. However, those tagged "sub-trees" are still comprised of the
above basic structures and other tagged sub-trees. Additional tagged objects are supported via
ASDF extensions.

The Python analogs for these types are:
- maps -> `dict`,
- lists -> `list`,
- arrays -> `np.ndarray`,
- strings -> `str`,
- booleans -> `bool`,
- and numbers -> `int`, `float`, `complex` (depending on the type of number).

Where `np.ndarray` are treated in a special way distinct from regular `yaml` (binary blocks). Note
that due to limits imposed by Python, dictionary keys are limited to `bool`, `int`, or `str` types
only, while value information can be any of the above data types.

Typically, when creating an ASDF file using the python library, one begins by creating a nested Python
dictionary which corresponds to the nested tree structure one wants the file to have. Indeed, one can
interact with any `AsdfFile` object as if it were a dictionary representing this tree structure.

## Creating ASDF files using basic python types

Lets first create an ASDF file with the key/value pair `"hello": "world"`:

In [3]:
tree = {"hello": "world"}
af = asdf.AsdfFile(tree)
af.write_to("hello.asdf")
af["hello"]

'world'

Open the `hello.asdf` file in your favorite text editor. You should see a something that looks like:

In [4]:
with open("hello.asdf") as f:
    print(f.read())

#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
  name: asdf, version: 2.12.0}
history:
  extensions:
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension.BuiltinExtension
    software: !core/software-1.0.0 {name: asdf, version: 2.12.0}
hello: world
...



Notice that the file contains more information than just the `"hello": "world"` key value that we
entered. It contains information on the library used to create the file under `asdf_library`, and
information on what the ASDF library needs (schemas, extensions, etc.) to deserialize the stored 
data under `history`. 


### Exercise 1
Create an ASDF file that stores information using all the basic Python types
Except `np.ndarray`:

In [9]:
content = {
    "integer": 5,
    "float": 5.0,
    "complex": complex(5.0, 2.0),
    "bool": False,
    "string": "my string",
    "list": [1, 2.0, "three"],
    "dict": {"a": 5, "b": 5.0, "c": "five point oh"},
    "set": {5, 5.0, "five point oh"}
}
asdf.AsdfFile(content).write_to("my.asdf")

In [10]:
!cat my.asdf

#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
  name: asdf, version: 2.12.0}
history:
  extensions:
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension.BuiltinExtension
    software: !core/software-1.0.0 {name: asdf, version: 2.12.0}
bool: false
complex: !core/complex-1.0.0 (5+2j)
dict: {a: 5, b: 5.0, c: five point oh}
float: 5.0
integer: 5
list: [1, 2.0, three]
set: !!set {five point oh: null, 5: null}
string: my string
...


## Creating ASDF files with `np.ndarray`

Beyond the maps, lists, strings, and numbers built into Python, ASDF can save arrays, in particular
numpy arrays (`np.array`). Indeed, much of ASDF is dedicated to efficiently saving arrays.

For example if suppose we want to save a random 8x8 numpy array:

In [11]:
tree = {"random_array": np.random.rand(8, 8)}
af = asdf.AsdfFile(tree)
af.write_to("random.asdf")

Now opening this file in your text editor will result in something like:

In [12]:
with open("random.asdf", "r", encoding="unicode_escape") as f:
    print(f.read())

#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
  name: asdf, version: 2.12.0}
history:
  extensions:
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension.BuiltinExtension
    software: !core/software-1.0.0 {name: asdf, version: 2.12.0}
random_array: !core/ndarray-1.0.0
  source: 0
  datatype: float64
  byteorder: little
  shape: [8, 8]
...
ÓBLK 0                             ±PÑ×vd=ZÏìQ_wø×?TÖ»h@lî?Qg~lç?°,cÖ5(ã?!"7køÃ?Lá ÷Ã?pUd"½­?µ·U	··ë?­KU<ã?Ò¨æ?~8?¥Sãb	ï?öÁ¿^£ê?0í-Ë?8à_
FÇ?H4áÌyÇ?hªät´xÓ?c< ÎÊà?Z;/¸ü¤Û?½2£Ò?'uLã?|$¿PïÚÁ?T@Êw²Ò?8·Üñxr×?Ü¹(@0Ý?.:yV) é?téÉ?ÔîÇtà?÷Ýeõâ?pC	7YÈ§?¢¸æqã?´>5¼ÓÅ? ªl 8§°?ÐÄ3E]î? Î@&uæî?ªdÞé?ê² ÀÊ~Ó?p{c'
¹?aÏö¨<åå?ÜRw]u+Ü?°Óå=¿?bª~ú°ß? hK_¡?1r'í?08uéÙÐ?v}ïa3å?¸fóÓ?¨îÔ°e¤à?<SE¦~á?\BO©Ç?æ^WÖï?XÌYãÍè?C¢`î?æ´÷l¢ì

Observe that at the end of the file there is apparently some binary data. This binary data contains the information
in the random array we wrote. Indeed, when ASDF writes arrays to the file it stores them as binary data in a block after
the YAML section of the file rather in the section itself. Note that `random_array` in the YAML section stores some
information about the nature of the array and includes the `source` key. This `source` value references which binary block 
(in this case block `0`) the data is stored in.

Note that ASDF will store this data in an efficient manner. By this we mean that arrays shared between different objects
stored in the ASDF tree, will only be stored once as a binary block with both entries in the yaml metadata will both
reference the same binary block. Moreover, this extends to objects which reference a different view of the same data,
meaning the views will all still reference the same binary block, only storing information on the view itself.

### Exercise 2

Create tree containing the same `np.ndarray` twice, and multiple views on the same `np.ndarray`.

In [15]:
np.random.rand(8,8)

array([[0.28093451, 0.54269608, 0.14092422, 0.80219698, 0.07455064,
        0.98688694, 0.77224477, 0.19871568],
       [0.00552212, 0.81546143, 0.70685734, 0.72900717, 0.77127035,
        0.07404465, 0.35846573, 0.11586906],
       [0.86310343, 0.62329813, 0.33089802, 0.06355835, 0.31098232,
        0.32518332, 0.72960618, 0.63755747],
       [0.88721274, 0.47221493, 0.11959425, 0.71324479, 0.76078505,
        0.5612772 , 0.77096718, 0.4937956 ],
       [0.52273283, 0.42754102, 0.02541913, 0.10789143, 0.03142919,
        0.63641041, 0.31435598, 0.50857069],
       [0.90756647, 0.24929223, 0.41038292, 0.75555114, 0.22879817,
        0.07697991, 0.28975145, 0.16122129],
       [0.92969765, 0.80812038, 0.63340376, 0.87146059, 0.80367208,
        0.18657006, 0.892559  , 0.53934224],
       [0.80744016, 0.8960913 , 0.31800347, 0.11005192, 0.22793516,
        0.42710779, 0.81801477, 0.86073058]])

In [16]:
block = np.random.rand(8, 4)
content = {
    "a": block,
    "b": {"c": block}
}
asdf.AsdfFile(content).write_to("views.asdf")

In [17]:
!cat views.asdf

#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
  name: asdf, version: 2.12.0}
history:
  extensions:
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension.BuiltinExtension
    software: !core/software-1.0.0 {name: asdf, version: 2.12.0}
a: &id001 !core/ndarray-1.0.0
  source: 0
  datatype: float64
  byteorder: little
  shape: [8, 4]
b:
  c: *id001
...
�BLK 0                             ��\���l
�y���w�@��MT�y|?V�"�
����?��Q��?ЀBl[�?����W��?f�����?d+�����?@�~H��?V��1S�?�)6�r��?��{ɪ��?6�Rh"��?��z0_�?��3�j�?Dj6_�M�?#ASDF BLOCK INDEX�
%YAML 1.1
---
- 528
...


In [18]:
!xxd views.asdf

00000000: 2341 5344 4620 312e 302e 300a 2341 5344  #ASDF 1.0.0.#ASD
00000010: 465f 5354 414e 4441 5244 2031 2e35 2e30  F_STANDARD 1.5.0
00000020: 0a25 5941 4d4c 2031 2e31 0a25 5441 4720  .%YAML 1.1.%TAG 
00000030: 2120 7461 673a 7374 7363 692e 6564 753a  ! tag:stsci.edu:
00000040: 6173 6466 2f0a 2d2d 2d20 2163 6f72 652f  asdf/.--- !core/
00000050: 6173 6466 2d31 2e31 2e30 0a61 7364 665f  asdf-1.1.0.asdf_
00000060: 6c69 6272 6172 793a 2021 636f 7265 2f73  library: !core/s
00000070: 6f66 7477 6172 652d 312e 302e 3020 7b61  oftware-1.0.0 {a
00000080: 7574 686f 723a 2054 6865 2041 5344 4620  uthor: The ASDF 
00000090: 4465 7665 6c6f 7065 7273 2c20 686f 6d65  Developers, home
000000a0: 7061 6765 3a20 2768 7474 703a 2f2f 6769  page: 'http://gi
000000b0: 7468 7562 2e63 6f6d 2f61 7364 662d 666f  thub.com/asdf-fo
000000c0: 726d 6174 2f61 7364 6627 2c0a 2020 6e61  rmat/asdf',.  na
000000d0: 6d65 3a20 6173 6466 2c20 7665 7273 696f  me: asdf, versio
000000e0: 6e3a 2032 2e31 322e 307d 0a68 6973 746

## Serializing Other Objects

As mentioned above, other types of objects can also be serialized by ASDF including objects outside 
the ASDF-standard; However, support for these objects requires the creation of an ASDF extension, which
we will describe in a later tutorial.

For our current purposes recall that these objects are denoted in the `yaml` metadata via a `yaml`
tag. Indeed some of the objects already discussed are tagged in the metadata. These tags are used by
ASDF to determine which extension to use when reading an ASDF file. This enables the "seamless" nature
of reading objects from an ASDF file, provided the necessary ASDF extension is installed. Note that
when a tagged object is present in an ASDF file, but no extension can be found to handle that tag ASDF
will raise a warning and return that "object" in its "raw" form, meaning you will get the nested dictionary
object rather than a fully realized instance of the object you wrote.

On the other hand, ASDF extensions specify what Python objects they support. This is how ASDF can
seamlessly recognize a complex object and serialize it with no input from the user (other than installing
the correct ASDF extensions).

For example, as part of the install for this course we installed the `asdf-astropy` package, which provides
extensions for writing many `astropy` objects. Indeed `asdf-astropy` enables ASDF support for

- `astropy` `unit` and `quantity` objects.
- (Most) `astropy` model objects.
- `astropy` `Time` objects.
- `astropy` coordinate and frame objects.
- `astropy` `Table` objects.

Thus serializing an `astropy` `Table` object:

In [19]:
from astropy.table import Table

tree = {"table": Table(dtype=[("a", "f4"), ("b", "i4"), ("c", "S2")])}
af = asdf.AsdfFile(tree)
af.write_to("table.asdf")

Notice how no additional effort was needed to write the ASDF file since `asdf-astropy` was installed 
already. Now lets perform a cursory inspection of the `table.asdf` file:

In [21]:
with open("table.asdf", "r", encoding="unicode_escape") as f:
    print(f.read())

#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
  name: asdf, version: 2.12.0}
history:
  extensions:
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension.BuiltinExtension
    software: !core/software-1.0.0 {name: asdf, version: 2.12.0}
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension._manifest.ManifestExtension
    extension_uri: asdf://asdf-format.org/core/extensions/core-1.5.0
    software: !core/software-1.0.0 {name: asdf-astropy, version: 0.2.1}
  - !core/extension_metadata-1.0.0
    extension_class: asdf.extension._manifest.ManifestExtension
    extension_uri: asdf://astropy.org/astropy/extensions/astropy-1.0.0
    software: !core/software-1.0.0 {name: asdf-astropy, version: 0.2.1}
table: !<tag:astropy.org:astropy/table/table-1.0.0>
  colnames: [a, b, c]
  columns:
  - !core/column-1.0.0

### Exercise 3

Write an ASDF file containing the following `astropy` objects:
1. `Quantity`
2. A `model`

   Hint: The `astropy.modeling` package provides a framework for representing models and performing model evaluation and fitting. Models are initialized using their parameters
   ```
   from astropy.modeling import models
   gauss = models.Gaussian1D(amplitude=10, mean=3, stddev=1.2)
   ```
3. A `Time` object

    Hint: The `astropy.time` package provides functionality for manipulating times and dates. To initialize it supply a string and a format, or supply a datetime object.
    
4. A Celestial coordinate object (astronomy specific).