In [None]:
import asdf
import os
import numpy as np

from dataclasses import dataclass
from pathlib import Path

# 6 - Creating Custom ASDF Extensions

Often we want to be able to save our "custom" python objects to ASDF in a "seamless"
fashion, like how we were able to save various `astropy` objects using `asdf-astropy`
in [tutorial 3](./03-Creating_ASDF_Files.ipynb). Here we will work through how to create
the necessary ASDF extensions to support doing this for a "custom" object.

## Example Object

Lets start with a simple geometric ellipse as a custom python object

In [None]:
@dataclass
class Ellipse:
    """An ellispe defined by semi-major and semi-minor axes.

    Note: Using a dataclass to define the object so that we get `==` for free.
    """

    semi_major: float
    semi_minor: float

## Introduction to Writing an Extension

Our ultimate goal is to create a `Converter` class which converts from an `Ellipse` object
to an ASDF file and from an ASDF file back into an `Ellipse` object, such that an `Ellipse`
object "round-trips".

Formally, the `Converter` interface defines a mapping between tagged objects in the `ASDF` tree
and their corresponding Python objects. Typically, this means one converter per tag/object
pair; however, a single `Converter` can also support many-to-one and many-to-many mappings 
object.

Thus to successfully create an ASDF extension to support `Ellipse`, we need three things:

1. A `tag` for `Ellipse`.
2. A `Converter` for `Ellispe`.
3. An `Extension` for `Ellipse`.

### Creating a `tag`

Recall that ASDF supports schemas for validating the information stored in its files. The
determination of which schema(s) need to be used to validate what parts of the ASDF tree
is noted by a yaml `tag` within the metadata.

This means we need to 

1. Create a `schema` for `Ellipse`.
2. Create `tag` for that schema.

Note that to create the `tag` for the schema, we will need to create an "extension manifest".

#### Create a `schema`

Note that schemas are typically stored in `yaml` files which are then loaded into ASDF via an
"entry point"; however, to begin with lets create the schema for `Ellipse` dynamically:

In [None]:
ellipse_uri = "asdf://example.com/example-project/schemas/ellipse-1.0.0"

ellipse_schema_content = f"""
%YAML 1.1
---
$schema: http://stsci.edu/schemas/yaml-schema/draft-01
id: {ellipse_uri}

type: object
properties:
  semi_major:
    type: number
  semi_minor:
    type: number
required: [semi_major, semi_minor]
...
"""

Note that ASDF uses JSON schema for its schema language; however, ASDF uses `yaml`
as its file format for schemas, not `json`.

Now we can dynamically add this schema to ASDF using the `add_resource_mapping`:

In [None]:
asdf.get_config().add_resource_mapping({ellipse_uri: ellipse_schema_content})

Later we will go over how to perform this automatically via an entry-point.

Lets now load and check that the schema we just created is a valid schema:

In [None]:
schema = asdf.schema.load_schema(ellipse_uri)
asdf.schema.check_schema(schema)

Note that `asdf.schema.check_schema` will work directly on any `yaml` file loaded
through the `pyyaml` interface.

Lets also attempt to validate a portion of an ASDF tree for `Ellipse` against this schema:

In [None]:
test_ellipse_object = {"semi_major": 1.0, "semi_minor": 2.0}

asdf.schema.validate(test_ellipse_object, schema=schema)

#### Creating the `tag` Extension

The mechanism that ASDF uses to bind a `tag` to a schema is a `manifest`, which
is special schema file which lists pairs pairs of `tag_uri` and `schema_uri` to
associate the `schema_uri` (what URI identifies the schema) with a specific `tag_uri`
(what will be used to reference a specific object). This allows the reuse of a `schema`
for multiple types of objects, which may contain identical information but have different
functionality.

Now lets create a manifest for ASDF which has a tag pointing to the schema for `Ellipse`:

In [None]:
ellipse_manifest_uri = "asdf://example.com/example-project/manifests/shapes-1.0.0"
ellipse_extension_uri = "asdf://example.com/example-project/extensions/shapes-1.0.0"
ellipse_tag = "asdf://example.com/example-project/tags/ellipse-1.0.0"

ellipse_manifest_content = f"""
%YAML 1.1
---
id: {ellipse_manifest_uri}
extension_uri: {ellipse_extension_uri}

title: Example Shape extension 1.0.0
description: Tags for example shape objects.

tags:
  - tag_uri: {ellipse_tag}
    schema_uri: {ellipse_uri}
...
"""

asdf.get_config().add_resource_mapping({ellipse_manifest_uri: ellipse_manifest_content})

# check
schema = asdf.schema.load_schema(ellipse_manifest_uri)
asdf.schema.check_schema(schema)
asdf.schema.validate(ellipse_manifest_content, schema=schema)

### Create a `Converter`

All converters should be constructed as subclasses of the abstract type `asdf.extension.Converter`,
which requires that you define two methods:

1. `to_yaml_tree`: which converts a Python object into an ASDF tree.
2. `from_yaml_tree`: which converts an ASDF tree into a python object.

Note that these methods can account for the type/tag of the objects attempting to be converted.

Moreover your converter also needs to define the following two variables:

1. `tags`: A list of tags that this converter will use when reading ASDF.
2. `types`: A list of Python (object) types that this converter will use when writing ASDF.

Note that these lists do not need to be indexed with respect to each other, and that in order for
the converter to actually be used by ASDF, at least one of the `tags` needs to be registered as a
resource with ASDF (usually via the entry point).

An example converter for `Ellipse`:

In [None]:
class EllipseConverter(asdf.extension.Converter):
    tags = [ellipse_tag]
    types = [Ellipse]

    def to_yaml_tree(self, obj, tag, ctx):
        return {
            "semi_major": obj.semi_major,
            "semi_minor": obj.semi_minor,
        }

    def from_yaml_tree(self, node, tag, ctx):
        return Ellipse(semi_major=node["semi_major"], semi_minor=node["semi_minor"])

Note that, for performance of the entry points, one will normally defer the `import` of the object to be created
until `from_yaml_tree` is actually called.

### Create the Full Extension

Now lets dynamically create an extension around the `ellipse_tag` and `EllipseConverter`:

In [None]:
class EllipseExtension(asdf.extension.Extension):
    extension_uri = ellipse_extension_uri
    converters = [EllipseConverter()]
    tags = [ellipse_tag]


asdf.get_config().add_extension(EllipseExtension())

#### Testing the `Ellipse` Extension

Lets now check that we can round-trip an `Ellipse` object through ASDF:

In [None]:
ellipse = Ellipse(1.0, 2.0)

with asdf.AsdfFile() as af:
    af["ellipse"] = ellipse
    af.write_to("ellipse.asdf")

Let's examine the contents of the ASDF file and then read/compare them to our original object:

In [None]:
with open("ellipse.asdf") as f:
    print(f.read())

with asdf.open("ellipse.asdf") as af:
    print(af["ellipse"])
    assert af["ellipse"] == ellipse

## Using Entry-Points to Automatically Extend ASDF

Now lets move to implementing the above Extension as an automatically available resource for
ASDF, much like the extensions found in `asdf-astropy`.

Recall that in order for the extension to function, we must have the required schema and manifest
"resources" available for ASDF to use. Only then can a functional extension can be added to ASDF.
This means we need to:

1. Package and add the resources to ASDF using an entry point
2. Add the extension to ASDF using an entry point

### Resources

Normally we organize the resource files into a directory structure which can be parsed to form part
of the URI (`id`) used for each resource document. This is done so that adding resources can be
performed by ASDF by crawling these directory structures.

Lets first take our two "resources" for `Ellipse` and turn them into these resource files:

In [None]:
schema_root = "resources/schemas"
manifest_root = "resources/manifests"

os.makedirs(schema_root, exist_ok=True)
os.makedirs(manifest_root, exist_ok=True)

with open(f"{schema_root}/ellipse-1.0.0.yaml", "w") as f:
    f.write(ellipse_schema_content)

with open(f"{manifest_root}/shapes-1.0.0.yaml", "w") as f:
    f.write(ellipse_manifest_content)


ASDF provides the `asdf.resource.DirectoryResourceMapping` object to crawl resource directories.
It allows us to turn these directory structures into objects which can subsequently be added to
ASDF using the entry points.

These objects require two input parameters:
1. A path to the root directory which contains the resources to be added.
2. The prefix that will be used together with the file names to generate the URI
for the resource in question.

There are some optional inputs:
1. `recursive`: (default `False`) which determines if the object will search recursively through
subdirectories.
2. `filename_pattern`: (default: `*.yaml`) Glob pattern for the files that should be added.
3. `stem_filename`: (default: `True`) determine if the file extension should be removed when creating
the URI.

In this case we do not need to set any of the file, here we need to do only the following:

In [None]:
schema_prefix = "asdf://example.com/example-project/schemas/"
schema_mapping = asdf.resource.DirectoryResourceMapping(schema_root, schema_prefix)

manifest_prefix = "asdf://example.com/example-project/manifests/"
manifest_mapping = asdf.resource.DirectoryResourceMapping(
    manifest_root, manifest_prefix
)

Now these "mapping" objects can be returned as elements of a list by a function, which
we will later reference when writing our entry point:

In [None]:
# In module asdf_shapes.integration
def get_resource_mappings():
    return [schema_mapping, manifest_mapping]

Then in the `setup.cfg` for your package you can define a section `[options.entry_points]` which defines
an `asdf.resource_mappings` entry point as follows:

```
[options.entry_points]
asdf.resource_mappings =
    asdf_shapes_schemas = asdf_shapes.integration:get_resource_mappings
```

Once installing your package, these resources would be available to ASDF without any direct intervention
on the user's part (i.e. `add_resource_mapping` calls are unnecessary).

### Extensions

Since Extensions are pure python objects there is not as much boiler plate needed in order
to conveniently add them to ASDF using entry points. Indeed, one only needs to return all
the extensions as elements of a list via a function, as we did for resources (though it
needs to be a different function):

In [None]:
# In module asdf_shapes.integration
def get_extensions():
    return [EllipseExtension()]

Then we add a different entry point in the `setup.cfg` under `asdf.extensions`, giving us:

```
[options.entry_points]
asdf.resource_mappings =
    asdf_shapes_schemas = asdf_shapes.integration:get_resource_mappings
asdf.extensions =
    asdf_shapes_extensions = asdf_shapes.integration:get_extensions
```

Once installing your package with the above entry points, ASDF will then be able to seamlessly
handle reading and writing the Python objects you created extensions for.

## Extending Our Example Object

Suppose that we want to extend our object so that it represents an Ellipse in 3D
(centered at the origin), that is add a position angle:

In [None]:
@dataclass
class Ellipse3D(Ellipse):
    position_angle: float

### Extend the Schema

JSON schema does not support the concept of inheritance, which makes "extending"
an existing schema somewhat awkward. What we do instead is create a schema which
adds attributes to the existing schema via the `allOf` operation. In this case,
we can define the a schema for `Ellipse3D` by adding a `position_angle` property:

In [None]:
ellipse3d_uri = "asdf://example.com/example-project/schemas/ellipse3d-1.0.0"

ellipse3d_schema_content = f"""
%YAML 1.1
---
$schema: http://stsci.edu/schemas/yaml-schema/draft-01
id: {ellipse3d_uri}

allOf:
  - $ref: {ellipse_uri}
  - properties:
      position_angle:
        type: number
    required: [position_angle]
...
"""

asdf.get_config().add_resource_mapping({ellipse3d_uri: ellipse3d_schema_content})

# check
schema = asdf.schema.load_schema(ellipse3d_uri)
asdf.schema.check_schema(schema)

test_ellipse3d_object = {"semi_major": 1.0, "semi_minor": 2.0, "position_angle": 3.0}
asdf.schema.validate(test_ellipse3d_object, schema=schema)

### Create an Updated Manifest

Let's assume that we already have `shapes-1.0.0` manifest (already released and
in use). Following our suggested versioning system, we should create a new manifest
which includes a new `ellipse3d-1.0.0`:

In [None]:
ellipse3d_manifest_uri = "asdf://example.com/example-project/manifests/shapes-1.1.0"
ellipse3d_extension_uri = "asdf://example.com/example-project/extensions/shapes-1.1.0"
ellipse3d_tag = "asdf://example.com/example-project/tags/ellipse3d-1.0.0"

ellipse3d_manifest_content = f"""
%YAML 1.1
---
id: {ellipse3d_manifest_uri}
extension_uri: {ellipse3d_extension_uri}

title: Example Shape extension 1.1.0
description: Tags for example shape objects.

tags:
  - tag_uri: {ellipse_tag}
    schema_uri: {ellipse_uri}

  - tag_uri: {ellipse3d_tag}
    schema_uri: {ellipse3d_uri}
...
"""

asdf.get_config().add_resource_mapping(
    {ellipse3d_manifest_uri: ellipse3d_manifest_content}
)


# check
schema = asdf.schema.load_schema(ellipse3d_manifest_uri)
asdf.schema.check_schema(schema)
asdf.schema.validate(ellipse3d_manifest_content, schema=schema)

### Create an Updated `Converter`

The "simplest" approach to creating a `Converter` for `Ellipse3D` would be to simply
create a new converter as we did above for `Ellipse`; however, we can also take advantage
of the fact that multiple `tags` and `types` can be listed. Note that when multiple tags
are handled by the same `Converter`, we need to also implement a `select_tag` method:

In [None]:
class UpdatedEllipseConverter(asdf.extension.Converter):
    tags = [ellipse_tag, ellipse3d_tag]
    types = [Ellipse, Ellipse3D]

    def select_tag(self, obj, tag, ctx):
        if isinstance(obj, Ellipse3D):
            return ellipse3d_tag
        elif isinstance(obj, Ellipse):
            return ellipse_tag
        else:
            raise ValueError(f"Unknown object {type(obj)}")

    def to_yaml_tree(self, obj, tag, ctx):
        tree = {
            "semi_major": obj.semi_major,
            "semi_minor": obj.semi_minor,
        }

        if tag == ellipse3d_tag:
            tree["position_angle"] = obj.position_angle

        return tree

    def from_yaml_tree(self, node, tag, ctx):
        if tag == ellipse_tag:
            return Ellipse(**node)
        elif tag == ellipse3d_tag:
            return Ellipse3D(**node)
        else:
            raise ValueError(f"Unknown tag {tag}")

### Creating an Updated `Extension`

We can now use this converter to create a new "updated" extension:

In [None]:
class UpdatedEllipseExtension(asdf.extension.Extension):
    extension_uri = ellipse3d_extension_uri
    converters = [UpdatedEllipseConverter()]
    tags = [ellipse_tag, ellipse3d_tag]


asdf.get_config().add_extension(UpdatedEllipseExtension())

### Checking the New Extension

Lets check this new extension by writing both an `Ellipse` and `Ellipse3D` object
to ASDF:

In [None]:
ellipse = Ellipse(1.0, 2.0)
ellipse3d = Ellipse3D(1.0, 2.0, 3.0)

with asdf.AsdfFile() as af:
    af["ellipse"] = ellipse
    af["ellipse3d"] = ellipse3d
    af.write_to("ellipse3d.asdf")

# Check
with open("ellipse3d.asdf") as f:
    print(f.read())

with asdf.open("ellipse3d.asdf") as af:
    print(af["ellipse"])
    assert af["ellipse"] == ellipse

    print(af["ellipse3d"])
    assert af["ellipse3d"] == ellipse3d