# Extending ASDF

One of the most useful features of ASDF is the ability to extend it to enable the serialization of arbitrary custom types. In this tutorial we will demonstrate how to use the extension mechanism to allow ASDF to process custom types. See the official documentation for [Extending ASDF](https://asdf.readthedocs.io/en/latest/asdf/extensions.html) for more details.

## A Simple Example

The built-in `fractions` module provides a simple `Fraction` type. ASDF does not know how to serialize this type out of the box, so it makes for a good simple example of how to develop a custom type extension.

First, we'll create an instance of `Fraction`:

In [None]:
from fractions import Fraction
f = Fraction(2, 3)

Next we'll create a tree containing our `Fraction` and try to write it out to an ASDF file:

In [None]:
import asdf
tree = dict(fraction=f)
af = asdf.AsdfFile(tree)
af.write_to('fraction.asdf')

We get a `RepresenterError` which indicates that ASDF doesn't know how to store this object. So what can we do?

### Tag Classes

We need to tell ASDF how to convert an instance of `Fraction` into a YAML representation. We do this by writing a "tag" class that defines how `Fraction` should be converted into a YAML node (or "tree"). All custom tag types for ASDF inherit from `asdf.CustomType` and override the `to_tree` class method. We also define several class-level attributes that provide information to ASDF about how our custom type should be labeled (or tagged, in YAML parlance) in the tree:

In [None]:
class FractionType(asdf.CustomType):
    # An arbitrary name that ASDF will use for the type being serialized
    name = 'fraction'
    # Corresponds to the organization that defines this type.
    # (e.g. stsci.edu, astropy.org, etc.)
    organization = 'example.org'
    # The name of the "standard" that this type belongs to.
    # In general this will correspond to the name of the package defining the type.
    # (e.g. asdf, gwcs, astropy, etc.)
    standard = 'custom'
    # The version of the type. By convention new types begin at 1.0.0.
    version = '1.0.0'
    # A list of the types that will be serialized by this tag class.
    types = [Fraction]

    @classmethod
    def to_tree(cls, node, ctx):
        """
        Takes an instance of Fraction as input and converts it into a tree representation.
        
        By convention, the instance argument is named `node`, although in this case it will
        be an instance of `Fraction`. The argument name is arbitrary.
        
        This function must return a basic Python type that corresponds to the YAML node that will
        be written to represent this type. In most cases, this function will return a `dict`.
        
        `ctx` is the `AsdfFile` instance that is being written out. It is not used in this example.
        """
        tree = dict()
        # This is where we define the way that the attributes of a Fraction instance are
        # represented in a YAML tree.
        tree['numerator'] = node.numerator
        tree['denominator'] = node.denominator
        return tree

For more details about the attributes and methods of `asdf.CustomType` see the [API documentation](https://asdf.readthedocs.io/en/latest/api/asdf.CustomType.html#asdf.CustomType).

### Extension Classes

Our `FractionType` class defines how an instance of `Fraction` can be written to a YAML tree. But this is not enough. We still need to tell ASDF how to find `FractionType` and use it when writing out a file. For this, we have to define an "extension" class. All custom ASDF extensions inherit from `asdf.AsdfExtension`.

In [None]:
class CustomFractionExtension(asdf.AsdfExtension):
    @property
    def types(self):
        """
        Returns a list of tag classes that are implemented by this extension.
        """
        return [FractionType]
    
    @property
    def tag_mapping(self):
        """
        This property must be an array but can be ignored for now.
        """
        return []
    
    @property
    def url_mapping(self):
        """
        This property must be an array but can be ignored for now.
        """
        return []

At a minimum, our extension must implement the `types` property to indicate which tag classes are provided by this extension. The `tag_mapping` and `url_mapping` properties must also be overridden, but are not used for the sake of this example (we'll cover them later). For more details about the `asdf.AsdfExtension` base class see the [API Documentation](https://asdf.readthedocs.io/en/latest/api/asdf.AsdfExtension.html#asdf.AsdfExtension).

### Connecting the Pieces

Now we can try to write out our ASDF file containing a `Fraction` instance. We simply need to tell ASDF to use our extension when writing:

In [None]:
af = asdf.AsdfFile(tree, extensions=CustomFractionExtension())
af.write_to('fraction.asdf')

It worked! Let's look at the contents of the file we created:

In [None]:
!cat fraction.asdf

Notice that ASDF used the `name`, `organization`, `standard`, and `version` attributes that we defined in `FractionType` to create a YAML tag for the node that represents our `Fraction` instance.

### Reading the File

But what happens when we try to read this new file?

In [None]:
new_af = asdf.open('fraction.asdf')

Reading the file works, but notice that we get a few warnings:
> AsdfConversionWarning: tag:example.org:custom/fraction-1.0.0 is not recognized, converting to raw Python data structure

Also:
> UserWarning: File 'file:///Users/dan/stsci/asdf-notebooks/fraction.asdf' was created with extension '\_\_main\_\_.CustomFractionExtension

Notice that the tree appears to be valid, but it does not contain a `Fraction` instance:

In [None]:
new_af.tree

This is because we didn't use our `CustomFractionExtension` when reading the file. But even more important, we didn't define how to turn a YAML node back into a `Fraction` instance!

We need to go back to our `FractionType` class and define another class method called `from_tree`. This method will take a Python data structure representing a YAML node as input and will return a `Fraction` instance:

In [None]:
class FractionType(asdf.CustomType):
    name = 'fraction'
    organization = 'example.org'
    standard = 'custom'
    version = '1.0.0'
    types = [Fraction]

    @classmethod
    def to_tree(cls, node, ctx):
        tree = dict()
        tree['numerator'] = node.numerator
        tree['denominator'] = node.denominator
        return tree
    
    @classmethod
    def from_tree(cls, tree, ctx):
        """
        Takes a YAML data structure as input and returns an instance of Fraction
        
        By convention, the YAML data structure is named `tree`.
        
        `ctx` is the `AsdfFile` instance that is being written out. It is not used in this example.
        """
        return Fraction(tree['numerator'], tree['denominator'])

We need to redefine our extension class to reflect the update to `FractionType`:

In [None]:
class CustomFractionExtension(asdf.AsdfExtension):
    @property
    def types(self):
        return [FractionType]
    
    @property
    def tag_mapping(self):
        return []
    
    @property
    def url_mapping(self):
        return []

Now let's try to read the file using our updated tag class and extension:

In [None]:
new_af = asdf.open('fraction.asdf', extensions=CustomFractionExtension())
new_af.tree

It worked! Here's the proof:

In [None]:
isinstance(new_af.tree['fraction'], Fraction)

## A More Complicated Example

The YAML representation of `Fraction` contains only two top-level attributes, which are both basic Python types. This means that the `to_tree` and `from_tree` methods of our tag class are relatively straightforward. But what happens when we want to represent more complicated types in ASDF? We need to consider cases where the attributes of the type we are trying to represent are themselves more complicated types.

To demonstrate this, we will use a fairly contrived example. We will define a class called `Fractional2DCoord` where each of the coordinate components is a `Fraction` instance:

In [None]:
class Fractional2DCoord:
    """
    This class is silly. Why not just use a tuple?
    
    Okay fine, but in that case you need to come up with a better example.
    """
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return "<Fractional2DCoord({}, {})>".format(repr(self.x), repr(self.y))

Let's try to store an instance of `Fractional2DCoord` to ASDF using the extension we defined above:

In [None]:
x = Fraction(1, 2)
y = Fraction(2, 3)
coord = Fractional2DCoord(x, y)

In [None]:
tree = dict(coord=coord)
coord_af = asdf.AsdfFile(tree, extensions=CustomFractionExtension())
coord_af.write_to('coord.asdf')

We get a `RepresenterError` since ASDF doesn't know how to serialize this type yet. So let's write the tag class:

In [None]:
from asdf.yamlutil import custom_tree_to_tagged_tree, tagged_tree_to_custom_tree

class Fractional2DCoordType(asdf.CustomType):
    name = 'fractional_2d_coord'
    organization = 'example.org'
    standard = 'custom'
    version = '1.0.0'
    types = [Fractional2DCoord]
    
    @classmethod
    def to_tree(self, node, ctx):
        tree = dict()
        # These calls ensure that all custom types are processed recursively.
        # In this case, it means that each coordinate component, which is a Fraction,
        # will be processed by FractionType.to_tree.
        tree['x'] = custom_tree_to_tagged_tree(node.x, ctx)
        tree['y'] = custom_tree_to_tagged_tree(node.y, ctx)
        return tree
    
    @classmethod
    def from_tree(self, tree, ctx):
        # This recurses through the tree and ensures each attribute gets converted
        # to a custom type, if applicable. In this case, it means that each tree attribute
        # is processed by FractionType.from_tree
        x = tagged_tree_to_custom_tree(tree['x'], ctx)
        y = tagged_tree_to_custom_tree(tree['y'], ctx)
        return Fractional2DCoord(x, y)

Note that our `to_tree` and `from_tree` methods make use of the `custom_tree_to_tagged_tree` and `tagged_tree_to_custom_tree` functions from `asdf.yamlutil`, respectively. These functions help us recursively process the subtrees, which ensures that the `to_tree` and `from_tree` methods specific to `Fraction` will be called automatically. (We could have also called `FractionType.to_tree` and `FractionType.from_tree` explicitly, but that would be more susceptible to being broken by future changes. The functions from `asdf.yamlutil` handle things in a more general way.)

Note that the `name` attribute of `Fractional2DCoordType` reflects the type being represented, but the `organization` and `standard` attributes are the same as `FractionType`. This means that both tag classes can be represented by the same extension class, which we redefine below:

In [None]:
class CustomFractionExtension(asdf.AsdfExtension):
    @property
    def types(self):
        # Note that both tag types now returned here
        return [FractionType, Fractional2DCoordType]
    
    @property
    def tag_mapping(self):
        return []
    
    @property
    def url_mapping(self):
        return []

Now let's try storing our `Fractional2DCoord` instance using the newly defined extension:

In [None]:
tree = dict(coord=coord)
coord_af = asdf.AsdfFile(tree, extensions=CustomFractionExtension())
coord_af.write_to('coord.asdf')

It worked! Let's look at the resulting file:

In [None]:
!cat coord.asdf

Note that our `coord` attribute got tagged as `tag:example.org:custom/fractional_2d_coord-1.0.0`, and the `x` and `y` attributes of that object were both tagged as `tag:example.org:custom/fraction-1.0.0`. It worked!

Let's see what happens when we read the file:

In [None]:
read_coord = asdf.open('coord.asdf', extensions=CustomFractionExtension())
# Using a shortcut here that enables us to get attributes from the top-level AsdfFile instead of using .tree
read_coord['coord']