# Extension Types

## Basics

The Arrow columnar format allows defining extension types or user-defined types to enable extending standard Arrow data types with custom semantics specific to the system or application.

For example:

* Universally unique identifier (`uuid`) can be represented as a `FixedSizeBinary` type
* Trading time can be represented as a `Timestamp` with metadata indicating the market trading calendar

Extension types can be defined by annotating any of the built-in Arrow logical types (the “storage type”) with a **custom type name** and **optional serialized representation** (`'ARROW:extension:name'` and `'ARROW:extension:metadata'` keys in the `Field` metadata structure).

Source: https://arrow.apache.org/docs/dev/format/Columnar.html#extension-types

## Canonical Extension Types

It is beneficial to share the definitions of well-known extension types so as to improve interoperability between different systems integrating Arrow columnar data. For this reason canonical extension types are defined in Arrow itself.

Examples:

* Fixed and variable shape tensor
  - https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor
  - https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#variable-shape-tensor

Source: https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#

## Community Extension Types

These are Arrow extension types that have been established as standards within specific domain areas.

Example:

* GeoArrow : collection of Arrow extension types for representing vector geometries
  - https://github.com/geoarrow/geoarrow

  ```python
  PointArray:PointType(geoarrow.point)[3]
  <POINT (1 3)>
  <POINT (2 4)>
  <POINT (3 5)>
  ```

## Subclassing ExtensionType from Python

Defining extension types from Python is done by subclassing pyarrow [`ExtensionType`](https://arrow.apache.org/docs/dev/python/generated/pyarrow.ExtensionType.html#pyarrow.ExtensionType) and giving the derived class its own extension name and serialization mechanism.

UUID example:

```python
class UuidType(pa.ExtensionType):

    def __init__(self):
        super().__init__(pa.binary(16), "my_package.uuid")

    def __arrow_ext_serialize__(self):
        return b''

    @classmethod
    def __arrow_ext_deserialize__(cls, storage_type, serialized):
        assert storage_type == pa.binary(16)
        assert serialized == b''
        return UuidType()
```