# How-To: Add an Operator

This notebook introduces how to add a custom input layer.

- For users: There's no way to do it as the end-user. Currently it's designed that modifications to the library code is required.
- For developers: Please look at comments for each code block to decide where to add/modify the code pieces.

A new operator requires symbolic definition(s) for each layer supporting it and the implementation(s) corresponding to the backend(s).

In this notebook, we will illustrate the process with `MYDIFFERENTIATION` and its `torch` backend, which is a replicate of `DIFFERENTIATION` in the library.

## Enums

First of all, we need to register the new operator in two `Enum` class: `CircuitOperator` and `LayerOperator`.

This can only be done by modifying the corresponding classes in the library, but not possible in the user code.

In rare cases that an operator on circuit does not involve any transformation of the layers (e.g. `CircuitOperator.CONCATENATE`), it may skip the `LayerOperator` class and the implementation for the operator on the layers.

[cirkit/symbolic/circuit.py](../cirkit/symbolic/circuit.py)

In [None]:
from enum import IntEnum, auto


class CircuitOperator(IntEnum):
    ...  # Any existing enum values.
    MYDIFFERENTIATION = auto()

[cirkit/symbolic/layers.py](../cirkit/symbolic/layers.py)

In [None]:
class LayerOperator(IntEnum):
    ...  # Any existing enum values.
    MYDIFFERENTIATION = auto()

## Symbolic

In the symbolic part, we will have to:
- Decide the layers that supports this operator;
  - Identify the parameter operations required by the operator;
- Define the process to operate a circuit with its symbolic representation.

All the above will not involve any actual tensors, just the configs and shapes.

We can add as many layers as the operator supports, but for illustrative purposes, here we only illustrate with `PolynomialLayer` in the library.

For layers that do not support the operator, just leave it out and it will be properly handled.

### Parameter Operation

After deciding which layer(s) we want to support, we must define the parameter operations the layer(s) need(s).

Since we are only looking at `PolynomialLayer` here, and the layer only has one parameter `coeff`, we only need to define one patameter operation.

As multiplication is a unary operator, we can inherit from `UnaryParameterOp` to make the best use of existing infrastructure. Alternatively, a more general `ParameterOp` class may be inherited.

The mininum definition should include the `shape` property which defines the output shape of this parameter operation.

In this case, we also need `__init__` as differentiation also needs an additional argument `order`, and `config` should also be overriden to include `order`.

[cirkit/symbolic/parameters.py](../cirkit/symbolic/parameters.py)

In [None]:
from typing import Any

from cirkit.symbolic.parameters import UnaryParameterOp


class PolynomialMyDifferential(UnaryParameterOp):
    def __init__(self, in_shape: tuple[int, ...], *, order: int = 1):
        if order <= 0:
            raise ValueError("The order of differentiation must be positive.")
        super().__init__(in_shape)
        self.order = order

    @property
    def shape(self) -> tuple[int, ...]:
        # if dp1>order, i.e., deg>=order, then diff, else const 0.
        return (
            self.in_shapes[0][0],  # dim Ko
            self.in_shapes[0][1] - self.order
            if self.in_shapes[0][1] > self.order
            else 1,  # dim dp1
        )

    @property
    def config(self) -> dict[str, Any]:
        return {**super().config, "order": self.order}

### Layer Operator

After the param op has been defined, we can then define how an operator act on the layer by defining a rule function and registering it to the rules registry.

In order to share the underlying parameters across the operations, `param.ref()` should be passed to build the new parameter from the operators.

And then, the resulting new layer (or can be layers, if needed) should be wrapped in a `CircuitBlock` for return.

[cirkit/symbolic/operators.py](../cirkit/symbolic/operators.py)

In [None]:
from cirkit.symbolic.circuit import CircuitBlock
from cirkit.symbolic.layers import PolynomialLayer
from cirkit.symbolic.operators import DEFAULT_OPERATOR_RULES
from cirkit.symbolic.parameters import Parameter


def my_differentiate_polynomial_layer(
    sl: PolynomialLayer, *, var_idx: int, ch_idx: int, order: int = 1
) -> CircuitBlock:
    # PolynomialLayer is constructed univariate, but we still take the 2 idx for unified interface
    assert (var_idx, ch_idx) == (0, 0), "This should not happen"
    if order <= 0:
        raise ValueError("The order of differentiation must be positive.")
    coeff = Parameter.from_unary(
        PolynomialMyDifferential(sl.coeff.shape, order=order), sl.coeff.ref()
    )
    sl = PolynomialLayer(
        sl.scope, sl.num_output_units, sl.num_channels, degree=coeff.shape[-1] - 1, coeff=coeff
    )
    return CircuitBlock.from_layer(sl)


DEFAULT_OPERATOR_RULES[LayerOperator.MYDIFFERENTIATION].append(my_differentiate_polynomial_layer)

### Operation on Symbolic Circuit

To implement the operator in a symbolic way, we need to define a function that takes in the symbolic circuit(s) and a optional custom registry, along with any other args the operator needs. The function should return the resulting circuit after applying the operator, with proper parameter sharing.

In the following code, we omit many algorithmic details, but focus on points to note with coding. Please read the comments for how it's expected to be defined.

[cirkit/symbolic/functional.py](../cirkit/symbolic/functional.py)

In [None]:
import itertools
from collections.abc import Sequence

from cirkit.symbolic.circuit import Circuit, CircuitOperation, StructuralPropertyError
from cirkit.symbolic.layers import InputLayer, Layer, SumLayer
from cirkit.symbolic.registry import OPERATOR_REGISTRY, OperatorRegistry


# sc (or multiple if n-ary op) and registry is common interface, order is specific to diff.
def my_differentiate(
    sc: Circuit, registry: OperatorRegistry | None = None, *, order: int = 1
) -> Circuit:
    # Sanity checks of args.
    if not sc.is_smooth or not sc.is_decomposable:
        raise StructuralPropertyError(
            "Only smooth and decomposable circuits can be efficiently differentiated."
        )
    if order <= 0:
        raise ValueError("The order of differentiation must be positive.")

    # Use the registry in the current context, if not specified otherwise.
    if registry is None:
        registry = OPERATOR_REGISTRY.get()

    # Keep a mapping from the layers in the input circuit to the blocks in the output circuit.
    # Depending on the algorithm, another form of mapping may be used.
    layers_to_blocks: dict[Layer, list[CircuitBlock]] = {}

    # The directed edges connecting the blocks in the output circuit, as a mapping from each block
    # to its inputs. This must be defined in this form to construct the output circuit.
    in_blocks: dict[CircuitBlock, Sequence[CircuitBlock]] = {}

    # Iterate all the symbolic layers in the input circuit in the topological order.
    for sl in sc.topological_ordering():
        # For an InputLayer, e.g. PolynomialLayer, the rule should exist in the registry.
        if isinstance(sl, InputLayer):
            # Retrieve the differentiation rule from the registry.
            func = registry.retrieve_rule(LayerOperator.MYDIFFERENTIATION, type(sl))
            # Get the differential using the rule.
            diff_blocks = [func(sl, var_idx=0, ch_idx=0, order=order)]

            # Save the blocks as corresponding to the current symbolic layer.
            layers_to_blocks[sl] = diff_blocks

            # Update to in_blocks can be omitted: blocks not exist will be treated as no input.
            # in_blocks[diff_blocks[0]] = []

        # For a SumLayer, the original connectivity and params are copied in the differential.
        elif isinstance(sl, SumLayer):
            # An idiom to make a copy of a layer and keep the params shared.
            diff_blocks = [
                CircuitBlock.from_layer(
                    type(sl)(**sl.config, **{name: p.ref() for name, p in sl.params.items()})
                )
            ]
            # Each item corresponds to the item in diff_blocks, meaning its inputs are the blocks
            # corresponding to the inputs of the current layer.
            diff_in_blocks = [
                [layers_to_blocks[sl_in][0] for sl_in in sc.layer_inputs(sl)],
            ]

            # Save the blocks as corresponding to the current symbolic layer.
            layers_to_blocks[sl] = diff_blocks

            # Update in_blocks with the blocks and the corresponding inputs.
            in_blocks.update(zip(diff_blocks, diff_in_blocks))

        # There can be other cases processed based on need.
        else:
            pass

    # End of `for sl in sc.topological_ordering():`

    # Construct the differential symbolic circuit and set the differentiation operation metadata.
    return Circuit.from_operation(
        sc.scope,  # The scope is the same, or may change based on the algorithm.
        sc.num_channels,  # Channels shall not change in most cases.
        itertools.chain.from_iterable(layers_to_blocks.values()),  # All the blocks constructed.
        in_blocks,  # The edges as recoded.
        itertools.chain.from_iterable(layers_to_blocks[sl] for sl in sc.outputs),  # Idiom.
        operation=CircuitOperation(  # Metadata of the operation.
            operator=CircuitOperator.MYDIFFERENTIATION,  # The Enum value for the op.
            operands=(sc,),  # The operands, anything passed into this operator function.
            metadata=dict(order=order),  # Any additional args of the operator.
        ),
    )

## Implementation with Backend

In the backend implementation, we will have to:
- Implement the actual computation for the layer and operator(s);
- Specify the rule that maps the implementation above with the symbolic layer/operator(s).

What has been provided in the symbolic part should has a corresponding implmentation with the backend, although the rules are actually what handles whether and how the symbolic representation is translated.

### `torch` Implementation

The `torch` version of operators also provides a `TorchUnaryParameterOp` for easier implementation, with `TorchParameterOp` for more customization.

The minimal implementation can include only the `shape` of output parameter, and the `forward` that transforms the input parameter(s) to the output.

In this case, we also need `__init__` as differentiation also needs an additional argument `order`, and `config` should also be overriden to include `order`.

And optionally, `fold_settings` can be provided to contain any additional shapes that affect folding (here the default is enough as the input shape decides everything).

[cirkit/backend/torch/parameters/nodes.py](../cirkit/backend/torch/parameters/nodes.py)

In [None]:
import torch
from torch import Tensor

from cirkit.backend.torch.parameters.nodes import TorchUnaryParameterOp


class TorchPolynomialMyDifferential(TorchUnaryParameterOp):
    def __init__(self, in_shape: tuple[int, ...], *, num_folds: int = 1, order: int = 1) -> None:
        if order <= 0:
            raise ValueError("The order of differentiation must be positive.")
        super().__init__(in_shape, num_folds=num_folds)
        self.order = order

    @property
    def shape(self) -> tuple[int, ...]:
        # if dp1>order, i.e., deg>=order, then diff, else const 0.
        return (
            self.in_shapes[0][0],  # dim Ko
            self.in_shapes[0][1] - self.order
            if self.in_shapes[0][1] > self.order
            else 1,  # dim dp1
        )

    @property
    def config(self) -> dict[str, Any]:
        return {**super().config, "order": self.order}

    def forward(self, coeff: Tensor) -> Tensor:
        if coeff.shape[-1] <= self.order:
            return torch.zeros_like(coeff[..., :1])  # shape (F, K, 1).

        for _ in range(self.order):
            degp1 = coeff.shape[-1]  # shape (F, K, dp1).
            arange = torch.arange(1, degp1).to(coeff)  # shape (deg,).
            coeff = coeff[..., 1:] * arange  # a_n x^n -> n a_n x^(n-1), with a_0 disappeared.

        return coeff  # shape (F, K, dp1-ord).

    # -------- unnecessary in this case, directly use inherited --------

    # @property
    # def fold_settings(self) -> Tuple[Any, ...]:
    #     return super().fold_settings

### Rules

Now we need to register the mapping between the torch implementations with their symbolic conterparts. It should be simple to define in most cases.

Note that each backend has its own registry instead of one large dict for everything.

[cirkit/backend/torch/rules/parameters.py](../cirkit/backend/torch/rules/parameters.py)

In [None]:
from cirkit.backend.torch.compiler import TorchCompiler
from cirkit.backend.torch.rules.parameters import DEFAULT_PARAMETER_COMPILATION_RULES


def compile_polynomial_my_differential(
    compiler: "TorchCompiler", p: PolynomialMyDifferential
) -> TorchPolynomialMyDifferential:
    return TorchPolynomialMyDifferential(*p.in_shapes, order=p.order)


DEFAULT_PARAMETER_COMPILATION_RULES.update(
    {PolynomialMyDifferential: compile_polynomial_my_differential}
)

## Pipeline-level Convenience Method

After the above definition, the new operator should be available to use as `cirkit.symbolic.functional.my_differential`. However for convenience, we can also add the following to the `PipelineContext` class.

Note that this must be done by modifying the class in the library instead of in the user code.

[cirkit/pipeline.py](../cirkit/pipeline.py)

In [None]:
from contextlib import AbstractContextManager

import cirkit.symbolic.functional as SF
from cirkit.backend.compiler import CompiledCircuit


class PipelineContext(AbstractContextManager):
    ...  # Any existing defs

    def differentiate(self, cc: CompiledCircuit, *, order: int = 1) -> CompiledCircuit:
        if not self._compiler.has_symbolic(cc):
            raise ValueError("The given compiled circuit is not known in this pipeline")
        if order <= 0:
            raise ValueError("The order of differentiation must be positive.")
        sc = self._compiler.get_symbolic_circuit(cc)
        diff_sc = SF.differentiate(sc, registry=self._op_registry, order=order)
        return self.compile(diff_sc)