# Implementing a Block

## What Should Be Implemented as a Block?

The first step is to ensure that what you want to implement is actually a block. 
The main utility of a block is that the order of operations can be changed.
If this is useful for your module, then you should implement a block.

Another reason to implement a block is if you want users to be able to add or
remove steps from the block. For example, adding an activation function, or
removing a dropout layer. 

Also, remember that blocks shouls be small and modular. If you are implementing
a block that is too big, you should consider breaking it down into smaller blocks.

Finally, blocks should be pretty strict in terms of input and output. This is 
important to ensure that the user has the flexibility to change the order of
operations without breaking the model.

## Implementing a Block

Here you'll see the steps you should follow to implement a block in deeplay. You'd do this by implementing `MyConv1dBlock`.

### 1. Create a New File

The first step is to create a new file in the `deeplay/blocks` directory. It
can be in a deeper subdirectory if it makes sense.

**The base class: `BaseBlock`.**
The file should contain a class that inherits from `BaseBlock`.

**The arguments.**
The first arguments should specify the input (for example, `in_channels`) and output (for example, `out_channels`) shapes of the block. This is
important to ensure that the block is used correctly.

The following arguments should specify the arguments for the default layer class.
In the example here, you'll use `torch.nn.Conv1d`, so you should specify the kernel
size, stride, padding, etc.

Finally, the class should accept `**kwargs` that will be passed to the super class.

This example implements the `MyConv1dBlock`.

In [1]:
from deeplay.blocks.base import BaseBlock
from deeplay.external.layer import Layer

import torch
import torch.nn as nn

class MyConv1dBlock(BaseBlock):
    def __init__(
        self, 
        in_channels, 
        out_channels, 
        kernel_size=3, 
        stride=1, 
        padding=0, 
        dilation=1, 
        groups=1, 
        bias=True,
        order=None, # The order argument is here for typing purposes.  ### Clarify
        **kwargs,
    ):
        
        # Save the input parameters.
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.groups = groups
        self.bias = bias

        # Create the layer.
        layer = Layer(
            nn.Conv1d, 
            in_channels=in_channels, 
            out_channels=out_channels, 
            kernel_size=kernel_size, 
            stride=stride, 
            padding=padding, 
            dilation=dilation, 
            groups=groups, 
            bias=bias,
        )
        
        # Send the layers and modules to the parent class.
        super(MyConv1dBlock, self).__init__(order=order, layer=layer, **kwargs)

You can now instantiate this block and print its architecture.

In [2]:
block = MyConv1dBlock(in_channels=10, out_channels=4)

print(block)

MyConv1dBlock(
  (layer): Layer[Conv1d](in_channels=10, out_channels=4, kernel_size=3, stride=1, padding=0, dilation=1, groups=1, bias=True)
)


### 2. Add Annotations

It is important to add annotations to the class and methods to ensure that the
user knows what to expect. This is also useful for the IDE to provide 
autocomplete.

In [3]:
from typing import List, Optional
from torch.nn.common_types import _size_1_t

from deeplay.module import DeeplayModule

class MyConv1dBlock(BaseBlock):

    # Annotate the attributes.
    in_channels: int
    out_channels: int
    kernel_size: _size_1_t
    stride: _size_1_t
    padding: _size_1_t
    dilation: _size_1_t
    groups: int
    bias: bool

    # Also annotate layer.
    layer: Layer 

    def __init__(
        self, 
        in_channels: int,
        out_channels: int,
        kernel_size: _size_1_t = 3,
        stride: _size_1_t = 1,
        padding: _size_1_t = 0,
        dilation: _size_1_t = 1, 
        groups: int = 1,
        bias: bool = True,
        order: Optional[List[str]] = None,
        **kwargs: DeeplayModule,
    ) -> None:
        
        # Save the input parameters.
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.groups = groups
        self.bias = bias

        # Create the layer.
        layer = Layer(
            nn.Conv1d, 
            in_channels=in_channels, 
            out_channels=out_channels, 
            kernel_size=kernel_size, 
            stride=stride, 
            padding=padding, 
            dilation=dilation, 
            groups=groups, 
            bias=bias,
        )
        
        # Send the layers and modules to the parent class.
        super().__init__(order=order, layer=layer, **kwargs)

## 3. Document the Block

The next step is to document the block. This should include a description of 
the block, the input and output shapes, and the arguments that can be passed to
the block.

In [4]:
class MyConv1dBlock(BaseBlock):
    """A block for 1D convolutional operations.

    This block performs a 1D convolutional operation on the input tensor.

    Parameters
    ----------
    in_channels : int
        The number of input channels.
    out_channels : int
        The number of output channels.
    kernel_size : int
        The size of the convolutional kernel.
    stride : int
        The stride of the convolutional operation.
    padding : int
        The padding of the convolutional operation.
    dilation : int
        The dilation of the convolutional operation.
    groups : int    
        The number of groups for the convolutional operation.
    bias : bool
        Whether to include a bias term in the convolutional operation.
    order : List[str]
        The order of the layers in the block. If None, the order is inferred 
        from the order of keyword arguments, with `layer` always being the 
        first layer.
    **kwargs
        Additional modules to include in the block. The keys should be the 
        names of the modules and the values should be the modules.
    
    Attributes
    ----------
    in_channels : int
        The number of input channels.
    out_channels : int
        The number of output channels.
    kernel_size : int
        The size of the convolutional kernel.
    stride : int
        The stride of the convolutional operation.
    padding : int
        The padding of the convolutional operation.
    dilation : int
        The dilation of the convolutional operation.
    groups : int    
        The number of groups for the convolutional operation.
    bias : bool
        Whether to include a bias term in the convolutional operation.
    order : List[str]
        The order of the layers in the block.
    layer : Layer
        The layer that performs the convolutional operation.
    
    Input
    -----
    x : torch.Tensor (batch_size, in_channels, Any)
        The input tensor to the block.
    
    Output
    ------
    y : torch.Tensor (batch_size, out_channels, Any)
        The output tensor from the block.

    Evaluation
    ----------
    See :func:`~SequentialBlock.forward` for details.

    Examples
    --------
    >>> block = MyConv1dBlock(in_channels=3, out_channels=6, kernel_size=3)
    >>> block.build()
    MyConv1dBlock(
      (layer):  Conv1d(3, 6, kernel_size=(3,), stride=(1,))
    )

    """

    # Annotate the attributes.
    in_channels: int
    out_channels: int
    kernel_size: _size_1_t
    stride: _size_1_t
    padding: _size_1_t
    dilation: _size_1_t
    groups: int
    bias: bool

    # Also annotate layer.
    layer: Layer 

    def __init__(
        self, 
        in_channels: int,
        out_channels: int,
        kernel_size: _size_1_t = 3,
        stride: _size_1_t = 1,
        padding: _size_1_t = 0,
        dilation: _size_1_t = 1, 
        groups: int = 1,
        bias: bool = True,
        order: Optional[List[str]] = None,
        **kwargs: DeeplayModule,
    ) -> None:
        
        # Save the input parameters.
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.groups = groups
        self.bias = bias

        # Create the layer.
        layer = Layer(
            nn.Conv1d, 
            in_channels=in_channels, 
            out_channels=out_channels, 
            kernel_size=kernel_size, 
            stride=stride, 
            padding=padding, 
            dilation=dilation, 
            groups=groups, 
            bias=bias,
        )
        
        # Send the layers and modules to the parent class.
        super().__init__(order=order, layer=layer, **kwargs)

### 4. Add Auxiliary Methods

There are several special methods that improve the usability of the block.
These don't need to be documented extensibly since they are not meant to be used directly by the user.

These include:

- `.call_with_dummy_data()`

    This method creates some dummy data and calls the forward method. This is used to create any lazy layers, and run all callbacks that are defined to run on forward.
    
    This method will be called immediately before the build phase, and only if the user doesn't provide a dummy input.
    
    It is recommended to use a batch size of 2, to ensure that the batch normalization works correctly. If there are any spatial or temporal dimensions, they should be set to at least 12. This is to reduce the risk of stride or padding errors for small inputs.

- `.get_default_activation()` (optional)

    This method should return the default activation function for the block. This will be used if the user calls `.activated()` without specifying an activation function. Default is `nn.ReLU`.

- `.get_default_normalization()`

    This method should return the default normalization function for the block. This will be used if the user calls `.normalized()` without specifying a normalization function. 

- `.get_default_merge()` (optional)

    This method should return the default merge function for the block. This will be used if the user calls `.shortcut()` without specifying a merge function. Default is `ops.Add`.

- `.get_default_shortcut()` (optional)

    This method should return the default shortcut function for the block. This will be used if the user calls `.shortcut()` without specifying a shortcut function. Default is `nn.Identity`. 

    This is used if there is a need for a projection in the shortcut connection. For example, if the input and output shapes are different such that the merge function cannot be used.

In [5]:
from re import L
from deeplay.module import DeeplayModule
from deeplay.ops.merge import MergeOp


class MyConv1dBlock(BaseBlock):
    """A block for 1D convolutional operations.

    This block performs a 1D convolutional operation on the input tensor.

    Parameters
    ----------
    in_channels : int
        The number of input channels.
    out_channels : int
        The number of output channels.
    kernel_size : int
        The size of the convolutional kernel.
    stride : int
        The stride of the convolutional operation.
    padding : int
        The padding of the convolutional operation.
    dilation : int
        The dilation of the convolutional operation.
    groups : int    
        The number of groups for the convolutional operation.
    bias : bool
        Whether to include a bias term in the convolutional operation.
    order : List[str]
        The order of the layers in the block. If None, the order is
        inferred from the order of keyword arguments, with `layer`
        always being the first layer.
    **kwargs
        Additional modules to include in the block. The keys should be
        the names of the modules and the values should be the modules.
    
    Attributes
    ----------
    in_channels : int
        The number of input channels.
    out_channels : int
        The number of output channels.
    kernel_size : int
        The size of the convolutional kernel.
    stride : int
        The stride of the convolutional operation.
    padding : int
        The padding of the convolutional operation.
    dilation : int
        The dilation of the convolutional operation.
    groups : int    
        The number of groups for the convolutional operation.
    bias : bool
        Whether to include a bias term in the convolutional operation.
    order : List[str]
        The order of the layers in the block.
    layer : Layer
        The layer that performs the convolutional operation.
    
    Input
    -----
    x : torch.Tensor (batch_size, in_channels, Any)
        The input tensor to the block.
    
    Output
    ------
    y : torch.Tensor (batch_size, out_channels, Any)
        The output tensor from the block.

    Evaluation
    ----------
    See :func:`~SequentialBlock.forward` for details.

    Examples
    --------
    >>> block = MyConv1dBlock(in_channels=3, out_channels=6, kernel_size=3)
    >>> block.build()
    MyConv1dBlock(
      (layer):  Conv1d(3, 6, kernel_size=(3,), stride=(1,))
    )

    """
    
    # Annotate the attributes.
    in_channels: int
    out_channels: int
    kernel_size: _size_1_t
    stride: _size_1_t
    padding: _size_1_t
    dilation: _size_1_t
    groups: int
    bias: bool

    # Also annotate layer.
    layer: Layer 

    def __init__(
        self, 
        in_channels: int,
        out_channels: int,
        kernel_size: _size_1_t = 3,
        stride: _size_1_t = 1,
        padding: _size_1_t = 0,
        dilation: _size_1_t = 1, 
        groups: int = 1,
        bias: bool = True,
        order: Optional[List[str]] = None,
        **kwargs: DeeplayModule,
    ) -> None:
        
        # Save the input parameters.
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.groups = groups
        self.bias = bias

        # Create the layer.
        layer = Layer(
            nn.Conv1d, 
            in_channels=in_channels, 
            out_channels=out_channels, 
            kernel_size=kernel_size, 
            stride=stride, 
            padding=padding, 
            dilation=dilation, 
            groups=groups, 
            bias=bias,
        )
        
        # Send the layers and modules to the parent class.
        super().__init__(order=order, layer=layer, **kwargs)

    def call_with_dummy_data(self) -> None:
        x = torch.randn(2, self.in_channels, 16)
        self(x)

    def get_default_activation(self) -> DeeplayModule:
        return Layer(nn.ReLU)

    def get_default_normalization(self) -> DeeplayModule:
        # This assumes that the normalization is applied after Layer.
        # If it is before, num_features should be in_channels.
        # This will be automatically handled during the build process.
        return Layer(nn.BatchNorm1d, num_features=self.out_channels)
    
    def get_default_merge(self) -> MergeOp:
        from deeplay.ops.merge import Add
        return Add()
    
    def get_default_shortcut(self) -> DeeplayModule:
        return MyConv1dBlock(
            self.in_channels, self.out_channels, kernel_size=1, 
            stride=self.stride, padding=0,
        )