# Overview of Deeplay Internal Structure

This notebook is a deep dive into the internals of the Deeplay library. It is intended for developers who want to understand how the library works and how to extend it.

## The `DeeplayModule` Class

At the core of deeplay is the `DeeplayModule` class. This class is a subclass of `torch.nn.Module` and is responsible to manage the configurations applied by the user, and to build the model based on these configurations.

### The Lifecycle of a `DeeplayModule` Object

Let's start by understanding the lifecycle of a `DeeplayModule` object. This is managed by the Deeplay metaclass `ExtendedConstructorMeta`. This metaclass is responsible to create the `DeeplayModule` class and managing its configuration. Let's look at the `.call()` method of the `ExtendedConstructorMeta` metaclass.

```python
class ExtendedConstructorMeta(type):

    ...

    def __call__(cls: Type[T], *args, **kwargs) -> T:
        """Construct an instance of a class whose metaclass is Meta."""

        # If the object is being constructed from a checkpoint, we instead
        # load the class from the pickled state and build it using the checkpoint.
        if "__from_ckpt_application" in kwargs:
            assert "__build_args" in kwargs, "Missing __build_args in kwargs"
            assert "__build_kwargs" in kwargs, "Missing __build_kwargs in kwargs"

            _args = kwargs.pop("__build_args")
            _kwargs = kwargs.pop("__build_kwargs")

            app = dill.loads(kwargs["__from_ckpt_application"])
            app.build(*_args, **_kwargs)
            return app

        # Otherwise, we construct the object as usual.
        obj = cls.__new__(cls, *args, **kwargs)

        # We store the actual arguments used to construct the object.
        object.__setattr__(
            obj,
            "_actual_init_args",
            {
                "args": args,
                "kwargs": kwargs,
            },
        )
        object.__setattr__(obj, "_config_tape", [])
        object.__setattr__(obj, "_is_calling_stateful_method", False)

        # First, we call the __pre_init__ method of the class.
        cls.__pre_init__(obj, *args, **kwargs)

        # Next, we construct the class. The not_top_level context manager is used to
        # keep track of where in the object hierarchy we currently are.
        with not_top_level(cls, obj):
            obj.__construct__()
            obj.__post_init__()

        return obj
```

### Breaking down the lifecycle

The method is pretty long, so let's break it down into smaller parts.

#### 1. If the Object Is Being Constructed from a Checkpoint, Load and Return It.

The method first checks if the object is being constructed from a checkpoint. If it is, it loads the object from the checkpoint and returns it.

```python
if "__from_ckpt_application" in kwargs:
    assert "__build_args" in kwargs, "Missing __build_args in kwargs"
    assert "__build_kwargs" in kwargs, "Missing __build_kwargs in kwargs"

    _args = kwargs.pop("__build_args")
    _kwargs = kwargs.pop("__build_kwargs")

    app = dill.loads(kwargs["__from_ckpt_application"])
    app.build(*_args, **_kwargs)
    return app
```

#### 2. Construct the Object with the `.__new__()` Method

Next, it constructs the object as usual. It creates the object using the `.__new__()` method of the class and sets some internal attributes.

```python
obj = cls.__new__(cls, *args, **kwargs)
object.__setattr__(
    obj,
    "_actual_init_args",
    {
        "args": args,
        "kwargs": kwargs,
    },
)
object.__setattr__(obj, "_config_tape", [])
object.__setattr__(obj, "_is_calling_stateful_method", False)
```

These attributes are 
- `_actual_init_args`: The actual arguments used to construct the object. This is used to create new copies of the object.
- `_config_tape`: A list of configurations applied to the object by the user (more on this later). This is also used to create new copies of the object.
- `_is_calling_stateful_method`: A flag that is used to check if the object is currently calling a stateful method. This is used to check if something should be added to the `_config_tape` or not.

#### 3. Call `.__pre_init__()` Method

Next, it calls the `.__pre_init__()` method of the class. This method is used to perform any pre-initialization steps. For most cases, subclasses do not need to override this method.

```python
cls.__pre_init__(obj, *args, **kwargs)
```

#### 4. Construction the Object.

Next, it constructs the object. This is done by calling the `.__construct__()` method of the object. This method actually calls the `.__init__()` method of the object and sets up the model. More on the `.__construct__()` method later; suffice for now to say that this is where deeper initialization of the object happens, recursively constructing the children of the object.

After constructing the object, it calls the `.__post_init__()` method of the object. This method is used to perform any post-initialization steps. This does nothing by default.

**NOTE:** Both the `.__pre_init__()` and `.__post_init__()` methods are called within the `not_top_level` context manager. This context manager is used to keep track of where in the object hierarchy we currently are. We'll cover this more later. But, the primary function of this is to help decide the priority of configurations applied to the object. Configurations applied while currently at the top level (as in, called directly by the user) are given higher priority than configurations applied while constructing the object. And the deeper we go, the lower the priority of the configurations.

```python
with not_top_level(cls, obj):
    obj.__construct__()
    obj.__post_init__()
```

#### 5. Return the Object

Finally, it returns the object.

**NOTE:** The main reason this is implemented as a meta class instead of using the `.__new__()` and `.__init__()` methods is to guarantee to store the exact arguments used to construct the object, not just the arguments passed up through `.__super__()` calls. This is important for creating new copies of the object. 

Moreover, the arguments passed to the `.__init__()` method may not be the same as the arguments passed to the `.__new__()` method. This is because the configurations applied by the user may change the arguments passed to the `.__init__()` method between the `.__pre_init__()` and `.__construct__()` calls.

## The `.__construct__()` method

The `.__construct__()` method of the `DeeplayModule` class is where the actual initialization of the object happens. This is where the `.__init__()` method of the object is called and the model is set up. The core idea is that the `.__construct__()` method should restore the state of the object to how it was immediately after the `.__pre_init__()` method was called, then find the correct arguments to pass to the `.__init__()` method based on the actual arguments passed to the `.__new__()` method and the configurations applied by the user.

```python
def __construct__(self):
    with not_top_level(ExtendedConstructorMeta, self):  # (1)
        # Reset construction.
        self._modules.clear()  # (2)
        self._user_config.remove_derived_configurations(self.tags)  # (3)

        self.is_constructing = True  # (4)

        args, kwargs = self.get_init_args()  # (5)
        getattr(self, self._init_method)(*(args + self._args), **kwargs)  # (6)

        self._run_hooks("after_init")  # (7)
        self.is_constructing = False  # (8)
        self.__post_init__()  # (9)
```

**(1)** This is the same `not_top_level` context manager we saw earlier. This is used to keep track of where in the object hierarchy we currently are.

**(2)** This removes any children of the object. These will only be added during the `.__init__()` method, so they should always be removed.

**(3)** Here we encounter two new terms: _derived configurations_ and _tags_.

#### Tags

Tags are tuples of strings used to identify a module in the hierarchy. These generally correspond to the names of the modules in the hierarchy. For example, ("block", "layer") would correspond to a module named `block.layer`. A module can have multiple tags if it exists in multiple places. Tags are used to identify the module in the hierarchy and to apply configurations to the module. It's important to refer to modules by their tags instead of them as objects, since the module may be cleared and re-initialized multiple times during the lifecycle of the object. 

Tags are always relative to the root module (which we have yet to encounter). The root module is the base of the hierarchy and is the only module that is not a child of any other module. A module may exist in multiple places in the hierarchy, but must always have the same root module. Every `DeeplayModule` object keeps track of the current root.

#### Derived configurations

Derived configurations are configurations not explicitly applied by the user. 
For example, if the `.__init__()` method of a module calls `self.child.configure("foo", 1)`, then the configuration `"foo"` is derived. This is because the user did not explicitly apply the configuration, but it was applied by the module itself. Since the configuration is applied during the `.__init__()` method, it should be removed before the `.__init__()` method is called again.

Deeplay uses the `not_top_level` context manager to decide if a configuration is derived or not. The `not_top_level` context manager stores the tags of the currently constructing module in the `ExtendedConstructorMeta` class. Every time a configuration is added, it also stores these tags as the `source` of the configuration. 

When deciding if a configuration is derived or not, Deeplay checks if the `source` of the configuration is a parent of the the target of the configuration. If it is, then the configuration is NOT derived. If the source is a child of the target, or the target is the same as the source, then the configuration is derived and should be removed before the `.__init__()` method is called.


**(4)** Next, we set the `is_constructing` flag to `True`. This is used to check if the object is currently being constructed. This is used to prevent certain configurations from being applied while the object is being constructed.

**(5)** This is where the actual arguments to pass to the `.__init__()` method are determined. This is done by calling the `.get_init_args()` method. This method is responsible for finding the correct arguments to pass to the `.__init__()` method based on the actual arguments passed to the `.__new__()` method and the configurations applied by the user. Each class can override this method to customize how the arguments are determined.

**(6)** Finally, we call the `.__init__()` method of the object with the correct arguments. The `_init_method` attribute is used to determine the name of the `.__init__()` method to call. Most of the time, this is just `"__init__"`, but it can be overridden by subclasses to call a different method. The reason for this is to make Deeplay play nicer with editors. It allows the class to define a dummy `.__init__()` method that gives the types and names of the arguments, while the actual initialization logic is in a different method. This allows the editor to provide better autocompletion and type checking.

**(7)** After the `.__init__()` method is called, we run the `after_init` hooks. Hooks are used to run code at specific points in the lifecycle of the object. The `after_init` hook is run after the `.__init__()` method is called. We'll cover hooks in more detail later.

**(8)** We set the `is_constructing` flag to `False` to indicate that the object is no longer being constructed.

**(9)** Finally, we call the `__post_init__` method of the object. This method is used to perform any post-initialization steps. This does nothing by default.

## The `Config` Object

For each hierarchy of modules, there is a corresponding `Config` object, which lives on the root module. 

It is a dictionary-like object that stores the configurations applied to the modules in the hierarchy. The keys are tags and the name of the configurable (for example, `("block", "layer", "foo")`). The values are lists of `ConfigItem` or `DetachedConfigItem` objects. 

`ConfigItem` objects store the `source` of the configuration and the `value` of the configuration. The `source` is the tags of the module that was constructing when the configuration was applied. The `value` is the value of the configuration.

`DetachedConfigItem` objects are in practice very similar to `ConfigItem`s, and should be ephemeral. They are used to store configurations that are applied by an object that is not part of the same hierarchy. As such, the `tags` of the `source` do not make sense. Instead, the `source` is temporarily set to the object itself. This is okay, because all `DetachedConfigItem`s become `ConfigItem`s after the `.__construct__()` method is called.

**NOTE:** No `DetachedConfigItem` should exist after the `.__construct__()` method is called.

The following is an example where a `DetachedConfigItem` is created:

```python	
class Module(DeeplayModule):
    def __init__(self):
        child = LinearBlock(10, 10)

        # Here, child is not attached to the hierarchy yet, so we don't have tags for it.
        child.configure("activation", nn.ReLU())

        # Here, the child is attached. This changes the root_module of `child` and we
        # can now get the tags of the child. The DetachedConfigItem is converted to a ConfigItem.
        self.child = child
```

Taking the example of `("block", "layer", "foo")`, the `Config` object would look something like this:

```python
{
    ("block", "layer", "foo"): [
        ConfigItem(source=None, value=1),
        ConfigItem(source=("block", "layer"), value=2),
    ]
}
```

A `None` source means that the configuration was applied by the user. When deciding which item to use as the actual value, the item with the highest priority is used. The priority is determined by the source of the item. The source closest to the root module has the highest priority. If two items have the same source, the item applied later has higher priority. A `None` source has the highest priority.

### Hooks

Since modules may be reconstructed at any point, it is important that any state-altering methods are re-run after the module is reconstructed. This is where hooks come in. Hooks are used to run code at specific points in the lifecycle of the object.

To register a method as a hook, you can use the any of the following decorators:

```python
# Does not create a hook, but adds the method to the config tape, which is replayed
# when model.new() is called.
@stateful 

# Runs the method after the __init__ method is called (and adds to config tape).
@after_init

# Runs the method before the build method is called (and adds to config tape).
@before_build

# Runs the method after the build method is called (and adds to config tape).
@after_build
```

## The Config Tape

The config tape is a list of methods that are run when the `.__new__()` method is called. This method should create a new, identical but detached object. To do so we first create the object with the same exact input arguments (as stored in the metaclass) and then run the same stateful methods, in the same order, with the same arguments. 

**NOTE:** One may imagine that one could simply pass the same configuration object to the new object, but this is far from simple. It is not guaranteed that the configuration object is serializable, and even if it is, it may contain cyclic references and other issues that are hard to resolve.

## Checkpointing

Since Deeplay modules requires an additional `build` step before the weights are created, so the default checkpointing system of `lightning` does not work.

We have solved this by storing the state of the `Application` object immediately before building as a a hyperparameter in the checkpoint. This is then loaded when the model is loaded from the checkpoint, and the `build` method is called with the same arguments as before before the weights are loaded.