# Introduction for Developers

This notebook presents a minimal example of how to implement a neural network and an application with deeplay.
Specifically, it implements the classes for a multilayer perceptron and a classifier.
Then, it combines them to demonstrate how they can be used for a simple classification task.
Finally, it upgrades these classes adding functionalities that are required to improved the user experience when using an IDE.

## Minimal Multilayer Perceptron

Here, we implement the minimal class `SimpleMLP`. It extends directly `dl.DeeplayModule`, which is the base class for all modules in `deeplay`.

It represents a multilayer perceptron with a certain umber of inputs (`ìn_features`, which is an integer), a series of hidden layers with a certain number of neurons (`hidden_features`, a vector with the number of neurons for each layer), and a certain number of outputs (`out_features`, which is an integer).

The constructor initializes the MLP by creating a sequence of linear and ReLU activation layers.

The `forward` method defines the data flow through the network, sequentially passing the input through each linear-activation block and returning the final output.

In [None]:
import deeplay as dl
import torch.nn as nn

class SimpleMLP(dl.DeeplayModule):

    def __init__(self, in_features, hidden_features, out_features):
        super().__init__()

        self.in_features = in_features
        self.hidden_features = hidden_features
        self.out_features = out_features
        
        self.blocks = dl.LayerList()
        for inputs_layer, outputs_layer in zip([in_features, *hidden_features], 
                                               [*hidden_features, out_features]):
            self.blocks.append(
                dl.LayerActivationBlock(
                    dl.Layer(nn.Linear, inputs_layer, outputs_layer),
                    dl.Layer(nn.ReLU)
                )
            )

    def forward(self, x):
        for block in self.blocks:
            x = block(x)

        return x

We can now create an instance of `SimpleMLP` in various ways, for example:

```python
mlp = SimpleMLP(2, [32, 32], 2)
```

or more explicitly:

```python
mlp = SimpleMLP(
    in_features=2, 
    hidden_feature=[32, 32], 
    out_features=2
)
```

In [8]:
mlp = SimpleMLP(2, [32, 32], 2)

print(mlp)

SimpleMLP(
  (blocks): LayerList(
    (0): LayerActivationBlock(
      (layer): Layer[Linear](in_features=2, out_features=32)
      (activation): Layer[ReLU]()
    )
    (1): LayerActivationBlock(
      (layer): Layer[Linear](in_features=32, out_features=32)
      (activation): Layer[ReLU]()
    )
    (2): LayerActivationBlock(
      (layer): Layer[Linear](in_features=32, out_features=2)
      (activation): Layer[ReLU]()
    )
  )
)


## Minimal Classifier

Here, we now implement the application `SimpleClassifier`. This extend the deeplay class `dl.Application`.

In [9]:
class SimpleClassifier(dl.Application):

    def __init__(self, model, **kwargs):
        self.model = model
        super().__init__(**kwargs)
        
        
    def forward(self, x):
        return self.model(x)

We can now create an instance of this using as model the `mlp` that we have defined above, setting `loss` to `nn.CrossEntropyLoss()` and `optimizer`to `dl.Adam(lr=1e-3)`. We also add kepp track of a metrics, setting `metrics` to `[tm.Accuracy("multiclass", num_classes=2)]`.

Since we are using a cross-entropy loss, we need to set the output activation to `nn.Identity`.

In [10]:
import torchmetrics as tm

classifier = SimpleClassifier(
    model=mlp, 
    loss=nn.CrossEntropyLoss(), 
    optimizer=dl.Adam(lr=1e-3), 
    metrics=[tm.Accuracy("multiclass", num_classes=2)]
)
classifier.model.blocks[-1].activation.configure(nn.Identity)

classifier.build()

print(classifier)

SimpleClassifier(
  (model): SimpleMLP(
    (blocks): LayerList(
      (0): LayerActivationBlock(
        (layer): Linear(in_features=2, out_features=32, bias=True)
        (activation): ReLU()
      )
      (1): LayerActivationBlock(
        (layer): Linear(in_features=32, out_features=32, bias=True)
        (activation): ReLU()
      )
      (2): LayerActivationBlock(
        (layer): Linear(in_features=32, out_features=2, bias=True)
        (activation): Identity()
      )
    )
  )
  (loss): CrossEntropyLoss()
  (optimizer): Adam[Adam](lr=0.001, params=<generator object Module.parameters at 0x7ff099b838b0>)
  (train_metrics): MetricCollection(
    (MulticlassAccuracy): MulticlassAccuracy(),
    prefix=train
  )
  (val_metrics): MetricCollection(
    (MulticlassAccuracy): MulticlassAccuracy(),
    prefix=val
  )
  (test_metrics): MetricCollection(
    (MulticlassAccuracy): MulticlassAccuracy(),
    prefix=test
  )
)


**Notes**

Instead of `classifier.build()`, which build the module in place, it is also possible to use `new_classifier = classifier.create()`, which clones and build the classifier.

Instead of `classifier.model.blocks[-1].activation.configure(nn.Identity)`, it'd also be possible to use `classifier.model.output.activation.configure(nn.Identity)`, which is more easily understandable.

## Example

We'll now use `classifier` for the simple task of determining whether the sum of two numbers is larger or smaller than 0.

In [11]:
from torch import randn
from torch.utils.data import TensorDataset, random_split, DataLoader

num_samples = 100
data = randn(num_samples, 2)
labels = (data.sum(dim=1) > 0).long()

dataset = TensorDataset(data, labels)
train, val = random_split(dataset, [0.8, 0.2])

train_dataloader = DataLoader(train, batch_size=16, shuffle=True)
val_dataloader = DataLoader(val, batch_size=16, shuffle=False)

trainer = dl.Trainer(max_epochs=10)

trainer.fit(classifier, train_dataloader, val_dataloader)

trainer.test(classifier, val_dataloader)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: /Users/giovannivolpe/Documents/GitHub/deeplay/tutorials/developers/lightning_logs

  | Name          | Type             | Params
---------------------------------------------------
0 | model         | SimpleMLP        | 1.2 K 
1 | loss          | CrossEntropyLoss | 0     
2 | train_metrics | MetricCollection | 0     
3 | val_metrics   | MetricCollection | 0     
4 | test_metrics  | MetricCollection | 0     
5 | optimizer     | Adam             | 0     
---------------------------------------------------
1.2 K     Trainable params
0         Non-trainable params
1.2 K     Total params
0.005     Total estimated model params size (MB)


Sanity Checking DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 50.91it/s]

/Users/giovannivolpe/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=7` in the `DataLoader` to improve performance.


                                                                           

/Users/giovannivolpe/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=7` in the `DataLoader` to improve performance.
/Users/giovannivolpe/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (5) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Epoch 9: 100%|██████████| 5/5 [00:00<00:00, 50.71it/s, v_num=0, train_loss_step=0.501, trainMulticlassAccuracy_step=1.000, val_loss_step=0.560, valMulticlassAccuracy_step=0.750, val_loss_epoch=0.513, valMulticlassAccuracy_epoch=0.900, train_loss_epoch=0.496, trainMulticlassAccuracy_epoch=0.962] 

`Trainer.fit` stopped: `max_epochs=10` reached.


Epoch 9: 100%|██████████| 5/5 [00:00<00:00, 45.62it/s, v_num=0, train_loss_step=0.501, trainMulticlassAccuracy_step=1.000, val_loss_step=0.560, valMulticlassAccuracy_step=0.750, val_loss_epoch=0.513, valMulticlassAccuracy_epoch=0.900, train_loss_epoch=0.496, trainMulticlassAccuracy_epoch=0.962]


/Users/giovannivolpe/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=7` in the `DataLoader` to improve performance.


Testing DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 73.47it/s] 
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        Test metric                 DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
testMulticlassAccuracy_epoch     0.8999999761581421
      test_loss_epoch            0.5125035047531128
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss_epoch': 0.5125035047531128,
  'testMulticlassAccuracy_epoch': 0.8999999761581421}]

## Quality-of-Life Improvements for IDE

### Improvements to `SimpleMLP`

We'll start by commenting the class, adding:

````python
#---snip---
class SimpleMLP(dl.DeeplayModule):
    """Multi-layer perceptron module.

    Also commonly known as a fully-connected neural network, or a dense neural network.

    Configurables
    -------------
    - in_features (int): Number of input features. If None, the input shape is inferred in the first forward pass. (Default: None)
    - hidden_features (list[int]): Number of hidden units in each layer.
    - out_features (int): Number of output features. (Default: 1)
    - blocks (template-like): Specification for the blocks of the MLP. (Default: "layer" >> "activation" >> "normalization" >> "dropout")
        - layer (template-like): Specification for the layer of the block. (Default: nn.Linear)
        - activation (template-like): Specification for the activation of the block. (Default: nn.ReLU)
        - normalization (template-like): Specification for the normalization of the block. (Default: nn.Identity)
    - out_activation (template-like): Specification for the output activation of the MLP. (Default: nn.Identity)

    Evaluation
    ----------
    >>> for block in mlp.blocks:
    >>>    x = block(x)
    >>> return x

    Examples
    --------
    >>> mlp = SimpleMLP(28 * 28, [128, 128], 10)
    >>> mlp.build()

    Return Values
    -------------
    The forward method returns the processed tensor.

    """

    #---snip---
```

We'll then add some defaults for the input parameters:

```python
#---snip---
from typing import Optional, Sequence, Type

class SimpleMLP(dl.DeeplayModule):
    #---snip---

    in_features: Optional[int]
    hidden_features: Sequence[Optional[int]]
    out_features: int
    blocks: dl.LayerList[dl.blocks.LAN.LayerActivationNormalizationBlock]

    #---snip---

    def __init__(
        self,
        in_features: Optional[int],
        hidden_features: Sequence[Optional[int]],
        out_features: int,
    ):
        #---snip---
```

We'll then add some checks on the input parameters:

```python
#---snip---
from typing import Optional, Sequence, Type

class SimpleMLP(dl.DeeplayModule):
    #---snip---

    def __init__(
        #---snip---
    ):
        #---snip---
        
        if out_features <= 0:
            raise ValueError(
                f"Number of output features must be positive, got {out_features}"
            )

        if in_features is not None and in_features <= 0:
            raise ValueError(f"in_channels must be positive, got {in_features}")

        if any(h <= 0 for h in hidden_features):
            raise ValueError(
                f"all hidden_channels must be positive, got {hidden_features}"
            )
        
    #---snip---
```

We'll then add some shorthands and the relative documentation for convenience:

```python
#---snip---

class SimpleMLP(dl.DeeplayModule):
    """Multi-layer perceptron module.
    ---snip---

    Shorthands
    ----------
    - `input`: Equivalent to `.blocks[0]`.
    - `hidden`: Equivalent to `.blocks[:-1]`.
    - `output`: Equivalent to `.blocks[-1]`.
    - `layer`: Equivalent to `.blocks.layer`.
    - `activation`: Equivalent to `.blocks.activation`.
    - `normalization`: Equivalent to `.blocks.normalization`.

    ---snip---
    """

    #---snip---

    @property
    def input(self):
        """Return the input layer of the network. Equivalent to `.blocks[0]`."""
        return self.blocks[0]

    @property
    def hidden(self):
        """Return the hidden layers of the network. Equivalent to `.blocks[:-1]`"""
        return self.blocks[:-1]

    @property
    def output(self):
        """Return the last layer of the network. Equivalent to `.blocks[-1]`."""
        return self.blocks[-1]

    @property
    def layer(self) -> LayerList[Layer]:
        """Return the layers of the network. Equivalent to `.blocks.layer`."""
        return self.blocks.layer

    @property
    def activation(self) -> LayerList[Layer]:
        """Return the activations of the network. Equivalent to `.blocks.activation`."""
        return self.blocks.activation

    @property
    def normalization(self) -> LayerList[Layer]:
        """Return the normalizations of the network. Equivalent to `.blocks.normalization`."""
        return self.blocks.normalization

    #---snip---
```

Finally, we can add some default configurations for the IDE:

```python
#---snip---
from typing import overload, List, Literal, Any

class SimpleMLP(dl.DeeplayModule):
    #---snip---

    @overload
    def configure(
        self,
        /,
        in_features: int | None = None,
        hidden_features: List[int] | None = None,
        out_features: int | None = None,
        out_activation: Type[nn.Module] | nn.Module | None = None,
    ) -> None:
        ...

    @overload
    def configure(
        self,
        name: Literal["blocks"],
        index: int | slice | List[int | slice] | None = None,
        order: Optional[Sequence[str]] = None,
        layer: Optional[Type[nn.Module]] = None,
        activation: Optional[Type[nn.Module]] = None,
        normalization: Optional[Type[nn.Module]] = None,
        **kwargs: Any,
    ) -> None:
        ...

    configure = dl.DeeplayModule.configure
```

In [13]:
import deeplay as dl
import torch.nn as nn
from typing import Optional, Sequence, Type, overload, List, Literal, Any


class SimpleMLP(dl.DeeplayModule):
    """Multi-layer perceptron module.

    Also commonly known as a fully-connected neural network, or a dense neural network.

    Configurables
    -------------
    - in_features (int): Number of input features. If None, the input shape is inferred in the first forward pass. (Default: None)
    - hidden_features (list[int]): Number of hidden units in each layer.
    - out_features (int): Number of output features. (Default: 1)
    - blocks (template-like): Specification for the blocks of the MLP. (Default: "layer" >> "activation" >> "normalization" >> "dropout")
        - layer (template-like): Specification for the layer of the block. (Default: nn.Linear)
        - activation (template-like): Specification for the activation of the block. (Default: nn.ReLU)
        - normalization (template-like): Specification for the normalization of the block. (Default: nn.Identity)

    Evaluation
    ----------
    >>> for block in mlp.blocks:
    >>>    x = block(x)
    >>> return x

    Examples
    --------
    >>> mlp = SimpleMLP(28 * 28, [128, 128], 10)
    >>> mlp.build()

    Shorthands
    ----------
    - `input`: Equivalent to `.blocks[0]`.
    - `hidden`: Equivalent to `.blocks[:-1]`.
    - `output`: Equivalent to `.blocks[-1]`.
    - `layer`: Equivalent to `.blocks.layer`.
    - `activation`: Equivalent to `.blocks.activation`.
    - `normalization`: Equivalent to `.blocks.normalization`.    

    Return Values
    -------------
    The forward method returns the processed tensor.

    """

    in_features: Optional[int]
    hidden_features: Sequence[Optional[int]]
    out_features: int
    blocks: dl.LayerList[dl.LayerActivationBlock]

    @property
    def input(self):
        """Return the input layer of the network. Equivalent to `.blocks[0]`."""
        return self.blocks[0]

    @property
    def hidden(self):
        """Return the hidden layers of the network. Equivalent to `.blocks[:-1]`"""
        return self.blocks[:-1]

    @property
    def output(self):
        """Return the last layer of the network. Equivalent to `.blocks[-1]`."""
        return self.blocks[-1]

    @property
    def layer(self) -> dl.LayerList[dl.Layer]:
        """Return the layers of the network. Equivalent to `.blocks.layer`."""
        return self.blocks.layer

    @property
    def activation(self) -> dl.LayerList[dl.Layer]:
        """Return the activations of the network. Equivalent to `.blocks.activation`."""
        return self.blocks.activation

    @property
    def normalization(self) -> dl.LayerList[dl.Layer]:
        """Return the normalizations of the network. Equivalent to `.blocks.normalization`."""
        return self.blocks.normalization

    def __init__(
        self,
        in_features: Optional[int],
        hidden_features: Sequence[Optional[int]],
        out_features: int,
    ):
        super().__init__()
        
        self.in_features = in_features
        self.hidden_features = hidden_features
        self.out_features = out_features

        if out_features <= 0:
            raise ValueError(
                f"Number of output features must be positive, got {out_features}"
            )

        if in_features is not None and in_features <= 0:
            raise ValueError(f"in_channels must be positive, got {in_features}")

        if any(h <= 0 for h in hidden_features):
            raise ValueError(
                f"all hidden_channels must be positive, got {hidden_features}"
            )

        self.blocks = dl.LayerList()
        for inputs_layer, outputs_layer in zip([in_features, *hidden_features], 
                                               [*hidden_features, out_features]):
            self.blocks.append(
                dl.LayerActivationBlock(
                    dl.Layer(nn.Linear, inputs_layer, outputs_layer),
                    dl.Layer(nn.ReLU)
                )
            )

    def forward(self, x):
        for block in self.blocks:
            x = block(x)

        return x

    @overload
    def configure(
        self,
        /,
        in_features: int | None = None,
        hidden_features: List[int] | None = None,
        out_features: int | None = None,
        out_activation: Type[nn.Module] | nn.Module | None = None,
    ) -> None:
        ...

    @overload
    def configure(
        self,
        name: Literal["blocks"],
        index: int | slice | List[int | slice] | None = None,
        order: Optional[Sequence[str]] = None,
        layer: Optional[Type[nn.Module]] = None,
        activation: Optional[Type[nn.Module]] = None,
        normalization: Optional[Type[nn.Module]] = None,
        **kwargs: Any,
    ) -> None:
        ...

    configure = dl.DeeplayModule.configure


### Improvements to `SimpleClassifier`