# layers

This module contains many layer classes that we might be interested in using in our models. These layers complement the default [Pytorch layers](https://pytorch.org/docs/stable/nn.html) which we can also use in our models.

In [1]:
from fastai.docs import *

In [4]:
??get_mnist

## AdaptiveConcatPool2d

In [4]:
from fastai.gen_doc.nbdoc import *
from fastai.layers import * 

In [7]:
data = get_mnist()

The [`AdaptiveConcatPool2d`](/layers.html#AdaptiveConcatPool2d) object uses adaptive average pooling and adaptive max pooling and concatenates them both. This is positive because it uses the information of both methods and improves performance. This technique is called `adaptive` because it allows us to decide on what output dimensions we want, instead of choosing the input's dimensions to fit a given desired output.

Let's try training with Adaptive Average Pooling first, then with Adaptive Max Pooling and finally with the concatenation of them both to see how they fare in performance.

We will first define a [`simple_cnn`](/layers.html#simple_cnn) using [Adapative Max Pooling](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveMaxPool2d).

In [6]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveMaxPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [7]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.083610    0.076819    0.976938  (00:05)
1      0.046277    0.051196    0.984789  (00:05)
2      0.029490    0.037488    0.987733  (00:05)
3      0.027417    0.031515    0.986261  (00:05)
4      0.021957    0.030983    0.987733  (00:05)



Now let's try with [Adapative Average Pooling](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveAvgPool2d) now.

In [57]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveAvgPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [58]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 406, in _shutdown_workers
    w.join()
  File "/home/francisco/anaconda3/lib/python3.6/multiprocessing/process.py", line 122, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/

  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 406, in _shutdown_workers
    w.join()
  File "/home/francisco/anaconda3/lib/python3.6/multiprocessing/process.py", line 122, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
    self._shutdown_workers()
Traceback (most recent call last):
  File "/home/

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.154510    0.136999    0.948970  (00:05)
1      0.106303    0.090356    0.969578  (00:05)
2      0.076875    0.066036    0.978901  (00:05)
3      0.059066    0.043986    0.982826  (00:05)
4      0.051543    0.039275    0.986752  (00:05)



Finally we will try with the concatenation of them both [`AdaptiveConcatPool2d`](/layers.html#AdaptiveConcatPool2d). We will see that, in fact, it increases our accuracy and decreases our loss considerably!

In [60]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(AdaptiveConcatPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [61]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 406, in _shutdown_workers
    w.join()
  File "/home/francisco/anaconda3/lib/python3.6/multiprocessing/process.py", line 122, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
  File "/home/francisco/

  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 406, in _shutdown_workers
    w.join()
  File "/home/francisco/anaconda3/lib/python3.6/multiprocessing/process.py", line 122, in join
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ff3006a4e80>>
Traceback (most recent call last):
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 412, in __del__
    self._shutdown_workers()
  File "/home/francisco/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 406, in _shutdown_workers
    w.join()
  File "/home/francisco/anaconda3/lib/python3

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.083039    0.062286    0.978410  (00:05)
1      0.050831    0.043394    0.987733  (00:05)
2      0.031640    0.027480    0.991659  (00:05)
3      0.025538    0.025393    0.991168  (00:05)
4      0.025177    0.017456    0.992640  (00:05)



## Lambda

Lambda allows us to define functions and use them as layers in our networks inside a [Sequential](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential) object. 

So, for example, say we want to apply a [log_softmax loss](https://pytorch.org/docs/stable/nn.html#torch.nn.functional.log_softmax) and we need to change the shape of our output batches we can add a layer that applies this change by calling:

`Lambda(lambda x: x.view(x.size(0),-1))`

This is actually how we built the [`Flatten`](/layers.html#flatten) layer, which you can inspect by calling `??Flatten`.

Let's see an example of how the shape of our output can change when we add this layer.

In [62]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10, 1, 1])


In [61]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Lambda(lambda x: x.view(x.size(0),-1))
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10])


We can combine these two final layers ([AdaptiveAvgPool2d](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveAvgPool2d) and [`Flatten`](/layers.html#flatten)) by using [`Pool Flatten`](/layers.html#Pool Flatten).

In [68]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    PoolFlatten()
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10])


Another use we give to the Lambda function is to resize batches with [`ResizeBatch`](/layers.html#ResizeBatch) when we have a layer that expects a different input than what comes from the previous one. Let's see an example:

In [85]:
a = torch.tensor([[1., -1.], [1., -1.]])
print(a)

tensor([[ 1., -1.],
        [ 1., -1.]])


In [86]:
out = ResizeBatch(4)
print(out(a))

tensor([[ 1., -1.,  1., -1.]])


In [13]:
show_doc(AdaptiveConcatPool2d)

## <a id=AdaptiveConcatPool2d></a>`class` `AdaptiveConcatPool2d`
> `AdaptiveConcatPool2d`(`sz`:`Optional`\[`int`\]=`None`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Layer that concats `AdaptiveAvgPool2d` and `AdaptiveMaxPool2d` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L60">[source]</a>

[`AdaptiveConcatPool2d`](/layers.html#AdaptiveConcatPool2d)

In [14]:
show_doc(AdaptiveConcatPool2d.forward)

#### <a id=forward></a>`forward`
> `forward`(`x`)


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L67">[source]</a>

`AdaptiveConcatPool2d.forward`

In [15]:
show_doc(CrossEntropyFlat)

## <a id=CrossEntropyFlat></a>`class` `CrossEntropyFlat`
> `CrossEntropyFlat`(`weight`=`None`, `size_average`=`None`, `ignore_index`=`-100`, `reduce`=`None`, `reduction`=`'elementwise_mean'`) :: [`CrossEntropyLoss`](https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss)


Same as `nn.CrossEntropyLoss`, but flattens input and target <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L93">[source]</a>

[`CrossEntropyFlat`](/layers.html#CrossEntropyFlat)

In [16]:
show_doc(CrossEntropyFlat.forward)

#### <a id=forward></a>`forward`
> `forward`(`input`:`Tensor`, `target`:`Tensor`) -> `Rank0Tensor`


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L95">[source]</a>

`CrossEntropyFlat.forward`

In [17]:
show_doc(Debugger)

## <a id=Debugger></a>`class` `Debugger`
> `Debugger`() :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


A module to debug inside a model <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L69">[source]</a>

[`Debugger`](/layers.html#Debugger)

In [18]:
show_doc(Debugger.forward)

#### <a id=forward></a>`forward`
> `forward`(`x`:`Tensor`) -> `Tensor`


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L71">[source]</a>

`Debugger.forward`

In [19]:
show_doc(Flatten)

#### <a id=Flatten></a>`Flatten`
> `Flatten`() -> `Tensor`


Flattens `x` to a single dimension, often used at the end of a model <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L21">[source]</a>

[`Flatten`](/layers.html#Flatten)

In [20]:
show_doc(Lambda)

## <a id=Lambda></a>`class` `Lambda`
> `Lambda`(`func`:`LambdaFunc`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


An easy way to create a pytorch layer for a simple `func` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L8">[source]</a>

[`Lambda`](/layers.html#Lambda)

In [21]:
show_doc(Lambda.forward)

#### <a id=forward></a>`forward`
> `forward`(`x`)


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L15">[source]</a>

`Lambda.forward`

In [22]:
show_doc(PoolFlatten)

#### <a id=PoolFlatten></a>`PoolFlatten`
> `PoolFlatten`() -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


Apply `nn.AdaptiveAvgPool2d` to `x` and then flatten the result <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L25">[source]</a>

[`PoolFlatten`](/layers.html#PoolFlatten)

In [23]:
show_doc(ResizeBatch)

#### <a id=ResizeBatch></a>`ResizeBatch`
> `ResizeBatch`(`size`:`int`) -> `Tensor`


Layer that resizes x to `size`, good for connecting mismatched layers <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L17">[source]</a>

[`ResizeBatch`](/layers.html#ResizeBatch)

In [24]:
show_doc(StdUpsample)

## <a id=StdUpsample></a>`class` `StdUpsample`
> `StdUpsample`(`n_in`:`int`, `n_out`:`int`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Standard upsample module <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L75">[source]</a>

[`StdUpsample`](/layers.html#StdUpsample)

In [25]:
show_doc(StdUpsample.forward)

#### <a id=forward></a>`forward`
> `forward`(`x`:`Tensor`) -> `Tensor`


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L82">[source]</a>

`StdUpsample.forward`

In [26]:
show_doc(bn_drop_lin)

#### <a id=bn_drop_lin></a>`bn_drop_lin`
> `bn_drop_lin`(`n_in`:`int`, `n_out`:`int`, `bn`:`bool`=`True`, `p`:`float`=`0.0`, `actn`:`Optional`\[[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\]=`None`)


`n_in`->bn->dropout->linear(`n_in`,`n_out`)->`actn` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L29">[source]</a>

[`bn_drop_lin`](/layers.html#bn_drop_lin)

In [27]:
show_doc(conv2d)

#### <a id=conv2d></a>`conv2d`
> `conv2d`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`, `padding`:`int`=`None`, `bias`=`False`) -> [`Conv2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d)


Create `nn.Conv2d` layer: `ni` inputs, `nf` outputs, `ks` kernel size. `padding` defaults to `k//2` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L37">[source]</a>

[`conv2d`](/layers.html#conv2d)

In [28]:
show_doc(conv2d_relu)

#### <a id=conv2d_relu></a>`conv2d_relu`
> `conv2d_relu`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`, `padding`:`int`=`None`, `bn`:`bool`=`False`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


Create a [`conv2d`](/layers.html#conv2d) layer with `nn.ReLU` activation and optional(`bn`) `nn.BatchNorm2d` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L49">[source]</a>

[`conv2d_relu`](/layers.html#conv2d_relu)

In [29]:
show_doc(conv2d_trans)

#### <a id=conv2d_trans></a>`conv2d_trans`
> `conv2d_trans`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`2`, `stride`:`int`=`2`, `padding`:`int`=`0`) -> [`ConvTranspose2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d)


Create `nn.nn.ConvTranspose2d` layer: `ni` inputs, `nf` outputs, `ks` kernel size. `padding` defaults to 0 <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L56">[source]</a>

[`conv2d_trans`](/layers.html#conv2d_trans)

In [30]:
show_doc(conv_layer)

#### <a id=conv_layer></a>`conv_layer`
> `conv_layer`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


Create Conv2d->BatchNorm2d->LeakyReLu layer: `ni` input, `nf` out filters, `ks` kernel, `stride`:stride <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L42">[source]</a>

[`conv_layer`](/layers.html#conv_layer)

In [31]:
show_doc(get_embedding)

#### <a id=get_embedding></a>`get_embedding`
> `get_embedding`(`ni`:`int`, `nf`:`int`) -> [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Creates an embedding layer <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L115">[source]</a>

[`get_embedding`](/layers.html#get_embedding)

In [32]:
show_doc(simple_cnn)

#### <a id=simple_cnn></a>`simple_cnn`
> `simple_cnn`(`actns`:`Collection`\[`int`\], `kernel_szs`:`Collection`\[`int`\]=`None`, `strides`:`Collection`\[`int`\]=`None`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


CNN with [`conv2d_relu`](/layers.html#conv2d_relu) layers defined by `actns`, `kernel_szs` and `strides` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L99">[source]</a>

[`simple_cnn`](/layers.html#simple_cnn)

In [33]:
show_doc(std_upsample_head)

#### <a id=std_upsample_head></a>`std_upsample_head`
> `std_upsample_head`(`c`, `nfs`:`Collection`\[`int`\]) -> [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Creates a sequence of upsample layers <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L85">[source]</a>

[`std_upsample_head`](/layers.html#std_upsample_head)

In [34]:
show_doc(trunc_normal_)

#### <a id=trunc_normal_></a>`trunc_normal_`
> `trunc_normal_`(`x`:`Tensor`, `mean`:`float`=`0.0`, `std`:`float`=`1.0`) -> `Tensor`


Truncated normal initialization <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L110">[source]</a>

[`trunc_normal_`](/layers.html#trunc_normal_)