# layers

This module contains many layer classes that we might be interested in using in our models. These layers complement the default [Pytorch layers](https://pytorch.org/docs/stable/nn.html) which we can also use as predefined layers.

In [29]:
from fastai import *
from fastai.docs import *

In [14]:
show_doc(AdaptiveConcatPool2d)

## <a id=AdaptiveConcatPool2d></a>`class` `AdaptiveConcatPool2d`
> `AdaptiveConcatPool2d`(`sz`:`Optional`\[`int`\]=`None`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Layer that concats `AdaptiveAvgPool2d` and `AdaptiveMaxPool2d` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L60">[source]</a>

In [11]:
from fastai.gen_doc.nbdoc import *
from fastai.layers import * 

The [`AdaptiveConcatPool2d`](/layers.html#AdaptiveConcatPool2d) object uses adaptive average pooling and adaptive max pooling and concatenates them both. We use this because it provides the model with the information of both methods and improves performance. This technique is called `adaptive` because it allows us to decide on what output dimensions we want, instead of choosing the input's dimensions to fit a desired output size.

Let's try training with Adaptive Average Pooling first, then with Adaptive Max Pooling and finally with the concatenation of them both to see how they fare in performance.

We will first define a [`simple_cnn`](/layers.html#simple_cnn) using [Adapative Max Pooling](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveMaxPool2d).

In [6]:
data = get_mnist()

In [18]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveMaxPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [19]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.103762    0.091423    0.968106  (00:05)
1      0.063399    0.072915    0.974975  (00:05)
2      0.046143    0.046770    0.983317  (00:05)
3      0.033929    0.036166    0.984298  (00:05)
4      0.027374    0.029849    0.989205  (00:05)



Now let's try with [Adapative Average Pooling](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveAvgPool2d) now.

In [7]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveAvgPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [8]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.137460    0.110767    0.962218  (00:05)
1      0.105729    0.080629    0.972522  (00:05)
2      0.081463    0.060069    0.979392  (00:05)
3      0.062050    0.055416    0.981354  (00:05)
4      0.050456    0.035258    0.988224  (00:05)



Finally we will try with the concatenation of them both [`AdaptiveConcatPool2d`](/layers.html#AdaptiveConcatPool2d). We will see that, in fact, it increases our accuracy and decreases our loss considerably!

In [9]:
def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv2d_relu(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(AdaptiveConcatPool2d(1), Flatten()))
    return nn.Sequential(*layers)

In [10]:
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

VBox(children=(HBox(children=(IntProgress(value=0, max=5), HTML(value=''))), HTML(value='epoch  train loss  va…

Total time: 00:27
epoch  train loss  valid loss  accuracy
0      0.079376    0.060103    0.979882  (00:05)
1      0.045153    0.045692    0.983317  (00:05)
2      0.034682    0.030597    0.987733  (00:05)
3      0.030254    0.026142    0.989205  (00:05)
4      0.027349    0.020678    0.992149  (00:05)



In [14]:
show_doc(Lambda, doc_string=False)

NameError: name 'show_doc' is not defined

Lambda allows us to define functions and use them as layers in our networks inside a [Sequential](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential) object. 

So, for example, say we want to apply a [log_softmax loss](https://pytorch.org/docs/stable/nn.html#torch.nn.functional.log_softmax) and we need to change the shape of our output batches to be able to use this loss. We can add a layer that applies the necessary change in shape by calling:

`Lambda(lambda x: x.view(x.size(0),-1))`

Let's see an example of how the shape of our output can change when we add this layer.

In [62]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10, 1, 1])


In [37]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Lambda(lambda x: x.view(x.size(0),-1))
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10])


In [19]:
show_doc(Flatten)

#### <a id=Flatten></a>`Flatten`
> `Flatten`() -> `Tensor`


Flattens `x` to a single dimension, often used at the end of a model <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L21">[source]</a>

The function we build above is actually implemented in our library as [`Flatten`](/layers.html#flatten). We can see that it returns the same size when we run it.

In [23]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Flatten(),
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10])


In [24]:
show_doc(PoolFlatten)

NameError: name 'show_doc' is not defined

We can combine these two final layers ([AdaptiveAvgPool2d](https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveAvgPool2d) and [`Flatten`](/layers.html#flatten)) by using [`Pool Flatten`](/layers.html#Pool Flatten).

In [23]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    PoolFlatten()
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break

torch.Size([64, 10])


In [20]:
show_doc(ResizeBatch)

#### <a id=ResizeBatch></a>`ResizeBatch`
> `ResizeBatch`(`size`:`int`) -> `Tensor`


Layer that resizes x to `size`, good for connecting mismatched layers <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L17">[source]</a>

Another use we give to the Lambda function is to resize batches with [`ResizeBatch`](/layers.html#ResizeBatch) when we have a layer that expects a different input than what comes from the previous one. Let's see an example:

In [85]:
a = torch.tensor([[1., -1.], [1., -1.]])
print(a)

tensor([[ 1., -1.],
        [ 1., -1.]])


In [86]:
out = ResizeBatch(4)
print(out(a))

tensor([[ 1., -1.,  1., -1.]])


In [21]:
show_doc(StdUpsample)

## <a id=StdUpsample></a>`class` `StdUpsample`
> `StdUpsample`(`n_in`:`int`, `n_out`:`int`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)


Increases the dimensionality of our data by applying a transposed convolution layer to the input and with batchnorm and a RELU activation <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L75">[source]</a>

In [22]:
show_doc(CrossEntropyFlat, doc_string=False)

## <a id=CrossEntropyFlat></a>`class` `CrossEntropyFlat`
> `CrossEntropyFlat`(`weight`=`None`, `size_average`=`None`, `ignore_index`=`-100`, `reduce`=`None`, `reduction`=`'elementwise_mean'`) :: [`CrossEntropyLoss`](https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss)
<a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L93">[source]</a>

Same as [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss), but flattens input and target. Is used to calculate cross entropy on arrays (which Pytorch will not let us do with their [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss) function). An example of a use case is image segmentation models where the output in an image (or an array of pixels).

The parameters are the same as [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss): `weight` to rescale each class, `size_average` whether we want to sum the losses across elements in a batch or we want to add them up, `ignore_index` what targets do we want to ignore, `reduce` on whether we want to return a loss per batch element and `reduction` specifies which type of reduction (if any) we want to apply to our input.

In [34]:
show_doc(Debugger)

NameError: name 'show_doc' is not defined

The debugger module allows us to peek inside a network while its training and see in detail what is going on. We can see inputs, ouputs and sizes at any point in the network.

In [None]:
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    Debugger(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
)

model.cuda()

learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

In [24]:
show_doc(bn_drop_lin, doc_string=False)

#### <a id=bn_drop_lin></a>`bn_drop_lin`
> `bn_drop_lin`(`n_in`:`int`, `n_out`:`int`, `bn`:`bool`=`True`, `p`:`float`=`0.0`, `actn`:`Optional`\[[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\]=`None`)
<a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L29">[source]</a>

[`Debugger`](/layers.html#Debugger)

The [`bn_drop_lin`](/layers.html#bn_drop_lin) function returns a sequence of [batch normalization](https://arxiv.org/abs/1502.03167), [dropout](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf) and a linear layer. This custom layer is usually used at the end of a model. 

`n_in` represents the number of size of the input `n_out` the size of the output, `bn` whether we want batch norm or not, `p` is how much dropout and `actn` is an optional parameter to add an activation function at the end.

In [25]:
show_doc(conv2d)

#### <a id=conv2d></a>`conv2d`
> `conv2d`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`, `padding`:`int`=`None`, `bias`=`False`) -> [`Conv2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d)


Create `nn.Conv2d` layer: `ni` inputs, `nf` outputs, `ks` kernel size. `padding` defaults to `k//2` <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L37">[source]</a>

In [26]:
show_doc(conv2d_relu)

#### <a id=conv2d_relu></a>`conv2d_relu`
> `conv2d_relu`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`, `padding`:`int`=`None`, `bn`:`bool`=`False`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


Create a [`conv2d`](/layers.html#conv2d) layer with `nn.ReLU` activation and optional(`bn`) `nn.BatchNorm2d`: `ni` input, `nf` out filters, `ks` kernel, `stride`:stride, `padding`:padding, `bn`: batch normalization <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L49">[source]</a>

In [27]:
show_doc(conv2d_trans)

#### <a id=conv2d_trans></a>`conv2d_trans`
> `conv2d_trans`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`2`, `stride`:`int`=`2`, `padding`:`int`=`0`) -> [`ConvTranspose2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d)


Create `nn.nn.ConvTranspose2d` layer: `ni` inputs, `nf` outputs, `ks` kernel size, `stride`: stride. `padding` defaults to 0 <a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L56">[source]</a>

In [28]:
show_doc(conv_layer, doc_string=False)

#### <a id=conv_layer></a>`conv_layer`
> `conv_layer`(`ni`:`int`, `nf`:`int`, `ks`:`int`=`3`, `stride`:`int`=`1`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)
<a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L42">[source]</a>

The [`bn_drop_lin`](/layers.html#bn_drop_lin) function returns a sequence of [nn.Conv2D](https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d), [BatchNorm2d](https://arxiv.org/abs/1502.03167) and a [leaky RELU](https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf) activation function.

`n_in` represents the number of size of the input `n_out` the size of the output, `ks` kernel size, `stride` the stride with which we want to apply the convolutions.

In [29]:
show_doc(get_embedding, doc_string=False)

#### <a id=get_embedding></a>`get_embedding`
> `get_embedding`(`ni`:`int`, `nf`:`int`) -> [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)
<a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L115">[source]</a>

Creates an [embedding layer](https://arxiv.org/abs/1711.09160) with input size `ni` and output size `nf`.

In [30]:
show_doc(simple_cnn)

#### <a id=simple_cnn></a>`simple_cnn`
> `simple_cnn`(`actns`:`Collection`\[`int`\], `kernel_szs`:`Collection`\[`int`\]=`None`, `strides`:`Collection`\[`int`\]=`None`) -> [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)


CNN with [`conv2d_relu`](/layers.html#conv2d_relu) layers defined by `actns`, `kernel_szs` and `strides` 

In [31]:
show_doc(std_upsample_head, doc_string=False)

#### <a id=std_upsample_head></a>`std_upsample_head`
> `std_upsample_head`(`c`, `nfs`:`Collection`\[`int`\]) -> [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)
<a href="https://github.com/fastai/fastai/blob/master/fastai/layers.py#L85">[source]</a>

Creates a sequence of upsample layers with a RELU at the beggining and a [nn.ConvTranspose2d](https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d). 

`nfs` is a list with the input and output sizes of each upsample layer and `c` is the output size of the final 2D Transpose Convolutional layer.

In [None]:
show_doc(trunc_normal_)