# Keras Convolutional Layers


Keras Convolutional layers 총정리! 사용해보지 않거나 헷갈리는 layers 및 arguments에 초점을 맞춰서 정리해보았다.


* Reference
   * Keras > Documentation > LAYERS > [Convolutional Layers](https://keras.io/layers/convolutional/)


※ 아래 기술한 모든 input/output shape은 `data_format='channels_last'` 기준


Arguments


* `filters`: Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
* `kernel_size`
   * 1D: An integer or tuple/list of a single integer, specifying the length of the 1D convolution window.
   * 2D: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
* `padding`: One of "valid", "causal" or "same" (case-insensitive).
   * `valid`: **No padding**
   * `same`: **Zero padding** (the output has the same length as the original input.)
   * 'causal`: **Causal (dilated) convolutions**, e.g. output[t] does not depend on input[t + 1:].
      * For 1D convolution only.
      * Output은 zero padding을 통해 input과 동일한 spatial resolution을 유지한다.
* `dilation_rate`
   * 1D: An integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. 
   * 2D: An integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution.
   * Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1.
* `initializer`: 생략
* `regularizer`: 생략
* `constraint`: 생략

In [1]:
from keras import layers

In [1]:
inputs = layers.Input(shape=(16, 3))
conv = layers.Conv1D(4, 3, padding='causal')(inputs)

conv

Using TensorFlow backend.


<tf.Tensor 'conv1d_1/add:0' shape=(?, 16, 4) dtype=float32>

## Convolution


Output volume (2D 기준):


$$
{ W }_{ out }=\frac { { W }_{ in }-F+2P }{ S } +1\quad ,\quad { H }_{ out }=\frac { { H }_{ in }-F+2P }{ S } +1\quad ,\quad { D }_{ out }=K
$$


* $W$ : width (columns)
* $H$ : height (rows)
* $D$ : depth (channel)
* $F$ : the size of kernels (filters)
* $K$ : the number of kernels (filters)
* $S$ : stride
* $P$ : padding


### Conv1D


```python
keras.layers.Conv1D(filters,
                    kernel_size,
                    strides=1, padding='valid', data_format='channels_last',
                    dilation_rate=1, activation=None, use_bias=True,
                    kernel_initializer='glorot_uniform', bias_initializer='zeros',
                    kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                    kernel_constraint=None, bias_constraint=None)
```


1D convolution layer (e.g. temporal convolution).


Input shape: `(batch, steps, channels)`


Output shape: `(batch, new_steps, filters)`


### Conv2D


```python
keras.layers.Conv2D(filters,
                    kernel_size,
                    strides=(1, 1), padding='valid', data_format=None,
                    dilation_rate=(1, 1), activation=None, use_bias=True,
                    kernel_initializer='glorot_uniform', bias_initializer='zeros',
                    kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                    kernel_constraint=None, bias_constraint=None)
```


2D convolution layer (e.g. spatial convolution over images).


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, new_rows, new_cols, filters)`


### Conv3D


```python
keras.layers.Conv3D(filters,
                    kernel_size,
                    strides=(1, 1, 1), padding='valid', data_format=None,
                    dilation_rate=(1, 1, 1), activation=None, use_bias=True,
                    kernel_initializer='glorot_uniform', bias_initializer='zeros',
                    kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                    kernel_constraint=None, bias_constraint=None)
```


3D convolution layer (e.g. spatial convolution over volumes).


Input shape: `(batch, conv_dim1, conv_dim2, conv_dim3, channels)`


Output shape: `(batch, new_conv_dim1, new_conv_dim2, new_conv_dim3, filters)`

## Depthwise Separable Convolution


Separable convolutions consist in first performing a **depthwise spatial convolution** (which acts on each input channel separately) followed by a **pointwise convolution** which mixes together the resulting output channels.


Arguments


* `depth_multiplier`: The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to `filters_in * depth_multiplier`.


### SeparableConv1D


```python
keras.layers.SeparableConv1D(filters,
                             kernel_size,
                             strides=1, padding='valid', data_format='channels_last',
                             dilation_rate=1, depth_multiplier=1, activation=None, use_bias=True,
                             depthwise_initializer='glorot_uniform', pointwise_initializer='glorot_uniform', bias_initializer='zeros',
                             depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                             depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None)
```


Depthwise separable 1D convolution with pointwise convolution.


Input shape: `(batch, steps, channels)`


Output shape: `(batch, new_steps, filters)`


### SeparableConv2D


```python
keras.layers.SeparableConv2D(filters,
                             kernel_size,
                             strides=(1, 1), padding='valid', data_format=None,
                             dilation_rate=(1, 1), depth_multiplier=1, activation=None, use_bias=True,
                             depthwise_initializer='glorot_uniform', pointwise_initializer='glorot_uniform', bias_initializer='zeros',
                             depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                             depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None)
```


Depthwise separable 2D convolution with pointwise convolution.


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, new_rows, new_cols, filters)`


### DepthwiseConv2D


```python
keras.layers.DepthwiseConv2D(kernel_size,
                             strides=(1, 1), padding='valid', data_format=None,
                             depth_multiplier=1, activation=None, use_bias=True,
                             depthwise_initializer='glorot_uniform', bias_initializer='zeros',
                             depthwise_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                             depthwise_constraint=None, bias_constraint=None)
```


Depthwise separable 2D convolution.


Depthwise separable convolutions consists in performing just the first step in a separable convolution (which acts on each input channel separately). 


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, new_rows, new_cols, channels)`

In [2]:
inputs = layers.Input(shape=(64, 64, 3))
conv = layers.DepthwiseConv2D(3, padding='same')(inputs)

conv

<tf.Tensor 'depthwise_conv2d_1/BiasAdd:0' shape=(?, 64, 64, 3) dtype=float32>

## Transposed Convolution


The need for transposed convolutions generally arises from the desire to use a transformation going in **the opposite direction of a normal convolution**. Up sampling layer와 달리 weights가 존재한다. 즉, 학습이 가능하다.


* Reference
   * [Up-sampling with Transposed Convolution](https://towardsdatascience.com/up-sampling-with-transposed-convolution-9ae4f2df52d0)
   * [Is the Transposed Convolution layer and Convolution layer the same thing? Experimenting with concepts using PyTorch.](https://towardsdatascience.com/is-the-transposed-convolution-layer-and-convolution-layer-the-same-thing-8655b751c3a1)


Output volume (2D 기준):


$$
{ W }_{ out }=S({ W }_{ in }-1)+F-2P\quad ,\quad { H }_{ out }=S({ H }_{ in }-1)+F-2P\quad ,\quad { D }_{ out }=K
$$


* $W$ : width (columns)
* $H$ : height (rows)
* $D$ : depth (channel)
* $F$ : the size of kernels (filters)
* $K$ : the number of kernels (filters)
* $S$ : stride
* $P$ : padding


Arguments


* `output_padding`
   * 2D: An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width of the output tensor.
   * 3D: An integer or tuple/list of 3 integers, specifying the amount of padding along the depth, height, and width.
   * Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred.


### Conv2DTranspose


```python
keras.layers.Conv2DTranspose(filters,
                             kernel_size,
                             strides=(1, 1), padding='valid', output_padding=None, data_format=None,
                             dilation_rate=(1, 1), activation=None, use_bias=True,
                             kernel_initializer='glorot_uniform', bias_initializer='zeros',
                             kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                             kernel_constraint=None, bias_constraint=None)
```


Transposed convolution layer (sometimes called Deconvolution).


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, new_rows, new_cols, filters)`


### Conv3DTranspose


```python
keras.layers.Conv3DTranspose(filters,
                             kernel_size,
                             strides=(1, 1, 1), padding='valid', output_padding=None, data_format=None,
                             activation=None, use_bias=True,
                             kernel_initializer='glorot_uniform', bias_initializer='zeros',
                             kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                             kernel_constraint=None, bias_constraint=None)
```


Transposed convolution layer (sometimes called Deconvolution).


Input shape: `(batch, depth, rows, cols, channels)`


Output shape: `(batch, new_depth, new_rows, new_cols, filters)`

## Cropping


Arguments


* `output_padding`
   * 1D: int or tuple of int (length 2).
   * 2D: int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints.
   * 3D: int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints.
   * How many units should be trimmed off at the beginning and end of the cropping dimension. If a single int is provided, the same value will be used for both.


### Cropping1D


```python
keras.layers.Cropping1D(cropping=(1, 1))
```


Cropping layer for 1D input (e.g. temporal sequence).


Input shape: `(batch, steps, channels)`


Output shape: `(batch, cropped_steps, channels)`


### Cropping2D


```python
keras.layers.Cropping2D(cropping=((0, 0), (0, 0)), data_format=None)
```


Cropping layer for 2D input (e.g. picture).


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, cropped_rows, cropped_cols, channels)`


### Cropping3D


```python
keras.layers.Cropping3D(cropping=((1, 1), (1, 1), (1, 1)), data_format=None)
```


Cropping layer for 3D data (e.g. spatial or spatio-temporal).


Input shape: `(batch, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop, channels)`


Output shape: `(batch, first_cropped_axis, second_cropped_axis, third_cropped_axis, channels)`

In [3]:
inputs = layers.Input(shape=(16, 3))
crop = layers.Cropping1D(cropping=3)(inputs)

crop

<tf.Tensor 'cropping1d_1/strided_slice:0' shape=(?, 10, 3) dtype=float32>

In [4]:
inputs = layers.Input(shape=(64, 3))
crop = layers.Cropping1D(cropping=(3, 3))(inputs)

crop

<tf.Tensor 'cropping1d_2/strided_slice:0' shape=(?, 58, 3) dtype=float32>

In [5]:
inputs = layers.Input(shape=(64, 3))
crop = layers.Cropping1D(cropping=(1, 2))(inputs)

crop

<tf.Tensor 'cropping1d_3/strided_slice:0' shape=(?, 61, 3) dtype=float32>

In [6]:
inputs = layers.Input(shape=(64, 64, 3))
crop = layers.Cropping2D(cropping=(1, 2))(inputs)

crop

<tf.Tensor 'cropping2d_1/strided_slice:0' shape=(?, 62, 60, 3) dtype=float32>

In [7]:
inputs = layers.Input(shape=(64, 64, 3))
crop = layers.Cropping2D(cropping=((1, 2), (3, 4)))(inputs)

crop

<tf.Tensor 'cropping2d_2/strided_slice:0' shape=(?, 61, 57, 3) dtype=float32>

In [8]:
inputs = layers.Input(shape=(64, 64, 64, 3))
crop = layers.Cropping3D(cropping=(1, 2, 3))(inputs)

crop

<tf.Tensor 'cropping3d_1/strided_slice:0' shape=(?, 62, 60, 58, 3) dtype=float32>

## Up Sampling


Interpolation으로 up sampling한다.


Arguments


* `size`: Upsampling factors.
   * 1D: integer.
   * 2D: int, or tuple of 2 integers. The upsampling factors for rows and columns.
   * 3D: int, or tuple of 3 integers. The upsampling factors for dim1, dim2 and dim3.


### UpSampling1D


```python
keras.layers.UpSampling1D(size=2)
```


Upsampling layer for 1D inputs.


Repeats each temporal step size times along the time axis.


Input shape: `(batch, steps, channels)`


Output shape: `(batch, upsampled_steps, channels)`


### UpSampling2D


```python
keras.layers.UpSampling2D(size=(2, 2), data_format=None, interpolation='nearest')
```


Upsampling layer for 2D inputs.


Repeats the rows and columns of the data by size[0] and size[1] respectively.


Arguments


* `interpolation`: A string, one of `'nearest'` or `'bilinear'`.


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, upsampled_rows, upsampled_cols, channels)`


### UpSampling3D


```python
keras.layers.UpSampling3D(size=(2, 2, 2), data_format=None)
```


Upsampling layer for 3D inputs.


Input shape: `(batch, dim1, dim2, dim3, channels)`


Output shape: `(batch, upsampled_dim1, upsampled_dim2, upsampled_dim3, channels)`

In [9]:
inputs = layers.Input(shape=(64, 3))
crop = layers.UpSampling1D(size=2)(inputs)

crop

<tf.Tensor 'up_sampling1d_1/concat:0' shape=(?, 128, 3) dtype=float32>

In [10]:
inputs = layers.Input(shape=(64, 64, 3))
crop = layers.UpSampling2D(size=2)(inputs)

crop

<tf.Tensor 'up_sampling2d_1/ResizeNearestNeighbor:0' shape=(?, 128, 128, 3) dtype=float32>

In [11]:
inputs = layers.Input(shape=(64, 64, 3))
crop = layers.UpSampling2D(size=(1, 2))(inputs)

crop

<tf.Tensor 'up_sampling2d_2/ResizeNearestNeighbor:0' shape=(?, 64, 128, 3) dtype=float32>

In [12]:
inputs = layers.Input(shape=(64, 64, 64, 3))
crop = layers.UpSampling3D(size=(1, 2, 3))(inputs)

crop

<tf.Tensor 'up_sampling3d_1/concat_2:0' shape=(?, 64, 128, 192, 3) dtype=float32>

## Padding


Arguments


* `padding`
   * 1D: int, or tuple of int (length 2), or dictionary.
   * 2D: int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints.
   * 3D: int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints.


### ZeroPadding1D


```python
keras.layers.UpSampling1D(size=2)
```


Zero-padding layer for 1D input (e.g. temporal sequence).


Input shape: `(batch, steps, channels)`


Output shape: `(batch, padded_steps, channels)`


### ZeroPadding2D


```python
keras.layers.ZeroPadding2D(padding=(1, 1), data_format=None)
```


Zero-padding layer for 2D input (e.g. picture).


Input shape: `(batch, rows, cols, channels)`


Output shape: `(batch, padded_rows, padded_cols, channels)`


### ZeroPadding3D


```python
keras.layers.ZeroPadding3D(padding=(1, 1, 1), data_format=None)
```


Zero-padding layer for 3D data (spatial or spatio-temporal).


Input shape: `(batch, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad, channels)`


Output shape: `(batch, first_padded_axis, second_padded_axis, third_axis_to_pad, channels)`

In [13]:
inputs = layers.Input(shape=(64, 3))
pad = layers.ZeroPadding1D(padding=2)(inputs)

pad

<tf.Tensor 'zero_padding1d_1/Pad:0' shape=(?, 68, 3) dtype=float32>

In [14]:
inputs = layers.Input(shape=(64, 64, 3))
pad = layers.ZeroPadding2D(padding=2)(inputs)

pad

<tf.Tensor 'zero_padding2d_1/Pad:0' shape=(?, 68, 68, 3) dtype=float32>

In [15]:
inputs = layers.Input(shape=(64, 64, 3))
pad = layers.ZeroPadding2D(padding=(1, 2))(inputs)

pad

<tf.Tensor 'zero_padding2d_2/Pad:0' shape=(?, 66, 68, 3) dtype=float32>

In [16]:
inputs = layers.Input(shape=(64, 64, 3))
pad = layers.ZeroPadding2D(padding=((1, 2), (3, 4)))(inputs)

pad

<tf.Tensor 'zero_padding2d_3/Pad:0' shape=(?, 67, 71, 3) dtype=float32>

In [17]:
inputs = layers.Input(shape=(64, 64, 64, 3))
pad = layers.ZeroPadding3D(padding=2)(inputs)

pad

<tf.Tensor 'zero_padding3d_1/Pad:0' shape=(?, 68, 68, 68, 3) dtype=float32>