In [15]:
import tensorflow as tf
import numpy as np

## tf.keras.layers.Conv1D
* reference:
   
    https://blog.csdn.net/yinizhilianlove/article/details/127129520

    https://blog.csdn.net/weixin_39910711/article/details/124678538
* Conv1D：普通一维卷积，常用于文本。参数个数 = 输入通道数×卷积核尺寸(如3)×卷积核个数
* 
```
    1D convolution layer (e.g. temporal convolution).

    This layer creates a convolution kernel that is convolved with the layer
    input over a single spatial (or temporal) dimension to produce a tensor of
    outputs. If `use_bias` is True, a bias vector is created and added to the
    outputs. Finally, if `activation` is not `None`, it is applied to the
    outputs as well.

    Args:
        filters: int, the dimension of the output space (the number of filters
            in the convolution).
        kernel_size: int or tuple/list of 1 integer, specifying the size of the
            convolution window.
        strides: int or tuple/list of 1 integer, specifying the stride length
            of the convolution. `strides > 1` is incompatible with
            `dilation_rate > 1`.
        padding: string, `"valid"`, `"same"` or `"causal"`(case-insensitive).
            `"valid"` means no padding. `"same"` results in padding evenly to
            the left/right or up/down of the input. When `padding="same"` and
            `strides=1`, the output has the same size as the input.
            `"causal"` results in causal(dilated) convolutions, e.g. `output[t]`
            does not depend on`input[t+1:]`. Useful when modeling temporal data
            where the model should not violate the temporal order.
            See [WaveNet: A Generative Model for Raw Audio, section2.1](
            https://arxiv.org/abs/1609.03499).
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape `(batch, steps, features)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch, features, steps)`. It defaults to the `image_data_format`
            value found in your Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be `"channels_last"`.
        dilation_rate: int or tuple/list of 1 integers, specifying the dilation
            rate to use for dilated convolution.
        groups: A positive int specifying the number of groups in which the
            input is split along the channel axis. Each group is convolved
            separately with `filters // groups` filters. The output is the
            concatenation of all the `groups` results along the channel axis.
            Input channels and `filters` must both be divisible by `groups`.
        activation: Activation function. If `None`, no activation is applied.
        use_bias: bool, if `True`, bias will be added to the output.
        kernel_initializer: Initializer for the convolution kernel. If `None`,
            the default initializer (`"glorot_uniform"`) will be used.
        bias_initializer: Initializer for the bias vector. If `None`, the
            default initializer (`"zeros"`) will be used.
        kernel_regularizer: Optional regularizer for the convolution kernel.
        bias_regularizer: Optional regularizer for the bias vector.
        activity_regularizer: Optional regularizer function for the output.
        kernel_constraint: Optional projection function to be applied to the
            kernel after being updated by an `Optimizer` (e.g. used to implement
            norm constraints or value constraints for layer weights). The
            function must take as input the unprojected variable and must return
            the projected variable (which must have the same shape). Constraints
            are not safe to use when doing asynchronous distributed training.
        bias_constraint: Optional projection function to be applied to the
            bias after being updated by an `Optimizer`.

    Input shape:
    - If `data_format="channels_last"`:
        A 3D tensor with shape: `(batch_shape, steps, channels)`
    - If `data_format="channels_first"`:
        A 3D tensor with shape: `(batch_shape, channels, steps)`

    Output shape:
    - If `data_format="channels_last"`:
        A 3D tensor with shape: `(batch_shape, new_steps, filters)`
    - If `data_format="channels_first"`:
        A 3D tensor with shape: `(batch_shape, filters, new_steps)`

    Returns:
        A 3D tensor representing `activation(conv1d(inputs, kernel) + bias)`.

    Raises:
        ValueError: when both `strides > 1` and `dilation_rate > 1`.

```

In [16]:
x = np.random.rand(4,11,32)
conv1d = tf.keras.layers.Conv1D(filters=32,kernel_size=3,dilation_rate=3,padding="causal")
y = conv1d(x)
print(y.shape)

(4, 11, 32)


In [17]:
# https://zhuanlan.zhihu.com/p/648890779?utm_id=0
input = tf.keras.layers.Input(shape=(10,5), name="inputs")


## tf.keras.layers.Conv2D
    This layer creates a convolution kernel that is convolved with the layer
    input over a single spatial (or temporal) dimension to produce a tensor of
    outputs. If `use_bias` is True, a bias vector is created and added to the
    outputs. Finally, if `activation` is not `None`, it is applied to the
    outputs as well.

    Args:
        filters: int, the dimension of the output space (the number of filters
            in the convolution).
        kernel_size: int or tuple/list of 2 integer, specifying the size of the
            convolution window.
        strides: int or tuple/list of 2 integer, specifying the stride length
            of the convolution. `strides > 1` is incompatible with
            `dilation_rate > 1`.
        padding: string, either `"valid"` or `"same"` (case-insensitive).
            `"valid"` means no padding. `"same"` results in padding evenly to
            the left/right or up/down of the input. When `padding="same"` and
            `strides=1`, the output has the same size as the input.
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape
            `(batch_size, height, width, channels)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch_size, channels, height, width)`. It defaults to the
            `image_data_format` value found in your Keras config file at
            `~/.keras/keras.json`. If you never set it, then it will be
            `"channels_last"`.
        dilation_rate: int or tuple/list of 2 integers, specifying the dilation
            rate to use for dilated convolution.
        groups: A positive int specifying the number of groups in which the
            input is split along the channel axis. Each group is convolved
            separately with `filters // groups` filters. The output is the
            concatenation of all the `groups` results along the channel axis.
            Input channels and `filters` must both be divisible by `groups`.
        activation: Activation function. If `None`, no activation is applied.
        use_bias: bool, if `True`, bias will be added to the output.
        kernel_initializer: Initializer for the convolution kernel. If `None`,
            the default initializer (`"glorot_uniform"`) will be used.
        bias_initializer: Initializer for the bias vector. If `None`, the
            default initializer (`"zeros"`) will be used.
        kernel_regularizer: Optional regularizer for the convolution kernel.
        bias_regularizer: Optional regularizer for the bias vector.
        activity_regularizer: Optional regularizer function for the output.
        kernel_constraint: Optional projection function to be applied to the
            kernel after being updated by an `Optimizer` (e.g. used to implement
            norm constraints or value constraints for layer weights). The
            function must take as input the unprojected variable and must return
            the projected variable (which must have the same shape). Constraints
            are not safe to use when doing asynchronous distributed training.
        bias_constraint: Optional projection function to be applied to the
            bias after being updated by an `Optimizer`.

    Input shape:
    - If `data_format="channels_last"`:
        A 4D tensor with shape: `(batch_size, height, width, channels)`
    - If `data_format="channels_first"`:
        A 4D tensor with shape: `(batch_size, channels, height, width)`

    Output shape:
    - If `data_format="channels_last"`:
        A 4D tensor with shape: `(batch_size, new_height, new_width, filters)`
    - If `data_format="channels_first"`:
        A 4D tensor with shape: `(batch_size, filters, new_height, new_width)`

    Returns:
        A 4D tensor representing `activation(conv2d(inputs, kernel) + bias)`.

    Raises:
        ValueError: when both `strides > 1` and `dilation_rate > 1`.


In [18]:
x = np.random.rand(4, 10, 10, 128)
y = tf.keras.layers.Conv2D(32, 3, activation='relu')(x)
print(tf.shape(y))

tf.Tensor([ 4  8  8 32], shape=(4,), dtype=int32)


## tf.keras.layers.Conv3D
    filters: int, the dimension of the output space (the number of filters
            in the convolution).
        kernel_size: int or tuple/list of 3 integer, specifying the size of the
            convolution window.
        strides: int or tuple/list of 3 integer, specifying the stride length
            of the convolution. `strides > 1` is incompatible with
            `dilation_rate > 1`.
        padding: string, either `"valid"` or `"same"` (case-insensitive).
            `"valid"` means no padding. `"same"` results in padding evenly to
            the left/right or up/down of the input. When `padding="same"` and
            `strides=1`, the output has the same size as the input.
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape
            `(batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)`.
            It defaults to the `image_data_format` value found in your Keras
            config file at `~/.keras/keras.json`. If you never set it, then it
            will be `"channels_last"`.
        dilation_rate: int or tuple/list of 3 integers, specifying the dilation
            rate to use for dilated convolution.
        groups: A positive int specifying the number of groups in which the
            input is split along the channel axis. Each group is convolved
            separately with `filters // groups` filters. The output is the
            concatenation of all the `groups` results along the channel axis.
            Input channels and `filters` must both be divisible by `groups`.
        activation: Activation function. If `None`, no activation is applied.
        use_bias: bool, if `True`, bias will be added to the output.
        kernel_initializer: Initializer for the convolution kernel. If `None`,
            the default initializer (`"glorot_uniform"`) will be used.
        bias_initializer: Initializer for the bias vector. If `None`, the
            default initializer (`"zeros"`) will be used.
        kernel_regularizer: Optional regularizer for the convolution kernel.
        bias_regularizer: Optional regularizer for the bias vector.
        activity_regularizer: Optional regularizer function for the output.
        kernel_constraint: Optional projection function to be applied to the
            kernel after being updated by an `Optimizer` (e.g. used to implement
            norm constraints or value constraints for layer weights). The
            function must take as input the unprojected variable and must return
            the projected variable (which must have the same shape). Constraints
            are not safe to use when doing asynchronous distributed training.
        bias_constraint: Optional projection function to be applied to the
            bias after being updated by an `Optimizer`.

    Input shape:
    - If `data_format="channels_last"`:
        5D tensor with shape:
        `(batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)`
    - If `data_format="channels_first"`:
        5D tensor with shape:
        `(batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)`

    Output shape:
    - If `data_format="channels_last"`:
        5D tensor with shape:
        `(batch_size, new_spatial_dim1, new_spatial_dim2, new_spatial_dim3,
        filters)`
    - If `data_format="channels_first"`:
        5D tensor with shape:
        `(batch_size, filters, new_spatial_dim1, new_spatial_dim2,
        new_spatial_dim3)`

    Returns:
        A 5D tensor representing `activation(conv3d(inputs, kernel) + bias)`.

    Raises:
        ValueError: when both `strides > 1` and `dilation_rate > 1`.

In [19]:
x = np.random.rand(4,10,10,10,32)
y = tf.keras.layers.Conv3D(filters=12,kernel_size=3,strides=2)(x)
print(y.shape)

(4, 4, 4, 4, 12)


## tf.keras.layers.Conv2DTranspose
* 二维卷积转置层，俗称反卷积层。并非卷积的逆操作，但在卷积核相同的情况下，当其输入尺寸是卷积操作输出尺寸的情况下，卷积转置的输出尺寸恰好是卷积操作的输入尺寸。
* 视频：https://www.bilibili.com/video/BV1mh411J7U4/?spm_id_from=333.337.search-card.all.click&vd_source=c1c07e231635072798fd6984a0d3876a
  1. 输入特征特征元素间填充s-1行、列0
  2. 在输入特征四周填充k-p-1行、列0
  3. 将卷积核参数上下、左右翻转
  4. 做正常的卷积（填充0、布距1）
* Hout = (Hin-1)\*stride[0] - 2\*padding[0] + kernel_size[0]
* Wout = (Win-1)\*stride[1] - 2\*padding[1] + kernel_size[1]

In [21]:
x = np.random.rand(2,10,10,3)
y = tf.keras.layers.Conv2D(filters=10,kernel_size=3,strides=1)(x)
z = tf.keras.layers.Conv2DTranspose(filters=3,kernel_size=3,strides=1)(y)
print(z.shape)

(2, 10, 10, 3)


## tf.keras.layers.MaxPool2D
    tf.keras.layers.MaxPool2D(
        pool_size=(2, 2),
        strides=None,
        padding='valid',
        data_format=None,
        name=None,
        **kwargs
    )
    
    Downsamples the input along its spatial dimensions (height and width)
    by taking the maximum value over an input window
    (of size defined by `pool_size`) for each channel of the input.
    The window is shifted by `strides` along each dimension.

    The resulting output when using the `"valid"` padding option has a spatial
    shape (number of rows or columns) of:
    `output_shape = math.floor((input_shape - pool_size) / strides) + 1`
    (when `input_shape >= pool_size`)

    The resulting output shape when using the `"same"` padding option is:
    `output_shape = math.floor((input_shape - 1) / strides) + 1`

    Args:
        pool_size: int or tuple of 2 integers, factors by which to downscale
            (dim1, dim2). If only one integer is specified, the same
            window length will be used for all dimensions.
        strides: int or tuple of 2 integers, or None. Strides values. If None,
            it will default to `pool_size`. If only one int is specified, the
            same stride size will be used for all dimensions.
        padding: string, either `"valid"` or `"same"` (case-insensitive).
            `"valid"` means no padding. `"same"` results in padding evenly to
            the left/right or up/down of the input such that output has the same
            height/width dimension as the input.
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape `(batch, height, width, channels)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch, channels, height, width)`. It defaults to the
            `image_data_format` value found in your Keras config file at
            `~/.keras/keras.json`. If you never set it, then it will be
            `"channels_last"`.

    Input shape:
    - If `data_format="channels_last"`:
        4D tensor with shape `(batch_size, height, width, channels)`.
    - If `data_format="channels_first"`:
        4D tensor with shape `(batch_size, channels, height, width)`.

    Output shape:
    - If `data_format="channels_last"`:
        4D tensor with shape
        `(batch_size, pooled_height, pooled_width, channels)`.
    - If `data_format="channels_first"`:
        4D tensor with shape
        `(batch_size, channels, pooled_height, pooled_width)`.

In [25]:
x = np.arange(9).reshape(1,3,3,1)
y = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=1)(x)
y

<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[4.],
         [5.]],

        [[7.],
         [8.]]]], dtype=float32)>

## tf.keras.layers.AveragePooling2D
    tf.keras.layers.AveragePooling2D(
        pool_size,
        strides=None,
        padding='valid',
        data_format=None,
        name=None,
        **kwargs
    )
    Downsamples the input along its spatial dimensions (height and width)
    by taking the average value over an input window
    (of size defined by `pool_size`) for each channel of the input.
    The window is shifted by `strides` along each dimension.

    The resulting output when using the `"valid"` padding option has a spatial
    shape (number of rows or columns) of:
    `output_shape = math.floor((input_shape - pool_size) / strides) + 1`
    (when `input_shape >= pool_size`)

    The resulting output shape when using the `"same"` padding option is:
    `output_shape = math.floor((input_shape - 1) / strides) + 1`

    Args:
        pool_size: int or tuple of 2 integers, factors by which to downscale
            (dim1, dim2). If only one integer is specified, the same
            window length will be used for all dimensions.
        strides: int or tuple of 2 integers, or None. Strides values. If None,
            it will default to `pool_size`. If only one int is specified, the
            same stride size will be used for all dimensions.
        padding: string, either `"valid"` or `"same"` (case-insensitive).
            `"valid"` means no padding. `"same"` results in padding evenly to
            the left/right or up/down of the input such that output has the same
            height/width dimension as the input.
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape `(batch, height, width, channels)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch, channels, height, width)`. It defaults to the
            `image_data_format` value found in your Keras config file at
            `~/.keras/keras.json`. If you never set it, then it will be
            `"channels_last"`.

    Input shape:
    - If `data_format="channels_last"`:
        4D tensor with shape `(batch_size, height, width, channels)`.
    - If `data_format="channels_first"`:
        4D tensor with shape `(batch_size, channels, height, width)`.

    Output shape:
    - If `data_format="channels_last"`:
        4D tensor with shape
        `(batch_size, pooled_height, pooled_width, channels)`.
    - If `data_format="channels_first"`:
        4D tensor with shape
        `(batch_size, channels, pooled_height, pooled_width)`.

In [28]:
x = np.random.rand(2,3,3,1)
y = tf.keras.layers.AveragePooling2D(pool_size=2, strides=1)(x)
y

<tf.Tensor: shape=(2, 2, 2, 1), dtype=float32, numpy=
array([[[[0.6579932 ],
         [0.60644126]],

        [[0.54774666],
         [0.5166114 ]]],


       [[[0.3283871 ],
         [0.6308077 ]],

        [[0.35316724],
         [0.42285788]]]], dtype=float32)>

## tf.keras.layers.GlobalMaxPool2D
    tf.keras.layers.GlobalMaxPool2D(
        data_format=None, keepdims=False, **kwargs
    )
    Args:
        
        data_format: string, either `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs. `"channels_last"`
            corresponds to inputs with shape `(batch, height, width, channels)`
            while `"channels_first"` corresponds to inputs with shape
            `(batch, features, height, weight)`. It defaults to the
            `image_data_format` value found in your Keras config file at
            `~/.keras/keras.json`. If you never set it, then it will be
            `"channels_last"`.
        keepdims: A boolean, whether to keep the temporal dimension or not.
            If `keepdims` is `False` (default), the rank of the tensor is
            reduced for spatial dimensions. If `keepdims` is `True`, the
            spatial dimension are retained with length 1.
            The behavior is the same as for `tf.reduce_mean` or `np.mean`.

    Input shape:

    - If `data_format='channels_last'`:
        4D tensor with shape:
        `(batch_size, height, width, channels)`
    - If `data_format='channels_first'`:
        4D tensor with shape:
        `(batch_size, channels, height, width)`

    Output shape:

    - If `keepdims=False`:
        2D tensor with shape `(batch_size, channels)`.
    - If `keepdims=True`:
        - If `data_format="channels_last"`:
            4D tensor with shape `(batch_size, 1, 1, channels)`
        - If `data_format="channels_first"`:
            4D tensor with shape `(batch_size, channels, 1, 1)`

In [31]:
x = np.arange(16).reshape(1,2,2,4)
y = tf.keras.layers.GlobalMaxPool2D()(x)
y

<tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[12., 13., 14., 15.]], dtype=float32)>

In [32]:
x

array([[[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7]],

        [[ 8,  9, 10, 11],
         [12, 13, 14, 15]]]])

In [33]:
tf.keras.layers.GlobalMaxPooling2D()(x)

<tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[12., 13., 14., 15.]], dtype=float32)>