In [1]:
import tensorflow as tf

2021-11-17 03:55:30.930028: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-17 03:55:30.930079: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


# SeparableConv1D Layer
The conv2D is the traditional convolution. So, you have an image, with or without padding, and filter that slides through the image with a given stride.<br>

On the other hand, the SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes together the resulting output channels. MobileNet, for example, uses this operation to compute the convolutions faster.<br>

## Spatial Separable Convolutions
The spatial separable convolution is so named because it deals primarily with the spatial dimensions of an image and kernel: the width and the height. (The other dimension, the “depth” dimension, is the number of channels of each image).<br>

A spatial separable convolution simply divides a kernel into two, smaller kernels.

<center><img src="assets/sconv.png"></img></center>

Now, instead of doing one convolution with 9 multiplications, we do two convolutions with 3 multiplications each (6 in total) to achieve the same effect. With less multiplications, computational complexity goes down, and the network is able to run faster.

<center><img src="assets/sconv2.png"></img></center>

## Depthwise Separable Convolutions
Unlike spatial separable convolutions, depthwise separable convolutions work with kernels that cannot be “factored” into two smaller kernels. Hence, it is more commonly used. This is the type of separable convolution seen in keras.layers.SeparableConv2D or tf.layers.separable_conv2d.<br>

The depthwise separable convolution is so named because it deals not just with the spatial dimensions, but with the depth dimension — the number of channels — as well. An input image may have 3 channels: RGB. After a few convolutions, an image may have multiple channels. You can image each channel as a particular interpretation of that image; in for example, the “red” channel interprets the “redness” of each pixel, the “blue” channel interprets the “blueness” of each pixel, and the “green” channel interprets the “greenness” of each pixel. An image with 64 channels has 64 different interpretations of that image.<br>

[![Gif](assets/sconv.gif)]

In [None]:
tf.keras.layers.SeparableConv1D(
    filters=1,
    kernel_size=1,
    strides=1,
    padding="valid",
    data_format=None,
    dilation_rate=1,
    depth_multiplier=1,
    activation=None,
    use_bias=True,
    depthwise_initializer="glorot_uniform",
    pointwise_initializer="glorot_uniform",
    bias_initializer="zeros",
    depthwise_regularizer=None,
    pointwise_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    depthwise_constraint=None,
    pointwise_constraint=None,
    bias_constraint=None,
)


## Arguments
- **filters:** Integer, the dimensionality of the output space (i.e. the number of filters in the convolution).
- **kernel_size:** A single integer specifying the spatial dimensions of the filters.
- **strides:** A single integer specifying the strides of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1.
- **padding:** One of "valid", "same", or "causal" (case-insensitive). "valid" means no padding. "same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input. "causal" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t+1:].
- **data_format:** A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch_size, length, channels) while channels_first corresponds to inputs with shape (batch_size, channels, length).
- **dilation_rate:** A single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1.
- **depth_multiplier:** The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to num_filters_in * depth_multiplier.
- **activation:** Activation function to use. If you don't specify anything, no activation is applied ( see keras.activations).
- **use_bias:** Boolean, whether the layer uses a bias.
- **depthwise_initializer:** An initializer for the depthwise convolution kernel ( see keras.initializers). If None, then the default initializer ( 'glorot_uniform') will be used.
- **pointwise_initializer:** An initializer for the pointwise convolution kernel ( see keras.initializers). If None, then the default initializer ('glorot_uniform') will be used.
- **bias_initializer:** An initializer for the bias vector. If None, the default initializer ('zeros') will be used (see keras.initializers).
- **depthwise_regularizer:** Optional regularizer for the depthwise convolution kernel (see keras.regularizers).
- **pointwise_regularizer:** Optional regularizer for the pointwise convolution kernel (see keras.regularizers).
- **bias_regularizer:** Optional regularizer for the bias vector ( see keras.regularizers).
- **activity_regularizer:** Optional regularizer function for the output ( see keras.regularizers).
- **depthwise_constraint:** Optional projection function to be applied to the depthwise kernel after being updated by an Optimizer (e.g. used for norm constraints or value constraints for layer weights). The function must take as input the unprojected variable and must return the projected variable (which must have the same shape). Constraints are not safe to use when doing asynchronous distributed training ( see keras.constraints).
- **pointwise_constraint:** Optional projection function to be applied to the pointwise kernel after being updated by an Optimizer ( see keras.constraints).
- **bias_constraint:** Optional projection function to be applied to the bias after being updated by an Optimizer ( see keras.constraints).
- **trainable:** Boolean, if True the weights of this layer will be marked as trainable (and listed in layer.trainable_weights).

### Input Shape
3D tensor with shape: (batch_size, channels, steps) if data_format='channels_first' or 5D tensor with shape: (batch_size, steps, channels) if data_format='channels_last'.

### Output Shape
3D tensor with shape: (batch_size, filters, new_steps) if data_format='channels_first' or 3D tensor with shape: (batch_size, new_steps, filters) if data_format='channels_last'. new_steps value might have changed due to padding or strides.