# Causal convolutions
Recurrent neural networks are not the only way of effectively representing sequences: convolutions can also do the job. In particular, we can use _causal convolutions_: convolutional filters applied to the sequence in a left-to-right fashion, emitting a representation at each step. They are _causal_ in that the their output at time $t$ is conditional on input up to $t-1$: this is necessary to ensure that they do not have access to the elements of the sequence we are trying to predict. 


Like LSTMs, causal convolutions can model sequences with long-term dependencies. This is achieved in two ways: stacking convolutional layers (with padding, every convolutional layer preserves the shape of the input), and _dilation_: insertion of gaps into the convolutional filters (otherwise known as _atrous_ convolutions).

The [WaveNet](https://arxiv.org/pdf/1609.03499.pdf) paper uses dilated causal convolutions to model audio:

---

<img src="https://storage.googleapis.com/deepmind-live-cms-alt/documents/BlogPost-Fig2-Anim-160908-r01.gif" alt="Causal convolutions" style="width: 600x;"/>

---

Using convolutional rather than recurrent networks for representing sequences has a couple of advantages, as described in [this](https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675f) blog post: 

1. Parallelization: RNNs needs to process inputs in a sequential fashion, one time-step at a time. In contrast, a CNN can perform convolutions across the entire sequence in parallel.
2. Convolutional representations are less likely to be bottlenecked by the fixed size of the RNN representation, or by the distance between the hidden output and the input in long sequences. Using convolutional networks, the distance between the output and is determined by the depth of the network, and is independent of the length of the sequence (see section 1 of [Neural Machine Translation in Linear Time](https://arxiv.org/pdf/1610.10099.pdf)).

## Causal convolutions in Spotlight
Spotlight implements causal convolution models as part of its [sequence models](https://maciejkula.github.io/spotlight/sequence/sequence.html) package, alongside more traditional recurrent and pooling models.