# TCN MODEL implemented in Torch

[Wiese et al., Quant GANs: Deep Generation of Financial Time Series, 2019](https://arxiv.org/abs/1907.06673)

For both the generator and the discriminator we used TCNs with skip connections. Inside the TCN architecture temporal blocks were used as block modules. A temporal block consists of two dilated causal convolutions and two PReLUs (He et al., 2015) as activation functions. The primary benefit of using temporal blocks is to make the TCN more expressive by increasing the number of non-linear operations in each block module. A complete definition is given below.

**Definition B.1 (Temporal block)**. Let $N_I, N_H, N_O ∈ \Bbb{N}$ denote the input, hidden and output dimension and let $D,K ∈ \mathbb{N}$ denote the dilation and the kernel size. Furthermore, let $w_1, w_2$ be two dilated causal convolutional layers with arguments $(N_I, N_H, K, D)$  and $(N_H,N_O,K,D)$ respectively and
let $φ_1, φ_2 : \mathbb{R} → \mathbb{R}$ be two PReLUs. The function $f : \mathbb{R}^{N_I×(2D(K−1)+1)} → \mathbb{R}^{N_O}$ defined by
$$f(X) = φ_2 ◦ w_2 ◦ φ_1 ◦ w_1(X)$$
is called temporal block with arguments $(N_I,N_H,N_O,K,D)$.

The TCN architecture used for the generator and the discriminator in the pure TCN and C-SVNN model is illustrated in Table 3. Table 4 shows the input, hidden and output dimensions of the different models. Here, G abbreviates the generator and D the discriminator. Note that for all models, except the generator of the C-SVNN, the hidden dimension was set to eighty. The kernel size of each temporal block, except the first one, was two. Each TCN modeled a RFS of 127.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-ik58{background-color:#2f2f2f;border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<h3>Table 3</h3>
<table class="tg">
<thead>
  <tr>
    <th class="tg-ik58">Module Name</th>
    <th class="tg-ik58">Arguments</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">Temporal block 1</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 1, 1)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 2</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 1)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 3</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 2)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 4</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 4)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 5</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 8)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 6</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 16)</td>
  </tr>
  <tr>
    <td class="tg-0pky">Temporal block 7</td>
    <td class="tg-0pky">(N<sub>I</sub>, N<sub>H</sub>, N<sub>H</sub>, 2, 32)</td>
  </tr>
  <tr>
    <td class="tg-0pky">1 x 1 Convolution</td>
    <td class="tg-0pky">(N<sub>H</sub>, N<sub>O</sub>, 1, 1)</td>
  </tr>
</tbody>
</table>

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;background-color:#2f2f2f;}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<h3>Table 4</h3>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0lax">Models</th>
    <th class="tg-0lax">PureTCN-G</th>
    <th class="tg-0lax">Pure TCN-D<br></th>
    <th class="tg-0lax">C-SVNN-G</th>
    <th class="tg-0lax">C-SVNN_D</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0lax">N<sub>I</sub></td>
    <td class="tg-0lax">3</td>
    <td class="tg-0lax">1</td>
    <td class="tg-0lax">3</td>
    <td class="tg-0lax">1</td>
  </tr>
  <tr>
    <td class="tg-0lax">N<sub>H</sub></td>
    <td class="tg-0lax">80</td>
    <td class="tg-0lax">80</td>
    <td class="tg-0lax">50<br></td>
    <td class="tg-0lax">80</td>
  </tr>
  <tr>
    <td class="tg-0lax">N<sub>O</sub></td>
    <td class="tg-0lax">1</td>
    <td class="tg-0lax">1</td>
    <td class="tg-0lax">2</td>
    <td class="tg-0lax">1</td>
  </tr>
</tbody>
</table>

In [2]:
import torch
import torch.nn as nn
from torch.nn.utils import weight_norm

class Chomp1d(nn.Module):
    def __init__(self, chomp_size):
        super(Chomp1d, self).__init__()
        self.chomp_size = chomp_size

    def forward(self, x):
        return x[:, :, :-self.chomp_size].contiguous()


class TemporalBlock(nn.Module):
    """Creates a temporal block.
    Args:
        n_inputs (int): number of inputs.
        n_outputs (int): size of fully connected layers.
        kernel_size (int): kernel size along temporal axis of convolution layers within the temporal block.
        dilation (int): dilation of convolution layers along temporal axis within the temporal block.
        padding (int): padding
        dropout (float): dropout rate
    Returns:
        tuple of output layers
    """
    def __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout=0.2):
        super(TemporalBlock, self).__init__()
        self.conv1 = weight_norm(nn.Conv1d(n_inputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation))
        self.chomp1 = Chomp1d(padding)
        self.relu1 = nn.ReLU()
        self.dropout1 = nn.Dropout(dropout)

        self.conv2 = weight_norm(nn.Conv1d(n_outputs, n_outputs, kernel_size, stride=stride, padding=padding, dilation=dilation))
        self.chomp2 = Chomp1d(padding)
        self.relu2 = nn.ReLU()
        self.dropout2 = nn.Dropout(dropout)

        if padding == 0:
            self.net = nn.Sequential(self.conv1, self.relu1, self.dropout1, self.conv2, self.relu2, self.dropout2)
        else:
            self.net = nn.Sequential(self.conv1, self.chomp1, self.relu1, self.dropout1, self.conv2, self.chomp2, self.relu2, self.dropout2)

        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None
        self.relu = nn.ReLU()
        self.init_weights()

    def init_weights(self):
        self.conv1.weight.data.normal_(0, 0.5)
        self.conv2.weight.data.normal_(0, 0.5)
        if self.downsample is not None:
            self.downsample.weight.data.normal_(0, 0.5)

    def forward(self, x):
        out = self.net(x)
        res = x if self.downsample is None else self.downsample(x)
        return out, self.relu(out + res)

pp = 20

class Generator(nn.Module):
    """Generator: 3 to 1 Causal temporal convolutional network with skip connections.
       This network uses 1D convolutions in order to model multiple timeseries co-dependency.
    """ 
    def __init__(self):
        super(Generator, self).__init__()
        self.tcn = nn.ModuleList([TemporalBlock(3, pp, kernel_size=1, stride=1, dilation=1, padding=0),
                                 *[TemporalBlock(pp, pp, kernel_size=2, stride=1, dilation=i, padding=i) for i in [1, 2, 4, 8, 16, 32]]])
        self.last = nn.Conv1d(pp, 1, kernel_size=1, stride=1, dilation=1)

    def forward(self, x):
        skip_layers = []
        for layer in self.tcn:
            skip, x = layer(x)
            skip_layers.append(skip)
        x = self.last(x + sum(skip_layers))
        return x


class Discriminator(nn.Module):
    """Discrimnator: 1 to 1 Causal temporal convolutional network with skip connections.
       This network uses 1D convolutions in order to model multiple timeseries co-dependency.
    """ 
    def __init__(self, seq_len, conv_dropout=0.05):
        super(Discriminator, self).__init__()
        self.tcn = nn.ModuleList([TemporalBlock(1, pp, kernel_size=1, stride=1, dilation=1, padding=0),
                                 *[TemporalBlock(pp, pp, kernel_size=2, stride=1, dilation=i, padding=i) for i in [1, 2, 4, 8, 16, 32]]])
        self.last = nn.Conv1d(pp, 1, kernel_size=1, dilation=1)
        self.to_prob = nn.Sequential(nn.Linear(seq_len, 1), nn.Sigmoid())

    def forward(self, x):
        skip_layers = []
        for layer in self.tcn:
            skip, x = layer(x)
            skip_layers.append(skip)
        x = self.last(x + sum(skip_layers))
        return self.to_prob(x).squeeze()