# The Encoder-Decoder Archicture.

Generally speaking, for machine translation, the input and output sequences are of differing lengths and are unaligned. In cases like this, it is commmon to use the enocder-decoder architecture. The encoder encodes the variable length sequence, while the decoder acts as a conditional language model, taking the encoded input and the "leftwards context" of the target sequence. 

## Encoder

We specify a base class that essentially just specifies that the encoder takes as input a variable length sequence X.

In [3]:
from torch import nn
from d2l import torch as d2l

In [4]:
class Encoder(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, X):
        raise NotImplementedError

## Decoder

Here, we add an additional method which prepares the initial state of the decoder archicture from the encoded state of the encoder.

In [5]:
class Decoder(nn.Module):
    def __init__(self):
        super().__init__()

    def init_state(self, enc_all_outputs, *args):
        raise NotImplementedError
    
    def forward(self, X, state):
        raise NotImplementedError


## Putting the Encoder and Decoder together

In [6]:
class EncoderDecoder(d2l.Classifier):

    def __init__(self, encoder, decoder):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder

    def forward(self, enc_X, dec_X, *args):
        enc_all_outputs = self.encoder(enc_X, *args)
        dec_state = self.decoder.init_state(enc_all_outputs, *args)

        return self.decoder(dec_X, dec_state)[0]
