[BUG] Implementation errors on `TransformerModel`. #672

Sangwon91 · 2021-12-06T05:36:12Z

I have looked into [Transformer](https://github.com/unit8co/darts/blame/master/darts/models/forecasting/transformer_model.py) and I have found some errors.

Frist,

In line 167, 170,

src = self.encoder(src) * math.sqrt(self.input_size)

tgt = self.encoder(tgt) * math.sqrt(self.input_size)

I don't think we have to multiply math.sqrt(self.input_size) to inputs (src or tgt).
Because torch.nn.MultiheadAttention take cares this normalization.

Second,

In line 173 - 174,

        x = self.transformer(src=src,
                             tgt=tgt)

There is no tgt_mask for this prediction. In order to use teacher forcing at training stage, user must feed tgt_mask to forward function (specifically square_subsequent_mask defined below). Otherwise decoder inputs before time t can see future decoder inputs (e.g, t+1, t+2, ...) which doesn't exist at inference stage.

[docs]    @staticmethod
    def generate_square_subsequent_mask(sz: int) -> Tensor:
        r"""Generate a square mask for the sequence. The masked positions are filled with float('-inf').
            Unmasked positions are filled with float(0.0).
        """
        return torch.triu(torch.full((sz, sz), float('-inf')), diagonal=1)

I'm not sure these things are errors.
But, in my opinion, it seems this is not correct.

Thank you!

The text was updated successfully, but these errors were encountered:

Sangwon91 added bug Something isn't working triage Issue waiting for triaging labels Dec 6, 2021

hrzn mentioned this issue Aug 18, 2022

Refactor Transformer model #601

Open

madtoinou added good first issue Good for newcomers and removed triage Issue waiting for triaging labels Feb 27, 2023

JanFidor mentioned this issue Jul 23, 2023

Feature/transformer refactorisation #1915

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Implementation errors on `TransformerModel`. #672

[BUG] Implementation errors on `TransformerModel`. #672

Sangwon91 commented Dec 6, 2021

[BUG] Implementation errors on TransformerModel. #672

[BUG] Implementation errors on TransformerModel. #672

Comments

Sangwon91 commented Dec 6, 2021

[BUG] Implementation errors on `TransformerModel`. #672

[BUG] Implementation errors on `TransformerModel`. #672