Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

distillation test breakage demonstration #4526

Closed
wants to merge 1 commit into from

Conversation

spencerp
Copy link
Contributor

@spencerp spencerp commented May 3, 2022

minimum change required to break TestBartDistillation::test_narrow_distillation_losses

@spencerp
Copy link
Contributor Author

spencerp commented May 3, 2022

This is a result of changing the order of random operations (ones that happen during module initialization) given a fixed random seed. Thanks @EricMichaelSmith for helping debug!

Doesn't break test (ModuleList order different, initialization order preserved):

        self.norm2 = torch.nn.LayerNorm(embedding_size, eps=LAYER_NORM_EPS)
        self.norm3 = torch.nn.LayerNorm(embedding_size, eps=LAYER_NORM_EPS)

        encoder_attention = self.swappables.encoder_attention(
            opt=self.opt, n_heads=n_heads, dim=embedding_size, dropout=attention_dropout
        )  # type: ignore
        ffn = self.swappables.feedforward(
            opt=self.opt,
            dim=embedding_size,
            dim_hidden=ffn_size,
            relu_dropout=relu_dropout,
            activation=activation,
        )  # type: ignore

        self.ffn = ffn
        self.encoder_attention = encoder_attention

Breaks test (ModuleList order preserved, initialization order different):

        ffn = self.swappables.feedforward(
            opt=self.opt,
            dim=embedding_size,
            dim_hidden=ffn_size,
            relu_dropout=relu_dropout,
            activation=activation,
        )  # type: ignore
        encoder_attention = self.swappables.encoder_attention(
            opt=self.opt, n_heads=n_heads, dim=embedding_size, dropout=attention_dropout
        )  # type: ignore

        self.encoder_attention = encoder_attention
        self.ffn = ffn

@spencerp spencerp closed this May 3, 2022
@spencerp spencerp deleted the dist-demo-not-to-merge branch May 3, 2022 19:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants