examples of roberta #1

flyleeee · 2024-04-22T11:52:07Z

Could you provide the code for the paper on the RoBERTa model?

flyleeee · 2024-04-24T08:55:07Z

When I tried to reproduce the experiment on RoBERTa, I found that if I implemented lora in the output layer of the model as described in the paper, it would trigger an error in the input_hook function of the save_input_hook function in the kfac.py file.

def input_hook(_module: nn.Module, pos_args: tuple[t.Tensor]) -> None:
        if not _hooks_enabled or _input_hooks_disabled:
            return
        # Select the first positional argument given to this layer (the input
        # activation), then the last token in the token sequence [:, -1]. `a`
        # should be a [batch, l_in] tensor.
        a: Float[Tensor, "batch l_in"] = pos_args[0].detach().clone()[:, -1]
        if has_bias:
            a = t.hstack((a, t.ones_like(a[:, :1])))
        assert a.dim() == 2

Moreover, the hyperparameter of the prior var on RoBERTa is also unknown

flyleeee closed this as completed Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples of roberta #1

examples of roberta #1

flyleeee commented Apr 22, 2024

flyleeee commented Apr 24, 2024 •

edited

Loading

examples of roberta #1

examples of roberta #1

Comments

flyleeee commented Apr 22, 2024

flyleeee commented Apr 24, 2024 • edited Loading

flyleeee commented Apr 24, 2024 •

edited

Loading