Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameters for Reproducing Evaluation Results #6

Open
StevenLau6 opened this issue Sep 30, 2022 · 0 comments
Open

Hyperparameters for Reproducing Evaluation Results #6

StevenLau6 opened this issue Sep 30, 2022 · 0 comments

Comments

@StevenLau6
Copy link

Hi @luyang-huang96, thanks so much for posting the code.
Table 3 and 4 in your paper shows that the encoder variants (SINKHORN and LSH) can bring great performance.

  1. To reproduce these results, I wonder if you use the hybrid attention in the encoder (how you set the input parameter encoder_not_hybrid)

  2. If you use the hybrid attention in the encode (encoder_not_hybrid is false), I hope to know how you set the args.sw, args.encoder_linear, and args.encoder_kernel_linear.
    If only use the SINKHORN for all encoder layers (encoder_not_hybrid is True), my result shows its performance can not compete with LED/Bigbird, when using the same-length inputs.

elif args.sinkhorn:
if args.encoder_not_hybrid:
self.layers.extend(
[self.build_sinkhorn_layer(
args, self.padding_idx)
for i in
range(args.encoder_layers)]
)
else:
self.layers.extend(
[self.build_sparse_encoder_layer(args, self._window[i], self.padding_idx) if i % 2 == 0 else self.build_sinkhorn_layer(args, self.padding_idx)
for i in
range(args.encoder_layers)]
)

@StevenLau6 StevenLau6 changed the title Hyperparameter for Reproducing Evaluation Results Hyperparameters for Reproducing Evaluation Results Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant