Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Dimension error within SwitchGate #5

Closed
Masaaki-75 opened this issue Jun 9, 2024 · 2 comments
Closed

[BUG] Dimension error within SwitchGate #5

Masaaki-75 opened this issue Jun 9, 2024 · 2 comments
Assignees
Labels
bug Something isn't working no-issue-activity

Comments

@Masaaki-75
Copy link

Masaaki-75 commented Jun 9, 2024

Describe the bug
Shape mismatch is found in the computation of auxiliary loss values:

load = gate_scores.sum(0) # Sum over all examples
importance = gate_scores.sum(1) # Sum over all experts
# Aux loss is mean suqared difference between load and importance
loss = ((load - importance) ** 2).mean()

where load is of shape [num_experts, dim] and importance is of shape [batch_size, dim]. Testing this SwitchGate class alone by giving an input with batch_size > 1 will raise error like this RuntimeError: The size of tensor a (64) must match the size of tensor b (2) at non-singleton dimension 0

To Reproduce
Simply run a sample with batch_size > 1:

gate = SwitchGate(dim=16, num_experts=3)
x = torch.randn((2, 64, 16)).float()
y, loss = gate(x, use_aux_loss=True)

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@Masaaki-75 Masaaki-75 added the bug Something isn't working label Jun 9, 2024
Copy link

github-actions bot commented Jun 9, 2024

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

Copy link

github-actions bot commented Sep 7, 2024

Stale issue message

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-issue-activity
Projects
None yet
Development

No branches or pull requests

2 participants