Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with Adafactor #469

Closed
pavelbatyr opened this issue Oct 7, 2022 · 2 comments
Closed

problem with Adafactor #469

pavelbatyr opened this issue Oct 7, 2022 · 2 comments

Comments

@pavelbatyr
Copy link

Something's wrong with Adafactor

  File "[...]/utils.py", line 835, in train_fn
    scaler.step(optimizer)
  File "/home/jovyan/.imgenv-batyr-hpo-0/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 338, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
  File "/home/jovyan/.imgenv-batyr-hpo-0/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 285, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
  File "/home/jovyan/.imgenv-batyr-hpo-0/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/jovyan/.imgenv-batyr-hpo-0/lib/python3.7/site-packages/torch/optim/optimizer.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/home/user/conda/lib/python3.7/site-packages/torch_optimizer/adafactor.py", line 192, in step
    exp_avg_sq_row, exp_avg_sq_col, update
  File "/home/user/conda/lib/python3.7/site-packages/torch_optimizer/adafactor.py", line 116, in _approx_sq_grad
    (exp_avg_sq_row / exp_avg_sq_row.mean(dim=-1))
RuntimeError: The size of tensor a (7) must match the size of tensor b (3) at non-singleton dimension 2
@pavelbatyr
Copy link
Author

oh, same as #405

@ionutmodo
Copy link

this was solved in #405

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants