Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Fix starspace #5003

Merged
merged 2 commits into from Apr 11, 2023
Merged

Fix starspace #5003

merged 2 commits into from Apr 11, 2023

Conversation

klshuster
Copy link
Contributor

Patch description
There is an in-place Variable error when training the current starspace models. This has to do with a known issue in using max_norm with nn.Embedding: pytorch/pytorch#26596

I have been unable to track down the root of the cause (it has something to do with accessing the embedding weights directly?) but simply removing the max_norm allows us to pass tests

Testing steps
To identify that it was an issue with the embedding, I wrapped the forward call with

with torch.autograd.set_detect_anomaly(True):

Which leads to this traceback:

  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/starspace.py", line 415, in predict
    xe, ye = self.model(xs, ys, negs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 54, in forward
    c_emb = self.encoder2(c)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/ParlAI/parlai/agents/starspace/modules.py", line 75, in forward
    xs_emb = self.lt(xs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/private/home/kshuster/.conda/envs/parlai_py39_pyt113/lib/python3.9/site-packages/torch/fx/traceback.py", line 57, in format_stack
    return traceback.format_stack()
 (Triggered internally at /opt/conda/conda-bld/pytorch_1666643016022/work/torch/csrc/autograd/python_anomaly_mode.cpp:114.)

You can see that the error originates from xs_emb = self.lt(xs)

Interestingly, the error seems to only come about when embedding candidate vectors -- NOT THE INPUT -- so there might be something going on there too.

Anyway, tests pass after this, I believe

cc @jaseweston for whether max_norm is required

Copy link
Contributor

@mojtaba-komeili mojtaba-komeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it for now to pass the tests.

@mojtaba-komeili mojtaba-komeili merged commit 49ecfa9 into main Apr 11, 2023
7 of 8 checks passed
@mojtaba-komeili mojtaba-komeili deleted the fix_starspace branch April 11, 2023 15:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants