Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure BERT TokenEmbedder so that it gets gradients? #3756

Closed
aagohary opened this issue Feb 10, 2020 · 6 comments
Closed

How to configure BERT TokenEmbedder so that it gets gradients? #3756

aagohary opened this issue Feb 10, 2020 · 6 comments

Comments

@aagohary
Copy link

@aagohary aagohary commented Feb 10, 2020

I see that ELMo tokenembedder has an argument called requires_grad. But that is not the case for BERT (PretrainedTransformerEmbedder).

@aagohary aagohary changed the title How to configure BERT TokenEmbedder so that it get gradients? How to configure BERT TokenEmbedder so that it gets gradients? Feb 10, 2020
@matt-gardner

This comment has been minimized.

Copy link
Member

@matt-gardner matt-gardner commented Feb 10, 2020

Are you saying that it's not getting updated? What exactly are you trying to do that isn't working?

@aagohary

This comment has been minimized.

Copy link
Author

@aagohary aagohary commented Feb 10, 2020

I am trying to use allennlp-interpret and I get an empty "instance_1" output (no grad_input_1) .. I guess it has to do with the gradients at the embedding layer. I get proper outputs with a glove-based embedder as long as I set trainable to true.

@matt-gardner

This comment has been minimized.

Copy link
Member

@matt-gardner matt-gardner commented Feb 10, 2020

Can you give more details about what you're doing? In particular, it'd be good if you could replicate what we do in the demo (which uses PretrainedTransformerEmbedder) in your setup, to help isolate the issue. Also see this issue, which has some more info: #3679.

@aagohary

This comment has been minimized.

Copy link
Author

@aagohary aagohary commented Feb 10, 2020

Thanks. I just added a requires_grad: true to the token_embedders config as

token_embedders: {
        bert: {
          type: 'bert-pretrained',
          pretrained_model: 'bert-base-uncased',
          requires_grad: true
        }
      }

and that solved the problem with allennlp-interpret. The reason why I was confused was that I did not see requires_grad as a parameter in PretrainedTransformerEmbedder.

@aagohary aagohary closed this Feb 10, 2020
@matt-gardner

This comment has been minimized.

Copy link
Member

@matt-gardner matt-gardner commented Feb 10, 2020

I'm confused about why you need to pass that in; it seems to say that the default is False. Oh! You're using bert-pretrained instead of pretrained_transformer as your embedder type. That's the problem. Yeah, the default for bert-pretrained might have been False, but that's not true for pretrained_transformer. The bert-pretrained one is also going away soon (it's already gone from master).

@aagohary

This comment has been minimized.

Copy link
Author

@aagohary aagohary commented Feb 10, 2020

I see. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.