Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adversarial/amnesic heads #9992

Open
eritain opened this issue Feb 4, 2021 · 1 comment
Open

Adversarial/amnesic heads #9992

eritain opened this issue Feb 4, 2021 · 1 comment
Labels
Feature request Request for a new feature

Comments

@eritain
Copy link

eritain commented Feb 4, 2021

🚀 Feature request

Task heads that backpropagate deliberately reversed gradients to the encoder. A flag requesting this behavior when constructing a task head.

Motivation

Transfer learning experiments lend themselves to questions about the extent to which two tasks rely on the same information about a word/sentence, and to experiments probing whether and how word encodings contain/correspond to syntax trees, lemmas, frequencies, and other objects of linguistic/psycholinguistic study.

A difficulty is that a pretrained model, without fine-tuning, may already encode certain information too thoroughly and accessibly for intermediate training to make much of a difference. For example, BERT's masked language modeling objective produces word encodings in which syntax information is readily accessible. Intermediate training on a syntax task requires training a task head to extract this information, of course, but it will result in very little reorganization of the encoder itself.

Adversarial training, such as the amnesic probing of Elazar et al. 2020, can avoid this pitfall. Intermediate training can aim to burn particular information out of the encodings, and measure how much this impairs trainability of the target task. Strictly reversing the sense of the training data won't do it though; getting all the answers exactly wrong requires just as much domain knowledge as getting them all right does. And randomizing the labels on training data may just result in a feckless task head, one that discards useful information passed to it from the encoder, rather than affecting the encoder itself.

Ideally, then, the task head would be trained toward correctly reproducing gold-standard labels, but would flip all its gradients before backpropagating them to the shared encoder, thus training it not to produce precisely the signals that the task head found most informative. The following work by Cory Shain illustrates flipping gradients in this way (although it's not applied to shared-encoder transfer learning, but rather to development of encoders that disentangle semantics from syntax).

https://docs.google.com/presentation/d/1E89yZ8jXXeSARDLmlksOCJo83QZdNbd7phBrR_dRogg/edit#slide=id.g79452223cd_0_19
https://github.com/coryshain/synsemnet

Your contribution

I am deeply unfamiliar with pytorch, unfortunately, and utterly ignorant of tensorflow. I can't offer much.

@LysandreJik LysandreJik added the Feature request Request for a new feature label Feb 4, 2021
@LysandreJik
Copy link
Member

Interesting thread, thank you for posting it! You could also post it on the forums to reach more users!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants