Setting `requires_grad = True` for optimizer parameters #5106

nelson-liu · 2021-04-09T01:01:56Z

Is your feature request related to a problem? Please describe.

I'd like the ability to set requires_grad=True in the optimizer parameter groups. For instance:

    ...
    "text_field_embedder": {
      "token_embedders": {
        "tokens": {
          "type": "pretrained_transformer",
          "model_name": transformer_model,
          "max_length": 512,
          "train_parameters": false,
        }
      }
    },
    ....
    "optimizer": {
       ...
      "parameter_groups": [
        # This turns on grad for the attention query bias vectors and the intermediate MLP bias vectors.
        # Since we set train_parameters to false in the token_embedder, these are the only weights that will be updated
        # in the token_embedder.
        [["^_text_field_embedder.token_embedder_tokens.transformer_model.*attention.self.query.bias$"], {"requires_grad": true}],
        [["^_text_field_embedder.token_embedder_tokens.transformer_model.*intermediate.dense.bias$"], {"requires_grad": true}]
      ]
    },

In this config, I set the token embedder train_parameters to false, so it's not trainable. However, i want to train some of the parameters (defined by the regex). The intended outcome is that the token embedder parameters are non-trainable (since train_parameters = False), but a subset of them are trainable (defined by the regex).

The current behavior is that these groups are just ignored. This is because, the non-trainable parameters aren't even passed to the optimizer, so the regexes don't match anything / they accordingly can't have their requires_grad value changed.

(i realize that i can do this by setting train_parameters = True, and then writing a regex to select out all of the parameters that don't match the regexes above and then setting {requires_grad: False} on those. however, that regex is borderline unmaintainable / certainly not very readable.)

The text was updated successfully, but these errors were encountered:

epwalsh · 2021-04-09T16:39:25Z

Also see #5073 (comment)

dirkgr · 2021-04-09T16:40:37Z

@nelson-liu, can we fix this by passing everything to the optimizer, but making sure that requires_grad is set properly for all of them? I don't like that we have two different ways of preventing parameters to be updated. We should have just one.

github-actions · 2021-04-26T16:10:32Z

@epwalsh this is just a friendly ping to make sure you haven't forgotten about this issue 😜

github-actions · 2021-05-05T16:10:16Z

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇

nelson-liu added the Feature request label Apr 9, 2021

epwalsh self-assigned this Apr 9, 2021

epwalsh removed their assignment Apr 26, 2021

github-actions bot added the stale label May 5, 2021

github-actions bot closed this as completed May 5, 2021

epwalsh added Contributions welcome and removed stale labels May 5, 2021

epwalsh reopened this May 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting `requires_grad = True` for optimizer parameters #5106

Setting `requires_grad = True` for optimizer parameters #5106

nelson-liu commented Apr 9, 2021

epwalsh commented Apr 9, 2021

dirkgr commented Apr 9, 2021

github-actions bot commented Apr 26, 2021

github-actions bot commented May 5, 2021

Setting requires_grad = True for optimizer parameters #5106

Setting requires_grad = True for optimizer parameters #5106

Comments

nelson-liu commented Apr 9, 2021

epwalsh commented Apr 9, 2021

dirkgr commented Apr 9, 2021

github-actions bot commented Apr 26, 2021

github-actions bot commented May 5, 2021

Setting `requires_grad = True` for optimizer parameters #5106

Setting `requires_grad = True` for optimizer parameters #5106