Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable other tunable parameters for create_optimizer in optimization.py #10143

Closed
1 task done
01-vyom opened this issue Jul 18, 2021 · 4 comments
Closed
1 task done
Assignees
Labels
models:official models that come under official repository type:feature

Comments

@01-vyom
Copy link

01-vyom commented Jul 18, 2021

Prerequisites

  • I checked to make sure that this feature has not been requested already.

1. The entire URL of the file you are using

def create_optimizer(init_lr,
num_train_steps,
num_warmup_steps,
end_lr=0.0,
optimizer_type='adamw',
beta_1=0.9):
"""Creates an optimizer with learning rate schedule."""
# Implements linear decay of the learning rate.
lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay(
initial_learning_rate=init_lr,
decay_steps=num_train_steps,
end_learning_rate=end_lr)
if num_warmup_steps:
lr_schedule = WarmUp(
initial_learning_rate=init_lr,
decay_schedule_fn=lr_schedule,
warmup_steps=num_warmup_steps)
if optimizer_type == 'adamw':
logging.info('using Adamw optimizer')
optimizer = AdamWeightDecay(
learning_rate=lr_schedule,
weight_decay_rate=0.01,
beta_1=beta_1,
beta_2=0.999,
epsilon=1e-6,
exclude_from_weight_decay=['LayerNorm', 'layer_norm', 'bias'])
elif optimizer_type == 'lamb':
logging.info('using Lamb optimizer')
optimizer = tfa_optimizers.LAMB(
learning_rate=lr_schedule,
weight_decay_rate=0.01,
beta_1=beta_1,
beta_2=0.999,
epsilon=1e-6,
exclude_from_weight_decay=['LayerNorm', 'layer_norm', 'bias'])
else:
raise ValueError('Unsupported optimizer type: ', optimizer_type)
return optimizer

2. Describe the feature you request

Similar to beta_1 as a tunable parameter for creating optimizers, we can add beta_2, epsilon, weight_decay_rate, and exclude_from_weight_decay as tunable parameters by passing them as argument from create_optimizer.

3. Additional context

I was recently trying to finetune a Huggingface Roberta model and while doing so, I wanted to add a scheduler as well as AdamW with custom parameters, and thus I came across these methods.

4. Are you willing to contribute it? (Yes or No)

Yes

@saberkun
Copy link
Member

Hi, we are using the configurable optimization now: https://github.com/tensorflow/models/tree/master/official/modeling/optimization
All params should be configurable now. It is not used in the legacy bert/ example.

@01-vyom
Copy link
Author

01-vyom commented Jul 21, 2021

Nice!! I was following this guide for fine-tuning TF BERT, and the official nlp optimizer was used to set up the optimizer. So, if this configurable optimization is ready, where can I find some examples for that? or can we change the above-mentioned fine-tuning example by using this configurable optimization? Let me know what you think about this.

@saberkun
Copy link
Member

Here is the unit test: https://github.com/tensorflow/models/blob/master/official/modeling/optimization/optimizer_factory_test.py
Here is the use case:
https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L67

The bert tutorial was limited to the particular use case as a demo.
Yes, we will consider updating the colab which is a bit old already.

@01-vyom
Copy link
Author

01-vyom commented Jul 22, 2021

Thank you for the resources!! Looking forward to the update in the Colab example. Let me know if I can contribute in any way.

This issue can be closed.

@01-vyom 01-vyom closed this as completed Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models:official models that come under official repository type:feature
Projects
None yet
Development

No branches or pull requests

6 participants