Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diffucult to understand the behavior of lr_scheduler when using gradient_accumulation #1160

Closed
2 of 4 tasks
kingnobro opened this issue Mar 7, 2023 · 0 comments · Fixed by #1187
Closed
2 of 4 tasks
Assignees

Comments

@kingnobro
Copy link

kingnobro commented Mar 7, 2023

System Info

- `Accelerate` version: 0.16.0
- Platform: Linux-5.4.0-91-generic-x86_64-with-glibc2.31
- Python version: 3.9.15
- Numpy version: 1.23.5
- PyTorch version (GPU?): 1.12.1 (True)
- `Accelerate` default config:
        Not found

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Reproduction

Add the following codes after this line: link

if accelerator.is_local_main_process:
    accelerator.print('step', step, 'lr', lr_scheduler.get_last_lr()[0])

I run the gradient_accumulation.py using the following command provided by document.

accelerate launch ./gradient_accumulation.py --gradient_accumulation_steps 5

Expected behavior

Expected: The learning rate is 0 at the end of training.

However, the result is not 0.

Epoch 3
step 222 lr 1.8779661016949152e-05
step 223 lr 1.8779661016949152e-05
step 224 lr 1.8745762711864407e-05
step 225 lr 1.8745762711864407e-05
step 226 lr 1.8745762711864407e-05
step 227 lr 1.8745762711864407e-05
step 228 lr 1.8745762711864407e-05
step 229 lr 1.8711864406779663e-05  # the last line of log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants