Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ranger21 has undocumented required arguments #214

Closed
Vectorrent opened this issue Nov 22, 2023 · 3 comments
Closed

Ranger21 has undocumented required arguments #214

Vectorrent opened this issue Nov 22, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@Vectorrent
Copy link
Contributor

Describe the bug

If you try to use the Ranger21 optimizer (with default settings), you'll get an error:

  File "/src/aigen/aigen/train.py", line 145, in select_optimizer
    optimizer = Ranger21(
TypeError: Ranger21.__init__() missing 1 required positional argument: 'num_iterations'

"num_iterations" is an undocumented argument.

To Reproduce

  • transformers version: 4.35.2
  • Platform: Linux-6.5.9-arch2-1-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.19.4
  • Safetensors version: 0.4.0
  • Accelerate version: 0.24.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.1.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: True
  • Using distributed or parallel set-up in script?: False

Expected behavior

I would expect num_iterations to be clearly documented, and to have a default value.

Additional context

I was originally going to fix this myself, and create a pull request. However, looking into the Ranger21 code, I don't actually know how "num_iterations" was intended to be used? There are two other undocumented arguments: "num_warm_up_iterations" and "num_warm_down_iterations". I don't understand why this wasn't left to the scheduler, and indeed, whether I use "1" num_iterations or the value of my run's total steps - the model does not learn at all. Loss remains flat, weights do not update.

This is not a major issue and I'm going to use a different optimizer for now. Just wanted to make sure maintainers were aware.

@Vectorrent Vectorrent added the bug Something isn't working label Nov 22, 2023
@kozistr
Copy link
Owner

kozistr commented Nov 30, 2023

thanks for reporting. I missed it :(

num_iterations is for the total training steps!

how "num_iterations", "num_warm_up_iterations", "num_warm_down_iterations" was intended to be used?

It's because the Ranger21 optimizer schedules the learning rate with its own recipes (schedulers). Here's the source code Ranger21.

Ranger21 performs linear lr warmup and Explore-exploit lr scheduler.

I don't understand why this wasn't left to the scheduler

I agree with you. It'd be better to use lr scheduler that we want (not hard-coded in the optimizer).

If there is time, I'll try to refactor to remove the internal lr schedulers and implement them with the Pytorch compatible lr scheduler class or something else!

anyway, thanks for the reporting and the suggestion.

@kozistr
Copy link
Owner

kozistr commented Nov 30, 2023

Here are the previous Ranger21 optimizer related issues.

hope it helps you!

@kozistr
Copy link
Owner

kozistr commented Dec 10, 2023

I just added the missing parameters. you can check it from the latest documentation!

thank you

@kozistr kozistr closed this as completed Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants