You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @geronimi73
Thanks for the issue - yes this is expected as Galore optimizers under the hood will fall back to the native optimizer if you don't pass any galore kwargs
yes, @jiaweizzhao told me the same that's why I closed the issue. would have noticed this myself if I looked at the code more carefully. thank you for your answer
System Info
Hey everyone,
I noticed something in the current implementation of GaLore (layerwise): the same optimizer is hooked onto
galore_params
andnon_galore_params
.Is this on purpose?
transformers/src/transformers/trainer.py
Lines 1296 to 1301 in 76a33a1
The official implementation hooks
GaLoreAdamW8bit
to galore_params andbnb.optim.Adam8bit
to all others:https://github.com/jiaweizzhao/GaLore/blob/864eeb361dc96c1932c3fa429ad0119aaed8e617/torchrun_main.py#L339-L342
Who can help?
@younesbelkada
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
any script that uses
galore_*_layerwise
Expected behavior
not sure
The text was updated successfully, but these errors were encountered: