Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we just use the sloth gradient checkpointing by uncommenting this line? #30

Open
vkaul11 opened this issue May 21, 2024 · 4 comments

Comments

@vkaul11
Copy link

vkaul11 commented May 21, 2024

I was not clear about how to use the code ?
https://github.com/jzhang38/EasyContext/blob/main/train.py#L28 By uncommenting this line we can enable sloth code?

@jzhang38
Copy link
Owner

Yes you can. It will produce the same loss. But it does not enable greater batch size in my experiments.

@vkaul11
Copy link
Author

vkaul11 commented May 21, 2024

I am getting this error though when I do this. Any idea why ?
File "/workspace/cookbook-internal/recipes/common/peft.py", line 89, in load_train_model
model = prepare_model_for_kbit_training(model)
File "/usr/local/lib/python3.10/dist-packages/peft/utils/other.py", line 137, in prepare_model_for_kbit_training
model.gradient_checkpointing_enable(**gc_enable_kwargs)
File "/workspace/cookbook-internal/recipes/common/sloth_activation.py", line 63, in new_gradient_checkpointing_enable
assert gradient_checkpointing_kwargs == None
AssertionError
Maybe using QLora instead of Lora complicates things?

@vkaul11
Copy link
Author

vkaul11 commented May 21, 2024

I need it reduce memory footprint not batch size

@vkaul11
Copy link
Author

vkaul11 commented May 21, 2024

A question assert gradient_checkpointing_kwargs == None is there which throws an error. Do I need to set gradient_checkpointing_kwargs to something or I need to comment this line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants