Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning with weights in bfloat16 #100

Merged
merged 8 commits into from
Apr 6, 2023
Merged

Finetuning with weights in bfloat16 #100

merged 8 commits into from
Apr 6, 2023

Conversation

awaelchli
Copy link
Member

The memory consumption is now ~20GB compared to before at ~38GB.
The training iteration speed-up ~1.5x-2x.

Copy link
Collaborator

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggested changes

README.md Outdated Show resolved Hide resolved
@awaelchli awaelchli changed the title Adapter-finetuning with weights in bfloat16 Finetuning with weights in bfloat16 Apr 5, 2023
Co-authored-by: Luca Antiga <luca@lightning.ai>
@lantiga
Copy link
Collaborator

lantiga commented Apr 5, 2023

Does memory fit a 3090 even in the LoRA case?

@awaelchli
Copy link
Member Author

It's tight but yeah it could work:

image

I can double-check on our 3090 machine.

@lantiga
Copy link
Collaborator

lantiga commented Apr 5, 2023

oh ahah, that's tight!

@lantiga
Copy link
Collaborator

lantiga commented Apr 5, 2023

better check IMO

@awaelchli
Copy link
Member Author

Attempted to test it but got hold up by this problem: #101

@awaelchli
Copy link
Member Author

The finetuning fits into the 3090:

image

I had to install pytorch nightly to get around the issue #101 though. Perhaps we should hold off merging this? Or we could say this requires pytorch nightly. Or we could investigate whether changing the implementation can avoid the error.

@lantiga
Copy link
Collaborator

lantiga commented Apr 5, 2023

Great, I would merge and say that 3090 requires nightly

then we’ll investigate complex again

@lantiga
Copy link
Collaborator

lantiga commented Apr 6, 2023

Amazing, let’s merge!

@lantiga lantiga merged commit 8a13cbf into main Apr 6, 2023
@lantiga lantiga deleted the adapter-bfloat branch April 6, 2023 06:42
timothylimyl referenced this pull request in timothylimyl/lit-llama-qa May 21, 2023
Co-authored-by: Luca Antiga <luca@lightning.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants