Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PagedAdamW32bit #900

Merged
merged 1 commit into from
Oct 29, 2023
Merged

Add PagedAdamW32bit #900

merged 1 commit into from
Oct 29, 2023

Conversation

xzuyn
Copy link
Contributor

@xzuyn xzuyn commented Oct 27, 2023

No description provided.

@kohya-ss kohya-ss changed the base branch from main to dev October 29, 2023 06:01
@kohya-ss
Copy link
Owner

Thank you for this! I didn't know PagedAdamW32bit.

@kohya-ss kohya-ss merged commit a9ed4ed into kohya-ss:dev Oct 29, 2023
1 check passed
@oliverban
Copy link

This is great, but isn't the 32bit part still expensive in terms of memoy? Or is this different for images? I really like this way of optimization but if we can get a PagedAdamW8Bit, wouldn't that be even better? I read that but they were talking about LLMs.

@xzuyn
Copy link
Contributor Author

xzuyn commented Nov 18, 2023

This is great, but isn't the 32bit part still expensive in terms of memory?

In terms of memory usage it's more expensive than using 8bit, but since its Paged that memory isn't eating into GPU VRAM, only system RAM. So if you have the RAM for it, using 32bit is higher precision. No idea how much that translates into real world quality though.

I really like this way of optimization but if we can get a PagedAdamW8Bit, wouldn't that be even better?

It's already implemented. I added PagedAdamW32bit here since only PagedAdamW8bit was added.

There may also be a 16bit paged version just named PagedAdamW but that may just be an alias for PagedAdamW32bit or PagedAdamW8bit. I'll have to check into this later.

edit: From what it seems, PagedAdamW is like AdamW so I will do another PR shortly after I confirm it runs.

@oliverban
Copy link

This is great, but isn't the 32bit part still expensive in terms of memory?

In terms of memory usage it's more expensive than using 8bit, but since its Paged that memory isn't eating into GPU VRAM, only system RAM. So if you have the RAM for it, using 32bit is higher precision. No idea how much that translates into real world quality though.

I really like this way of optimization but if we can get a PagedAdamW8Bit, wouldn't that be even better?

It's already implemented. I added PagedAdamW32bit here since only PagedAdamW8bit was added.

There may also be a 16bit paged version just named PagedAdamW but that may just be an alias for PagedAdamW32bit or PagedAdamW8bit. I'll have to check into this later.

edit: From what it seems, PagedAdamW is like AdamW so I will do another PR shortly after I confirm it runs.

Great news! I have 64 GB system RAM so should easily fit! Thanks for looking into this! <3

wkpark pushed a commit to wkpark/sd-scripts that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants