Mixed Precision Merging #316

sais-github · 2024-05-10T10:18:58Z

I am curious about whether models used for merging are converted to the set dtype before merging, or if it is only the resulting merge being converted.
And if there is any loss when merging a fp16 and bf16 model?

cg123 · 2024-05-26T23:06:23Z

When you specify a dtype everything is converted to the specified dtype before merging, yeah.

Merging between fp16 and bf16 is potentially lossy, as both data types are capable of representing values that the other is not. This generally isn't going to be a meaningful difference though. If you're working with something super numerically sensitive, I'd recommend upcasting to fp32. I haven't found a use case that needs that kind of precision yet personally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed Precision Merging #316

Mixed Precision Merging #316

sais-github commented May 10, 2024

cg123 commented May 26, 2024

Mixed Precision Merging #316

Mixed Precision Merging #316

Comments

sais-github commented May 10, 2024

cg123 commented May 26, 2024