-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dreambooth: crash after saving a checkpoint if fp16 output is enabled #1566
Labels
bug
Something isn't working
Comments
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 6, 2022
… checkpoint to avoid crash when running fp16
Great catch! Will take a look at the PR! |
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
… checkpoint to avoid crash when running fp16
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
…ions of accelerate. part of fix for huggingface#1566
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
… checkpoint to avoid crash when running fp16
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
…ions of accelerate. part of fix for huggingface#1566
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 8, 2022
…ions of accelerate. part of fix for huggingface#1566
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 10, 2022
… checkpoint to avoid crash when running fp16
timh
added a commit
to timh/diffusers
that referenced
this issue
Dec 10, 2022
…ions of accelerate. part of fix for huggingface#1566
tcapelle
pushed a commit
to tcapelle/diffusers
that referenced
this issue
Dec 12, 2022
… checkpoint to avoid crash when running fp16 (huggingface#1618) * dreambooth: fix huggingface#1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16 * dreambooth: guard against passing keep_fp32_wrapper arg to older versions of accelerate. part of fix for huggingface#1566 * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/dreambooth/train_dreambooth.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
sliard
pushed a commit
to sliard/diffusers
that referenced
this issue
Dec 21, 2022
… checkpoint to avoid crash when running fp16 (huggingface#1618) * dreambooth: fix huggingface#1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16 * dreambooth: guard against passing keep_fp32_wrapper arg to older versions of accelerate. part of fix for huggingface#1566 * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/dreambooth/train_dreambooth.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
If (
accelerate
is configured withfp16
, or--mixed_precision=fp16
is specified on the command line) AND--save_steps
is specified on the command line, Dreambooth crashes after writing a checkpoint:Reproduction
Repro at command line (with some irrelevant fetching output snipped):
Accelerate config:
(PR to fix is incoming)
Logs
No response
System Info
The text was updated successfully, but these errors were encountered: