Skip to content

Add Option to use Target Model in LCM-LoRA Scripts#6537

Open
dg845 wants to merge 4 commits intohuggingface:mainfrom
dg845:lcm-lora-scripts-enable-target-model
Open

Add Option to use Target Model in LCM-LoRA Scripts#6537
dg845 wants to merge 4 commits intohuggingface:mainfrom
dg845:lcm-lora-scripts-enable-target-model

Conversation

@dg845
Copy link
Copy Markdown
Collaborator

@dg845 dg845 commented Jan 11, 2024

What does this PR do?

This PR enables a target model to be optionally used in the LCM-LoRA distillation scripts via the --use_target_model argument.

Follow up to #6505.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@patil-suraj
@sayakpaul
@jon-chuang
@shuminghu

# Checks if the accelerator has performed an optimization step behind the scenes
if accelerator.sync_gradients:
# 12. If using a target model, update its parameters via EMA.
update_ema(target_unet.parameters(), unet.parameters(), args.ema_decay)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this works? I had errors with the LoRA parameters.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a workaround in the issue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both are lora weights, so they should work?

Copy link
Copy Markdown
Collaborator Author

@dg845 dg845 Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to test this fully yet, it's possible that this runs into the errors mentioned in #6505 (comment)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the current implementation is correctly updating the LoRA parameters.

# ----Latent Consistency Distillation (LCD) Specific Arguments----
parser.add_argument(
"--use_target_model",
action="store_true",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this default to false so existing users are not surprised?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target model will be used only if the --use_target_model flag is specified (so existing script calls should work as before).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. My bad.

Copy link
Copy Markdown
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. Could you explain the behind this ? Do you have any experiments that demonstrate the use of this ?

@dg845
Copy link
Copy Markdown
Collaborator Author

dg845 commented Jan 18, 2024

#6505 (comment) reports seeing a lot of training instability when training using the current LCM-LoRA script. @jon-chuang, would you be willing to share more details about the training instability?

@dg845 dg845 changed the title [WIP] Add Option to use Target Model in LCM-LoRA Scripts Add Option to use Target Model in LCM-LoRA Scripts Jan 21, 2024
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@jon-chuang
Copy link
Copy Markdown

I don’t have conclusive evidence that this fix will mitigate, but what I observed was some divergent results on LCM-LoRA training runs.

Anw, I think it’s a zero-cost opt-in feature that may produce better results for some users.

I will definitely try the EMA once it is merged and can report on further results.

@github-actions
Copy link
Copy Markdown
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions Bot added the stale Issues that haven't received updates label Feb 23, 2024
@sayakpaul
Copy link
Copy Markdown
Member

@patil-suraj a gentle ping.

@yiyixuxu yiyixuxu added training and removed stale Issues that haven't received updates labels Feb 23, 2024
@github-actions
Copy link
Copy Markdown
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions Bot added the stale Issues that haven't received updates label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Issues that haven't received updates training

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants