Skip to content

Conversation

@andrehuang
Copy link
Contributor

Just adding a clone operation. The original code will cause backprop errors when training the controlnet.

…, which will cause backprop errors when training
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten
Copy link
Contributor

Yes, this makes sense indeed! cc @williamberman can you check?

@williamberman
Copy link
Contributor

hey!

hmm, in all honesty I'm not super familiar with the restrictions around how modifying tensors in place effects errors during back propagation.

My understanding from a (quick) googling is that tensor variables which are stored for the backward pass have the underlying tensors they point to modified, torch throws an error. It isn't immediately clear to me if torch always throws this error or if it requires some extra config.

While training controlnet, I'm not getting the in place modification error. Assuming torch throws the error by default, I'd rather not introduce an unnecessary copy.

Am I off base here with if torch always throw the error or not?

@andrehuang
Copy link
Contributor Author

Hi William,

Thanks for your feedback! Your understanding of in-place tensor modification is correct. But note Pytorch will throw an error only if the gradient of the tensor is being calculated during back prop.

In the vanilla case, the down_block_res_sample does not require grad calculation, so you don't find an error. However, as long as you want to also fine-tune the UNet (either full fine-tuning or using LoRA), Pytorch will throw an error.

So I think it makes sense to add a copy here in case anyone wants to train ControlNet with UNet together.

@jfischoff
Copy link

FWIW I got this error as well, not during training, but when using 'Null-Text Inversion', which requires backprop.

@patrickvonplaten
Copy link
Contributor

@williamberman can you double check here? I'm in favor of going ahead with this PR as it has no drawbacks as far as I can see

Copy link
Contributor

@williamberman williamberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@williamberman williamberman merged commit e2d9a9b into huggingface:main Mar 14, 2023
w4ffl35 pushed a commit to w4ffl35/diffusers that referenced this pull request Apr 14, 2023
…huggingface#2586)

* fix the in-place modification in unet condition when using controlnet, which will cause backprop errors when training

* add clone to mid block

* fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…huggingface#2586)

* fix the in-place modification in unet condition when using controlnet, which will cause backprop errors when training

* add clone to mid block

* fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…huggingface#2586)

* fix the in-place modification in unet condition when using controlnet, which will cause backprop errors when training

* add clone to mid block

* fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants