Possible bug in ControlNet code #126

Tomsen1410 · 2024-04-23T22:06:35Z

Tomsen1410
Apr 23, 2024

Hey,

I have a question regarding the PixArt ControlNet code.

The paper suggests to add the input directly to the conditioning signal in the trainable network copy. However in the code, you first forward the input through the first frozen block and only then add it to the conditioning.

The first trainable copy therefore does not receive the input as expected, but the already modified and slightly encoded input from the first frozen layer. This seems unintuitive. Is it possible that this is wrong behaviour or did I overlook something?

sanshibayuan · 2024-04-24T09:21:33Z

sanshibayuan
Apr 24, 2024

Did you fix this? I also found the ControlNet result is not right compared to the paper, is it possible causing the problem? Did you test it ?

0 replies

lawrence-cj · 2024-04-24T10:01:20Z

lawrence-cj
Apr 24, 2024
Maintainer

This issue may cause problem when the input is image, not HED. We have fixed the bug in the ControlNet in PixArt-Sigma. Will released soon. Stay tune!

https://github.com/PixArt-alpha/PixArt-sigma

2 replies

Feynman1999 May 24, 2024

Will inputting an image (not a hed) cause a bug? If I modify the code logic myself and add input to the condition at the beginning, is that okay?

lawrence-cj May 24, 2024
Maintainer

If your input is an image, the training may be unstable.

If I modify the code logic myself and add input to the condition at the beginning, is that okay?

Yes. It definitely will help with the task with image inputs.

Feynman1999 · 2024-05-24T08:41:56Z

Feynman1999
May 24, 2024

Hey,

I have a question regarding the PixArt ControlNet code.

The paper suggests to add the input directly to the conditioning signal in the trainable network copy. However in the code, you first forward the input through the first frozen block and only then add it to the conditioning.

The first trainable copy therefore does not receive the input as expected, but the already modified and slightly encoded input from the first frozen layer. This seems unintuitive. Is it possible that this is wrong behaviour or did I overlook something?

I have found the same problem as you, that is, there is a slight difference between the process illustrated in the paper and the actual code. What I'm thinking is that since the copied block is trainable, the difference may not be significant?

0 replies

Feynman1999 · 2024-05-24T08:56:17Z

Feynman1999
May 24, 2024

I am currently trying to use Pixart alpha to train ControlNet on my own dataset, with a training resolution of 1024 and a task of image restoration (condition is image). If there is any progress, I will synchronize it here

1 reply

lawrence-cj May 24, 2024
Maintainer

You can try to modify the bug in your code base. Our new version controlnet fixed this bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PixArt

Possible bug in ControlNet code #126

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

PixArt

Possible bug in ControlNet code #126

Tomsen1410 Apr 23, 2024

Replies: 4 comments · 3 replies

sanshibayuan Apr 24, 2024

lawrence-cj Apr 24, 2024 Maintainer

Feynman1999 May 24, 2024

lawrence-cj May 24, 2024 Maintainer

Feynman1999 May 24, 2024

Feynman1999 May 24, 2024

lawrence-cj May 24, 2024 Maintainer

Tomsen1410
Apr 23, 2024

Replies: 4 comments 3 replies

sanshibayuan
Apr 24, 2024

lawrence-cj
Apr 24, 2024
Maintainer

lawrence-cj May 24, 2024
Maintainer

Feynman1999
May 24, 2024

Feynman1999
May 24, 2024

lawrence-cj May 24, 2024
Maintainer