Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eps for GroupNorm #5

Closed
Asthestarsfalll opened this issue Jul 22, 2022 · 5 comments
Closed

eps for GroupNorm #5

Asthestarsfalll opened this issue Jul 22, 2022 · 5 comments

Comments

@Asthestarsfalll
Copy link
Contributor

Asthestarsfalll commented Jul 22, 2022

Great work!
The paramter 'eps' in group norm will be initialized to 1e-5 by default.
However, the group norm in TensorFlow has a little diference, which is initialized with 1e-6.
Maybe it doesn't have any influence on training results, but can you just modify this(for all GroupNorm in code) for aligning?
Because I want to convert the trained model from torch or tf to megengine, the less the error is, the better it is.

@ChaiByte
Copy link
Contributor

ChaiByte commented Jul 22, 2022

Thanks for your watching! The DDPM model was based on some Pytorch code implementation at first and I'm glad to hear that you are willing to convert original pre-trained model to MegEngine. Here are some information might be helpful:

  • I only checked that the forward process is consistent with the Pytorch version I referenced, but not sure if all details of the original Tensorflow version are implemented.
  • Other converted ckpts: https://github.com/pesser/pytorch_diffusion

In my opinion, converting scripts are also important for users to understand how converted pre-trained models come from. So I sugguest you upload them into this repo, which could encourage more users join us.

Btw, I'm not sure yet how to develop this library in the future, I hope it will help more people understand the implementation of diffusion models. (OpenAI's improved/guided codebase is great, but lack of readability.)

@ChaiByte
Copy link
Contributor

During developing this repo, I write some notes in Chinese for myself to understand more about diffusion models. Here is a post: https://meg.chai.ac.cn/ddpm-megengine/ Welcome to read it and give me some advice.

@Asthestarsfalll
Copy link
Contributor Author

Asthestarsfalll commented Jul 24, 2022

I'm willing to upload my convert codes, but it doesn't work well after converting.
The error between megengine and pytorch implementation are high with the same input.
Because of the padding of convolution in Downsample are different, which in pytorch implementation it uses asymmetric padding.
Atfter I modified the megengine implmetation, the result:

class DownSample(M.Module):
    """"A downsampling layer with an optional convolution.

    Args:
        in_ch: channels in the inputs and outputs.
        use_conv: if ``True``, apply convolution to do downsampling; otherwise use pooling.
    """""

    def __init__(self, in_ch, with_conv=True):
        super().__init__()
        self.with_conv = with_conv
        if with_conv:
            self.main = M.Conv2d(in_ch, in_ch, 3, stride=2)
        else:
            self.main = M.AvgPool2d(2, stride=2)

    def _initialize(self):
        for module in self.modules():
            if isinstance(module, M.Conv2d):
                init.xavier_uniform_(module.weight)
                init.zeros_(module.bias)

    def forward(self, x, temb):  # add unused temb param here just for convince
        if self.with_conv:
            x = F.nn.pad(x, [*[(0, 0)
                         for i in range(x.ndim - 2)], (0, 1), (0, 1)])
        return self.main(x)

image

Btw, I'm also a beginner in ddpm, your blog helps me a lot!

@ChaiByte
Copy link
Contributor

Got it. I'm not available at the moment and I will check the padding mode and #6 after day off.

@ChaiByte
Copy link
Contributor

The initial eps value has been updated and I will close this issue now to keep tracking the same thing in one issue.

Feel free to reopen it if you have any questions or suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants