-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eps for GroupNorm #5
Comments
Thanks for your watching! The DDPM model was based on some Pytorch code implementation at first and I'm glad to hear that you are willing to convert original pre-trained model to MegEngine. Here are some information might be helpful:
In my opinion, converting scripts are also important for users to understand how converted pre-trained models come from. So I sugguest you upload them into this repo, which could encourage more users join us. Btw, I'm not sure yet how to develop this library in the future, I hope it will help more people understand the implementation of diffusion models. (OpenAI's improved/guided codebase is great, but lack of readability.) |
During developing this repo, I write some notes in Chinese for myself to understand more about diffusion models. Here is a post: https://meg.chai.ac.cn/ddpm-megengine/ Welcome to read it and give me some advice. |
I'm willing to upload my convert codes, but it doesn't work well after converting. class DownSample(M.Module):
""""A downsampling layer with an optional convolution.
Args:
in_ch: channels in the inputs and outputs.
use_conv: if ``True``, apply convolution to do downsampling; otherwise use pooling.
"""""
def __init__(self, in_ch, with_conv=True):
super().__init__()
self.with_conv = with_conv
if with_conv:
self.main = M.Conv2d(in_ch, in_ch, 3, stride=2)
else:
self.main = M.AvgPool2d(2, stride=2)
def _initialize(self):
for module in self.modules():
if isinstance(module, M.Conv2d):
init.xavier_uniform_(module.weight)
init.zeros_(module.bias)
def forward(self, x, temb): # add unused temb param here just for convince
if self.with_conv:
x = F.nn.pad(x, [*[(0, 0)
for i in range(x.ndim - 2)], (0, 1), (0, 1)])
return self.main(x) Btw, I'm also a beginner in ddpm, your blog helps me a lot! |
Got it. I'm not available at the moment and I will check the padding mode and #6 after day off. |
The initial eps value has been updated and I will close this issue now to keep tracking the same thing in one issue. Feel free to reopen it if you have any questions or suggestion. |
Great work!
The paramter 'eps' in group norm will be initialized to 1e-5 by default.
However, the group norm in TensorFlow has a little diference, which is initialized with 1e-6.
Maybe it doesn't have any influence on training results, but can you just modify this(for all GroupNorm in code) for aligning?
Because I want to convert the trained model from torch or tf to megengine, the less the error is, the better it is.
The text was updated successfully, but these errors were encountered: