Question About LayerNorm Implementation #35

Guaishou74851 · 2022-06-11T03:35:09Z

Hi, your work is really inspiring and interesting!
I am wondering why you re-implement LayerNorm (2D), instead of using PyTorch modules like nn.GroupNorm(1, channels)?
Are there some differences between them (e.g. final performance or function differences)?

The text was updated successfully, but these errors were encountered:

mayorx · 2022-06-11T19:12:10Z

Hi, Guaishou74851,
Thanks for your attention to NAFNet.
You can regard the code as a re-implementation of https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py#L138-L143, which behaves differently from nn.GroupNorm(1, channels). We re-implement it to save GPU memory cost.

Luciennnnnnn · 2023-08-06T06:57:43Z

@mayorx Hi, I observe a different optimization behavior between your implementation and connext's, how it occurs?

Guaishou74851 closed this as completed Jun 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question About LayerNorm Implementation #35

Question About LayerNorm Implementation #35

Guaishou74851 commented Jun 11, 2022 •

edited

mayorx commented Jun 11, 2022

Luciennnnnnn commented Aug 6, 2023

Question About LayerNorm Implementation #35

Question About LayerNorm Implementation #35

Comments

Guaishou74851 commented Jun 11, 2022 • edited

mayorx commented Jun 11, 2022

Luciennnnnnn commented Aug 6, 2023

Guaishou74851 commented Jun 11, 2022 •

edited