Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastGAN grid artifacts #17

Closed
JCBrouwer opened this issue Dec 7, 2021 · 7 comments
Closed

FastGAN grid artifacts #17

JCBrouwer opened this issue Dec 7, 2021 · 7 comments

Comments

@JCBrouwer
Copy link

JCBrouwer commented Dec 7, 2021

I've been noticing quite a lot of griddy/repetitive patterns in the outputs when training at high resolution with FastGAN.

Will the change from today help address those? Or are these inherent to the skip-excitation layers? (the grids do seem to be ~32x32, which is what is skipped to the 512x512 block). Alternatively, would you happen to know ways that these patterns could be reduced?

Example training grid with repetitive grid patterns (5000 image dataset after 919 kimg):

training grid with repetitive grid patterns

Example training grid with repetitive grid patterns and mode collapse (4000 image dataset after 680 kimg finetuning from above checkpoint)

training grid with repetitive grid patterns (and mode collapse)

@xl-sr
Copy link
Contributor

xl-sr commented Dec 7, 2021

very cool samples!

I also noticed that these patterns sometimes occur on higher resolutions, and I think you're probably right that the skip-excitation layers cause this...
And yes, the changes today might fix this, as we now introduce noise at every layer which should break the repetitive patterns.

If not, you can try two things:

  • Use StyleGAN2. With the changes I added for the release, StyleGAN is working really well too in most cases.
  • Don't use the skip-excitation layers, the gain they yield is not that substantial tbh.

@JCBrouwer
Copy link
Author

Alright, thanks for the quick reply! I'll fire up some runs with StyleGAN then.

By 'don't use the skip-excitation layers,' do you mean essentially making the FastGAN generator an nn.Sequential of UpBlockComp()'s? Is this a configuration you've tested?

@xl-sr
Copy link
Contributor

xl-sr commented Dec 7, 2021

Yes, that's exactly what I mean :) I did some testing and adding skip-excitation layers usually gave some minor improvements, so I kept them and didn't bother.

@xl-sr
Copy link
Contributor

xl-sr commented Dec 7, 2021

gonna close this for now, feel free to reopen it or to post updates :)

@xl-sr xl-sr closed this as completed Dec 7, 2021
@JCBrouwer
Copy link
Author

I've given both stylegan2 and the sequential fastgan a try, but haven't been happy with the results from either.

Here's the stylegan2 backend after 1000 kimg, serious mode collapse since the beginning of training:
Training grid of projected GAN with stylegan2 backend. serious mode collapse is visibile in repeated structural patterns throughout the grid

Sequential fastGAN had even worse grid / diagonal artifacts than regular fastGAN (comparable/slightly worse than separable discriminator version?) and wasn't able to get off the ground. ~300 kimg

training grid with sequential fastgan, griddy diagonal artifacts

For reference here's the same dataset trained with essentially the CIFAR10 config of stylegan2-ada-pytorch (no PL reg, lower gamma, no style mixing), finetuned from the FFHQ checkpoint for 1000 kimg. There's also a little bit of mode collapse visible, but much less than the projected GAN version.

stylegan2-ada-pytorch training grid, best quality overall with minor structural mode collapse

I'd say overall the regular projected FastGAN config is the most successful, but cannot quite reach the same artifactless quality as stylegan2-ada-pytorch.

@woctezuma
Copy link

woctezuma commented Dec 12, 2021

Yes, the CIFAR config is basically set up to:

  • disable path length regularization
  • disable style mixing
  • disable residual skip connections
if cfg == 'cifar':
    args.loss_kwargs.pl_weight = 0 # disable path length regularization
    args.loss_kwargs.style_mixing_prob = 0 # disable style mixing
    args.D_kwargs.architecture = 'orig' # disable residual skip connections

cf. https://github.com/NVlabs/stylegan2-ada-pytorch/blob/d4b2afe9c27e3c305b721bc886d2cb5229458eba/train.py#L195-L198

and with a slightly different config:

cfg_specs = {
    'paper512':  dict(ref_gpus=8,  kimg=25000,  mb=64, mbstd=8, fmaps=1,
                      lrate=0.0025, gamma=0.5,  ema=20,  ramp=None, map=8),
    'cifar':     dict(ref_gpus=2,  kimg=100000, mb=64, mbstd=32, fmaps=1,
                      lrate=0.0025, gamma=0.01, ema=500, ramp=0.05, map=2),
}

cf. https://github.com/NVlabs/stylegan2-ada-pytorch/blob/d4b2afe9c27e3c305b721bc886d2cb5229458eba/train.py#L160
and https://github.com/NVlabs/stylegan2-ada-pytorch/blob/d4b2afe9c27e3c305b721bc886d2cb5229458eba/train.py#L158

@xl-sr
Copy link
Contributor

xl-sr commented Dec 13, 2021

very interesting, thanks for the insights!

Generally, we have seen that the current PG config might not be optimal for higher resolutions. As you can see in the results , the FIDs for higher resolution, e.g for Pokemon, are generally a bit higher (of course the task is also harder). This is likely due to the training resolution of the feature network being different, usually, it is at 224/256.

As a check, have you tried training on 256?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants