Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training issues #2

Closed
Gummygamer opened this issue Feb 21, 2021 · 8 comments
Closed

Training issues #2

Gummygamer opened this issue Feb 21, 2021 · 8 comments

Comments

@Gummygamer
Copy link

Hello, I admire your work here, I love GANs and Pokémon, I've read your paper on AEGAN and I have some questions. How should I properly resume the training? Is it enough to save the generator weights or does it make too hard for the discriminators training to converge? Should I save the four networks? The generator and the encoder? Any other variable? How should I change the architecture to support 48x48 or 32x32 datasets to make the training faster?

@RileyLazarou
Copy link
Owner

Hey André, thanks for your interest. If you're looking to stop then resume training you should save the weights for the generator, the encoder, the image discriminator, and the latent vector discriminator (although the AEGAN recovers pretty well when resuming with a brand new encoder and discriminators; this may be a useful regularization technique to look into).

As for changing the architecture, you'll have to tweak the generator, encoder, and discriminator classes. GANs are a lot of trial and error, and I tuned this one to work with 96x96 images so you'll have to experiment with different layer shapes to see what works for you.

@Gummygamer
Copy link
Author

Thanks a lot for the answers, I'll keep on the path of saving all weights and maybe exploring different hyperparameters. I'm training it currently on what seems to be a slightly more complete sprite dataset than what you used to generate pokémon.

@Gummygamer
Copy link
Author

Gummygamer commented Feb 28, 2021

Just a little thing for you: https://www.youtube.com/watch?v=JpTDWyAQTiA
I've had moderate success in training. One little trick that seemed to help was training for around 1000 epochs, loading only the generator and after that keep loading the 4 networks. One problem that seems to be remaining is data preprocessing. I believe you preprocessed the original dataset that you have linked on your article at Towards Data Science, a dataset which by the way was recently increased. Did you just batch paint the backgrounds to white? Did you exclude images? Did you include females and shinies (those don't look useful)? I've read you mentioning 600 unique images and expanding to 1600. I'm well aware that making the exact same model is kind of pointless, but I'm aiming to understand the details of the process so that I can apply the methods to other image generation tasks in the future. As I've mentioned, the linked sprite source was expanded so I might train it with a bigger dataset than you used though quite similar. In my attempts, I've observed the G loss slowly increasing and Dx decreasing to near zero, which made me fear a maxed out or even degrading training.

@Gummygamer Gummygamer reopened this Feb 28, 2021
@RileyLazarou
Copy link
Owner

First of all, love the video 😄

It's been a while, but as best as I can recall, these are the steps I took in data prepossessing (not in any particular order):

  • Only include the front sprites (including the backs and making it a conditional GAN may help generalization though?)
  • Remove the transparent layer but first use it to turn all transparent pixels white (but leaving it in may help the GAN learn to differentiate between white pixels that are part of the pokemon and white pixels that are background)
  • I did include the shinnies. They're hand-crafted palette swaps, not hue rotations (as I did in data augmentation), so I figured there was value in keeping them.
  • I did include the females. There probably wasn't much value in this though, since they're so similar to the male sprites.
  • I discarded any sprite that didn't originally have 4 channels. I think this was only a handful of greyscale sprites like the question mark sprite.

Actually, I just realized I still have the formatted sprites. Here you go!

I've observed the G loss slowly increasing and Dx decreasing to near zero, which made me fear a maxed out or even degrading training.

Yeah, that'll happen. Exploring alternative losses might be a good path forward, and adjusting your learning rate might help too.

Happy hunting!

gen 3474

@Gummygamer
Copy link
Author

Gummygamer commented Feb 28, 2021

Thanks for seeing my video and for the clarification. I will have a look at the files, I believe that the transparent layer was detrimental to my experiments as the images were sometimes read as having black backgrounds and sometimes white backgrounds, which was reflected on the reconstructed and generated images. I didn't include shinies nor back sprites nor females, in part of the experiments I removed eggs, the question mark and images that gave CRC warnings from imagemagick. One idea for a conditional GAN is dividing by body shapes:

https://bulbapedia.bulbagarden.net/wiki/List_of_Pokémon_by_shape

Right now I'm training it on the bigger newer version of the dataset with females and shinies, but with the white background preprocessing and removing the eggs and the question mark.

Do you have any more philosophical thoughts about the limits of AI? The most striking thing about this instance of AEGAN to me is that it signs that we have at least a very rough version of what would be a mathematical description of visual artistic creation. Do you believe that this field can (and should) go much further or will we hit a wall as Moore's Law dies out, we fail to get feasible bigger quantum computers and hurt ourselves on environment and health issues?

@RileyLazarou
Copy link
Owner

I appreciate the enthusiasm but I don't think I'm qualified to talk much on those topics. But I wouldn't call GANs a mathematical description of artistic creation; rather, they're just effective tools at approximating probability distributions in pixel space.

@Gummygamer
Copy link
Author

Gummygamer commented Mar 5, 2021

Just updating, I've quite succeeded in training it on that updated version of the dataset. In the middle of the process, I discarded the 3D model pictures and ended up with 2583 sprites.

cre 0001

As in your experiments, it seems good in creating unown variants. Besides that, in more than one training, I've witnessed great reconstruction of Arceus.

@RileyLazarou
Copy link
Owner

Looks great, I'm glad it's working for you! Unknowns and Arceus are are overrepresented in the dataset; unknowns because they all look similar, and arceus because he has a different palette for each type. To avoid biasing the GAN, it might be worthwhile to remove most of the Arceus sprites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants