Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GAN #46

Open
2 tasks
raspstephan opened this issue Apr 23, 2021 · 9 comments
Open
2 tasks

Improve GAN #46

raspstephan opened this issue Apr 23, 2021 · 9 comments

Comments

@raspstephan
Copy link
Owner

Here is a new thread for discussing how to proceed with the GAN experiments. For the first results see #29

Current status:

  • Working on precip-to-precip TIGGE to MRMS
  • Reproduced GAN fairly closely from Leinonen paper. For differences, see below.
  • Output is GAN-like (i.e. generative) and discriminator seems to do what it should but doesn't look realistic yet

Goal:

  • Create precip-to-precip GAN that produces realistic-looking output

Comments/observations:

  • Open question as to whether our forecast-to-observation setup changes things a lot. Therefore, @annavaughan's suggestion: Work with pure super-resolution setup until we have something realistic looking.
  • Data preprocessing seems to matter A LOT. Specifically, the weighted sampling and the log-transform. First job for me is to check again exactly whether everything is correct and make sure we pick a setup that does what we think it should! Check log-transform and weighted sampling #45
  • There are some differences to the original Leinonen paper and to the Oxford/ECMWF work from what I tested
    • I was using spectral norm in G and D. Leinonen only in D. Oxford not at all.
    • I pretrained G using MSE for one epoch. Leinonen does not do this. Not sure about Oxford.
    • I added a L1 loss for the G. BUT because the G loss was so crazy, it shouldn't have any impact.
    • Leinonen uses L2 regularization for G. I am not doing this, not sure about Oxford.
    • Architecture is slightly different. Compared to Leinonen, I am concatenating only a single noise channel. We and Oxford are not using the two-path structure of the discriminator.
    • We are upscaling the low-res input and concatenating to the high-res image at the start of D. Oxford and Leinonen use a two-path structure before concatenating.
    • We are not using any output activation in G, Oxford use ReLU, Leinonen uses sigmoid.

Next steps:

  • Test best setup so far with pure SR setup
  • Test sensitivity of GAN results (visually at first) to several of the options above

@annavaughan @HirtM I will focus on this over the next week. Feel free to also run experiments. If you do, could you maybe just quickly announce this in this thread, so that we make sure we are not doing duplicate work. I have been experimenting in notebooks in the Experiments folder. Feel free to copy this setup.

@raspstephan raspstephan added this to In progress in NWP downscaling Apr 23, 2021
@raspstephan
Copy link
Owner Author

First hint: Not using the log transform makes the GAN not work. This suggests that data preprocessing is super important.

@raspstephan
Copy link
Owner Author

I am training our current setup with the pure super-resolution setup on my branch stephans_gan as a notebook 07. Will update later on how it went.

@raspstephan
Copy link
Owner Author

Really weird: Using the pure SR option with the same setup that produced somewhat reasonable results before does not work. The losses explode and the image looks crap.

image

This emphasizes that I need to check the log-transform/preprocessing.

@annavaughan
Copy link
Collaborator

@raspstephan I'm going to look at this now. Is the current best version the one in notebook 07?

@raspstephan
Copy link
Owner Author

Hi Anna, here a quick update. 07 contains the close-to-Leinonen setup with the pure super-resolution approach. THIS DOES NOT WORK!? 08 contains the same setup with regular TIGGE to MRMS, which "works". I just tested this again in 09 and 10 with the new weighted sampling (not yet fully commited because still running). SAME outcome. TIGGE to MRMS "works", pure SR does not. I have no idea why. Maybe we could have a quick meeting tomorrow to look at this together?

@annavaughan
Copy link
Collaborator

That would be great to meet tomorrow, I'm free between 12pm-2:30pm Munich time if sometime then works? I'm very confused why the pure SR doesn't work - I'm looking at the code now and it seems fine

@raspstephan
Copy link
Owner Author

Great, here a summary of my very confusing findings. BTW @annavaughan, I am working in the Experiments subdirectory of notebooks.

TIGGE-->MRMS works sort of, also with the new sampling method. (09 notebook). The loss implodes.
image
image

Pure SR (so MRMS coare-->MRMS fine) with the same setup does NOT work (10 notebook)
image
image

Testing a different discriminator architecture with two heads more similar to Leinonen produces a really weird discriminatory loss of almost exactly 10(!?). This is cause by the gradient penalty. No idea why.
image
image

We need to figure out why the pure SR behaves so differently from the TIGGE--> MRMs setup. This makes no sense to me at the moment...

@raspstephan
Copy link
Owner Author

@annavaughan Update on my experiments with lower learning rate. Started out encouraging but then wasn't so great after all.

First up, I ran the TIGGE-MRMS setup that has previously "worked" with 1e-4, now with 1e-5. Notebook 09. In the end the results look pretty similar to the original learning rate. Maybe a little less filamenty but still not realistic enough.
image
image

Then I ran the pure super-resolution setup with 1e-5 (notebook 10). Unfortunately, I didn't save the training logs. But after like 8 epochs the losses again exploded and the images became increasingly unrealistic.
image

This makes me wonder whether the LR was really at fault after all. Maybe it just delayed the inevitable... So no solution yet. How did you get on with your Leinonen setup? Happy to talk again tomorrow to debug.

@raspstephan
Copy link
Owner Author

@annavaughan So my MNIST test was a failure. I guess this tells us at least that it's not the data (which we already suspected) But then WHAT THE HELL IS IT!?!?!? It must be something in the networks or training that we both are doing wrong?? Well that is, if you MNIST experiments also turn out to fail. Keep me posted :)

image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
NWP downscaling
In progress
Development

No branches or pull requests

2 participants