Improve GAN #46

raspstephan · 2021-04-23T12:23:06Z

Here is a new thread for discussing how to proceed with the GAN experiments. For the first results see #29

Current status:

Working on precip-to-precip TIGGE to MRMS
Reproduced GAN fairly closely from Leinonen paper. For differences, see below.
Output is GAN-like (i.e. generative) and discriminator seems to do what it should but doesn't look realistic yet

Goal:

Create precip-to-precip GAN that produces realistic-looking output

Comments/observations:

Open question as to whether our forecast-to-observation setup changes things a lot. Therefore, @annavaughan's suggestion: Work with pure super-resolution setup until we have something realistic looking.
Data preprocessing seems to matter A LOT. Specifically, the weighted sampling and the log-transform. First job for me is to check again exactly whether everything is correct and make sure we pick a setup that does what we think it should! Check log-transform and weighted sampling #45
There are some differences to the original Leinonen paper and to the Oxford/ECMWF work from what I tested
- I was using spectral norm in G and D. Leinonen only in D. Oxford not at all.
- I pretrained G using MSE for one epoch. Leinonen does not do this. Not sure about Oxford.
- I added a L1 loss for the G. BUT because the G loss was so crazy, it shouldn't have any impact.
- Leinonen uses L2 regularization for G. I am not doing this, not sure about Oxford.
- Architecture is slightly different. Compared to Leinonen, I am concatenating only a single noise channel. We and Oxford are not using the two-path structure of the discriminator.
- We are upscaling the low-res input and concatenating to the high-res image at the start of D. Oxford and Leinonen use a two-path structure before concatenating.
- We are not using any output activation in G, Oxford use ReLU, Leinonen uses sigmoid.

Next steps:

Test best setup so far with pure SR setup
Test sensitivity of GAN results (visually at first) to several of the options above

@annavaughan @HirtM I will focus on this over the next week. Feel free to also run experiments. If you do, could you maybe just quickly announce this in this thread, so that we make sure we are not doing duplicate work. I have been experimenting in notebooks in the Experiments folder. Feel free to copy this setup.

raspstephan · 2021-04-23T12:28:47Z

First hint: Not using the log transform makes the GAN not work. This suggests that data preprocessing is super important.

raspstephan · 2021-04-23T12:36:22Z

I am training our current setup with the pure super-resolution setup on my branch stephans_gan as a notebook 07. Will update later on how it went.

raspstephan · 2021-04-28T07:00:32Z

Really weird: Using the pure SR option with the same setup that produced somewhat reasonable results before does not work. The losses explode and the image looks crap.

This emphasizes that I need to check the log-transform/preprocessing.

annavaughan · 2021-04-28T12:31:00Z

@raspstephan I'm going to look at this now. Is the current best version the one in notebook 07?

raspstephan · 2021-04-28T13:06:59Z

Hi Anna, here a quick update. 07 contains the close-to-Leinonen setup with the pure super-resolution approach. THIS DOES NOT WORK!? 08 contains the same setup with regular TIGGE to MRMS, which "works". I just tested this again in 09 and 10 with the new weighted sampling (not yet fully commited because still running). SAME outcome. TIGGE to MRMS "works", pure SR does not. I have no idea why. Maybe we could have a quick meeting tomorrow to look at this together?

annavaughan · 2021-04-28T13:12:12Z

That would be great to meet tomorrow, I'm free between 12pm-2:30pm Munich time if sometime then works? I'm very confused why the pure SR doesn't work - I'm looking at the code now and it seems fine

raspstephan · 2021-04-28T13:34:02Z

Great, here a summary of my very confusing findings. BTW @annavaughan, I am working in the Experiments subdirectory of notebooks.

TIGGE-->MRMS works sort of, also with the new sampling method. (09 notebook). The loss implodes.

Pure SR (so MRMS coare-->MRMS fine) with the same setup does NOT work (10 notebook)

Testing a different discriminator architecture with two heads more similar to Leinonen produces a really weird discriminatory loss of almost exactly 10(!?). This is cause by the gradient penalty. No idea why.

We need to figure out why the pure SR behaves so differently from the TIGGE--> MRMs setup. This makes no sense to me at the moment...

raspstephan · 2021-04-29T17:49:58Z

@annavaughan Update on my experiments with lower learning rate. Started out encouraging but then wasn't so great after all.

First up, I ran the TIGGE-MRMS setup that has previously "worked" with 1e-4, now with 1e-5. Notebook 09. In the end the results look pretty similar to the original learning rate. Maybe a little less filamenty but still not realistic enough.

Then I ran the pure super-resolution setup with 1e-5 (notebook 10). Unfortunately, I didn't save the training logs. But after like 8 epochs the losses again exploded and the images became increasingly unrealistic.

This makes me wonder whether the LR was really at fault after all. Maybe it just delayed the inevitable... So no solution yet. How did you get on with your Leinonen setup? Happy to talk again tomorrow to debug.

raspstephan · 2021-05-03T14:05:01Z

@annavaughan So my MNIST test was a failure. I guess this tells us at least that it's not the data (which we already suspected) But then WHAT THE HELL IS IT!?!?!? It must be something in the networks or training that we both are doing wrong?? Well that is, if you MNIST experiments also turn out to fail. Keep me posted :)

raspstephan added this to In progress in NWP downscaling Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve GAN #46

Improve GAN #46

raspstephan commented Apr 23, 2021

raspstephan commented Apr 23, 2021

raspstephan commented Apr 23, 2021

raspstephan commented Apr 28, 2021

annavaughan commented Apr 28, 2021

raspstephan commented Apr 28, 2021

annavaughan commented Apr 28, 2021

raspstephan commented Apr 28, 2021

raspstephan commented Apr 29, 2021

raspstephan commented May 3, 2021