Question: monet2photo training loss #30

filmo · 2017-05-17T00:26:29Z

I'm trying to train the monet2photo. My command line was:

python train.py --dataroot ./datasets/monet2photo --name monet2photo --model cycle_gan --gpu_ids 0,1 --batchSize 8 --identity 0.5

The paper discussed using a batch size of 1, but I increased it to 8 to more fully occupy the GPUs. I think this is the only difference between what was described in the paper and my settings, but I may be wrong.

------------ Options -------------
align_data: False
batchSize: 8
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
dataroot: ./datasets/monet2photo
display_freq: 100
display_id: 1
display_winsize: 256
fineSize: 256
gpu_ids: [0, 1]
identity: 0.5
input_nc: 3
isTrain: True
lambda_A: 10.0
lambda_B: 10.0
loadSize: 286
lr: 0.0002
max_dataset_size: inf
model: cycle_gan
nThreads: 2
n_layers_D: 3
name: monet2photo
ndf: 64
ngf: 64
niter: 100
niter_decay: 100
no_flip: False
no_html: False
no_lsgan: False
norm: instance
output_nc: 3
phase: train
pool_size: 50
print_freq: 100
save_epoch_freq: 5
save_latest_freq: 5000
serial_batches: False
use_dropout: False
which_direction: AtoB
which_epoch: latest
which_model_netD: basic
which_model_netG: resnet_9blocks
-------------- End ----------------
UnalignedDataLoader
#training images = 6287
cycle_gan

I'm training on two GTX-1070s

I'm about 80 epochs in (~40 hours on my set up) and it seems like I'm oscillating between generated 'photos' that look okay-ish and 'photos' that look pretty 'meh', more like the original painting.

My loss declined pretty rapidly for the first 20 or so epochs, but now seems to be relatively stable with occasional crazy spikes:

I think it's improving slightly with each epoch based on the images and there seems to be a slight downward trend on the loss, but I also might just be kidding myself because I've been staring at it for a while. In other words, I'm not certain that what it's generating a epoch 80 is really that much better than epoch 30. Here's the most recent detailed loss curve.

Question: Is this expected behavior (more or less) or should I be concerned that I've plateaued and/or used the wrong settings. At 100 epochs the learning rate is set to start decreasing based on the default settings. Given that it's taking about 30 minutes per epoch and thus about 61 more hours to complete 200 epochs, I'm wondering if I should "keep on going" or "abort" and fix some settings.

The text was updated successfully, but these errors were encountered:

filmo · 2017-05-25T05:03:28Z

I"m not sure how representative this is, but here's my final loss for the discriminators and generators.
There's visible oscillation from about 20 epochs to 100 epochs for Generator B as well as D_B.

Once the learning rate started to get lowered at epoch 100, G_B loss slowly increased and G_A seemed to converge in the .35 to .40 range. Both Discriminators stopped oscillating and gradually got lower.

Perhaps this will be useful to someone doing the same. I used instance norm. Perhaps I should have used batch norm since I was running batch_size = 8

junyanz · 2017-06-09T05:17:50Z

The losses are not so interpretable as G and D are optimizing a minimax game. The plots you posted here looks quite typical to me (except the spike). I will mainly focus on the quality of images.

John1231983 · 2019-10-19T16:53:21Z

@filmo : Have you find the reason why the spike in the loss and how to solve it?

filmo · 2019-10-20T21:58:47Z

No, I didn't end up exploring it further.

John1231983 · 2019-10-20T22:13:33Z

Thanks. Let follows the issue how it will be solved
#807

Fix Quiet Mode

filmo closed this as completed Jun 19, 2017

JiahangLiGary pushed a commit to lanbas/pytorch-CycleGAN-and-pix2pix that referenced this issue Apr 19, 2023

Merge pull request junyanz#30 from ThomasThelen/master

7c2c55b

Fix Quiet Mode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: monet2photo training loss #30

Question: monet2photo training loss #30

filmo commented May 17, 2017 •

edited

Loading

filmo commented May 25, 2017

junyanz commented Jun 9, 2017

John1231983 commented Oct 19, 2019

filmo commented Oct 20, 2019

John1231983 commented Oct 20, 2019

Question: monet2photo training loss #30

Question: monet2photo training loss #30

Comments

filmo commented May 17, 2017 • edited Loading

filmo commented May 25, 2017

junyanz commented Jun 9, 2017

John1231983 commented Oct 19, 2019

filmo commented Oct 20, 2019

John1231983 commented Oct 20, 2019

filmo commented May 17, 2017 •

edited

Loading