Code cleaning

HsinYingLee · Jul 22, 2018 · bed2aa4 · bed2aa4
1 parent 54d28e9
commit bed2aa4
Show file tree

Hide file tree

Showing 7 changed files with 576 additions and 527 deletions.
diff --git a/README.md b/README.md
@@ -5,7 +5,7 @@
 # Diverse Image-to-Image Translation via Disentangled Representations
 [[Project Page]]()[[Paper]]()
 
-Pytorch implementation for our diverse image-to-image translation method. With the proposed disentangled representation aproach, we are able to produce diverse translation results without paired training images.
+Pytorch implementation for our diverse image-to-image translation method. With the proposed disentangled representation approach, we are able to produce diverse translation results without paired training images.
 
 Contact: Hsin-Ying Lee (hlee246@ucmerced.edu) and Hung-Yu Tseng (htseng6@ucmerced.edu)
 
@@ -49,7 +49,7 @@ link for photo <-> portrait
 ## Training Examples
 - Yosemite summer <-> winter translation
 ```
-python3 train.py --dataroot ../datasets/yosemite --concat 1 --name yosemite
+python3 train.py --dataroot ../datasets/yosemite --name yosemite
 tensorboard --logdir ../logs yosemite
 ```
 Results and saved models can be found at `../results/yosemite`.
@@ -65,6 +65,13 @@ Results and saved models can be found at `../results/photo2portrait`.
 - Download a pre-trained model
 - Generate results in domain B from domain A
 ```
-python3 test.py --dataroot ../datasets/yosemite --a2b 1 --name yosemite --concat 1 --resume ../models/example.pth
+python3 test.py --dataroot ../datasets/yosemite --a2b 1 --name yosemite --resume ../models/example.pth
 ```
 Results can be found at `../outputs/yosemite`.
+
+## Training options and tips
+- Due to the usage of adaptive pooling for attribute encoders, our model support various input size. For example, here's the results of Grayscale -> RGB using 340x340 images.
+<img src='' width="1000px"/>
+- We provide two different methods for combining content representation and attribute vector. One is simple concatenation, the other is xxxxx. In our experience, if the translation involves less shape variation (e.g. Winter2Summer), simple concatentation produces better results. On the other hand, the translation with shape variation (e.g. cat2dog, celeb2portrait), xxx should be use (i.e. set --concat 0) in order to generate diverse results. 
+- In our experience, using multiscale discriminator also always gets better results.
+- We also provide option for using spectral normalization(https://arxiv.org/abs/1802.05957). We use the code from the master branch of pytorch since pytorch 0.5.0 is not stable yet. However, despite using spetral normalization significantly stablizes the training, we fail to observe consistent quality improvement. We encourage everyone to play around with various settings and explore better configurations.
diff --git a/src/dataset.py b/src/dataset.py
@@ -58,7 +58,6 @@ def __init__(self, opts):
     transforms = [Resize(opts.resize_size, Image.BICUBIC)]
     if opts.phase == 'train':
       transforms.append(RandomCrop(opts.crop_size))
-      #transforms.append(CenterCrop(opts.crop_size))
     else:
       transforms.append(CenterCrop(opts.crop_size))
     if not opts.no_flip:

diff --git a/src/model.py b/src/model.py
@@ -92,7 +92,6 @@ def test_forward(self, image_a, image_b, random_z=False, a2b=True, idx=0):
     self.z_content_a, self.z_content_b = self.enc_c.forward(image_a, image_b)
     if random_z:
       self.z_random = self.get_z_random(image_a.size(0), self.nz, 'gauss')
-      #np.save('/home/ym41608/z/{}'.format(idx), self.z_random.cpu().numpy())
       if a2b:
         image = self.gen.forward_b(self.z_content_a, self.z_random)
       else:
@@ -245,8 +244,6 @@ def backward_D(self, netD, real, fake):
       out_real = nn.functional.sigmoid(out_b)
       all0 = torch.zeros_like(out_fake).cuda(self.gpu)
       all1 = torch.ones_like(out_real).cuda(self.gpu)
-      #all1 = torch.ones((out_real.size(0))).cuda(self.gpu)
-      #all0 = torch.zeros((out_fake.size(0))).cuda(self.gpu)
       ad_fake_loss = nn.functional.binary_cross_entropy(out_fake, all0)
       ad_true_loss = nn.functional.binary_cross_entropy(out_real, all1)
       loss_D += ad_true_loss + ad_fake_loss
@@ -350,7 +347,6 @@ def backward_G_GAN(self, fake, netD=None):
     for out_a in outs_fake:
       outputs_fake = nn.functional.sigmoid(out_a)
       all_ones = torch.ones_like(outputs_fake).cuda(self.gpu)
-      #all_ones = Variable(torch.ones((outputs_fake.size(0))).cuda(self.gpu))
       loss_G += nn.functional.binary_cross_entropy(outputs_fake, all_ones)
     return loss_G
 
@@ -435,7 +431,7 @@ def save(self, filename, ep, total_it):
     torch.save(state, filename)
     return
 
-  def assemble_outputs3(self):
+  def assemble_outputs(self):
     images_a = self.normalize_image(self.real_A_encoded)
     images_b = self.normalize_image(self.real_B_encoded)
     images_a1 = self.normalize_image(self.fake_A_encoded)