Option to use AdaIN

eridgd · Dec 22, 2017 · 0f3d820 · 0f3d820
1 parent b20f693
commit 0f3d820
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -62,14 +62,15 @@ Other args to take note of:
 * `--concat`  Append the style image to the stylized output
 * `--noise`  Generate textures from random noise image instead of webcam
 * `--random`  Load a new random image every # of frames
+* `--adain`  Use [Adaptive Instance Normalization](https://arxiv.org/abs/1703.06868) as transfer op instead of WCT
 
 There are also a couple of keyboard shortcuts:
 
 * `r`  Load random image from style folder
 * `w`  Write frame to a .png
 * `c`  Toggle color preservation
 * `s`  Toggle [style swap](#style-swap) (only applied on layer relu5_1)
-* `a`  Toggle Adaptive Instance Normalization instead of WCT
+* `a`  Toggle AdaIN as transform instead of WCT
 * `q`  Quit cleanly and close streams
 
 `stylize.py` will stylize content images and does not require OpenCV. The options are the same as for the webcam script with the addition of `--content-path`, which can be a single image file or folder, and `--out-path` to specify the output folder. Each style in `--style-path` will be applied to each content image. 
@@ -155,7 +156,7 @@ For example:
 
 ## Notes
 
-* This repo is based on [my implementation](https://github.com/eridgd/AdaIN-TF/) of [Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization](https://arxiv.org/abs/1703.06868) by Huang et al.
+* This repo is based on [my implementation](https://github.com/eridgd/AdaIN-TF/) of [Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization](https://arxiv.org/abs/1703.06868) by Huang et al. The [AdaIN op](https://github.com/eridgd/WCT-TF/blob/master/ops.py#L276) is included here as an alternative transform to WCT. It generally requires multiple stylization passes to achieve a comparable effect.
 * The stylization pipeline can be hooked up with decoders in any order. For instance, to reproduce the (sub-optimal) reversed fine-to-coarse pipeline in figure 5(d) from the original paper use the option `--relu-targets relu1_1 relu2_1 relu3_1 relu4_1 relu5_1` in webcam.py/stylize.py. 
 * `coral.py` implements [CORellation ALignment](https://arxiv.org/abs/1612.01939) to transfer colors from the content image to the style image in order to preserve colors in the stylized output. The default method uses NumPy and there is also a commented out version in PyTorch that is slightly faster.
 * WCT involves two tf.svd ops, which as of TF r1.4 has a GPU implementation. However, this appears to be 2-4x slower than the CPU version and so is explicitly executed on `/cpu:0` in ops.py. [See here](https://github.com/tensorflow/tensorflow/issues/13603) for an interesting discussion of the issue.

diff --git a/model.py b/model.py
@@ -142,7 +142,9 @@ def build_model(self,
             if self.mode != 'train':                
                 with tf.name_scope('wct_'+relu_target):
                     if relu_target == 'relu5_1':
-                        # Apply style swap on relu5_1 encodings if self.swap5 flag is set. Otherwise perform WCT.
+                        # Apply style swap on relu5_1 encodings if self.swap5 flag is set
+                        # Use AdaIN as transfer op instead of WCT if self.use_adain is set
+                        # Otherwise perform WCT
                         decoder_input = tf.case([(self.swap5, lambda: wct_style_swap(content_encoded,
                                                                                     style_encoded_tensor,
                                                                                     self.ss_alpha,