activation_fn in refiner and discriminator is default None. #30

shimacos37 · 2018-10-29T05:01:38Z

In layers.py

def conv2d(inputs, num_outputs, kernel_size, stride,
           layer_dict={}, activation_fn=None,
           #weights_initializer=tf.random_normal_initializer(0, 0.001),
           weights_initializer=tf.contrib.layers.xavier_initializer(),
           scope=None, name="", **kargv):
  outputs = slim.conv2d(
      inputs, num_outputs, kernel_size,
      stride, activation_fn=activation_fn, 
      weights_initializer=weights_initializer,
      biases_initializer=tf.zeros_initializer(dtype=tf.float32), scope=scope, **kargv)
  if name:
    scope = "{}/{}".format(name, scope)
  _update_dict(layer_dict, scope, outputs)
  return outputs

and in model.py

  def _build_refiner(self, layer):
    with tf.variable_scope("refiner") as sc:
      layer = conv2d(layer, 64, 3, 1, scope="conv_1")
      layer = repeat(layer, 4, resnet_block, scope="resnet")
      layer = conv2d(layer, 1, 1, 1, 
                     activation_fn=None, scope="conv_2")
      output = tanh(layer, name="tanh")
      self.refiner_vars = tf.contrib.framework.get_variables(sc)
    return output 

  def _build_discrim(self, layer, name, reuse=False):
    with tf.variable_scope("discriminator", reuse=reuse) as sc:
      layer = conv2d(layer, 96, 3, 2, scope="conv_1", name=name)
      layer = conv2d(layer, 64, 3, 2, scope="conv_2", name=name)
      layer = max_pool2d(layer, 3, 1, scope="max_1", name=name)
      layer = conv2d(layer, 32, 3, 1, scope="conv_3", name=name)
      layer = conv2d(layer, 32, 1, 1, scope="conv_4", name=name)
      logits = conv2d(layer, 2, 1, 1, scope="conv_5", name=name)
      output = tf.nn.softmax(logits, name="softmax")
      self.discrim_vars = tf.contrib.framework.get_variables(sc)
    return output, logits

Activation is None in most convolution layers.
Is this OK? I think that gradients do not propagate properly.

NoNamesLeft4Me · 2018-11-28T22:02:13Z

@shimacos37 Have you tried adding any activation functions? In the Apple's CVPR paper there is no mention of activation functions so I guess this is why activation is None here. However I cannot duplicate the Apple's performance using the code. Most importantly I find the scale of losses are way off compared to Apple's experiment. In Apple's ML journal on this issue, they show the scale of losses is below 3. Bu my experiment has losses around 100, which is similar to what @carpedm20 shows here. Maybe the activation function is a missing part here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

activation_fn in refiner and discriminator is default None. #30

activation_fn in refiner and discriminator is default None. #30

shimacos37 commented Oct 29, 2018

NoNamesLeft4Me commented Nov 28, 2018 •

edited

activation_fn in refiner and discriminator is default None. #30

activation_fn in refiner and discriminator is default None. #30

Comments

shimacos37 commented Oct 29, 2018

NoNamesLeft4Me commented Nov 28, 2018 • edited

NoNamesLeft4Me commented Nov 28, 2018 •

edited