Why this reimplementation use bias==False in all CONV layers? #1

dypromise · 2019-07-04T08:39:22Z

Hi, bluestyle97!
Thanks for your nice pytorch reimplementation! It more faster than official version. But, i found some difference: 1. conv without bias . 2. target label has a randmom coefficient multiplied. Could you explain that? I am confusing about why you did this. thank you very much!

bluestyle97 · 2019-07-04T09:09:39Z

Hi, thanks for your comment!
For the first question, 'bias' will only influence the mean value of the convolutional layer's output. So 'bias' can be ignored when the convolutional layer (nn.Conv2d) is followed by a batch-normalization layer, since the mean shift will be removed in the batchnorm layer. You can refer to the following links for more explanation:
https://discuss.pytorch.org/t/any-purpose-to-set-bias-false-in-densenet-torchvision/22067
kuangliu/pytorch-cifar#52

For the second question, STGAN is exactly based on the work of AttGAN, and in AttGAN they use a mechanism to control the attribute manipulation intensity by make the target vector uniformly lie on [-1, 1] while training. You can read the AttGAN paper and its implementation for more details:
https://arxiv.org/abs/1711.10678
https://github.com/elvisyjlin/AttGAN-PyTorch

dypromise · 2019-07-04T09:29:46Z

Wow!! Thank you very much, it helps me A LOT!!!

Hi, thanks for your comment!
For the first question, 'bias' will only influence the mean value of the convolutional layer's output. So 'bias' can be ignored when the convolutional layer (nn.Conv2d) is followed by a batch-normalization layer, since the mean shift will be removed in the batchnorm layer. You can refer to the following links for more explanation:
https://discuss.pytorch.org/t/any-purpose-to-set-bias-false-in-densenet-torchvision/22067
kuangliu/pytorch-cifar#52

For the second question, STGAN is exactly based on the work of AttGAN, and in AttGAN they use a mechanism to control the attribute manipulation intensity by make the target vector uniformly lie on [-1, 1] while training. You can read the AttGAN paper and its implementation for more details:
https://arxiv.org/abs/1711.10678
https://github.com/elvisyjlin/AttGAN-PyTorch

dypromise · 2019-07-05T03:43:11Z

Hi, I noticed that another differences: your version don't use inject layers and just use 3 stu layers. I modify it to use inject layers in decoder and use 4 shortcut layers and found that it is difficult to converge. Did you try this ? If so, could you give me some suggestions on training? following is my parameters:
exp_name: stgan
model_name: stgan
mode: train
cuda: true
ngpu: 4

data

dataset: celeba
data_root: /dockerdata/home/rpf/rpf/xmmtyding/celeba_data/crop384/img_crop_celeba_png/
att_list_file: /dockerdata/home/rpf/rpf/xmmtyding/celeba_data/crop384/new_list_attr_celeba_addhair.txt
crop_size: 384
image_size: 384

model

g_conv_dim: 48
d_conv_dim: 48
d_fc_dim: 512
g_layers: 5
d_layers: 5
shortcut_layers: 4
stu_kernel_size: 3
use_stu: true
one_more_conv: true
attrs: [Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Male, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Young, HairLength]
checkpoint: ~

training

batch_size: 64
beta1: 0.5
beta2: 0.5
g_lr: 0.0008
d_lr: 0.0008
n_critic: 5
thres_int: 0.5
lambda_gp: 10
lambda1: 1
lambda2: 10
lambda3: 100
max_iters: 1000000
lr_decay_iters: 800000

steps:

summary_step: 10
sample_step: 2500
checkpoint_step: 2500

dypromise closed this as completed Jul 4, 2019

dypromise reopened this Jul 5, 2019

bluestyle97 closed this as completed Sep 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why this reimplementation use bias==False in all CONV layers? #1

Why this reimplementation use bias==False in all CONV layers? #1

dypromise commented Jul 4, 2019

bluestyle97 commented Jul 4, 2019

dypromise commented Jul 4, 2019

dypromise commented Jul 5, 2019

Why this reimplementation use bias==False in all CONV layers? #1

Why this reimplementation use bias==False in all CONV layers? #1

Comments

dypromise commented Jul 4, 2019

bluestyle97 commented Jul 4, 2019

dypromise commented Jul 4, 2019

dypromise commented Jul 5, 2019

data

model

training

steps: