Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why this reimplementation use bias==False in all CONV layers? #1

Closed
dypromise opened this issue Jul 4, 2019 · 3 comments
Closed

Why this reimplementation use bias==False in all CONV layers? #1

dypromise opened this issue Jul 4, 2019 · 3 comments

Comments

@dypromise
Copy link

Hi, bluestyle97!
Thanks for your nice pytorch reimplementation! It more faster than official version. But, i found some difference: 1. conv without bias . 2. target label has a randmom coefficient multiplied. Could you explain that? I am confusing about why you did this. thank you very much!

@bluestyle97
Copy link
Owner

Hi, thanks for your comment!
For the first question, 'bias' will only influence the mean value of the convolutional layer's output. So 'bias' can be ignored when the convolutional layer (nn.Conv2d) is followed by a batch-normalization layer, since the mean shift will be removed in the batchnorm layer. You can refer to the following links for more explanation:
https://discuss.pytorch.org/t/any-purpose-to-set-bias-false-in-densenet-torchvision/22067
kuangliu/pytorch-cifar#52

For the second question, STGAN is exactly based on the work of AttGAN, and in AttGAN they use a mechanism to control the attribute manipulation intensity by make the target vector uniformly lie on [-1, 1] while training. You can read the AttGAN paper and its implementation for more details:
https://arxiv.org/abs/1711.10678
https://github.com/elvisyjlin/AttGAN-PyTorch

@dypromise
Copy link
Author

Wow!! Thank you very much, it helps me A LOT!!!

Hi, thanks for your comment!
For the first question, 'bias' will only influence the mean value of the convolutional layer's output. So 'bias' can be ignored when the convolutional layer (nn.Conv2d) is followed by a batch-normalization layer, since the mean shift will be removed in the batchnorm layer. You can refer to the following links for more explanation:
https://discuss.pytorch.org/t/any-purpose-to-set-bias-false-in-densenet-torchvision/22067
kuangliu/pytorch-cifar#52

For the second question, STGAN is exactly based on the work of AttGAN, and in AttGAN they use a mechanism to control the attribute manipulation intensity by make the target vector uniformly lie on [-1, 1] while training. You can read the AttGAN paper and its implementation for more details:
https://arxiv.org/abs/1711.10678
https://github.com/elvisyjlin/AttGAN-PyTorch

@dypromise dypromise reopened this Jul 5, 2019
@dypromise
Copy link
Author

Hi, I noticed that another differences: your version don't use inject layers and just use 3 stu layers. I modify it to use inject layers in decoder and use 4 shortcut layers and found that it is difficult to converge. Did you try this ? If so, could you give me some suggestions on training? following is my parameters:
exp_name: stgan
model_name: stgan
mode: train
cuda: true
ngpu: 4

data

dataset: celeba
data_root: /dockerdata/home/rpf/rpf/xmmtyding/celeba_data/crop384/img_crop_celeba_png/
att_list_file: /dockerdata/home/rpf/rpf/xmmtyding/celeba_data/crop384/new_list_attr_celeba_addhair.txt
crop_size: 384
image_size: 384

model

g_conv_dim: 48
d_conv_dim: 48
d_fc_dim: 512
g_layers: 5
d_layers: 5
shortcut_layers: 4
stu_kernel_size: 3
use_stu: true
one_more_conv: true
attrs: [Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Male, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Young, HairLength]
checkpoint: ~

training

batch_size: 64
beta1: 0.5
beta2: 0.5
g_lr: 0.0008
d_lr: 0.0008
n_critic: 5
thres_int: 0.5
lambda_gp: 10
lambda1: 1
lambda2: 10
lambda3: 100
max_iters: 1000000
lr_decay_iters: 800000

steps:

summary_step: 10
sample_step: 2500
checkpoint_step: 2500

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants