Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weight (lambda) value for image level adaptation and hyperparameter request #17

Open
sehyun03 opened this issue Feb 2, 2019 · 1 comment

Comments

@sehyun03
Copy link

sehyun03 commented Feb 2, 2019

Hi I found weight value for image level adaptation loss on "train.prototext" set to 1.0, which is not consistent with your paper(all lambda set to 0.1).

layer {
  name: "da_conv_loss"
  type: "SoftmaxWithLoss"
  bottom: "da_score_ss"
  bottom: "da_label_ss_resize"
  top: "da_conv_loss"
  loss_param {
    ignore_label: 255
    normalize: 1
  }
  propagate_down: 1
  propagate_down: 0
  loss_weight: 1
}

Also "lr_mult" for instance level domain classifier have 10 times more value than other conv of fc.

layer {
  name: "dc_ip3"
  type: "InnerProduct"
  bottom: "dc_ip2"
  top: "dc_ip3"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "gaussian"
      # std: 0.3
      std: 0.05
    }
    bias_filler {
      type: "constant"
    }
  }
}

Can you provide exact hyperparameters on "loss_weight", "lr_mult", "gradient_scaler_param" you used on your paper?
It would be appriciated to get the hypereparameters for each setting(image level DA, image + instance level DA, image + instance level DA + consistency loss) and dataset(sim10k->cityscapes, cityscapes->citysacpes_foggy, kitty <-> cityscapes). Thank you.

@JeromeMutgeert
Copy link

JeromeMutgeert commented Feb 19, 2019

Hi,

I am trying to get familliar with the code too. I came to similar questions. I think you can find your lambda in the GradientScaler layers that implement the GRL's. They scale the gradient with a factor -0.1, which effectively results in the right gradient, at least for the FRCNN-part of the network. The DA-part has a loss of (L_img + L_ins + L_cst), without minius, and without factor lambda. I think this is desirable for the training of the adversarial (DA) part.

However, what I just said is not consistent with the rest of the code, because the L_ins has a gradient scaling factor in the GRL of -0.1, ánd a loss weight of 0.1 at the dc_loss output. I think the latter factor 0.1 is cancelled out by the learning rate multipliers in the corresponding layers in between. But anyway, when its gradient is mixed with the FRCNN loss, it thus seems to only be weighed in with a factor 0.01.

About the L_cst, I have not found any code for that in this repository. I think you will need to use the caffe2 implementation for that, see #4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants