Weight (lambda) value for image level adaptation and hyperparameter request #17

sehyun03 · 2019-02-02T07:50:41Z

Hi I found weight value for image level adaptation loss on "train.prototext" set to 1.0, which is not consistent with your paper(all lambda set to 0.1).

layer {
  name: "da_conv_loss"
  type: "SoftmaxWithLoss"
  bottom: "da_score_ss"
  bottom: "da_label_ss_resize"
  top: "da_conv_loss"
  loss_param {
    ignore_label: 255
    normalize: 1
  }
  propagate_down: 1
  propagate_down: 0
  loss_weight: 1
}

Also "lr_mult" for instance level domain classifier have 10 times more value than other conv of fc.

layer {
  name: "dc_ip3"
  type: "InnerProduct"
  bottom: "dc_ip2"
  top: "dc_ip3"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "gaussian"
      # std: 0.3
      std: 0.05
    }
    bias_filler {
      type: "constant"
    }
  }
}

Can you provide exact hyperparameters on "loss_weight", "lr_mult", "gradient_scaler_param" you used on your paper?
It would be appriciated to get the hypereparameters for each setting(image level DA, image + instance level DA, image + instance level DA + consistency loss) and dataset(sim10k->cityscapes, cityscapes->citysacpes_foggy, kitty <-> cityscapes). Thank you.

The text was updated successfully, but these errors were encountered:

JeromeMutgeert · 2019-02-19T14:07:20Z

Hi,

I am trying to get familliar with the code too. I came to similar questions. I think you can find your lambda in the GradientScaler layers that implement the GRL's. They scale the gradient with a factor -0.1, which effectively results in the right gradient, at least for the FRCNN-part of the network. The DA-part has a loss of (L_img + L_ins + L_cst), without minius, and without factor lambda. I think this is desirable for the training of the adversarial (DA) part.

However, what I just said is not consistent with the rest of the code, because the L_ins has a gradient scaling factor in the GRL of -0.1, ánd a loss weight of 0.1 at the dc_loss output. I think the latter factor 0.1 is cancelled out by the learning rate multipliers in the corresponding layers in between. But anyway, when its gradient is mixed with the FRCNN loss, it thus seems to only be weighed in with a factor 0.01.

About the L_cst, I have not found any code for that in this repository. I think you will need to use the caffe2 implementation for that, see #4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weight (lambda) value for image level adaptation and hyperparameter request #17

Weight (lambda) value for image level adaptation and hyperparameter request #17

sehyun03 commented Feb 2, 2019 •

edited

Loading

JeromeMutgeert commented Feb 19, 2019 •

edited

Loading

Weight (lambda) value for image level adaptation and hyperparameter request #17

Weight (lambda) value for image level adaptation and hyperparameter request #17

Comments

sehyun03 commented Feb 2, 2019 • edited Loading

JeromeMutgeert commented Feb 19, 2019 • edited Loading

sehyun03 commented Feb 2, 2019 •

edited

Loading

JeromeMutgeert commented Feb 19, 2019 •

edited

Loading