Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss always stays around 9.3 #7

Closed
taoyunuo opened this issue Jul 27, 2017 · 8 comments
Closed

Loss always stays around 9.3 #7

taoyunuo opened this issue Jul 27, 2017 · 8 comments

Comments

@taoyunuo
Copy link

Loss always stays around 9.3, not down
I set the learning rate to 0.01 and 0.06, and loss didn't converge???
How does the training network need to be modified?????
How is the training parameter set?????
Hope to get your help!!thanks!!
@wdwen

@zhly0
Copy link

zhly0 commented Jul 28, 2017

I have the same issue,and I think many people have the same issue,maybe it is due to the setting of parameters

@recordcode
Copy link

You guys did't train MS-celeb, and use other dataset, right?

@zhly0
Copy link

zhly0 commented Jul 29, 2017

@recordcode you mean its relate to the dataset?

@taoyunuo
Copy link
Author

I set the learning rate is 0.06 ,loss start to converge , RGB picture as the input face, I don't use lmdb as input

@wy1iu
Copy link
Owner

wy1iu commented Aug 9, 2017

We update the repo to fix some bugs. The pipeline should be okay to run without modifications now. The SphereFace-20 model (described in the paper) is also released.

@happynear
Copy link

The loss computed after margin_inner_product layer is always bigger than the normal inner_product layer. A small trick used in large margin softmax is to use traditional inner_product layer at testing. Sample prototxt:

############### A-Softmax Loss ##############
layer {
  name: "fc6"
  type: "MarginInnerProduct"
  bottom: "fc5"
  bottom: "label"
  top: "fc6"
  top: "lambda"
  param {
    name: "fc6"
    lr_mult: 1
    decay_mult: 1
  }
  margin_inner_product_param {
    num_output: 10572
    type: QUADRUPLE
    weight_filler {
      type: "xavier"
    }
    base: 1000
    gamma: 0.12
    power: 1
    lambda_min: 5
    iteration: 0
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "fc6"
  type: "MarginInnerProduct"
  bottom: "fc5"
  bottom: "label"
  top: "fc6"
  top: "lambda"
  param {
    name: "fc6"
    lr_mult: 0
  }
  margin_inner_product_param {
    num_output: 10572
    type: SINGLE
    base: 0
    gamma: 1
    iteration: 0
    lambda_min: 0
    weight_filler {
      type: "msra"
    }
  }
  include {
    phase: TEST
  }
}
layer {
  name: "softmax_loss"
  type: "SoftmaxWithLoss"
  bottom: "fc6"
  bottom: "label"
  top: "softmax_loss"
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc6"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}

Then you can observe the accuracy and loss at testing phase to check if the network is training normally.

@inlmouse
Copy link

inlmouse commented Aug 9, 2017

@happynear You are right. This trick was also mentioned in L-margin softmax.

@wy1iu wy1iu closed this as completed Aug 10, 2017
@wy1iu
Copy link
Owner

wy1iu commented Aug 10, 2017

If there are still something wrong with your loss, you guys can reopen this issue or open a new one for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants