Implementation questions #4

bkj · 2017-07-22T01:48:13Z

Are you able to explain a couple of bits about the implementation of the margin_inner_product_layer?

What is lambda_? It looks like it's a constant that decreases w/ the iterations, but doesn't seem to be mentioned in the paper. ** Edit: Looks like you mix x'w and margin(x'w) via (margin(x'w) + lambda_ * x'w) / (1 + lambda_) where lambda_ decreases exponentially w/ iterations. Is that right? **
What is the type parameter? (eg. SINGLE, DOUBLE, TRIPLE, QUADRUPLE) I'm guessing this is how you set the value of m from the paper? ** Edit: I gather these are ways of implementing the margin for m={1,2,3,4}. Any particular reason why you implemented this way? Numerical stability? **

Thanks

The text was updated successfully, but these errors were encountered:

melgor · 2017-07-24T13:18:12Z

As I analyse the code, I will answer your questions:

This idea was not mentioned is SphereFace (unfortunately). But it is explained at Large-Margin Softmax. The idea behind it is to at the begging use pure SoftMax and at every iteration increase the weight from A-Softmax and lower weight of SoftMax till the weight of SoftMax will be 0 and there will be pure A-Softmax
Your guess is right. I think that this implementation look like that because of speed. For the much cleaner version of Large-Margin Softmax (which is old version of A-SoftMax and main difference is normalized weights) is here. There are function where you just input as argument 'margin' and it calculate values depended on that. No hard-coded values etc. Also much cleaner because look like pure numpy code. Bu its is also much slower (even 10x)

wy1iu · 2017-07-31T00:59:18Z

Thanks @melgor for answering the questions. :)

wy1iu closed this as completed Jul 31, 2017

Provide feedback