a re-implementation of asoftmax in tensorflow
Switch branches/tags
Nothing to show
Clone or download
Latest commit 53077e0 Dec 27, 2017
Failed to load latest commit information.
figures init Dec 13, 2017
Loss_ASoftmax.py Update Loss_ASoftmax.py Dec 26, 2017
Readme.md init Dec 13, 2017
test_mnist.py init Dec 13, 2017



This is quick re-implementation of asoftmax loss proposed in this paper: SphereFace: Deep Hypersphere Embedding for Face Recognition. Please cite it if it helps in your paper.


  1. I was using Tensorflow 1.4
  2. I followed this author's caffe implementation sphereface.
  3. l is \lambda in the paper to balance the modified logits and original logits

Visualization of MNIST results

Set l = 1

  • original softmax, 97.6758% original softmax

  • m = 1, 98.0469% m = 1

  • m = 2, 98.3887% m = 2

  • m = 4, 98.6523% m = 4

On Face Recognition

My observation is that the same set of hyper-parameters does not work well in TF. The asoftmax generally improves the accuracy for about 2% on LFW when trained with CASIA. The best accuracy I got is about 98.X%. It seems it is quite tricky to tune the hyper-parameters to match the accuracy of the implementation in caffe.