About DeepID2 #14

denghuiru · 2015-12-15T06:50:27Z

hi,
Recently, I still try to implement the DeepID model from DeepID1 to DeepID2, however, I have implemented your DeepID1 model, but the DeepID2 model is always incorrect, the model is not only non-convergent but also wrong. So, I wonder if you still research the DeepID2 model and have you made some new progress?

happynear · 2015-12-15T07:15:35Z

Recently I am working on face alignment. As a friend of mine said, he successfully trained a DeepID2 model, by setting the loss_weight of contrastive loss to a very low value, say 1e-5. You may try it again.

denghuiru · 2015-12-15T08:05:35Z

@happynear
Thank you very much! I will try it again!!

denghuiru · 2015-12-15T08:08:42Z

@happynear
By the way, the margin in the contrastiveloss has great effect on the loss results, how to set this value, and besides, can your friends share the net structure with me?
Thanks very much!

happynear · 2015-12-15T10:14:53Z

As the DeepID2 paper described, using same pair only can already get accuracy very close to the best performance. In this case, we can generate pairs of images of same label, and set the margin to 0.

denghuiru · 2015-12-16T02:25:18Z

@happynear
I have another question, does your friend construct the DeepID2 model just the same as you provided, if not, can you please tell me the difference or key points.Besides, I also use webface to train the model, but I modify the data_layer.cpp to generate the pair images lmdb datasets(image1 image2 label1 label2) and then use slice layer to slice the pair image to image1 and image2 and run two convnets, almost same as your model, does this manner right or not? if not, how do you generate the pair images?

happynear · 2015-12-16T02:34:08Z

His net is the same with mine, and is described in the CASIA-webface paper.
I use matlab script to generate the list of image pairs. They are in ./dataset folder of this repo.

denghuiru · 2015-12-16T02:54:38Z

@happynear
Thank you very much! I wll do this process again!

denghuiru · 2015-12-16T03:13:50Z

@happynear
hi, as in webface datasets is very unbalanced, it has 10575 persons, but the amount of same person is small, have you made some stategies to make the generated pair images balanced such as the probability of been the same person or the different person is almost equal?

happynear · 2015-12-16T03:17:37Z

Face++ has done a experiment. The conclusion is to delete all identity with less than 10 images.

Paper: http://arxiv.org/pdf/1501.04690.pdf .

denghuiru · 2015-12-17T06:15:18Z

@happynear
Hi, I found a problem during generate pair images using your matlab scripts, I want to generate only the same pairs and set the same_r=1,but this can not generate corresponding train.txt, if i set the same_r=0.5 and it works, how to solve this problem. My datasets is webface.

happynear · 2015-12-17T06:27:24Z

This is really a messy code. Just try more times....

happynear · 2015-12-17T06:32:23Z

In fact, for same pair generation, I suggest you to use a full permutation, i. e. generate all possible pairs. Since there are not too many images in a single identity, the final number will not be too large.

Moreover, you can extract a pre-trained feature first and mine the hard paris only, then fine-tune the pre-trained network with DeepID2.

denghuiru · 2015-12-17T07:06:15Z

@happynear
I am not very clearly about fine-tuning the pre-trained model with DeepID2, you mean that i can use any one pre-trained model as the initialization of DeepID2?

happynear · 2015-12-17T08:36:22Z

Sure, DeepID1 model and DeepID2 model can share the same trained parameters.

denghuiru · 2015-12-17T09:46:00Z

@happynear
I found a problem that, i can generate two train.txt with all the same pairs, then using caffe to convert them to lmdb datasets, however, if the training process random pick two images every time but not in order, the two images must be the different class, how to deal this problem? Thank you!

dfdf · 2015-12-17T12:51:28Z

About the DeepID2, in your implementation you use:

"layer {
name: "simloss"
type: "ContrastiveLoss"
loss_weight: 0.00001
contrastive_loss_param {
margin: 0
}
bottom: "pool5"
bottom: "pool5_p"
bottom: "label1"
bottom: "label2"
top: "simloss"
}"

You use 4 blobs as bottom, but in the docs from caffe the constrative loss only use 3 (the third parameter is the similarity, it value should be 0 if differs or 1 if simmilar), so as I am reading, your similarity parameter is based only in the label1 which I think is totally wrong. Is this correct? Or you changed something else in the code?

Thank you, and by the way good work!

denghuiru · 2015-12-18T01:38:07Z

@dfdf
The contrastiveloss sure can only accept 3 blobs as input, however, in the DeepID2 , as it simultaneous use identification and verification imformation to training the net, it must use each image's label which lead to we can not just use the similarity of pair data(0 or 1) to the contrastive loss, so in order to use the pair label, we must modify the contrastiveloss layer to accept four blobs as input, the modify of contrastive loss can refer to the author's windows caffe.

happynear · 2015-12-18T01:57:04Z

@dfdf
As @denghuiru said, I have modified contrastive loss layer to get a more elegant implementation. The modified layer is in my caffe windows repository https://github.com/happynear/caffe-windows/blob/master/src/caffe/layers/contrastive_loss_layer.cpp .

denghuiru · 2015-12-21T08:51:03Z

@happynear
hi, I recently follow your suggestion to do the deepID2, using all the same pairs and set the loss weight to 1*10-5, and using a small datasets only has 9000 pairs, however,during the training process, the contrastive loss become very small and nearly has no contribute to the total loss, is this right?
2. Besides, how about the accuracy dose your friends get based on deepid2 model and how does he calculate the face verification accuracy?
3. In deepid2, how to check whether the two loss(contrastive loss and softmaxloss) adjust the parameter together, i always think the process of adjust parameter is wrong! Thank you!

dfdf · 2015-12-21T13:33:31Z

@denghuiru What are the values that you are using on the solver? Thank You

denghuiru · 2015-12-22T03:12:36Z

@dfdf
The author has provided the solver in the caffe_proto and i also use his verison!

dfdf · 2015-12-23T15:45:59Z

@denghuiru you got any improvements doing what you suggest? You were able to make the network to converge, using a valid constrative loss?

denghuiru · 2015-12-25T01:43:42Z

@dfdf
I am just follow the author's suggestion to doing this model and have not made some impressive results, i think in order to fully implement this model still need a long time! However, if you made any progress please contact me anytime!

dfdf · 2015-12-28T12:31:09Z

@denghuiru Yeah, sure. I am still trying to converge the network on training, but if I got any progress I will post here

denghuiru · 2015-12-29T01:21:31Z

@dfdf
Thank you very much, and i am also trying to converge the network, but has not made much progress!

dfdf · 2015-12-29T12:39:26Z

@denghuiru

I got some ideas from a discussion list in the caffe users.

I will try to do the follow:

I created 3 diferents prototxt and try to execute them only changing the value o constrative loss weight and the learning rate

1 - Constrative loss = 0 and Learning Rate: 0.01
2 - Learning Rate: 0.001", "Constrative: 0.00032"
3 - Learning Rate: 0.0001","Constrative: 0.006"

I will try for 30~50k iterations until I change the parameters.
It will take a while to train this, but I will post here what I can achiev.

If you have any progress in the mean time, feel free to warn me. Thanks

denghuiru · 2016-01-04T02:40:02Z

@dfdf
At present, I have generate the datasets following the authors scripts, and run the model, the learning rate is 0.01 and the contrastiveloss weight is 0.00001, however i found the softmaxloss is always about 9.0 and dosen't decrease, i still try to found why this happen, have you meet this problems?

dfdf · 2016-01-04T14:39:43Z

By the way, if we could train the DeepID2 (my network are still training now), how do we get the output vector for one image? Should we get the combination of dropout5 + dropout5_p? Or we choose one of them as the deepid output?

denghuiru · 2016-01-05T01:24:57Z

@dfdf
Once you get the trained model, you can extract the feature of dropout5 only, it is 320 dimension, then do face verification.
By the way, will you share your model with me? does your model converge correctly?

dfdf · 2016-01-11T14:35:45Z

@happynear I found a problem with your caffe version, I am using your windows version (caffe repository) but it seens that you don't have the last version for the constrative loss, for example this bug is not fixed in your version:
nickcarlevaris/caffe@7e2fceb

Maybe that is the reason for some problems to converge the network

happynear · 2016-01-12T08:53:40Z

@dfdf This is another type of contrastive loss. Would you like to merge it and try it yourself? I am looking for your results.

dfdf · 2016-01-12T11:19:23Z

I tried to merge it in the windows version, but I am not so experienced with compilers, I will try on another machine using ubuntu

dfdf · 2016-01-12T12:44:25Z

In this thread: BVLC/caffe#2308, they reported that the equation is different from what is pointed in LeCun's paper (The same used in the DeepID2 paper), maybe as they suggest that could be one of the reasons that are affecting the network training speed (or not converging at all).

cheer37 · 2016-03-16T17:23:17Z

Hi all.
This thread is interesting.
I am also struggling with siamese with softmax + contrastive.
I am getting the contrastive loss as a 1.#QNAN.
Have you ever met this situation? i set up the parameters like you, margin = 0, loss_weight = 1e-5.
My net is casia model.
Please help me.
and also if you made any progress, please let me know too.
Thanks.

ghost · 2016-11-01T06:51:29Z

@happynear , is your model coincident with the DeepID2 paper (Deep Learning Face Representation by Joint Identification-Verification) ? I find that your Convolution layers are twice as much as the paper's.

happynear · 2016-11-01T13:47:19Z

No, my implementation is based on the Webface paper.

joyhuang9473 mentioned this issue Apr 20, 2016

[INFO] About DeepID2 joyhuang9473/deepid-implementation#3

Open

About DeepID2 #14

About DeepID2 #14

Comments

denghuiru commented Dec 15, 2015

happynear commented Dec 15, 2015

denghuiru commented Dec 15, 2015

denghuiru commented Dec 15, 2015

happynear commented Dec 15, 2015

denghuiru commented Dec 16, 2015

happynear commented Dec 16, 2015

denghuiru commented Dec 16, 2015

denghuiru commented Dec 16, 2015

happynear commented Dec 16, 2015

denghuiru commented Dec 17, 2015

happynear commented Dec 17, 2015

happynear commented Dec 17, 2015

denghuiru commented Dec 17, 2015

happynear commented Dec 17, 2015

denghuiru commented Dec 17, 2015

dfdf commented Dec 17, 2015

denghuiru commented Dec 18, 2015

happynear commented Dec 18, 2015

denghuiru commented Dec 21, 2015

dfdf commented Dec 21, 2015

denghuiru commented Dec 22, 2015

dfdf commented Dec 23, 2015

denghuiru commented Dec 25, 2015

dfdf commented Dec 28, 2015

denghuiru commented Dec 29, 2015

dfdf commented Dec 29, 2015

denghuiru commented Jan 4, 2016

dfdf commented Jan 4, 2016

denghuiru commented Jan 5, 2016

dfdf commented Jan 11, 2016

happynear commented Jan 12, 2016

dfdf commented Jan 12, 2016

dfdf commented Jan 12, 2016

cheer37 commented Mar 16, 2016

ghost commented Nov 1, 2016

happynear commented Nov 1, 2016