Question about model pretrain method? #13

ZhangYuef · 2019-05-24T13:10:54Z

Thanks for your sharing.

I find that the paper mentions that the model is pretrained only used $$L_{AL}$$first. In section 4.2,

i.e. we ﬁrst pretrain the network using only L AL (without enforcing the unit norm constraint) to endow the basic discriminative power with the embedding and to determine the directions of the reference agents in the hypersphere embedding....

And I don't know how to pretrain the model according to the code now. I need some more detailed instructions, e.g how many epoches should I pretrain the model.

Thanks >.<

KovenYu · 2019-05-30T12:07:52Z

Hi @ZhangYuef , thanks for your attention.

As you can see, in fact the pretraining is simply using a softmax classification loss, with every identity being a unique class. I didn't pay much attention on it or tune it, so I didn't remember the exact values of the hyperparameters. But it should be somewhere around:

epoch: 40
batchsize: 64 (not sure)
lr: 1e-2
wd: 1e-2

You may tune a bit and obtain some reasonable results.

moodom · 2019-06-17T02:11:26Z

Hi @KovenYu ，thanks for your sharing.
According to your settings, I reproduced the results of your pretrained model. but I found that I can't get the results in the paper when I train this pretrained model in the second-stage training. I think the parameter distribution of the pretrained model is very important for the parameter setting of the second-stage training. can you share the pretrained model code?
Thank you very much.

KovenYu · 2019-06-30T03:31:57Z

hi @moodom thank you for your attention. Did you try using the provided pretrained model and is that working?

moodom · 2019-06-30T06:36:39Z

HI,@KovenYu. I had used the provided pretrained model and got a good result. But when I used the LAL loss as described in the paper and remove the unit norm constraint to train a pretrained model. After that, I used the pretrained model in the second stage of training and the rank 1 can only reach about 56. I tried to adjust LR and WD. the results were the same. I tested the average parameters of the provided pretrained model in the FC layer and the Euclidean distance between the FC layer column vector. The results are as follows:
Average parameters of FC layer: - 0.00755771
Column Vector Euclidean Distance Mean: - 413379.0
Standard deviation of column vector Euclidean distance: 1.8415+08
I think it's a very good result. The parameters are very small, but the distance is very large. But the pretrained model I trained did not reach that level.
Do you use any other training skills?

KovenYu · 2019-06-30T08:39:54Z

@moodom thank you for your detailed description!
I looked at the pretrained code and I find two notable points:

By "without the unit norm constraint" it means both a and f(x) are not normalized, and the scale factor 30 is also not used.
I find that I actually tried a few different pretraining strategies, and chose a best baseline obtained by using ImageNet initialized weights (downloaded from here), then trained for 60 epochs with batchsize=256 (256 SOURCE images without any target images, unlike in the current code), and increase LR to 1e-3. Other settings (incl. data augmentation, lr strategy, etc.) were the same as in the code.

pzhren · 2020-03-21T01:58:20Z

Thank you for sharing the code. I set the corresponding parameters according to your description and want to re-loss_al pre-training. However, the pre-training weights obtained in the second stage of training appeared a large number of nan cases. The following is my pre-training code: https://github.com/pzhren/Papers/blob/master/%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B%E4%B8%8Ere-id%E4%BB%BB%E5%8A%A1/MAR-master/src/pretrain.py#L6

pzhren · 2020-03-21T02:10:05Z

The following are the hyperparameter settings during pre-training.
python version : 3.5.4 |Continuum Analytics, Inc.| (default, Aug 14 2017, 13:26:58) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
torch version : 1.1.0

------------------------------------------------------- options --------------------------------------------------------
batch_size: 256 beta: 0.2 crop_size: (384, 128)
epochs: 60 gpu: 0 img_size: (384, 128)
lamb_1: 0.0002 lamb_2: 50.0 lr: 0.001
margin: 1.0 mining_ratio: 0.005 ml_path: ../data/ml_Market.dat
padding: 7 pretrain: True pretrain_path: ../data/resnet50-19c8e357.pth
print_freq: 100 resume: save_path: ../runs/debug
scala_ce: 30.0 source: MSMT17 target: Market
wd: 0.025

do not use pre-trained model. train from scratch.
loaded pre-trained model from ../data/resnet50-19c8e357.pth

==>>[2020-03-20 18:12:12] [Epoch=000/060] Stage 1, [Need: 00:00:00]
Iter: [000/969] Freq 37.5 loss_total 8.316 loss_source 8.316

KovenYu closed this as completed Jul 22, 2019

KovenYu mentioned this issue Mar 3, 2020

Need pretrained_Duke.pth #34

Closed

This was referenced Mar 24, 2020

Unsupervised Person Re-identification by Soft Multilabel Learning moodom/leetcode#1

Open

Unsupervised Person Re-identification by Soft Multilabel Learning ZhangYuef/Hit-the-Mole#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about model pretrain method? #13

Question about model pretrain method? #13

ZhangYuef commented May 24, 2019

KovenYu commented May 30, 2019

moodom commented Jun 17, 2019

KovenYu commented Jun 30, 2019

moodom commented Jun 30, 2019

KovenYu commented Jun 30, 2019

pzhren commented Mar 21, 2020

pzhren commented Mar 21, 2020

Question about model pretrain method? #13

Question about model pretrain method? #13

Comments

ZhangYuef commented May 24, 2019

KovenYu commented May 30, 2019

moodom commented Jun 17, 2019

KovenYu commented Jun 30, 2019

moodom commented Jun 30, 2019

KovenYu commented Jun 30, 2019

pzhren commented Mar 21, 2020

pzhren commented Mar 21, 2020