Train and Fine-Tuning LightCNN #110

TheusStremens · 2017-03-20T13:10:15Z

First, congratulations and thank you for your work, it's very exciting to see that's possible to make a light CNN without millions (or billions) of parameters and achieve state-of-art accuracy.

I intend to do two experiments (varying type of activations, cost functions, solver types, neurons, ...) using the model C architecture, one training a new CNN on my database and another with fine-tuning of model C on my database. I made the following solver.prototxt and train_value.prototxt:

solver:

net: "LightenedCNN_New_train_val.prototxt"
test_iter: 1000
test_interval: 10000
iter_size: 60
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 500000
display: 100
max_iter: 5000000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "LightenedCNN_New_Net"
solver_mode: GPU

train_val

layer {
  name: "data"
  type:"Data"
  top: "data"
  top: "label"
  data_param{
	  source: "my_csv_train_database.txt"
	  batch_size: 32
	}
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: true
  }
  include: { phase: TRAIN }
}

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  data_param{
	  source: "my_csv_validation_database.txt"
	  batch_size: 10
	}
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: false
  }
  include: { phase: TEST }
}
...
the same layers on deploy

Could you tell me if this solver and train_val are similar to those you used on the final training of model C?
And for fine-tuning can I use the same solver used in train and just freeze the layers on the train_val or it's necessary another solver for fine-tuning?

Thanks

The text was updated successfully, but these errors were encountered:

AlfredXiangWu · 2017-03-21T01:35:49Z

The configurations of solver and train_val are right for training light CNN and it is also suitable for fine-tuning on your own dataset.

jiangxuehan · 2017-03-28T02:30:34Z

As the description in your paper , “The learning rate is set to 1e-3 initially and reduced to 5e-5 gradually”.Could you please tell me the specific related parameters to achieve this point in Caffe , such as lr_policy\gamma\stepsize\max_iter etc. Thanks.

TheusStremens · 2017-03-28T12:45:02Z

@jiangxuehan I believe that isn't a right answer to your question, the learning rate decay depends of your training database. In general, we reduce the learning rate when the training cost don't decrease after some iterations. So, the "best" way is to take a look in your training cost and reduce the learning rate after X iterations. In Caffe you can specify the number of steps for decrease the learning rate in the solver, for example:

lr_policy: "multistep"
gamma: 0.9
stepvalue: 5000
stepvalue: 7000
stepvalue: 8000
stepvalue: 9000
stepvalue: 9500

The multistep policy makes the learning rate reduce for gamma in each stepvalue (that you can find looking at training cost).
But if you just want follow the paper, it's easy to calculate the gamma and use step policy. For example, take a look at my solver, my stepsize is 500000 and max_iter is 5000000. This means that (in worse case) my learning rate will drop ten times. So, with base_lr = 0.001, after ten drops should be 0.00005. Calculating: base_lr * (gamma)^10 = 5e-5 we get gamma ~= 0.933.
So, the solver becomes:

lr_policy: "step"
gamma: 0.933
stepsize: 500000

If you won't do all the 5000000 iterations just adjust the equation.

TheusStremens · 2017-03-28T12:55:37Z

@AlfredXiangWu I'm training using a Tesla K40 and at Iteration 1100 my training is loss = 11.3229. That's taking so long, is this normal? I normalized 5M images of MS-CELEB (clean list) using the paper's specification and used the solver of this issue.

AlfredXiangWu · 2017-03-28T13:30:06Z

@TheusStremens I think it is normal for training the light CNN.

@jiangxuehan You can follow the configurations as @TheusStremens mentioned. It is similar as my configurations.

jiangxuehan · 2017-03-29T01:31:08Z

@TheusStremens @AlfredXiangWu ,Thanks for your reply, i will follow similar configurations to train this model. BTW, the loss of light CNN drops slowly at the begin several thousands iterations, @TheusStremens
just keep training is OK.

TheusStremens · 2017-04-05T12:30:49Z

Hi guys, after 7 days of training, the cost just barely oscillated and it's already at iteration 20K. Following this proportion, it will be in iteration 100K in 5 weeks and iteration 1M (1/5 of the max iteration number) in a year. @AlfredXiangWu is this normal? How long did your training take? Can you tell me the number of iterations in the end of your training?
ps: I'm training on a Tesla K40

AlfredXiangWu · 2017-04-05T16:01:08Z

@TheusStremens Do you mean that you train the light CNN for about a week and the iterations are only 20k?

It is abnormal. I set max iteration to 4,000,000 and it takes about 1 week on Titan X.

TheusStremens · 2017-04-06T15:08:32Z

I remove iter_size: 60 from the solver and the speed grows up. But now I have a problem with convergence like #36 my loss is 87.3365 at the beginning. Changing the batch_size to 80 apparently resolved the problem with the convergence, but the speed still abnormal (1/4 of your speed). Did you use iter_size in your training @AlfredXiangWu ? I'll try different sets of batch_size

TheusStremens · 2017-04-06T16:57:57Z

The convergence problem doesn't change. Just happeed at iteration 8980. I'm using the normalization correctly, the same base_lr, same architecture, so I can't figure out what is the problem of convergence.

AlfredXiangWu · 2017-04-07T00:03:51Z

net: "DeepFace_set003_train_test.prototxt"

test_iter: 500
test_interval: 1000
test_compute_loss: true

base_lr: 0.001
momentum: 0.9
weight_decay: 0.0005
lr_policy: "step"
stepsize:500000
gamma:0.457305051927326

display: 100
max_iter: 4000000
snapshot: 40000
snapshot_prefix: "DeepFace_set003_net"

solver_mode: GPU

debug_info: false

clip_gradients: 150

The solver I used for training is above. Clipping gradient may help to solve your problems. If not, I think you can finetune the light CNN with your own datasets by the pre-trained model.

lei-xiong · 2017-04-18T03:40:47Z

@AlfredXiangWu @TheusStremens
I tried to train with MS-Celeb-1M and model C until 20 million iterations. Loss is always at 11.0, I use 61332 class altogether 390,000 pictures and batchsize = 96x4. Is this normal? How many times did you drift begin to drop significantly? Thank you

I tried to lower my learning rate

TheusStremens · 2017-04-18T14:06:31Z

@xionglei181818 Did you use the clean list of MS-Celeb-1M? Why did you use only 390,000 pictures if MS-Celeb-1M have 5M+? What learning rate did you use?

In my case, shuffle the train data solves the problem with convergence. After 700K iterations the loss dropped to 3. Now I'm at 1,8M iterations, loss = 1 and acc = 89%

lei-xiong · 2017-04-19T01:47:14Z

@TheusStremens I use the clean list of MS-Celeb-1M and took 50 images of each category, so that after screening to get 61332 categories, about 390,000 images.

1、Using the learning rate provided by @AlfredXiangWu。
base_lr: 0.001
momentum: 0.9
weight_decay: 0.0005
lr_policy: "step"
stepsize:200000
gamma: 0.457305051927326

2、Also try to use a set of parameters is
base_lr: 0.001
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.000005
power:0.75

These two sets of parameters under the run, run 200,000 iteration loss has been around 11.0. Reduce the learning rate to 0.0001 as well. Have you observed this phenomenon? Thank you

AlfredXiangWu · 2017-04-19T02:14:54Z

@xionglei181818
I recommend that the policy of learning rate is set to "fixed" or "step" rather than "inv" .

lyuchuny3 · 2017-04-21T09:49:29Z

I have trained on 1M_Celeb_MS with solver config provided by@AlfredXiangWu. It tooks 9 days on TitanX for 3,500,000 iters. The performance of my model on LFW is not as well as model C.
My test results:
model C: DIR= 0.835 @ FAR=1% on LFW
my model: DIR= 0.641 @ FAR=1% on LFW
I wonder the reasons are:

train images: I directly crop the aligned image of 1M-Celeb-MS without alignment. I note that:
Dataset size ec_mc_y ec_y
Training set 144x144 48 48
Testing set 128x128 48 40
train batch: in my training, I set the batch for train is 124 (but I think this is not the main reason)
in proto of model C, I add param for weight decay for 'fc2' as mentioned in the paper
param{
lr_mult:1
decay_mult:10
}
@AlfredXiangWu , do you have some advice?

ctgushiwei · 2017-04-27T02:34:49Z

@AlfredXiangWu @TheusStremens @lyuchuny3 can you share you train_tese_prototxt and your solver.prototxt? I'm training light cnn with the clean list, after screening to get 79056 categories, about 4,920,000 images.But run 400,000 iteration loss is also 11.2?can you give me a hand?

yuzcccc · 2017-04-27T15:30:09Z

how many iterations (what batch-size) are needed to achieve the results of model B trained on the CASIA-webface dataset?

TheusStremens · 2017-04-27T17:46:08Z

@ctgushiwei
solver:

net: "/path_to_your_train_val_net/your_net_train_val.prototxt"
test_iter: 1000
test_interval: 10000
test_compute_loss: true
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 500000
display: 10
max_iter: 4000000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "Snapshot_your_net"
solver_mode: GPU
debug_info: false

your_net_train_val.prototxt:

name: "Your_Name_Net"

layer {
  name: "data"
  type:"ImageData"
  top: "data"
  top: "label"
  image_data_param{
	  source: "/your_path/train_csv.txt"
	  batch_size: 50
	  shuffle: true
	}
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: true
    
  }
  include: { phase: TRAIN }
}

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  image_data_param{
	  source: "/your_path/validation_csv.txt"
	  batch_size: 10
	}
  transform_param {
    scale: 0.00390625
    crop_size: 128
    mirror: false
  }
  include: { phase: TEST }
}

layer{
  name: "conv1"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
	num_output: 96
	kernel_size: 5
	stride: 1
	pad: 2
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}
  }
  bottom: "data"
  top: "conv1"
}

layer{
  name: "slice1"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv1"
  top: "slice1_1"
  top: "slice1_2"
}
layer{
  name: "etlwise1"
  type: "Eltwise"
  bottom: "slice1_1"
  bottom: "slice1_2"
  top: "eltwise1"
  eltwise_param {
	operation: MAX
  }
}
layer{
  name: "pool1"
  type: "Pooling"
  pooling_param {
	pool: MAX
	kernel_size: 2
	stride: 2
  }
  bottom: "eltwise1"
  top: "pool1"
}

layer{
  name: "conv2a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
	num_output: 96
	kernel_size: 1
	stride: 1
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}
  }
  bottom: "pool1"
  top: "conv2a"
}
layer{
  name: "slice2a"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv2a"
  top: "slice2a_1"
  top: "slice2a_2"
}
layer{
  name: "etlwise2a"
  type: "Eltwise"
  bottom: "slice2a_1"
  bottom: "slice2a_2"
  top: "eltwise2a"
  eltwise_param {
	operation: MAX
  }
}

layer{
  name: "conv2"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
	num_output: 192
	kernel_size: 3
	stride: 1
	pad: 1
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}
  }
  bottom: "eltwise2a"
  top: "conv2"
}



layer{
  name: "slice2"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv2"
  top: "slice2_1"
  top: "slice2_2"
}
layer{
  name: "etlwise2"
  type: "Eltwise"
  bottom: "slice2_1"
  bottom: "slice2_2"
  top: "eltwise2"
  eltwise_param {
	operation: MAX
  }
}
layer{
  name: "pool2"
  type: "Pooling"
  pooling_param {
	pool: MAX
	kernel_size: 2
	stride: 2
  }
  bottom: "eltwise2"
  top: "pool2"
}

layer{
  name: "conv3a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
	num_output: 192
	kernel_size: 1
	stride: 1
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}
  }
  bottom: "pool2"
  top: "conv3a"
}
layer{
  name: "slice3a"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv3a"
  top: "slice3a_1"
  top: "slice3a_2"
}
layer{
  name: "etlwise3a"
  type: "Eltwise"
  bottom: "slice3a_1"
  bottom: "slice3a_2"
  top: "eltwise3a"
  eltwise_param {
	operation: MAX
  }
}

layer{
  name: "conv3"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
	num_output: 384
	kernel_size: 3
	stride: 1
	pad: 1
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}
  }
  bottom: "eltwise3a"
  top: "conv3"
}


layer{
  name: "slice3"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv3"
  top: "slice3_1"
  top: "slice3_2"
}
layer{
  name: "etlwise3"
  type: "Eltwise"
  bottom: "slice3_1"
  bottom: "slice3_2"
  top: "eltwise3"
  eltwise_param {
	operation: MAX
  }
}
layer{
  name: "pool3"
  type: "Pooling"
  pooling_param {
	pool: MAX
	kernel_size: 2
	stride: 2
  }
  bottom: "eltwise3"
  top: "pool3"
}

layer{
  name: "conv4a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 384
    kernel_size: 1
    stride: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "pool3"
  top: "conv4a"
}
layer{
  name: "slice4a"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv4a"
  top: "slice4a_1"
  top: "slice4a_2"
}
layer{
  name: "etlwise4a"
  type: "Eltwise"
  bottom: "slice4a_1"
  bottom: "slice4a_2"
  top: "eltwise4a"
  eltwise_param {
	operation: MAX
  }
}
layer{
  name: "conv4"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise4a"
  top: "conv4"
}



layer{
  name: "slice4"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv4"
  top: "slice4_1"
  top: "slice4_2"
}
layer{
  name: "etlwise4"
  type: "Eltwise"
  bottom: "slice4_1"
  bottom: "slice4_2"
  top: "eltwise4"
  eltwise_param {
	operation: MAX
  }
}

layer{
  name: "conv5a"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 1
    stride: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise4"
  top: "conv5a"
}
layer{
  name: "slice5a"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv5a"
  top: "slice5a_1"
  top: "slice5a_2"
}
layer{
  name: "etlwise5a"
  type: "Eltwise"
  bottom: "slice5a_1"
  bottom: "slice5a_2"
  top: "eltwise5a"
  eltwise_param {
	operation: MAX
  }
}
layer{
  name: "conv5"
  type: "Convolution"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param{
    num_output: 256
    kernel_size: 3
    stride: 1
    pad: 1
    weight_filler{
      type:"xavier"
    }
    bias_filler{
      type: "constant"
      value: 0.1    
    }
  }
  bottom: "eltwise5a"
  top: "conv5"
}


layer{
  name: "slice5"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "conv5"
  top: "slice5_1"
  top: "slice5_2"
}
layer{
  name: "etlwise5"
  type: "Eltwise"
  bottom: "slice5_1"
  bottom: "slice5_2"
  top: "eltwise5"
  eltwise_param {
	operation: MAX
  }
}

layer{
  name: "pool4"
  type: "Pooling"
  pooling_param {
	pool: MAX
	kernel_size: 2
	stride: 2
  }
  bottom: "eltwise5"
  top: "pool4"
}

layer{
  name: "fc1"
  type: "InnerProduct"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
	num_output: 512
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}	
  }  
  bottom: "pool4"
  top: "fc1"
}
layer{
  name: "slice_fc1"
  type:"Slice"
  slice_param {
	slice_dim: 1
  }
  bottom: "fc1"
  top: "slice_fc1_1"
  top: "slice_fc1_2"
}
layer{
  name: "etlwise_fc1"
  type: "Eltwise"
  bottom: "slice_fc1_1"
  bottom: "slice_fc1_2"
  top: "eltwise_fc1"
  eltwise_param {
	operation: MAX
  }
}

layer{
  name: "drop1"
  type: "Dropout"
  dropout_param{
	dropout_ratio: 0.7
  }
  bottom: "eltwise_fc1"
  top: "eltwise_fc1"
}

layer{
  name: "fc2"
  type: "InnerProduct"

  inner_product_param{
	num_output: 79010
	weight_filler {
	  type: "xavier"
	}
	bias_filler {
	  type: "constant"
	  value: 0.1
	}	
  }
  bottom: "eltwise_fc1"
  top: "fc2"
}

layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc2"
  bottom: "label"
  top: "accuracy"
  include: { phase: TEST }
}

layer {
  name: "softmaxloss"
  type: "SoftmaxWithLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}

remember to change the num_output value in fc2

ctgushiwei · 2017-04-28T00:48:54Z

@TheusStremens firstly,thank you very much for your answer!
I have other two questions :
1.
I use the same train_test_prototxt as your configurations,but after 500K iterations,the loss is still at 11.2,then I change the fc2 layer param as follow:
my fc2 layer add the param{
lr_mult:10
decay_mult:1
}
param
{
lr_mult:20
decay_mult:0
}
and then the loss begin to drop, as your configurations,how many iterations,the loss begin to drop?

2.do you test your model on LFW and the accuracy can achieve 98%?

TheusStremens · 2017-04-28T13:32:09Z

@ctgushiwei

In my case, at iteration 700K the drop was 2. The drop begin to drop only after I change the batch size and allow shuffle the train data.
I'm still training. My training is take four times longer then mr Wu, and the electricity went off in my lab a few times. Besides that I have to suspend the training for train another urgent work. When it's over I notice you the results on LFW.

honghuCode · 2017-11-29T11:56:03Z

@TheusStremens,hello, when I fine-tuning the lightcnn,I met the error of "Cannot copy param 0 weights from layer 'conv1'; shape mismatch. Source param shape is 96 1 5 5 (2400); target param shape is 96 3 5 5 (7200). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer".could you please help me?

TheusStremens · 2017-11-29T12:16:45Z

@honghuCode check if you are loading rgb images, LightCNN works with grayscale images

honghuCode · 2017-11-29T12:21:09Z

@TheusStremens I used the following code to gray the image and resize it to 128 * 128.
`
mat = cv2.imread(imgPath,1)

mat=cv2.resize(mat,(128,128))

im_gray = cv2.cvtColor(mat, cv2.COLOR_BGR2GRAY)
`

the following is my train_test_bak.prototxt

`
name: "DeepFace_set003_net"

layer {
name: "data"
type:"ImageData"
top: "data"
top: "label"
image_data_param{
source: "/home/honghu/code/caffe-master/lightCNNFace/train.txt"
batch_size: 20
shuffle: true
}
transform_param {
scale: 0.00390625
crop_size: 128
mirror: true

}
include: { phase: TRAIN }
}

layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
image_data_param{
source: "/home/honghu/code/caffe-master/lightCNNFace/val.txt"
batch_size: 20
}
transform_param {
scale: 0.00390625
crop_size: 128
mirror: false
}
include: { phase: TEST }
}

layer{
name: "conv1"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 5
stride: 1
pad: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "data"
top: "conv1"
}

layer{
name: "slice1"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv1"
top: "slice1_1"
top: "slice1_2"
}
layer{
name: "etlwise1"
type: "Eltwise"
bottom: "slice1_1"
bottom: "slice1_2"
top: "eltwise1"
eltwise_param {
operation: MAX
}
}
layer{
name: "pool1"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
bottom: "eltwise1"
top: "pool1"
}

layer{
name: "conv2a"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "pool1"
top: "conv2a"
}
layer{
name: "slice2a"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv2a"
top: "slice2a_1"
top: "slice2a_2"
}
layer{
name: "etlwise2a"
type: "Eltwise"
bottom: "slice2a_1"
bottom: "slice2a_2"
top: "eltwise2a"
eltwise_param {
operation: MAX
}
}

layer{
name: "conv2"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 3
stride: 1
pad: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "eltwise2a"
top: "conv2"
}

layer{
name: "slice2"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv2"
top: "slice2_1"
top: "slice2_2"
}
layer{
name: "etlwise2"
type: "Eltwise"
bottom: "slice2_1"
bottom: "slice2_2"
top: "eltwise2"
eltwise_param {
operation: MAX
}
}
layer{
name: "pool2"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
bottom: "eltwise2"
top: "pool2"
}

layer{
name: "conv3a"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "pool2"
top: "conv3a"
}
layer{
name: "slice3a"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv3a"
top: "slice3a_1"
top: "slice3a_2"
}
layer{
name: "etlwise3a"
type: "Eltwise"
bottom: "slice3a_1"
bottom: "slice3a_2"
top: "eltwise3a"
eltwise_param {
operation: MAX
}
}

layer{
name: "conv3"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
kernel_size: 3
stride: 1
pad: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "eltwise3a"
top: "conv3"
}

layer{
name: "slice3"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv3"
top: "slice3_1"
top: "slice3_2"
}
layer{
name: "etlwise3"
type: "Eltwise"
bottom: "slice3_1"
bottom: "slice3_2"
top: "eltwise3"
eltwise_param {
operation: MAX
}
}
layer{
name: "pool3"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
bottom: "eltwise3"
top: "pool3"
}

layer{
name: "conv4a"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param{
num_output: 384
kernel_size: 1
stride: 1
weight_filler{
type:"xavier"
}
bias_filler{
type: "constant"
value: 0.1
}
}
bottom: "pool3"
top: "conv4a"
}
layer{
name: "slice4a"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv4a"
top: "slice4a_1"
top: "slice4a_2"
}
layer{
name: "etlwise4a"
type: "Eltwise"
bottom: "slice4a_1"
bottom: "slice4a_2"
top: "eltwise4a"
eltwise_param {
operation: MAX
}
}
layer{
name: "conv4"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param{
num_output: 256
kernel_size: 3
stride: 1
pad: 1
weight_filler{
type:"xavier"
}
bias_filler{
type: "constant"
value: 0.1
}
}
bottom: "eltwise4a"
top: "conv4"
}

layer{
name: "slice4"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv4"
top: "slice4_1"
top: "slice4_2"
}
layer{
name: "etlwise4"
type: "Eltwise"
bottom: "slice4_1"
bottom: "slice4_2"
top: "eltwise4"
eltwise_param {
operation: MAX
}
}

layer{
name: "conv5a"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param{
num_output: 256
kernel_size: 1
stride: 1
weight_filler{
type:"xavier"
}
bias_filler{
type: "constant"
value: 0.1
}
}
bottom: "eltwise4"
top: "conv5a"
}
layer{
name: "slice5a"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv5a"
top: "slice5a_1"
top: "slice5a_2"
}
layer{
name: "etlwise5a"
type: "Eltwise"
bottom: "slice5a_1"
bottom: "slice5a_2"
top: "eltwise5a"
eltwise_param {
operation: MAX
}
}
layer{
name: "conv5"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param{
num_output: 256
kernel_size: 3
stride: 1
pad: 1
weight_filler{
type:"xavier"
}
bias_filler{
type: "constant"
value: 0.1
}
}
bottom: "eltwise5a"
top: "conv5"
}

layer{
name: "slice5"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "conv5"
top: "slice5_1"
top: "slice5_2"
}
layer{
name: "etlwise5"
type: "Eltwise"
bottom: "slice5_1"
bottom: "slice5_2"
top: "eltwise5"
eltwise_param {
operation: MAX
}
}

layer{
name: "pool4"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
bottom: "eltwise5"
top: "pool4"
}

layer{
name: "fc1"
type: "InnerProduct"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 512
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "pool4"
top: "fc1"
}
layer{
name: "slice_fc1"
type:"Slice"
slice_param {
slice_dim: 1
}
bottom: "fc1"
top: "slice_fc1_1"
top: "slice_fc1_2"
}
layer{
name: "etlwise_fc1"
type: "Eltwise"
bottom: "slice_fc1_1"
bottom: "slice_fc1_2"
top: "eltwise_fc1"
eltwise_param {
operation: MAX
}
}

layer{
name: "drop1"
type: "Dropout"
dropout_param{
dropout_ratio: 0.7
}
bottom: "eltwise_fc1"
top: "eltwise_fc1"
}

layer{
name: "fnc2"
type: "InnerProduct"

inner_product_param{
num_output: 50
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "eltwise_fc1"
top: "fnc2"
}

layer {
name: "accuracy"
type: "Accuracy"
bottom: "fnc2"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}

layer {
name: "softmaxloss"
type: "SoftmaxWithLoss"
bottom: "fnc2"
bottom: "label"
top: "loss"
}
`

TheusStremens · 2017-11-29T12:31:15Z

@honghuCode add is_color: false in the data layer. Caffe loads images in 3 channels even if they are in grayscale unless you set this parameter to true.

honghuCode · 2017-11-29T12:40:15Z

@TheusStremens thank you very much,you solved my problems.

`

layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
image_data_param{
source: "/home/code/caffe-master/lightCNNFace/val.txt"
batch_size: 20
is_color:false
}
transform_param {
scale: 0.00390625
crop_size: 128
mirror: false
}
include: { phase: TEST }
}
`

AlfredXiangWu mentioned this issue May 19, 2017

solver of model_B #123

Closed

AlfredXiangWu closed this as completed Aug 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train and Fine-Tuning LightCNN #110

Train and Fine-Tuning LightCNN #110

TheusStremens commented Mar 20, 2017

AlfredXiangWu commented Mar 21, 2017

jiangxuehan commented Mar 28, 2017

TheusStremens commented Mar 28, 2017

TheusStremens commented Mar 28, 2017

AlfredXiangWu commented Mar 28, 2017

jiangxuehan commented Mar 29, 2017

TheusStremens commented Apr 5, 2017

AlfredXiangWu commented Apr 5, 2017 •

edited

TheusStremens commented Apr 6, 2017

TheusStremens commented Apr 6, 2017

AlfredXiangWu commented Apr 7, 2017

lei-xiong commented Apr 18, 2017 •

edited

TheusStremens commented Apr 18, 2017

lei-xiong commented Apr 19, 2017 •

edited

AlfredXiangWu commented Apr 19, 2017

lyuchuny3 commented Apr 21, 2017

ctgushiwei commented Apr 27, 2017

yuzcccc commented Apr 27, 2017

TheusStremens commented Apr 27, 2017 •

edited

ctgushiwei commented Apr 28, 2017

TheusStremens commented Apr 28, 2017

honghuCode commented Nov 29, 2017 •

edited

TheusStremens commented Nov 29, 2017

honghuCode commented Nov 29, 2017 •

edited

TheusStremens commented Nov 29, 2017

honghuCode commented Nov 29, 2017

Train and Fine-Tuning LightCNN #110

Train and Fine-Tuning LightCNN #110

Comments

TheusStremens commented Mar 20, 2017

AlfredXiangWu commented Mar 21, 2017

jiangxuehan commented Mar 28, 2017

TheusStremens commented Mar 28, 2017

TheusStremens commented Mar 28, 2017

AlfredXiangWu commented Mar 28, 2017

jiangxuehan commented Mar 29, 2017

TheusStremens commented Apr 5, 2017

AlfredXiangWu commented Apr 5, 2017 • edited

TheusStremens commented Apr 6, 2017

TheusStremens commented Apr 6, 2017

AlfredXiangWu commented Apr 7, 2017

lei-xiong commented Apr 18, 2017 • edited

TheusStremens commented Apr 18, 2017

lei-xiong commented Apr 19, 2017 • edited

AlfredXiangWu commented Apr 19, 2017

lyuchuny3 commented Apr 21, 2017

ctgushiwei commented Apr 27, 2017

yuzcccc commented Apr 27, 2017

TheusStremens commented Apr 27, 2017 • edited

ctgushiwei commented Apr 28, 2017

TheusStremens commented Apr 28, 2017

honghuCode commented Nov 29, 2017 • edited

TheusStremens commented Nov 29, 2017

honghuCode commented Nov 29, 2017 • edited

TheusStremens commented Nov 29, 2017

honghuCode commented Nov 29, 2017

AlfredXiangWu commented Apr 5, 2017 •

edited

lei-xiong commented Apr 18, 2017 •

edited

lei-xiong commented Apr 19, 2017 •

edited

TheusStremens commented Apr 27, 2017 •

edited

honghuCode commented Nov 29, 2017 •

edited

honghuCode commented Nov 29, 2017 •

edited