train_val prototxt #6

ShervinAr · 2017-05-22T16:12:26Z

Hello, Could you please provide the train_val.prototxt file?
I would like to do some fine-tuning but it seems like the number of blobs for the batchnorm layers are not the same between the training and deploy models.

shicai · 2017-05-23T04:31:54Z

except data layers and loss/acc layers, the main body of the training and deploy prototxt files should be the same. please check your training prototxt files.

ShervinAr · 2017-05-23T04:45:33Z

thanks for your reply. Is it possible for your to share your train_val.prototxt file?

shicai · 2017-05-23T05:37:53Z

they are actually the same.
so if you have any troubles, would you please provide your error infomation?

ShervinAr · 2017-05-23T08:56:48Z

there seems to be an inconsistency between the batchnormalization layers definition in the mobilenet_deploy.prototxt and that of the provided Caffe model. More specifically, for example, for the conv1/bn layer, I get the following error message:

ERROR: Check failed: target_blobs.size() == source_layer.blobs_size() (5 vs. 3) Incompatible number of blobs for layer conv1/bn

I could remove the error message by renaming ALL the batchnormalization layers but I would like to use exactly the same model as you have provided

shicai · 2017-05-23T09:00:22Z

do you use the official caffe?
batch_norm_layer has only 3 blobs.
please take a look at: https://github.com/BVLC/caffe/blob/master/src/caffe/layers/batch_norm_layer.cpp#L25

ShervinAr · 2017-05-23T09:05:17Z

I am using nvCaffe and the NVIDIA Digits environment for fine-tuning.
the corresponding batch norm layer has five blobs as

this->blobs_[0].reset(new Blob<Dtype>(sz));  // scale
this->blobs_[1].reset(new Blob<Dtype>(sz));  // bias
this->blobs_[2].reset(new Blob<Dtype>(sz));  // mean
this->blobs_[3].reset(new Blob<Dtype>(sz));  // variance

.
.
.
this->blobs_[4].reset(new Blob(sz));

Do you have any suggestions how to work around this problem?

shicai · 2017-05-23T09:13:31Z

please take a look at https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/batch_norm_layer.cpp#L25

  scale_bias_ = false;
  scale_bias_ = param.scale_bias(); // by default = false;
  if (param.has_scale_filler() || param.has_bias_filler()) { // implicit set
    scale_bias_ = true;
  }

  if (this->blobs_.size() > 0) {
    LOG(INFO) << "Skipping parameter initialization";
  } else {
    if (scale_bias_)
      this->blobs_.resize(5);
    else
      this->blobs_.resize(3);

please confirm that you set scale_bias false, and have no scale_filler or bias_filler for batch norm layers.

ShervinAr · 2017-05-23T09:28:12Z

many thanks for your reply. As I am very new to Caffe, could you please let me know exactly how to do that?

shicai · 2017-05-23T09:33:12Z

please update your caffe to newest one from: https://github.com/NVIDIA/caffe/
and the default param settings would be ok for you.

ShervinAr · 2017-05-23T12:28:57Z

thanks for your reply. I have upgraded the caffe but still face the same problem. Any help on how to resolve this is greatly appreciated. The configuration of required libs are as follows:
-- * Caffe Configuration Summary *
-- General:
-- Version : 0.16.1
-- Git : v0.16.1-6-g5a06f0e
-- System : Linux
-- C++ compiler : /usr/bin/c++
-- Release CXX flags : -O3 -DNDEBUG -fPIC -Wall -std=c++11 -Wno-sign-compare -Wno-uninitialized
-- Debug CXX flags : -g -DDEBUG -fPIC -Wall -std=c++11 -Wno-sign-compare -Wno-uninitialized
-- Build type : Release

-- BUILD_SHARED_LIBS : ON
-- BUILD_python : ON
-- BUILD_matlab : OFF
-- BUILD_docs : ON
-- CPU_ONLY : OFF
-- USE_OPENCV : ON
-- USE_LEVELDB : ON
-- USE_LMDB : ON
-- ALLOW_LMDB_NOLOCK : OFF
-- TEST_FP16 : OFF

-- Dependencies:
-- BLAS : Yes (Atlas)
-- Boost : Yes (ver. 1.54)
-- glog : Yes
-- gflags : Yes
-- protobuf : Yes (ver. 2.5.0)
-- lmdb : Yes (ver. 0.9.16)
-- LevelDB : Yes (ver. 1.15)
-- Snappy : Yes (ver. 1.1.0)
-- OpenCV : Yes (ver. 2.4.8)
-- CUDA : Yes (ver. 8.0)

-- NVIDIA CUDA:
-- Target GPU(s) : Auto
-- GPU arch(s) : sm_52
-- cuDNN : Yes (ver. 6.0)
-- NCCL : Not found
-- NVML : /usr/lib/nvidia-361/libnvidia-ml.so

-- Python:
-- Interpreter : /usr/bin/python2.7 (ver. 2.7.6)
-- Libraries : /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.6)
-- NumPy : /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.8.2)

-- Documentaion:
-- Doxygen : No
-- config_file :

shicai · 2017-05-24T06:36:34Z

would you please share a link of your train_val.prototxt for me?
I need take a good look at it.

shicai closed this as completed May 27, 2017

CEDAclaire mentioned this issue Aug 16, 2017

batchnorm layer inconsistency NVIDIA/DIGITS#1648

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_val prototxt #6

train_val prototxt #6

ShervinAr commented May 22, 2017 •

edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 •

edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 •

edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 •

edited

shicai commented May 24, 2017

train_val prototxt #6

train_val prototxt #6

Comments

ShervinAr commented May 22, 2017 • edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 • edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 • edited

shicai commented May 23, 2017

ShervinAr commented May 23, 2017

shicai commented May 23, 2017

ShervinAr commented May 23, 2017 • edited

-- BUILD_SHARED_LIBS : ON -- BUILD_python : ON -- BUILD_matlab : OFF -- BUILD_docs : ON -- CPU_ONLY : OFF -- USE_OPENCV : ON -- USE_LEVELDB : ON -- USE_LMDB : ON -- ALLOW_LMDB_NOLOCK : OFF -- TEST_FP16 : OFF

-- Dependencies: -- BLAS : Yes (Atlas) -- Boost : Yes (ver. 1.54) -- glog : Yes -- gflags : Yes -- protobuf : Yes (ver. 2.5.0) -- lmdb : Yes (ver. 0.9.16) -- LevelDB : Yes (ver. 1.15) -- Snappy : Yes (ver. 1.1.0) -- OpenCV : Yes (ver. 2.4.8) -- CUDA : Yes (ver. 8.0)

-- NVIDIA CUDA: -- Target GPU(s) : Auto -- GPU arch(s) : sm_52 -- cuDNN : Yes (ver. 6.0) -- NCCL : Not found -- NVML : /usr/lib/nvidia-361/libnvidia-ml.so

-- Python: -- Interpreter : /usr/bin/python2.7 (ver. 2.7.6) -- Libraries : /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.6) -- NumPy : /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.8.2)

shicai commented May 24, 2017

ShervinAr commented May 22, 2017 •

edited

ShervinAr commented May 23, 2017 •

edited

ShervinAr commented May 23, 2017 •

edited

ShervinAr commented May 23, 2017 •

edited

-- BUILD_SHARED_LIBS : ON
-- BUILD_python : ON
-- BUILD_matlab : OFF
-- BUILD_docs : ON
-- CPU_ONLY : OFF
-- USE_OPENCV : ON
-- USE_LEVELDB : ON
-- USE_LMDB : ON
-- ALLOW_LMDB_NOLOCK : OFF
-- TEST_FP16 : OFF

-- Dependencies:
-- BLAS : Yes (Atlas)
-- Boost : Yes (ver. 1.54)
-- glog : Yes
-- gflags : Yes
-- protobuf : Yes (ver. 2.5.0)
-- lmdb : Yes (ver. 0.9.16)
-- LevelDB : Yes (ver. 1.15)
-- Snappy : Yes (ver. 1.1.0)
-- OpenCV : Yes (ver. 2.4.8)
-- CUDA : Yes (ver. 8.0)

-- NVIDIA CUDA:
-- Target GPU(s) : Auto
-- GPU arch(s) : sm_52
-- cuDNN : Yes (ver. 6.0)
-- NCCL : Not found
-- NVML : /usr/lib/nvidia-361/libnvidia-ml.so

-- Python:
-- Interpreter : /usr/bin/python2.7 (ver. 2.7.6)
-- Libraries : /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.6)
-- NumPy : /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.8.2)