Skip to content

SanghyukChun/caffe

 
 

Repository files navigation

Caffe development version by SanghyukChun

This is repository for Caffe development by @SanghyukChun.

Now currently working on implementing batch normalization based on PR#1965.

File changes for implementation of batch normalization

Implemetation of BN

In PR#1965, there are 2 unresolved problems:

  1. In the PR, shuffling is implemented for only encoded data.
  2. Mean/variance for inference is not implemented. Therefore, we cannot classify test data with batch size 1.

In this dev repository, I resolve the problems as followings:

  1. Instead of using shuffling code from PR#1965, I use shuffle param in ImageDataLayer. It is still not implemented for every layers but I decide to use ImageDataLayer because it is uniform shuffling and not that slow then DataLayer.
  2. I apply moving average strategy duriung training phase and use it for inference.

Some approaches I am now considering:

  1. Shuffling via random skip as mentioned in the comment in PR#1965. Then we can use every type of data layer for shuffling. However, my experiments say shuffling is not critical issue for BN network
  2. Implement completely independent batch normalization inference module as like lim6060's. However, my my experiments say inference rule is not critical issue for BN network

License and Citation

Caffe is released under the BSD 2-Clause license. The BVLC reference models are released for unrestricted use.

Please cite Caffe in your publications if it helps your research:

@article{jia2014caffe,
  Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
  Journal = {arXiv preprint arXiv:1408.5093},
  Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},
  Year = {2014}
}

About

Caffe: a fast open framework for deep learning.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 80.1%
  • Python 8.5%
  • Cuda 4.8%
  • CMake 3.0%
  • Protocol Buffer 1.5%
  • MATLAB 1.0%
  • Other 1.1%