Caffe: a fast open framework for deep learning.
Pull request Compare This branch is 1120 commits ahead, 491 commits behind BVLC:master.
Latest commit f7801e9 Aug 22, 2018
Permalink
Failed to load latest commit information.
3rdparty CPU_ONLY mode removed, cleanup Jan 12, 2018
classification Python 3 syntax enforced Dec 30, 2017
cmake v0.17.1 Aug 20, 2018
data Issue 490 fixed (FP16 support for SSD train) Apr 9, 2018
docker Pin the base image version for the GPU Dockerfile May 2, 2016
docs CPU_ONLY mode removed, cleanup Jan 12, 2018
examples Inf scaling, TRT Jun 1, 2018
include/caffe #459 #529 fixed Aug 22, 2018
matlab Templateless Blob serializer Oct 28, 2017
models v0.17.1 Aug 20, 2018
packaging/deb Travis&cleanup Mar 1, 2018
python #459 #529 fixed Aug 22, 2018
scripts Travis&cleanup Mar 2, 2018
src Lint Aug 22, 2018
tools v0.17.1 Aug 20, 2018
.Doxyfile update doxygen config to stop warnings Sep 3, 2014
.gitignore CUB checked in Aug 30, 2016
.travis.yml 0.17 Feb 28, 2018
CMakeLists.txt Mark 0.17.1 Aug 22, 2018
CONTRIBUTING.md [docs] add CONTRIBUTING.md which will appear on GitHub new Issue/PR p… Jul 30, 2015
CONTRIBUTORS.md clarify the license and copyright terms of the project Aug 7, 2014
INSTALL.md installation questions -> caffe-users Oct 19, 2015
LICENSE CLA added Aug 20, 2018
Makefile Mark 0.17.1 Aug 22, 2018
Makefile.config.example Non-cuDNN path fixed Feb 28, 2018
NVCaffe-User-Guide.pdf User Guide added Jan 31, 2018
NVIDIA_CLA_v1.0.1.docx CLA added Aug 20, 2018
README.md CLA added Aug 20, 2018
common_plot.py tuned bvlc_googlenet/lars* Nov 2, 2017
plot_loss.py Squashed commit of the following: Apr 26, 2017
plot_top1.py Squashed commit of the following: Apr 26, 2017
plot_top5.py Squashed commit of the following: Apr 26, 2017
plot_train_loss.py June 2017 release Jun 22, 2017

README.md

Caffe

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors.

NVCaffe

NVIDIA Caffe (NVIDIA Corporation ©2017) is an NVIDIA-maintained fork of BVLC Caffe tuned for NVIDIA GPUs, particularly in multi-GPU configurations. Here are the major features:

  • 16 bit (half) floating point train and inference support.
  • Mixed-precision support. It allows to store and/or compute data in either 64, 32 or 16 bit formats. Precision can be defined for every layer (forward and backward passes might be different too), or it can be set for the whole Net.
  • Layer-wise Adaptive Rate Control (LARC) and adaptive global gradient scaler for better accuracy, especially in 16-bit training.
  • Integration with cuDNN v7.
  • Automatic selection of the best cuDNN convolution algorithm.
  • Integration with v2.2 (or higher) of NCCL library for improved multi-GPU scaling.
  • Optimized GPU memory management for data and parameters storage, I/O buffers and workspace for convolutional layers.
  • Parallel data parser, transformer and image reader for improved I/O performance.
  • Parallel back propagation and gradient reduction on multi-GPU systems.
  • Fast solvers implementation with fused CUDA kernels for weights and history update.
  • Multi-GPU test phase for even memory load across multiple GPUs.
  • Backward compatibility with BVLC Caffe and NVCaffe 0.15 and higher.
  • Extended set of optimized models (including 16 bit floating point examples).
  • Experimental feature (no official support) Multi-node training (since v0.17.1, NCCL 2.2 and OpenMPI 2 required).
  • Experimental feature (no official support) TRTLayer (since v0.17.1, can be used as inference plugin).

License and Citation

Caffe is released under the BSD 2-Clause license. The BVLC reference models are released for unrestricted use.

Please cite Caffe in your publications if it helps your research:

@article{jia2014caffe,
  Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
  Journal = {arXiv preprint arXiv:1408.5093},
  Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},
  Year = {2014}
}

Contributions

Please read, sign and attach enclosed agreement NVIDIA_CLA_v1.0.1.docx to your PR.

Useful notes

Libturbojpeg library is used since 0.16.5. It has a packaging bug. Please execute the following (required for Makefile, optional for CMake):

sudo apt-get install libturbojpeg
sudo ln -s /usr/lib/x86_64-linux-gnu/libturbojpeg.so.0.1.0 /usr/lib/x86_64-linux-gnu/libturbojpeg.so