ResNet training in Torch

facebookarchive · Feb 4, 2016 · 6f9402d · 6f9402d
commit 6f9402d
Show file tree

Hide file tree

Showing 22 changed files with 1,896 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,3 @@
+gen/
+libnccl.so
+model_best.t7
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,27 @@
+# Contributing to fb.resnet.torch
+We want to make contributing to this project as easy and transparent as
+possible.
+
+## Pull Requests
+We actively welcome your pull requests.
+
+1. Fork the repo and create your branch from `master`.
+2. If you haven't already, complete the Contributor License Agreement ("CLA").
+
+## Contributor License Agreement ("CLA")
+In order to accept your pull request, we need you to submit a CLA. You only need
+to do this once to work on any of Facebook's open source projects.
+
+Complete your CLA here: <https://code.facebook.com/cla>
+
+## Issues
+We use GitHub issues to track public bugs. Please ensure your description is
+clear and has sufficient instructions to be able to reproduce the issue.
+
+## Coding Style
+* Use three spaces for indentation rather than tabs
+* 80 character line length
+
+## License
+By contributing to fb.resnet.torch, you agree that your contributions will be
+licensed under its BSD license.
diff --git a/INSTALL.md b/INSTALL.md
@@ -0,0 +1,92 @@
+Torch ResNet Installation
+=========================
+
+This is the suggested way to install the Torch ResNet dependencies on [Ubuntu 14.04+](http://www.ubuntu.com/):
+* NVIDIA CUDA 7.0+
+* NVIDIA cuDNN v4
+* Torch
+* ImageNet dataset
+
+## Requirements
+* NVIDIA GPU with compute capability 3.5 or above
+
+## Install CUDA
+1. Install the `build-essential` package:
+ ```bash
+ sudo apt-get install build-essential
+ ```
+
+2. If you are using a Virtual Machine (like Amazon EC2 instances), install:
+ ```bash
+ sudo apt-get update
+ sudo apt-get install linux-generic
+ ```
+
+3. Download the CUDA .deb file for Linux Ubuntu 14.04 64-bit from: https://developer.nvidia.com/cuda-downloads.
+The file will be named something like `cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb`
+
+4. Install CUDA from the .deb file:
+ ```bash
+ sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
+ sudo apt-get update
+ sudo apt-get install cuda
+ echo "export PATH=/usr/local/cuda/bin/:\$PATH; export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:\$LD_LIBRARY_PATH; " >>~/.bashrc && source ~/.bashrc
+ ```
+
+4. Restart your computer
+
+## Install cuDNN v4
+1. Download cuDNN v4 from https://developer.nvidia.com/cuDNN  (requires registration).
+  The file will be named something like `cudnn-7.0-linux-x64-v4.0-rc.tgz`.
+
+2. Extract the file to `/usr/local/cuda`:
+ ```bash
+ tar -xvf cudnn-7.0-linux-x64-v4.0-rc.tgz
+ sudo cp cuda/include/*.h /usr/local/cuda/include
+ sudo cp cuda/lib64/*.so* /usr/local/cuda/lib64
+ ```
+
+## Install Torch
+1. Install the Torch dependencies:
+  ```bash
+  curl -sk https://raw.githubusercontent.com/torch/ezinstall/master/install-deps | bash -e
+  ```
+
+2. Install Torch in a local folder:
+  ```bash
+  git clone https://github.com/torch/distro.git ~/torch --recursive
+  cd ~/torch; ./install.sh
+  ```
+
+If you want to uninstall torch, you can use the command: `rm -rf ~/torch`
+
+## Install the Torch cuDNN v4 bindings
+```bash
+git clone -b R4 https://github.com/soumith/cudnn.torch.git
+cd cudnn.torch; luarocks make
+```
+
+## Download the ImageNet dataset
+The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset has 1000 categories and 1.2 million images. The images do not need to be preprocessed or packaged in any database, but the validation images need to be moved into appropriate subfolders.
+
+1. Download the images from http://image-net.org/download-images
+
+2. Extract the training data:
+  ```bash
+  mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
+  tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
+  find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
+  cd ..
+  ```
+
+3. Extract the validation data and move images to subfolders:
+  ```bash
+  mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar
+  wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
+  ```
+
+## Download Torch ResNet
+```bash
+git clone https://github.com/facebook/fb.resnet.torch.git
+cd fb.resnet.torch
+```
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,30 @@
+BSD License
+
+For fb.resnet.torch software
+
+Copyright (c) 2016, Facebook, Inc. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification,
+are permitted provided that the following conditions are met:
+
+ * Redistributions of source code must retain the above copyright notice, this
+   list of conditions and the following disclaimer.
+
+ * Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+
+ * Neither the name Facebook nor the names of its contributors may be used to
+   endorse or promote products derived from this software without specific
+   prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
+ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/PATENTS b/PATENTS
@@ -0,0 +1,33 @@
+Additional Grant of Patent Rights Version 2
+
+"Software" means the fb.resnet.torch software distributed by Facebook, Inc.
+
+Facebook, Inc. ("Facebook") hereby grants to each recipient of the Software
+("you") a perpetual, worldwide, royalty-free, non-exclusive, irrevocable
+(subject to the termination provision below) license under any Necessary
+Claims, to make, have made, use, sell, offer to sell, import, and otherwise
+transfer the Software. For avoidance of doubt, no license is granted under
+Facebook’s rights in any patent claims that are infringed by (i) modifications
+to the Software made by you or any third party or (ii) the Software in
+combination with any software or other technology.
+
+The license granted hereunder will terminate, automatically and without notice,
+if you (or any of your subsidiaries, corporate affiliates or agents) initiate
+directly or indirectly, or take a direct financial interest in, any Patent
+Assertion: (i) against Facebook or any of its subsidiaries or corporate
+affiliates, (ii) against any party if such Patent Assertion arises in whole or
+in part from any software, technology, product or service of Facebook or any of
+its subsidiaries or corporate affiliates, or (iii) against any party relating
+to the Software. Notwithstanding the foregoing, if Facebook or any of its
+subsidiaries or corporate affiliates files a lawsuit alleging patent
+infringement against you in the first instance, and you respond by filing a
+patent infringement counterclaim in that lawsuit against that party that is
+unrelated to the Software, the license granted hereunder will not terminate
+under section (i) of this paragraph due to such counterclaim.
+
+A "Necessary Claim" is a claim of a patent owned by Facebook that is
+necessarily infringed by the Software standing alone.
+
+A "Patent Assertion" is any lawsuit or other action alleging direct, indirect,
+or contributory infringement or inducement to infringe any patent, including a
+cross-claim or counterclaim.
diff --git a/README.md b/README.md
@@ -0,0 +1,56 @@
+ResNet training in Torch
+============================
+
+This implements training of residual networks from [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385) by Kaiming He, et. al.
+
+
+## Requirements
+See the [installation instructions](INSTALL.md) for a step-by-step guide.
+- Install [Torch](http://torch.ch/docs/getting-started.html) on a machine with CUDA GPU
+- Install [cuDNN v4](https://developer.nvidia.com/cudnn) and the Torch [cuDNN bindings](https://github.com/soumith/cudnn.torch/tree/R4)
+- Download the [ImageNet](http://image-net.org/download-images) dataset and [move validation images](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset) to labeled subfolders
+
+## Training
+See the [training recipes](TRAINING.md) for addition examples.
+
+The training scripts come with several options, which can be listed with the `--help` flag.
+```bash
+th main.lua --help
+```
+
+To run the training, simply run main.lua. By default, the script runs ResNet-34 on ImageNet with 1 GPU and 2 data-loader threads.
+```bash
+th main.lua -data [imagenet-folder with train and val folders]
+```
+
+To train ResNet-50 on 4 GPUs:
+```bash
+th main.lua -depth 50 -batchSize 256 -nGPU 4 -nThreads 8 -shareGradInput -data [imagenet-folder]
+```
+
+## Trained models
+
+Trained ResNet 18, 34, 50, and 101 models are [available for download](pretrained). We include instructions for [using a custom dataset](pretrained/README.md#fine-tuning-on-a-custom-dataset) and for [extracting image features](pretrained/README.md#extracting-image-features) using a pre-trained model.
+
+The trained models achieve better error rates than the [original ResNet models](https://github.com/KaimingHe/deep-residual-networks).
+
+#### Single-crop validation error rate
+
+| Network       | Top-1 error | Top-5 error |
+| ------------- | ----------- | ----------- |
+| ResNet-18     | 30.41       | 10.76       |
+| ResNet-34     | 26.73       | 8.74        |
+| ResNet-50     | 24.01       | 7.02        |
+| ResNet-101    | **22.44**   | **6.21**    |
+
+## Notes
+
+This implementation differs from the ResNet paper in a few ways:
+
+**Scale augmentation**: We use the [scale and aspect ratio augmentation](datasets/transforms.lua#L130) from [Going Deeper with Convolutions](http://arxiv.org/abs/1409.4842), instead of [scale augmentation](datasets/transforms.lua#L113) used in the ResNet paper. We find this gives a better validation error.
+
+**Color augmentation**: We use the photometric distortions from [Andrew Howard](http://arxiv.org/abs/1312.5402) in addition to the [AlexNet](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)-style color augmentation used in the ResNet paper.
+
+**Weight decay**: We apply weight decay to all weights and biases instead of just the weights of the convolution layers.
+
+**Strided convolution**: When using the bottleneck architecture, we use stride 2 in the 3x3 convolution, instead of the first 1x1 convolution.
diff --git a/TRAINING.md b/TRAINING.md
@@ -0,0 +1,66 @@
+Training recipes
+----------------
+
+### CIFAR-10
+
+To train ResNet-20 on CIFAR-10 with 2 GPUs:
+
+```bash
+th main.lua -dataset cifar10 -nGPU 2 -batchSize 128 -depth 20
+```
+
+To train ResNet-110 instead just change the `-depth` flag:
+
+```bash
+th main.lua -dataset cifar10 -nGPU 2 -batchSize 128 -depth 110
+```
+
+To fit ResNet-1202 on two GPUs, you will need to use the [`-shareGradInput`](#sharegradinput) flag:
+
+```bash
+th main.lua -dataset cifar10 -nGPU 2 -batchSize 128 -depth 1202 -shareGradInput
+```
+
+### ImageNet
+
+See the [installation instructions](INSTALL.md#download-the-imagenet-dataset) for ImageNet data setup.
+
+To train ResNet-18 on ImageNet with 4 GPUs and 8 data loading threads:
+
+```bash
+th main.lua -depth 18 -nGPU 4 -nThreads 8 -batchSize 256 -data [imagenet-folder]
+```
+
+To train ResNet-34 instead just change the `-depth` flag:
+
+```bash
+th main.lua -depth 34 -nGPU 4 -nThreads 8 -batchSize 256 -data [imagenet-folder]
+```
+To train ResNet-50 on 4 GPUs, you will need to use the [`-shareGradInput`](#sharegradinput) flag:
+
+```bash
+th main.lua -depth 50 -nGPU 4 -nThreads 8 -batchSize 256 -shareGradInput -data [imagenet-folder]
+```
+
+To train ResNet-101 or ResNet-152 with batch size 256, you may need 8 GPUs:
+
+```bash
+th main.lua -depth 152 -nGPU 8 -nThreads 12 -batchSize 256 -shareGradInput -data [imagenet-folder]
+```
+
+## Useful flags
+
+For a complete list of flags, run `th main.lua --help`.
+
+### shareGradInput
+
+The `-shareGradInput` flag enables sharing of `gradInput` tensors between modules of the same type. This reduces
+memory usage. It works correctly with the include ResNet models, but may not work for other network architectures. See 
+[models/init.lua](models/init.lua#L37-L55) for the implementation.
+
+### shortcutType
+
+The `-shortcutType` flag selects the type of shortcut connection. The [ResNet paper](http://arxiv.org/abs/1512.03385) describes three different shortcut types:
+- `A`: identity shortcut with zero-padding for increasing dimensions. This is used for all CIFAR-10 experiments.
+- `B`: identity shortcut with 1x1 convolutions for increasing dimesions. This is used for most ImageNet experiments.
+- `C`: 1x1 convolutions for all shortcut connections.