Permalink
Please sign in to comment.
Browse files
drop performance + hardware page and switch to sheet
simpler to read and update
- Loading branch information...
Showing
with
5 additions
and 77 deletions.
- +5 −4 docs/index.md
- +0 −73 docs/performance_hardware.md
| @@ -1,73 +0,0 @@ | ||
| ---- | ||
| -title: Performance and Hardware Configuration | ||
| ---- | ||
| - | ||
| -# Performance and Hardware Configuration | ||
| - | ||
| -To measure performance on different NVIDIA GPUs we use CaffeNet, the Caffe reference ImageNet model. | ||
| - | ||
| -For training, each time point is 20 iterations/minibatches of 256 images for 5,120 images total. For testing, a 50,000 image validation set is classified. | ||
| - | ||
| -**Acknowledgements**: BAIR members are very grateful to NVIDIA for providing several GPUs to conduct this research. | ||
| - | ||
| -## NVIDIA K40 | ||
| - | ||
| -Performance is best with ECC off and boost clock enabled. While ECC makes a negligible difference in speed, disabling it frees ~1 GB of GPU memory. | ||
| - | ||
| -Best settings with ECC off and maximum clock speed in standard Caffe: | ||
| - | ||
| -* Training is 26.5 secs / 20 iterations (5,120 images) | ||
| -* Testing is 100 secs / validation set (50,000 images) | ||
| - | ||
| -Best settings with Caffe + [cuDNN acceleration](http://nvidia.com/cudnn): | ||
| - | ||
| -* Training is 19.2 secs / 20 iterations (5,120 images) | ||
| -* Testing is 60.7 secs / validation set (50,000 images) | ||
| - | ||
| -Other settings: | ||
| - | ||
| -* ECC on, max speed: training 26.7 secs / 20 iterations, test 101 secs / validation set | ||
| -* ECC on, default speed: training 31 secs / 20 iterations, test 117 secs / validation set | ||
| -* ECC off, default speed: training 31 secs / 20 iterations, test 118 secs / validation set | ||
| - | ||
| -### K40 configuration tips | ||
| - | ||
| -For maximum K40 performance, turn off ECC and boost the clock speed (at your own risk). | ||
| - | ||
| -To turn off ECC, do | ||
| - | ||
| - sudo nvidia-smi -i 0 --ecc-config=0 # repeat with -i x for each GPU ID | ||
| - | ||
| -then reboot. | ||
| - | ||
| -Set the "persistence" mode of the GPU settings by | ||
| - | ||
| - sudo nvidia-smi -pm 1 | ||
| - | ||
| -and then set the clock speed with | ||
| - | ||
| - sudo nvidia-smi -i 0 -ac 3004,875 # repeat with -i x for each GPU ID | ||
| - | ||
| -but note that this configuration resets across driver reloading / rebooting. Include these commands in a boot script to initialize these settings. For a simple fix, add these commands to `/etc/rc.local` (on Ubuntu). | ||
| - | ||
| -## NVIDIA Titan | ||
| - | ||
| -Training: 26.26 secs / 20 iterations (5,120 images). | ||
| -Testing: 100 secs / validation set (50,000 images). | ||
| - | ||
| -cuDNN Training: 20.25 secs / 20 iterations (5,120 images). | ||
| -cuDNN Testing: 66.3 secs / validation set (50,000 images). | ||
| - | ||
| - | ||
| -## NVIDIA K20 | ||
| - | ||
| -Training: 36.0 secs / 20 iterations (5,120 images). | ||
| -Testing: 133 secs / validation set (50,000 images). | ||
| - | ||
| -## NVIDIA GTX 770 | ||
| - | ||
| -Training: 33.0 secs / 20 iterations (5,120 images). | ||
| -Testing: 129 secs / validation set (50,000 images). | ||
| - | ||
| -cuDNN Training: 24.3 secs / 20 iterations (5,120 images). | ||
| -cuDNN Testing: 104 secs / validation set (50,000 images). |
0 comments on commit
3562698