-
Merge pull request #5687 from BVLC/readme_list_branches
List branches in readme
-
Merge pull request #5530 from willyd/nccl-py3
Explicit std::string to bp::object conversion
-
Merge pull request #5527 from willyd/nccl-py3
Added support for python 3 and NCCL
-
-
-
Merge pull request #5474 from willcrichton/master
Fixed memory leaks in cudnn conv and relu
-
Merge pull request #5408 from cypof/multi_infer
Init test network on all GPUs
-
Merge pull request #5455 from cypof/remove_shared_parallel
Remove missed legacy parallel code
-
Merge pull request #5393 from jfolz/master
Multi GPU training from Python can use any solver
-
Merge pull request #5215 from cypof/fix_restore
Restore can be invoked on rank > 0
-
-
Merge pull request #5153 from cypof/docker
Docker refresh: simplified & update to 16.04, cuda8, cudnn5, nccl
-
Merge pull request #5075 from tsocha/master
Fix mkl issue #4836
-
-
cypof committed
Nov 23, 2016 -
cypof committed
Jan 6, 2017 -
Logging from python, e.g. for lower log level on multi-GPU workers
cypof committedNov 22, 2016
-
- Parallelize batches among GPUs and tree-reduce the gradients - The effective batch size scales with the number of devices - Batch size is multiplied by the number of devices - Split batches between GPUs, and tree-reduce the gradients - Detect machine topology (twin-GPU boards, P2P connectivity) - Track device in syncedmem (thanks @thatguymike) - Insert a callback in the solver for minimal code change - Accept list for gpu flag of caffe tool, e.g. '-gpu 0,1' or '-gpu all'. Run on default GPU if no ID given. - Add multi-GPU solver test - Deterministic architecture for reproducible runs
-
Allocate host memory through cudaMallocHost
thanks to discussion by @thatguymike and @flx42
-
Add DataReader for parallel training with one DB session
- Make sure each solver accesses a different subset of the data - Sequential reading of DB for performance - Prefetch a configurable amount of data to host memory - Distribute data to solvers in round-robin way for determinism
-
-
Change the way threads are started and stopped
- Interrupt the thread before waiting on join - Provide a method for looping threads to exit on demand - CHECK if start and stop succeed instead of returning an error
-