Skip to content

Implementation and evaluation of a data parallel, synchronous SGD to train CNNs on image classification datasets

Notifications You must be signed in to change notification settings

fluegelk/ATPC-Data-parallel-Training-of-Neural-Networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Parallel Training of Neural Networks

Implementation and evaluation of Synchronous SGD with All-Reduce, a data parallel algorithm to train CNNs on image classification datasets.

Implemented with Python3, PyTorch, and MPI4Py.
Created as part of the seminar Advanced Topics in Parallel Computing 2019 (http://www.scc.kit.edu/en/teaching/11673.php).

The "implementation" subdirectory contains a separate Readme on how to run the code.

The "evaluation" subdirectory contains multiple R scripts that might be useful to evaluate the outputs generated by the training. However, it is currently necessary to add a "device" and a "machine" column manually to the results__*__summary files.

About

Implementation and evaluation of a data parallel, synchronous SGD to train CNNs on image classification datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published