Tarantella is an open-source, distributed Deep Learning framework built on top of TensorFlow, providing scalable Deep Neural Network training on CPU and GPU compute clusters.
Tarantella offers an easy-to-use data parallel solution for speeding up the training of Tensorflow models. It provides full support for the TensorFlow Keras and Dataset APIs, allowing users to efficiently harness large numbers of compute resources without requiring any knowledge of parallel computing.
Tarantella is designed to meet the following goals:
- ease of use
- synchronous training scheme
- seamless integration with existing Keras models
- support for GPU and CPU systems
- strong scalability
To get started, you only need to add two lines of code to enable data parallel training for your Keras model.
Take a look at the highlighted lines in the following code snippet:
That's it!
All the necessary steps to distribute training and datasets will now be automatically handled by Tarantella. A full version of the above example can be found here.
Now simply train the model distributedly by executing one of the following commands:
Detailed instructions and configuration options are provided in the technical docs.
To build Tarantella from source, check out the installation guide.
Tarantella relies on the following dependencies:
- TensorFlow (starting with version 2.4)
- GaspiCxx (version 1.2.0)
- GPI-2 (from version 1.5.0)
Tarantella is licensed under the GPL-3.0 License. See LICENSE for details.