Skip to content

Commit

Permalink
Merge pull request #636 from mv1388/AMP-native-docu-update
Browse files Browse the repository at this point in the history
Native AMP docu update
  • Loading branch information
mv1388 committed Oct 31, 2020
2 parents 5d94645 + 96051c9 commit c09a176
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 20 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ custom AMP `GradScaler` initialization parameters, these should be provided as a
`use_amp={'init_scale': 2.**16, 'growth_factor': 2.0, ...}` to the TrainLoop.
All AMP initializations and training related steps are then handled automatically by the TrainLoop.

You can read more about different details in the
You can read more about different AMP details in the
[PyTorch AMP documentation](https://pytorch.org/docs/stable/notes/amp_examples.html).

### Single-GPU mixed precision training
Expand All @@ -188,6 +188,8 @@ All the user has to do is set accordingly the `use_amp` parameter of the TrainLo
parameter to `'ddp'`.
Under the hood, TrainLoop will initialize the model and the optimizer for AMP and start training using
DistributedDataParallel approach.

Example of multi-GPU AMP setup:
```python
from aitoolbox.torchtrain.train_loop import *

Expand Down
2 changes: 1 addition & 1 deletion docs/source/torchtrain.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ results are either stored only locally or if desired also automatically synced t
torchtrain/schedulers
torchtrain/multi_loss_opti
torchtrain/parallel
torchtrain/apex_training
torchtrain/amp_training
torchtrain/advanced
Original file line number Diff line number Diff line change
@@ -1,23 +1,25 @@
APEX Mixed Precision Training
=============================
Automatic Mixed Precision Training
==================================

All the TrainLoop versions also support training with **Automatic Mixed Precision** (*AMP*) using
the `Nvidia apex <https://github.com/NVIDIA/apex>`_ extension. To use this feature the user first has to install
the Nvidia apex library (`installation instructions <https://github.com/NVIDIA/apex#linux>`_).
All the TrainLoop versions also support training with Automatic Mixed Precision (*AMP*). In the past this required
using the `Nvidia apex <https://github.com/NVIDIA/apex>`_ extension but from _PyTorch 1.6_ onwards AMP functionality
is built into core PyTorch and no separate instalation is needed.
Current version of AIToolbox already supports the use of built-in PyTorch AMP.

The user only has to set the TrainLoop parameter ``use_amp`` to ``use_amp=True`` in order to use the default
AMP initialization and start training the model in the mixed precision mode. If the user wants to specify custom
AMP initialization parameters, these should be provided as a dict parameter ``use_amp={'opt_level': 'O1'}`` to
the TrainLoop. All AMP initializations and training related steps are then handled automatically by the TrainLoop.
AMP initialization and start training the model in the mixed precision mode. If the user wants to specify
custom AMP ``GradScaler`` initialization parameters, these should be provided as a dict parameter
``use_amp={'init_scale': 2.**16, 'growth_factor': 2.0, ...}`` to the TrainLoop.
All AMP initializations and training related steps are then handled automatically by the TrainLoop.

You can read more about different AMP optimization levels in the
`official Nvidia apex documentation <https://nvidia.github.io/apex/amp.html#opt-levels-and-properties>`_.
You can read more about different AMP details in the
`PyTorch AMP documentation <https://pytorch.org/docs/stable/notes/amp_examples.html>`_.


Single-GPU mixed precision training
-----------------------------------

Example of single-GPU APEX setup:
Example of single-GPU AMP setup:

.. code-block:: python
Expand All @@ -36,14 +38,14 @@ Example of single-GPU APEX setup:
tl = TrainLoop(model,
train_loader, val_loader, test_loader,
optimizer, criterion,
use_amp={'opt_level': 'O1'})
use_amp=True)
model = tl.fit(num_epochs=10)
Check out a full
`Apex AMP training example
<https://github.com/mv1388/aitoolbox/blob/master/examples/apex_amp_training/apex_single_GPU_training.py#L83>`_.
`AMP single-GPU training example
<https://github.com/mv1388/aitoolbox/blob/master/examples/amp_training/single_GPU_training.py>`_.


Multi-GPU DDP mixed precision training
Expand All @@ -53,7 +55,9 @@ When training in the multi-GPU setting, the setup is mostly the same as in the s
All the user has to do is set accordingly the ``use_amp`` parameter of the TrainLoop and to switch its ``gpu_mode``
parameter to ``'ddp'``.
Under the hood, TrainLoop will initialize the model and the optimizer for AMP and start training using
DistributedDataParallel approach (DDP is currently only multi-GPU training setup supported by Apex AMP).
DistributedDataParallel approach.

Example of multi-GPU AMP setup:

.. code-block:: python
Expand All @@ -73,12 +77,12 @@ DistributedDataParallel approach (DDP is currently only multi-GPU training setup
train_loader, val_loader, test_loader,
optimizer, criterion,
gpu_mode='ddp',
use_amp={'opt_level': 'O1'})
use_amp=True)
model = tl.fit(num_epochs=10,
num_nodes=1, node_rank=0, num_gpus=torch.cuda.device_count())
Check out a full
`Apex AMP DistributedDataParallel training example
<https://github.com/mv1388/aitoolbox/blob/master/examples/apex_amp_training/apex_mutli_GPU_training.py#L86>`_.
`AMP multi-GPU DistributedDataParallel training example
<https://github.com/mv1388/aitoolbox/blob/master/examples/amp_training/mutli_GPU_training.py>`_.

0 comments on commit c09a176

Please sign in to comment.