Merge pull request #636 from mv1388/AMP-native-docu-update

Native AMP docu update
mv1388 · Oct 31, 2020 · c09a176 · c09a176
2 parents 5d94645 + 96051c9
commit c09a176
Show file tree

Hide file tree

Showing 3 changed files with 26 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -163,7 +163,7 @@ custom AMP `GradScaler` initialization parameters, these should be provided as a
 `use_amp={'init_scale': 2.**16, 'growth_factor': 2.0, ...}` to the TrainLoop. 
 All AMP initializations and training related steps are then handled automatically by the TrainLoop. 
 
-You can read more about different details in the 
+You can read more about different AMP details in the 
 [PyTorch AMP documentation](https://pytorch.org/docs/stable/notes/amp_examples.html).
 
 ### Single-GPU mixed precision training
@@ -188,6 +188,8 @@ All the user has to do is set accordingly the `use_amp` parameter of the TrainLo
 parameter to `'ddp'`. 
 Under the hood, TrainLoop will initialize the model and the optimizer for AMP and start training using 
 DistributedDataParallel approach.
+
+Example of multi-GPU AMP setup:
 ```python
 from aitoolbox.torchtrain.train_loop import *
 

diff --git a/docs/source/torchtrain.rst b/docs/source/torchtrain.rst
@@ -16,5 +16,5 @@ results are either stored only locally or if desired also automatically synced t
    torchtrain/schedulers
    torchtrain/multi_loss_opti
    torchtrain/parallel
-   torchtrain/apex_training
+   torchtrain/amp_training
    torchtrain/advanced
diff --git a/docs/source/torchtrain/apex_training.rst → docs/source/torchtrain/amp_training.rst b/docs/source/torchtrain/apex_training.rst → docs/source/torchtrain/amp_training.rst
@@ -1,23 +1,25 @@
-APEX Mixed Precision Training
-=============================
+Automatic Mixed Precision Training
+==================================
 
-All the TrainLoop versions also support training with **Automatic Mixed Precision** (*AMP*) using
-the `Nvidia apex <https://github.com/NVIDIA/apex>`_ extension. To use this feature the user first has to install
-the Nvidia apex library (`installation instructions <https://github.com/NVIDIA/apex#linux>`_).
+All the TrainLoop versions also support training with Automatic Mixed Precision (*AMP*). In the past this required
+using the `Nvidia apex <https://github.com/NVIDIA/apex>`_ extension but from _PyTorch 1.6_ onwards AMP functionality
+is built into core PyTorch and no separate instalation is needed.
+Current version of AIToolbox already supports the use of built-in PyTorch AMP.
 
 The user only has to set the TrainLoop parameter ``use_amp`` to ``use_amp=True`` in order to use the default
-AMP initialization and start training the model in the mixed precision mode. If the user wants to specify custom
-AMP initialization parameters, these should be provided as a dict parameter ``use_amp={'opt_level': 'O1'}`` to
-the TrainLoop. All AMP initializations and training related steps are then handled automatically by the TrainLoop.
+AMP initialization and start training the model in the mixed precision mode. If the user wants to specify
+custom AMP ``GradScaler`` initialization parameters, these should be provided as a dict parameter
+``use_amp={'init_scale': 2.**16, 'growth_factor': 2.0, ...}`` to the TrainLoop.
+All AMP initializations and training related steps are then handled automatically by the TrainLoop.
 
-You can read more about different AMP optimization levels in the
-`official Nvidia apex documentation <https://nvidia.github.io/apex/amp.html#opt-levels-and-properties>`_.
+You can read more about different AMP details in the
+`PyTorch AMP documentation <https://pytorch.org/docs/stable/notes/amp_examples.html>`_.
 
 
 Single-GPU mixed precision training
 -----------------------------------
 
-Example of single-GPU APEX setup:
+Example of single-GPU AMP setup:
 
 .. code-block:: python
 
@@ -36,14 +38,14 @@ Example of single-GPU APEX setup:
     tl = TrainLoop(model,
                    train_loader, val_loader, test_loader,
                    optimizer, criterion,
-                   use_amp={'opt_level': 'O1'})
+                   use_amp=True)
 
     model = tl.fit(num_epochs=10)
 
 
 Check out a full
-`Apex AMP training example
-<https://github.com/mv1388/aitoolbox/blob/master/examples/apex_amp_training/apex_single_GPU_training.py#L83>`_.
+`AMP single-GPU training example
+<https://github.com/mv1388/aitoolbox/blob/master/examples/amp_training/single_GPU_training.py>`_.
 
 
 Multi-GPU DDP mixed precision training
@@ -53,7 +55,9 @@ When training in the multi-GPU setting, the setup is mostly the same as in the s
 All the user has to do is set accordingly the ``use_amp`` parameter of the TrainLoop and to switch its ``gpu_mode``
 parameter to ``'ddp'``.
 Under the hood, TrainLoop will initialize the model and the optimizer for AMP and start training using
-DistributedDataParallel approach (DDP is currently only multi-GPU training setup supported by Apex AMP).
+DistributedDataParallel approach.
+
+Example of multi-GPU AMP setup:
 
 .. code-block:: python
 
@@ -73,12 +77,12 @@ DistributedDataParallel approach (DDP is currently only multi-GPU training setup
                    train_loader, val_loader, test_loader,
                    optimizer, criterion,
                    gpu_mode='ddp',
-                   use_amp={'opt_level': 'O1'})
+                   use_amp=True)
 
     model = tl.fit(num_epochs=10,
                    num_nodes=1, node_rank=0, num_gpus=torch.cuda.device_count())
 
 
 Check out a full
-`Apex AMP DistributedDataParallel training example
-<https://github.com/mv1388/aitoolbox/blob/master/examples/apex_amp_training/apex_mutli_GPU_training.py#L86>`_.
+`AMP multi-GPU DistributedDataParallel training example
+<https://github.com/mv1388/aitoolbox/blob/master/examples/amp_training/mutli_GPU_training.py>`_.