Fix missing parameter in docs

facebookresearch · Nov 5, 2021 · 4f90d23 · 4f90d23
1 parent 68ff8f1
commit 4f90d23
Showing 1 changed file with 4 additions and 4 deletions.
diff --git a/docs/source/tutorials/slowmo_ddp.rst b/docs/source/tutorials/slowmo_ddp.rst
@@ -12,9 +12,9 @@ the same.
 If you have code that is setup to use Distributed Data Parallel, using SlowMo Distributed Data Parallel
 is simply replacing the DDP call with a call to
 ``fairscale.experimental.nn.data_parallel.SlowMoDistributedDataParallel``, adding a
-``model.perform_slowmo(optimizer)`` call after ``optimizer.step()``, and moving the ``model.zero_grad()``
-to be after ``optimizer.step()``, as follows. The different points at which ``use_slowmo`` is used
-below help demonstrate these changes:
+``model.perform_slowmo(optimizer)`` call after ``optimizer.step()``, and moving the
+``model.zero_grad(set_to_none=True)`` to be after ``optimizer.step()``, as follows.
+The different points at which ``use_slowmo`` is used below help demonstrate these changes:
 
 .. code-block:: python
 
@@ -57,7 +57,7 @@ below help demonstrate these changes:
                 loss.backward()
                 optimizer.step()
                 if use_slowmo:
-                    model.zero_grad()
+                    model.zero_grad(set_to_none=True)
                     model.perform_slowmo(optimizer)  # SlowMoDDP specific
 
 In the example above, when using SlowMoDDP, we are reducing the total communication between