Skip to content

Commit

Permalink
Trainer: auto default (#16847)
Browse files Browse the repository at this point in the history
  • Loading branch information
carmocca committed Feb 23, 2023
1 parent d486f94 commit 0130273
Show file tree
Hide file tree
Showing 21 changed files with 336 additions and 192 deletions.
39 changes: 20 additions & 19 deletions docs/source-pytorch/accelerators/gpu_basic.rst
Expand Up @@ -14,30 +14,31 @@ A Graphics Processing Unit (GPU), is a specialized hardware accelerator designed

----

Train on 1 GPU
--------------

Make sure you're running on a machine with at least one GPU. There's no need to specify any NVIDIA flags
as Lightning will do it for you.

.. testcode::
:skipif: torch.cuda.device_count() < 1

trainer = Trainer(accelerator="gpu", devices=1)

----------------


.. _multi_gpu:

Train on multiple GPUs
----------------------
Train on GPUs
-------------

To use multiple GPUs, set the number of devices in the Trainer or the index of the GPUs.
The Trainer will run on all available GPUs by default. Make sure you're running on a machine with at least one GPU.
There's no need to specify any NVIDIA flags as Lightning will do it for you.

.. code::
.. code-block:: python
# run on as many GPUs as available by default
trainer = Trainer(accelerator="auto", devices="auto", strategy="auto")
# equivalent to
trainer = Trainer()
trainer = Trainer(accelerator="gpu", devices=4)
# run on one GPU
trainer = Trainer(accelerator="gpu", devices=1)
# run on multiple GPUs
trainer = Trainer(accelerator="gpu", devices=8)
# choose the number of devices automatically
trainer = Trainer(accelerator="gpu", devices="auto")
.. note::
Setting ``accelerator="gpu"`` will also automatically choose the "mps" device on Apple sillicon GPUs.
If you want to avoid this, you can set ``accelerator="cuda"`` instead.

Choosing GPU devices
^^^^^^^^^^^^^^^^^^^^
Expand Down
40 changes: 16 additions & 24 deletions docs/source-pytorch/accelerators/hpu_basic.rst
Expand Up @@ -25,25 +25,30 @@ For more information, check out `Gaudi Architecture <https://docs.habana.ai/en/l

----

Run on 1 Gaudi
--------------
Run on Gaudi
------------

To enable PyTorch Lightning to utilize the HPU accelerator, simply provide ``accelerator="hpu"`` parameter to the Trainer class.

.. code-block:: python
trainer = Trainer(accelerator="hpu", devices=1)
----
# run on as many Gaudi devices as available by default
trainer = Trainer(accelerator="auto", devices="auto", strategy="auto")
# equivalent to
trainer = Trainer()
Run on multiple Gaudis
----------------------
The ``devices=8`` and ``accelerator="hpu"`` parameters to the Trainer class enables the Habana accelerator for distributed training with 8 Gaudis.
It uses :class:`~pytorch_lightning.strategies.hpu_parallel.HPUParallelStrategy` internally which is based on DDP strategy with the addition of Habana's collective communication library (HCCL) to support scale-up within a node and scale-out across multiple nodes.
# run on one Gaudi device
trainer = Trainer(accelerator="hpu", devices=1)
# run on multiple Gaudi devices
trainer = Trainer(accelerator="hpu", devices=8)
# choose the number of devices automatically
trainer = Trainer(accelerator="hpu", devices="auto")
.. code-block:: python
trainer = Trainer(devices=8, accelerator="hpu")
The ``devices>1`` parameter with HPUs enables the Habana accelerator for distributed training.
It uses :class:`~pytorch_lightning.strategies.hpu_parallel.HPUParallelStrategy` internally which is based on DDP
strategy with the addition of Habana's collective communication library (HCCL) to support scale-up within a node and
scale-out across multiple nodes.

----

Expand Down Expand Up @@ -81,19 +86,6 @@ On Node 2:
----

Select Gaudis automatically
---------------------------

Lightning can automatically detect the number of Gaudi devices to run on. This setting is enabled by default if the devices argument is missing.

.. code-block:: python
# equivalent
trainer = Trainer(accelerator="hpu")
trainer = Trainer(accelerator="hpu", devices="auto")
----

How to access HPUs
------------------

Expand Down
25 changes: 14 additions & 11 deletions docs/source-pytorch/accelerators/ipu_basic.rst
Expand Up @@ -24,23 +24,26 @@ See the `Graphcore Glossary <https://docs.graphcore.ai/projects/graphcore-glossa

----

Run on 1 IPU
------------
To use a single IPU, set the accelerator and devices argument.
Run on IPU
----------

.. code-block:: python
trainer = pl.Trainer(accelerator="ipu", devices=1)
----
To enable PyTorch Lightning to utilize the IPU accelerator, simply provide ``accelerator="ipu"`` parameter to the Trainer class.

Run on multiple IPUs
--------------------
To use multiple IPUs set the devices to a number that is a power of 2 (i.e: 2, 4, 8, 16, ...)

.. code-block:: python
trainer = pl.Trainer(accelerator="ipu", devices=8)
# run on as many IPUs as available by default
trainer = Trainer(accelerator="auto", devices="auto", strategy="auto")
# equivalent to
trainer = Trainer()
# run on one IPU
trainer = Trainer(accelerator="ipu", devices=1)
# run on multiple IPUs
trainer = Trainer(accelerator="ipu", devices=8)
# choose the number of devices automatically
trainer = Trainer(accelerator="ipu", devices="auto")
----

Expand Down
40 changes: 15 additions & 25 deletions docs/source-pytorch/accelerators/tpu_basic.rst
Expand Up @@ -32,36 +32,26 @@ some subset of those 2048 cores.

----

Run on 1 TPU core
-----------------
Enable the following Trainer arguments to run on 1 TPU.

.. code::
trainer = Trainer(accelerator="tpu", devices=1)
----

Run on multiple TPU cores
-------------------------
For multiple TPU cores, change the value of the devices flag.

.. code::
trainer = Trainer(accelerator="tpu", devices=8)
----

Run on a specific TPU core
--------------------------
Run on TPU cores
----------------

To run on a specific core, specify the index of the TPU core.
To run on different cores, modify the ``devices`` argument.

.. code-block:: python
trainer = pl.Trainer(accelerator="tpu", devices=[5])
# run on as many TPUs as available by default
trainer = Trainer(accelerator="auto", devices="auto", strategy="auto")
# equivalent to
trainer = Trainer()
This example runs on the 5th core, not on five cores.
# run on one TPU core
trainer = Trainer(accelerator="tpu", devices=1)
# run on multiple TPU cores
trainer = Trainer(accelerator="tpu", devices=8)
# run on the 5th core
trainer = Trainer(accelerator="tpu", devices=[5])
# choose the number of cores automatically
trainer = Trainer(accelerator="tpu", devices="auto")
----

Expand Down
4 changes: 2 additions & 2 deletions docs/source-pytorch/common/trainer.rst
Expand Up @@ -200,7 +200,7 @@ as well as custom accelerator instances.
# Training with GPU Accelerator using the DistributedDataParallel strategy
trainer = Trainer(devices=4, accelerator="gpu", strategy="ddp")
.. note:: The ``"auto"`` option recognizes the machine you are on, and selects the respective ``Accelerator``.
.. note:: The ``"auto"`` option recognizes the machine you are on, and selects the appropriate ``Accelerator``.

.. code-block:: python
Expand Down Expand Up @@ -417,7 +417,7 @@ Number of devices to train on (``int``), which devices to train on (``list`` or

.. code-block:: python
# If your machine has GPUs, it will use all the available GPUs for training
# Use whatever hardware your machine has available
trainer = Trainer(devices="auto", accelerator="auto")
# Training with CPU Accelerator using 1 process
Expand Down
4 changes: 4 additions & 0 deletions src/lightning/pytorch/CHANGELOG.md
Expand Up @@ -52,6 +52,10 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Changed


- The `Trainer` now chooses `accelerator="auto", strategy="auto", devices="auto"` as defaults ([#16847](https://github.com/Lightning-AI/lightning/pull/16847))


- "Native" suffix removal ([#16490](https://github.com/Lightning-AI/lightning/pull/16490))
* `strategy="fsdp_native"` is now `strategy="fsdp"`
* `strategy="fsdp_native_full_shard_offload"` is now `strategy="fsdp_cpu_offload"`
Expand Down

0 comments on commit 0130273

Please sign in to comment.