Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate log_gpu_memory, gpu_metrics, and util funcs in favor of DeviceStatsMonitor callback #9921

Merged
merged 13 commits into from Oct 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Expand Up @@ -341,7 +341,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Deprecated passing `progress_bar_refresh_rate` to the `Trainer` constructor in favor of adding the `ProgressBar` callback with `refresh_rate` directly to the list of callbacks, or passing `enable_progress_bar=False` to disable the progress bar ([#9616](https://github.com/PyTorchLightning/pytorch-lightning/pull/9616))


- Deprecate `LightningDistributed` and move the broadcast logic to `DDPPlugin` and `DDPSpawnPlugin` directly ([#9691](https://github.com/PyTorchLightning/pytorch-lightning/pull/9691))
- Deprecated `LightningDistributed` and move the broadcast logic to `DDPPlugin` and `DDPSpawnPlugin` directly ([#9691](https://github.com/PyTorchLightning/pytorch-lightning/pull/9691))


- Deprecated passing `stochastic_weight_avg` from the `Trainer` constructor in favor of adding the `StochasticWeightAveraging` callback directly to the list of callbacks ([#8989](https://github.com/PyTorchLightning/pytorch-lightning/pull/8989))
Expand All @@ -362,6 +362,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Deprecated passing `weights_summary` to the `Trainer` constructor in favor of adding the `ModelSummary` callback with `max_depth` directly to the list of callbacks ([#9699](https://github.com/PyTorchLightning/pytorch-lightning/pull/9699))


- Deprecated `log_gpu_memory`, `gpu_metrics`, and util funcs in favor of `DeviceStatsMonitor` callback ([#9921](https://github.com/PyTorchLightning/pytorch-lightning/pull/9921))


- Deprecated `GPUStatsMonitor` and `XLAStatsMonitor` in favor of `DeviceStatsMonitor` callback ([#9924](https://github.com/PyTorchLightning/pytorch-lightning/pull/9924))

### Removed
Expand Down
32 changes: 1 addition & 31 deletions docs/source/common/trainer.rst
Expand Up @@ -528,7 +528,7 @@ Example::
checkpoint_callback
^^^^^^^^^^^^^^^^^^^

Deprecated: This has been deprecated in v1.5 and will be removed in v.17. Please use ``enable_checkpointing`` instead.
Deprecated: This has been deprecated in v1.5 and will be removed in v1.7. Please use ``enable_checkpointing`` instead.

default_root_dir
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -838,36 +838,6 @@ How often to add logging rows (does not write to disk)
See Also:
- :doc:`logging <../extensions/logging>`

log_gpu_memory
^^^^^^^^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/log_gpu_memory.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/log_gpu_memory.mp4"></video>

|

Options:

- None
- 'min_max'
- 'all'

.. testcode::

# default used by the Trainer
trainer = Trainer(log_gpu_memory=None)

# log all the GPUs (on master node only)
trainer = Trainer(log_gpu_memory="all")

# log only the min and max memory on the master node
trainer = Trainer(log_gpu_memory="min_max")

.. note:: Might slow performance because it uses the output of ``nvidia-smi``.

logger
^^^^^^

Expand Down
Expand Up @@ -29,6 +29,11 @@
class LoggerConnector:
def __init__(self, trainer: "pl.Trainer", log_gpu_memory: Optional[str] = None) -> None:
self.trainer = trainer
if log_gpu_memory is not None:
rank_zero_deprecation(
"Setting `log_gpu_memory` with the trainer flag is deprecated in v1.5 and will be removed in v1.7. "
"Please monitor GPU stats with the `DeviceStatsMonitor` callback directly instead."
)
self.log_gpu_memory = log_gpu_memory
self.eval_loop_results: List[_OUT_DICT] = []
self._val_log_step: int = 0
Expand Down Expand Up @@ -222,6 +227,7 @@ def update_train_step_metrics(self) -> None:
if self.trainer.fit_loop._should_accumulate() and self.trainer.lightning_module.automatic_optimization:
return

# TODO: remove this call in v1.7
self._log_gpus_metrics()

# when metrics should be logged
Expand All @@ -239,6 +245,11 @@ def update_train_epoch_metrics(self) -> None:
self.trainer._results.reset(metrics=True)

def _log_gpus_metrics(self) -> None:
"""
.. deprecated:: v1.5
This function was deprecated in v1.5 in favor of
`pytorch_lightning.accelerators.gpu._get_nvidia_gpu_stats` and will be removed in v1.7.
"""
for key, mem in self.gpus_metrics.items():
if self.log_gpu_memory == "min_max":
self.trainer.lightning_module.log(key, mem, prog_bar=False, logger=True)
Expand Down Expand Up @@ -309,6 +320,14 @@ def metrics(self) -> _METRICS:

@property
def gpus_metrics(self) -> Dict[str, float]:
"""
.. deprecated:: v1.5
Will be removed in v1.7.
"""
rank_zero_deprecation(
"The property `LoggerConnector.gpus_metrics` was deprecated in v1.5"
" and will be removed in 1.7. Use the `DeviceStatsMonitor` callback instead."
)
if self.trainer._device_type == DeviceType.GPU and self.log_gpu_memory:
mem_map = memory.get_memory_profile(self.log_gpu_memory)
self._gpus_metrics.update(mem_map)
Expand Down
6 changes: 5 additions & 1 deletion pytorch_lightning/trainer/trainer.py
Expand Up @@ -134,7 +134,7 @@ def __init__(
auto_select_gpus: bool = False,
tpu_cores: Optional[Union[List[int], str, int]] = None,
ipus: Optional[int] = None,
log_gpu_memory: Optional[str] = None,
log_gpu_memory: Optional[str] = None, # TODO: Remove in 1.7
progress_bar_refresh_rate: Optional[int] = None, # TODO: remove in v1.7
enable_progress_bar: bool = True,
overfit_batches: Union[int, float] = 0.0,
Expand Down Expand Up @@ -277,6 +277,10 @@ def __init__(

log_gpu_memory: None, 'min_max', 'all'. Might slow performance.

.. deprecated:: v1.5
Deprecated in v1.5.0 and will be removed in v1.7.0
Please use the ``DeviceStatsMonitor`` callback directly instead.

log_every_n_steps: How often to log within steps (defaults to every 50 steps).

prepare_data_per_node: If True, each LOCAL_RANK=0 will call prepare data.
Expand Down
14 changes: 12 additions & 2 deletions pytorch_lightning/utilities/memory.py
Expand Up @@ -96,7 +96,12 @@ def garbage_collection_cuda() -> None:


def get_memory_profile(mode: str) -> Dict[str, float]:
"""Get a profile of the current memory usage.
r"""
.. deprecated:: v1.5
This function was deprecated in v1.5 in favor of
`pytorch_lightning.accelerators.gpu._get_nvidia_gpu_stats` and will be removed in v1.7.

Get a profile of the current memory usage.

Args:
mode: There are two modes:
Expand Down Expand Up @@ -124,7 +129,12 @@ def get_memory_profile(mode: str) -> Dict[str, float]:


def get_gpu_memory_map() -> Dict[str, float]:
"""Get the current gpu usage.
r"""
.. deprecated:: v1.5
This function was deprecated in v1.5 in favor of
`pytorch_lightning.accelerators.gpu._get_nvidia_gpu_stats` and will be removed in v1.7.

Get the current gpu usage.

Return:
A dictionary in which the keys are device ids as integers and
Expand Down
11 changes: 11 additions & 0 deletions tests/deprecated_api/test_remove_1-7.py
Expand Up @@ -21,6 +21,7 @@
from pytorch_lightning.callbacks.gpu_stats_monitor import GPUStatsMonitor
from pytorch_lightning.callbacks.xla_stats_monitor import XLAStatsMonitor
from pytorch_lightning.loggers import LoggerCollection, TestTubeLogger
from pytorch_lightning.trainer.connectors.logger_connector import LoggerConnector
from tests.deprecated_api import _soft_unimport_module
from tests.helpers import BoringModel
from tests.helpers.datamodules import MNISTDataModule
Expand Down Expand Up @@ -370,6 +371,16 @@ def test_v1_7_0_weights_summary_trainer(tmpdir):
t.weights_summary = "blah"


def test_v1_7_0_trainer_log_gpu_memory(tmpdir):
with pytest.deprecated_call(
match="Setting `log_gpu_memory` with the trainer flag is deprecated in v1.5 and will be removed"
):
trainer = Trainer(log_gpu_memory="min_max")
with pytest.deprecated_call(match="The property `LoggerConnector.gpus_metrics` was deprecated in v1.5"):
lg = LoggerConnector(trainer)
_ = lg.gpus_metrics


@RunIf(min_gpus=1)
def test_v1_7_0_deprecate_gpu_stats_monitor(tmpdir):
with pytest.deprecated_call(match="The `GPUStatsMonitor` callback was deprecated in v1.5"):
Expand Down