Only track dev debugger events if enabled #7875

ananthsub · 2021-06-08T04:57:12Z

is this debugger used only for tests? should we look at removing this dependency and rebuilding event tracking later on?

Not even. It was used in the past for debugging, but we found out it was risky to rely on it for testing.
I am not sure of its utility right now and could be deprecated.

These are all the usages of the dev debugger in tests.

tests/callbacks/test_early_stopping.py:111: assert len(trainer.dev_debugger.early_stopping_history) == expected_count tests/callbacks/test_stochastic_weight_avg.py:100: assert trainer.dev_debugger.count_events( tests/plugins/test_rpc_sequential_plugin.py:46: assert len(trainer.dev_debugger.pbar_added_metrics) > 0 tests/plugins/test_rpc_sequential_plugin.py:91: assert len(trainer.dev_debugger.pbar_added_metrics) > 0 tests/models/test_amp.py:216: assert trainer.dev_debugger.count_events('AMP') == 0 tests/models/test_amp.py:249: assert trainer.dev_debugger.count_events('AMP') == 10 tests/checkpointing/test_model_checkpoint.py:143: lr_scheduler_debug = trainer.dev_debugger.saved_lr_scheduler_updates tests/checkpointing/test_model_checkpoint.py:249: lr_scheduler_debug = trainer.dev_debugger.saved_lr_scheduler_updates tests/checkpointing/test_model_checkpoint.py:867: assert len(trainer.dev_debugger.checkpoint_callback_history) == 3 tests/checkpointing/test_model_checkpoint.py:954: assert trainer.dev_debugger.checkpoint_callback_history[-1]['epoch'] == len(monitor) - 1 tests/checkpointing/test_checkpoint_callback_frequency.py:38: assert len(trainer.dev_debugger.checkpoint_callback_history) == 0 tests/checkpointing/test_checkpoint_callback_frequency.py:47: assert len(trainer.dev_debugger.checkpoint_callback_history) == 0 tests/trainer/test_dataloaders.py:527: assert len(trainer.dev_debugger.num_seen_val_check_batches) == num_val_dataloaders tests/trainer/test_dataloaders.py:528: for dataloader_idx, num_batches in trainer.dev_debugger.num_seen_val_check_batches.items(): tests/trainer/test_dataloaders.py:532: assert len(trainer.dev_debugger.num_seen_test_check_batches) == num_test_dataloaders tests/trainer/test_dataloaders.py:533: for dataloader_idx, num_batches in trainer.dev_debugger.num_seen_test_check_batches.items(): tests/trainer/test_dataloaders.py:594: assert trainer.dev_debugger.num_seen_sanity_check_batches == trainer.num_sanity_val_steps * num_val_dataloaders tests/trainer/test_dataloaders.py:1282: assert len(trainer.dev_debugger.val_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1283: assert len(trainer.dev_debugger.test_dataloader_calls) == 0 tests/trainer/test_dataloaders.py:1284: assert len(trainer.dev_debugger.train_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1287: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/test_dataloaders.py:1315: assert len(trainer.dev_debugger.val_dataloader_calls) == 10 tests/trainer/test_dataloaders.py:1316: assert len(trainer.dev_debugger.test_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1317: assert len(trainer.dev_debugger.train_dataloader_calls) == 3 tests/trainer/test_dataloaders.py:1320: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/test_dataloaders.py:1357: assert len(trainer.dev_debugger.val_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1358: assert len(trainer.dev_debugger.test_dataloader_calls) == 0 tests/trainer/test_dataloaders.py:1359: assert len(trainer.dev_debugger.train_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1362: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/test_dataloaders.py:1389: assert len(trainer.dev_debugger.val_dataloader_calls) == 4 tests/trainer/test_dataloaders.py:1390: assert len(trainer.dev_debugger.train_dataloader_calls) == 3 tests/trainer/test_dataloaders.py:1391: assert len(trainer.dev_debugger.test_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1394: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/test_dataloaders.py:1438: assert len(trainer.dev_debugger.val_dataloader_calls) == 4 tests/trainer/test_dataloaders.py:1439: assert len(trainer.dev_debugger.train_dataloader_calls) == 3 tests/trainer/test_dataloaders.py:1440: assert len(trainer.dev_debugger.test_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1443: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/test_dataloaders.py:1490: assert len(trainer.dev_debugger.val_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1491: assert len(trainer.dev_debugger.test_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1492: assert len(trainer.dev_debugger.train_dataloader_calls) == 1 tests/trainer/test_dataloaders.py:1495: calls = trainer.dev_debugger.dataloader_sequence_calls tests/trainer/flags/test_fast_dev_run.py:102: assert len(trainer.dev_debugger.checkpoint_callback_history) == 0 tests/trainer/flags/test_fast_dev_run.py:106: assert len(trainer.dev_debugger.early_stopping_history) == 0 tests/trainer/optimization/test_manual_optimization.py:98: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:130: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:161: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:187: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:225: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:502: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * num_manual_backward_calls tests/trainer/optimization/test_manual_optimization.py:577: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * 2 tests/trainer/optimization/test_manual_optimization.py:636: assert trainer.dev_debugger.count_events('backward_call') == limit_train_batches * 2

Most could be replaced with progress tracking. Others could have the test changed to track manually.

agree, dev debugger should not exist at all. we want to rely on the existing built in, well tested unit testing framework.

Original file line number	Diff line number	Diff line change
Expand Up		@@ -184,6 +184,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
		- Fixed global step update when the epoch is skipped ([#7677](https://github.com/PyTorchLightning/pytorch-lightning/pull/7677))


		- Fixed dev debugger memory growing due to tracking events even when disabled ([#7875](https://github.com/PyTorchLightning/pytorch-lightning/pull/7875))


		- Fixed training loop total batch counter when accumulate grad batches was enabled ([#7692](https://github.com/PyTorchLightning/pytorch-lightning/pull/7692))


Expand Down

-Original file line number
+Diff line change
@@ Expand Up / @@ -51,6 +51,7 @@ def __init__(self, trainer): @@
             self.test_dataloader_calls = []
             self.dataloader_sequence_calls = []
+        @enabled_only
         def track_event(
             self,
             evt_type: str,
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only track dev debugger events if enabled #7875

Uh oh!

Diff view

Diff view

There are no files selected for viewing

ananthsub Jun 8, 2021

Uh oh!

tchaton Jun 8, 2021

Uh oh!

carmocca Jun 8, 2021

Uh oh!

awaelchli Jun 8, 2021

Uh oh!

Only track dev debugger events if enabled #7875

Uh oh!

Only track dev debugger events if enabled #7875

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

ananthsub Jun 8, 2021

Choose a reason for hiding this comment

Uh oh!

tchaton Jun 8, 2021

Choose a reason for hiding this comment

Uh oh!

carmocca Jun 8, 2021

Choose a reason for hiding this comment

Uh oh!

awaelchli Jun 8, 2021

Choose a reason for hiding this comment

Uh oh!