Fix test_pytorch_lightning.py #4305

Alnusjaponica · 2023-01-05T08:18:22Z

Motivation

Resolve #3418 and #4116

Description of the changes

Refactor deprecated features

trainer.training_type_plugin is deleted since v1.8 (PR#11239). The attribute training_type_plugin is just renamed to strategy, so refactored as suggested.
An optional argument of trainer, accelerator stopped to accept ddp_cpu. Instead, we can pass ddp to strategy and cpu to accelerator. Also, num_processes will be removed, so we give the number of processes to devices instead.
AcceleratorConnector.distributed_backend is deleted, but now they have AcceleratorConnector.is_distributed instead, so refactored as suggested.
callback.on_init_start() is deleted since v1.8 (Issue#10894, PR#10940, PR#14867).
Although they don't provide exactly equivalent alternative, it would be possible to move this confirmation to somewhere else. Although it seems strategy.setup_environment is the right place to implement this check, implementing as a method of Strategy will affect user's code. Therefore it will be reasonable to implement as callback.setup or callback.on_fit_start().

grburgess · 2023-01-05T14:37:46Z

Out of curiosity, is all working for PL 1.8 for this branch w.r.t. the pruner?

codecov-commenter · 2023-01-05T14:46:04Z

Codecov Report

Merging #4305 (7e88045) into master (7bc4bb1) will decrease coverage by 0.04%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #4305      +/-   ##
==========================================
- Coverage   90.42%   90.38%   -0.04%     
==========================================
  Files         172      172              
  Lines       13660    13668       +8     
==========================================
+ Hits        12352    12354       +2     
- Misses       1308     1314       +6

Impacted Files	Coverage Δ
optuna/integration/pytorch_lightning.py	`0.00% <0.00%> (ø)`
optuna/trial/_trial.py	`95.26% <0.00%> (-1.19%)`	⬇️
optuna/study/study.py	`94.96% <0.00%> (+0.77%)`	⬆️
optuna/integration/botorch.py	`97.55% <0.00%> (+0.81%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Alnusjaponica · 2023-01-06T05:53:51Z

Thanks for the question. It is what we aim to do in this PR. @grburgess

grburgess · 2023-01-06T05:55:47Z

Thanks! I checked out this branch to try (which I should have done in the first place) and indeed the issue seems fixed

Alnusjaponica · 2023-01-06T06:14:45Z

Thank you for trying the patch. Please note that the problem of distributed processing still remains for now.

Alnusjaponica · 2023-01-12T01:59:26Z

Problem

When you use DDP and optuna.TrialPruned() is raised from the child process, Pytorch Lightning tries to resynchronize to fix the "error" and finally deal it as a DeadlockDetectedException, which terminates the whole process. For more details, see reconciliate_processes. To fix this problem, we need to change environment variable and private variable.

Alternative

Stop supporting DDP in order to fix daily test integration (#4116) and to support PyTorch Lightning after v1.6.

nzw0301 · 2023-01-12T02:52:50Z

When I tried to fix this issue before, I also thought that calling TrialPruned in the main process manually like catboost callback is another approach,
https://github.com/optuna/optuna-examples/blob/main/catboost/catboost_pruning.py#L57.

Alnusjaponica · 2023-01-13T09:11:26Z

@nzw0301
Thank you for your advice. I'll see if the suggested approach works for this case.
If it is possible, I am going to apply the change to support DDP after #4322 .

…ion-pytorch-lightning

github-actions · 2023-01-26T23:05:20Z

This pull request has not seen any recent activity.

github-actions · 2023-02-09T23:05:29Z

This pull request was closed automatically because it had not seen any recent activity. If you want to discuss it, you can reopen it freely.

Fix test_pytorch_lightning.py

a9a5e7e

github-actions bot added the optuna.integration Related to the `optuna.integration` submodule. This is automatically labeled by github-actions. label Jan 5, 2023

Alnusjaponica added 3 commits January 5, 2023 17:32

Revert spaces

8ddf3bb

Comment out unused packages

5b988a7

Remove deprecated optional argument num_processes

7d59c15

Alnusjaponica marked this pull request as ready for review January 6, 2023 06:09

Alnusjaponica marked this pull request as draft January 6, 2023 06:09

Alnusjaponica added 2 commits January 12, 2023 09:53

Restore on_fit_start as on_fit_start

6473b36

Remove unnecessary space

6d66d04

Alnusjaponica mentioned this pull request Jan 12, 2023

Fix checks integration about pytorch lightning #4322

Merged

Support DDP

b4bc3ce

Alnusjaponica added 3 commits January 19, 2023 11:02

Raise TrialPruned manually

ec16fb2

Merge remote-tracking branch 'origin/master' into fix-checks-integrat…

cf7efca

…ion-pytorch-lightning

Fix test-integration.yml

7e88045

github-actions bot added the stale Exempt from stale bot labeling. label Jan 26, 2023

github-actions bot closed this Feb 9, 2023

Alnusjaponica deleted the fix-checks-integration-pytorch-lightning branch February 16, 2023 00:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test_pytorch_lightning.py #4305

Fix test_pytorch_lightning.py #4305

Alnusjaponica commented Jan 5, 2023 •

edited

grburgess commented Jan 5, 2023

codecov-commenter commented Jan 5, 2023 •

edited

Alnusjaponica commented Jan 6, 2023

grburgess commented Jan 6, 2023

Alnusjaponica commented Jan 6, 2023

Alnusjaponica commented Jan 12, 2023 •

edited

nzw0301 commented Jan 12, 2023

Alnusjaponica commented Jan 13, 2023 •

edited

github-actions bot commented Jan 26, 2023

github-actions bot commented Feb 9, 2023

Fix test_pytorch_lightning.py #4305

Fix test_pytorch_lightning.py #4305

Conversation

Alnusjaponica commented Jan 5, 2023 • edited

Motivation

Description of the changes

grburgess commented Jan 5, 2023

codecov-commenter commented Jan 5, 2023 • edited

Codecov Report

Alnusjaponica commented Jan 6, 2023

grburgess commented Jan 6, 2023

Alnusjaponica commented Jan 6, 2023

Alnusjaponica commented Jan 12, 2023 • edited

Problem

Alternative

nzw0301 commented Jan 12, 2023

Alnusjaponica commented Jan 13, 2023 • edited

github-actions bot commented Jan 26, 2023

github-actions bot commented Feb 9, 2023

Alnusjaponica commented Jan 5, 2023 •

edited

codecov-commenter commented Jan 5, 2023 •

edited

Alnusjaponica commented Jan 12, 2023 •

edited

Alnusjaponica commented Jan 13, 2023 •

edited