Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch Lightning run is marked as finished after .fit loop #3132

Open
Michael-Tanzer opened this issue Apr 12, 2024 · 5 comments
Open

Pytorch Lightning run is marked as finished after .fit loop #3132

Michael-Tanzer opened this issue Apr 12, 2024 · 5 comments
Assignees
Labels
area / integrations Issue area: integrations with other tools and libs area / SDK-storage Issue area: SDK and storage related issues phase / ready-to-go Issue phase: issues that are merged and will be included in the upcoming release type / bug Issue type: something isn't working
Milestone

Comments

@Michael-Tanzer
Copy link

馃悰 Bug

When using the pytorch lightning aim logger, a run will be marked as finished after the fit loop, ignoring the test loop and any metric logged there.

Expected behavior

The logger should mark the run as finished only on exit, after testing loop and any other additional logging.

Environment

  • Aim Version: 3.19.2
  • Python version: 3.10.8
  • Lightning version: 2.0.1
  • pip version: 22.3.1
  • OS: Linux
@Michael-Tanzer Michael-Tanzer added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Apr 12, 2024
@mihran113
Copy link
Contributor

mihran113 commented Apr 16, 2024

Hey @Michael-Tanzer! Thanks for the report.
The run is being closed, because pytorch lightning is calling .finalize() method on the logger. But when the test loop starts, and the trainer logs any additional metrics during the test loop aim.Run will be reopened.
I've tested it on our example(https://github.com/aimhubio/aim/blob/main/examples/pytorch_lightning_track.py) and the test loss is successfully tracked.
Can you please double-check if the test metrics are tracked?

@Michael-Tanzer
Copy link
Author

Hi, I'm glad it's working on this example, but there is also another ticket with pretty much the same issue. Could it be related to the fact that I am using a remote server? My current fix is to disable finalize and later finalize the run manually.

@Michael-Tanzer
Copy link
Author

#3097

@mihran113
Copy link
Contributor

mihran113 commented Apr 16, 2024

Oh, yeah, remote tracking is actually causing this.
I've just opened a PR which should address that:
#3134
We'll release a patch version today or tomorrow which will include the fix for this issue.

@Michael-Tanzer
Copy link
Author

Thank you! This is awesome news! I will close this issue then

@mihran113 mihran113 self-assigned this Apr 17, 2024
@mihran113 mihran113 added area / integrations Issue area: integrations with other tools and libs phase / ready-to-go Issue phase: issues that are merged and will be included in the upcoming release area / SDK-storage Issue area: SDK and storage related issues and removed help wanted Extra attention is needed labels Apr 17, 2024
@mihran113 mihran113 added this to the v3.19.x milestone Apr 17, 2024
@mihran113 mihran113 reopened this Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area / integrations Issue area: integrations with other tools and libs area / SDK-storage Issue area: SDK and storage related issues phase / ready-to-go Issue phase: issues that are merged and will be included in the upcoming release type / bug Issue type: something isn't working
Projects
Status: Patch-issues
Development

No branches or pull requests

2 participants