Pytorch Lightning run is marked as finished after .fit loop #3132

Michael-Tanzer · 2024-04-12T16:16:57Z

🐛 Bug

When using the pytorch lightning aim logger, a run will be marked as finished after the fit loop, ignoring the test loop and any metric logged there.

Expected behavior

The logger should mark the run as finished only on exit, after testing loop and any other additional logging.

Environment

Aim Version: 3.19.2
Python version: 3.10.8
Lightning version: 2.0.1
pip version: 22.3.1
OS: Linux

mihran113 · 2024-04-16T14:36:45Z

Hey @Michael-Tanzer! Thanks for the report.
The run is being closed, because pytorch lightning is calling .finalize() method on the logger. But when the test loop starts, and the trainer logs any additional metrics during the test loop aim.Run will be reopened.
I've tested it on our example(https://github.com/aimhubio/aim/blob/main/examples/pytorch_lightning_track.py) and the test loss is successfully tracked.
Can you please double-check if the test metrics are tracked?

Michael-Tanzer · 2024-04-16T14:39:13Z

Hi, I'm glad it's working on this example, but there is also another ticket with pretty much the same issue. Could it be related to the fact that I am using a remote server? My current fix is to disable finalize and later finalize the run manually.

Michael-Tanzer · 2024-04-16T14:42:26Z

#3097

mihran113 · 2024-04-16T15:04:14Z

Oh, yeah, remote tracking is actually causing this.
I've just opened a PR which should address that:
#3134
We'll release a patch version today or tomorrow which will include the fix for this issue.

Michael-Tanzer · 2024-04-16T15:37:20Z

Thank you! This is awesome news! I will close this issue then

Michael-Tanzer added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Apr 12, 2024

Michael-Tanzer closed this as completed Apr 16, 2024

mihran113 self-assigned this Apr 17, 2024

mihran113 added this to the v3.19.x milestone Apr 17, 2024

mihran113 reopened this Apr 17, 2024

mihran113 mentioned this issue Apr 17, 2024

[fix] Resolve issue with new runs after tracking queue shutdown #3134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytorch Lightning run is marked as finished after .fit loop #3132

Pytorch Lightning run is marked as finished after .fit loop #3132

Michael-Tanzer commented Apr 12, 2024

mihran113 commented Apr 16, 2024 •

edited

Michael-Tanzer commented Apr 16, 2024

Michael-Tanzer commented Apr 16, 2024

mihran113 commented Apr 16, 2024 •

edited

Michael-Tanzer commented Apr 16, 2024

Pytorch Lightning run is marked as finished after .fit loop #3132

Pytorch Lightning run is marked as finished after .fit loop #3132

Comments

Michael-Tanzer commented Apr 12, 2024

🐛 Bug

Expected behavior

Environment

mihran113 commented Apr 16, 2024 • edited

Michael-Tanzer commented Apr 16, 2024

Michael-Tanzer commented Apr 16, 2024

mihran113 commented Apr 16, 2024 • edited

Michael-Tanzer commented Apr 16, 2024

mihran113 commented Apr 16, 2024 •

edited

mihran113 commented Apr 16, 2024 •

edited