New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Neptune log error for multiple dataloaders #643
Comments
Hi @stonelazy I'm Prince Canuma a Data Scientist and DevRel at Neptune.ai The error you are facing happens when try to use
Traceback: Why: Example Solution ✅ :
|
Please let me know if this solves your problem 😃 👍 |
Thanks for getting back on this, but as a lightning user i'm not invoking |
Have updated this comment as additional context in the original post as well. If we have multiple dataloaders, then all of the parameters that gets logged will have name of the dataloader appended.
But according to Neptune, 'loss' is now invalid once you have already logged 'loss/dataloader_1' (I guess) ? If so, you are both contradicting. |
Alright, mailto: prince.canuma@neptune.ai I will reach you through intercom. |
Correct ✅ . You can't log to |
This comment has been minimized.
This comment has been minimized.
Hi @stonelazy Is your problem solved? or is it still not working properly? |
Error is reproducible in this notebook file. Please have a look at it. |
Thank you very much!
I will submit a ticket to the engineering team to fix it.
But a workarround would be:
chaning:
`
metrics = self._add_prefix(metrics)
metrics_key = self.METRICS_KEY
if self._base_namespace:
metrics_key = f'{self._base_namespace}/{metrics_key}'
for key, val in metrics.items():
# `step` is ignored because Neptune expects strictly increasing step values
which
# Lighting does not always guarantee.
self.experiment[f'{metrics_key}/{key}'].log(val)
`
to something like:
`
metrics = self._add_prefix(metrics)
metrics_key = self.METRICS_KEY
if self._base_namespace:
metrics_key = f'{self._base_namespace}/{metrics_key}'
for key, val in metrics.items():
# `step` is ignored because Neptune expects strictly increasing step values
which
# Lighting does not always guarantee. if key == 'loss' and isinstance(val,
(int, float)): key = 'loss/loss'
self.experiment[f'{metrics_key}/{key}'].log(val)
`
…On Tue, Aug 10, 2021 at 12:23 PM stonelazy ***@***.***> wrote:
Error is reproducible in this notebook file. Please have a look at it.
https://colab.research.google.com/drive/13rRlztjGRQrv6Y3W-d21Dotoj8L2UtoZ?usp=sharing
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#643 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFS4BGMFXDYCRHOIUNRZEJTT4DEGLANCNFSM5BTZ4YIQ>
.
|
I have created a jira ticket
The team will try to inform you once the issue is resolved.
…On Wed, Aug 11, 2021 at 11:51 AM Prince Canuma ***@***.***> wrote:
Thank you very much!
I will submit a ticket to the engineering team to fix it.
But a workarround would be:
chaning:
`
metrics = self._add_prefix(metrics)
metrics_key = self.METRICS_KEY
if self._base_namespace:
metrics_key = f'{self._base_namespace}/{metrics_key}'
for key, val in metrics.items():
# `step` is ignored because Neptune expects strictly increasing step
values which
# Lighting does not always guarantee.
self.experiment[f'{metrics_key}/{key}'].log(val)
`
to something like:
`
metrics = self._add_prefix(metrics)
metrics_key = self.METRICS_KEY
if self._base_namespace:
metrics_key = f'{self._base_namespace}/{metrics_key}'
for key, val in metrics.items():
# `step` is ignored because Neptune expects strictly increasing step
values which
# Lighting does not always guarantee. if key == 'loss' and isinstance(val,
(int, float)): key = 'loss/loss'
self.experiment[f'{metrics_key}/{key}'].log(val)
`
On Tue, Aug 10, 2021 at 12:23 PM stonelazy ***@***.***>
wrote:
> Error is reproducible in this notebook file. Please have a look at it.
>
> https://colab.research.google.com/drive/13rRlztjGRQrv6Y3W-d21Dotoj8L2UtoZ?usp=sharing
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#643 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AFS4BGMFXDYCRHOIUNRZEJTT4DEGLANCNFSM5BTZ4YIQ>
> .
>
|
Hi Sudharsan, Regarding your multiple data loaders issues, We found the following solution, In the code you submitted, we noticed you wanted to log a metric named “loss“ after you already had created The fix is to simply rename the aggregate “loss” to something else like “loss_global” → this fix the problem. Link to colab solution: https://colab.research.google.com/drive/1APHu9qYVukdxBHmZBQFD35m1PAuDpLZ4 Let me know if this helps, Kind regards, |
There is a slight correction, 'we' are not creating metric named 'loss/dataloader_idx_0' it is done by Pytorch Lightning when there are multiple dataloaders and this Issue was raised to address the very same concern. I understand changing the name of metric would work, but it would only be a workaround. |
Hey @stonelazy, I checked the colab that you initially paste as a reproduction info: Here is a run that I made: https://app.neptune.ai/o/common/org/pytorch-lightning-integration/e/PTL-29/all Here is what I did
The error was fixed my making sure that you log val_loss to the separate namespace. Yes, PTL creates I will pass this info to the product team, for the time being, I recommend to adjust loss names a bit. Pls, let me know what you think? |
Appreciate your reply.
Sure, Thanks. |
Describe the bug
Error gets thrown while logging the metric value.
Having Pytorch lightning integration with Neptune. This error gets thrown only in the latest client of Neptune's
from neptune.new.integrations.pytorch_lightning import NeptuneLogger
Reproduction
https://colab.research.google.com/drive/13rRlztjGRQrv6Y3W-d21Dotoj8L2UtoZ?usp=sharing
Expected behavior
Experiment should keep running when without any error.
Traceback
Following trace as a result of invoking self.logger.log_metrics
If the value of attr is None, then it passes the if condition and am not facing any error. Facing the issue in the else condition.
neptune.new.handler.Handler.log
self._path = "val_loss"
Environment
The output of
pip list
:The operating system you're using:
Ubuntu
The output of
python --version
:Python 3.8.10
Additional context
It gets logged for all the metrics, only for this particular 'val_loss' key the error gets thrown.
Happens only after migrating to new neptune client. Works fine with previous version.
This error gets thrown only having more than one validation dataloader.
EDIT:
If we have multiple dataloaders, then all of the parameters that gets logged will have name of the dataloader appended.
Ex: Suppose my log is
self.log('loss',0.2)
It will get logged for each of the dataloader along with its index in the log name and its corresponding value: loss/dataloader_0 = 0.2 , loss/dataloader_1=0.4 and so on for every dataloader.
Since my metric to monitor is 'loss', PTL also expects exact string 'loss' value to be logged, otherwise it throws below error
But according to Neptune, 'loss' is now invalid once you have already logged 'loss/dataloader_1' (I guess) ? If so, you are both contradicting.
The text was updated successfully, but these errors were encountered: