[Q] Log image per epoch and have slider per epoch #3455

azinonos · 2022-04-01T14:59:11Z

Hello,

In my training loop I am logging an image generated from the data at every epoch. I am using PyTorch Lightning and Weights & Biases. The command I'm using is like the below:

fig = # compute fig
self.logger.experiment.log({"heatmap": [wandb.Image(fig)]})

However, when I log it to wandb, I get a weird slider which is not epochs but random steps. For example the first step is step 14, the second is step 19, the third is step 24 etc. So even though I can slide the slider to see the progress, I never really know which epoch each image corresponds to.

Alternatively, I tried to log multiple images (which is an uglier way) by putting the current epoch number as part of the title:

fig = # compute fig
self.logger.experiment.log({f"heatmap_e{self.current_epoch}": [wandb.Image(fig)]})

This logs multiple images with the epoch number, but does so in a seemingly arbitrary order. If I use the sorting functionality, I still don't get a correct sorting of the images, but a different arbitrary combination.

So to summarise:
Question/Issue 1) How to log images on every epoch and have a slider epoch instead of random steps?
Question/Issue 2) How to sort multiple images in a correct order instead of arbitrarily?

The text was updated successfully, but these errors were encountered:

ramit-wandb · 2022-04-01T18:31:11Z

Hi @azinonos,

Images (or for that matter any data logged through wandb.log) are ordered on the basis of step, an internal variable that allows us to order the data present in a meaningful manner. step is incremented on each call to wandb.log.

To answer your question, there are 2 ways to manage the step number:

Collect all your metrics and send them over in a single log call:

wandb.log({
  'metric_1' : ...,
  'metric_2' : ...
})

vs

wandb.log({'metric_1' : ...})
wandb.log({'metric_2' : ...})

Manually manage your step value using wandb.log({...}, step=STEP). Please note that the value of step must increase monotonically, otherwise calls to log may be ignored.

As for your second question, the images are always ordered in the order in which they are logged to W&B. There is no way to rearrange this order currently.

Thanks,
Ramit

azinonos · 2022-04-05T11:48:52Z

Hi Ramit,

Thanks for the response. For the second solution, when you set a manual step with step=STEP, does this affect all steps on a global scope? Or just for this specific metric?

Regarding the logging order, I do log them in a specific order monotonically, but they end up getting logged randomly - that's the problem.

ramit-wandb · 2022-04-05T19:13:56Z

Hey @azinonos,

When you set a manual step, it only defines the step value for that particular call to wandb.log. Its effect does not extend beyond that.

Would it be possible for you to share a minimal script or colab link reproducing the issue where images are not in order? It will help me understand why these images are not ordered correctly for you.

Thanks,
Ramit

azinonos · 2022-04-06T09:53:50Z

Hi Ramit,
This is great then (about the manual step), will try that. I'll try and set up a colab with the issue in the next few days and send you a link, thanks.

ramit-wandb · 2022-04-19T20:29:08Z

Hi @azinonos,

Just wanted to follow up here, were you able to set up a reproduction of the error you were seeing with unordered images? It will help me figure out the underlying cause of the bug here.

Thanks,
Ramit

azinonos · 2022-04-20T09:47:24Z

Hi Ramit,

I've just set up a demo now. If you run this, you can see that the logged figures are in a jumbled order. If you try to sort them, you can see it sorts them in an odd way as if the integer is a string instead of numerically (eg. 1,11,12,13..,2, 21).

https://colab.research.google.com/drive/1QobUjeysZE2DHurQoKTBnBd79CcrsegT?usp=sharing

Also setting step=STEP does not seem to work for me, I get the below warning and wandb does not plot anything at all:
wandb: WARNING Step must only increase in log calls. Step 126 < 2727; dropping {'conf_matrix/covariance_embedding_heatmap': [<wandb.sdk.data_types.Image object at 0x7f9054f8b2d0>]}.

ramit-wandb · 2022-04-28T16:23:35Z

Hi @azinonos,

Thanks for the reproduction! I have requested access so that I can look into this. As for the warning you see, this is what I meant by step having to increase monotonically. If you have logged a value for step which is 2727, you can only log values for steps greater than or equal to 2727.

Thanks,
Ramit

azinonos · 2022-04-29T08:35:56Z

Hi @ramit-wandb, I've given you access to the Google Colab.

Regarding the error, I thought the step value is local in which case I could log a new monotonically increasing value every time I log the image (0, 1, 2 etc). Getting this error means the step value is actually global, so it gives me this error because wandb has already performed logs in other parts of the program and it has already incremented the counter to a high value (eg. 2727).

If this is indeed the case, how could I use it to log this specific image no_epochs time and not get this error?

ramit-wandb · 2022-05-09T17:03:10Z

Hey @azinonos,

I looked through your colab, Looks like you are logging each image with a different key : image_{i}, which is why there is no specific order to them. Logging them through the same key, like image should order them by timestamp.

Thanks,
Ramit

exalate-issue-sync · 2022-05-12T18:51:57Z

Ramit Goolry commented:
Hi @azinonos,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Weights & Biases

azinonos · 2022-05-19T09:30:47Z

Hi @ramit-wandb ,

If I log them with the same key then I have the initial problem of having a slider which is not the number of epochs but a random step. The reason why I'm using the different key is so that I can have them in a manual order I can control.

exalate-issue-sync · 2022-05-19T19:56:29Z

WandB Internal User commented:
azinonos commented:
Hi @ramit-wandb ,

If I log them with the same key then I have the initial problem of having a slider which is not the number of epochs but a random step. The reason why I'm using the different key is so that I can have them in a manual order I can control.

ramit-wandb · 2022-05-19T20:15:12Z

Ah, I see what you mean. Currently, there is no way to switch step out for another metric, since we use step to internally track when a metric is logged. If you really need to make sure that you see epoch at the bottom of your metric, for now I will suggest only making one call to wandb.log per epoch, so that step = epoch for all steps. This way, you will be able to see the epochs as needed.

Thanks,
Ramit

exalate-issue-sync · 2022-05-23T20:15:28Z

Ramit Goolry commented:
Hi @azinonos,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Weights & Biases

exalate-issue-sync · 2022-05-26T22:58:18Z

Ramit Goolry commented:
Hi @azinonos, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

azinonos · 2022-05-29T11:42:47Z

Hi @ramit-wandb , sorry for the late response. To be honest this hasn't really been resolved, the issue still remains that a) You can use a custom x-axis with the epochs and b) when you log things independently, you can't sort those images in a specific order. These would be features for wandb to implement.

ramit-wandb · 2022-05-31T18:20:25Z

Hi @azinonos,

I have created a feature request to allow for custom sliders for a given media object. I'll keep you updated on the status of its progress!

Thanks,
Ramit

anthonyhu · 2023-02-22T09:49:26Z

Hey both! I'm having the same issue logging images with wandb (0.13.10) and pytorch lightning (1.8.6).

The "step" on the slider of the logged images are not aligned with the other metrics which use self.global_step. Like @azinonos mentioned, they are a bit arbitrary even when explicitly passing the step argument during logging as in self.logger.log_metrics({key: wandb.Image(img)}, step=self.global_step). They depend on an internal step count of the logger if I understood correctly from @ramit-wandb?

Would it be possible for the slider to display the step specified as an argument, instead of the internal step count from the logger?

ramit-wandb · 2023-02-22T16:59:44Z

@anthonyhu if you have defined teh step value manually, that should override the internal step value in Lightning (see here)

Could you provide some more details about what you are seeing, specifically, how are you setting step and what you are seeing in our UI (a link to your project would be great).

Thanks,
Ramit

anthonyhu · 2023-02-22T18:26:33Z

I think it might be more of an UI issue. When I try to modify the x-axis in the wandb project:

It's correctly reflected on the scalar logs: (here at step 15)

but it does not affect the wandb media. The slider remains "step" (which reads when hovering over it: "this increments every time you call wandb.log in your script")
and not "trainer/global_step"

Is there a way for the slider to become what I've selected for the x-axis?

ramit-wandb · 2023-02-28T12:33:23Z

I see what you mean, could you share a link to your project? There is no way to change the axis of the media panel currently, though that is something we have roadmapped for the future.

trainer/global_step is different from Step, and so they can not be used interchangeably.

anthonyhu · 2023-03-01T08:58:28Z

Thank you for the answer! I cannot share the project as it is a private one.

ramit-wandb · 2023-03-01T16:05:59Z

I can understand. In that case I will be bumping up the priority for this request in our internal system. I'll reach out once I have some updates for you regarding this.

Thanks,
Ramit

anthonyhu · 2023-03-01T18:22:13Z

Thank you so much 🙏

pme0 · 2023-03-23T12:45:10Z

I think it might be more of an UI issue. When I try to modify the x-axis in the wandb project:

It's correctly reflected on the scalar logs: (here at step 15)

but it does not affect the wandb media. The slider remains "step" (which reads when hovering over it: "this increments every time you call wandb.log in your script") and not "trainer/global_step"

Is there a way for the slider to become what I've selected for the x-axis?

🙌

Very keen to have this feature as well!

When logging the loss every k training steps with the actual step S = k, 2k, 3k, ... in the web interface I can change the x-axis variable to S for the loss but not the media, which seems to be logged automatically with wandb internal step 1, 2, 3.... The mismatch between the two makes it hard to check the media corresponding to a particular actual step.

I can't find a neat way to do it because the actual step at which validation occurs and media is logged (at the end of each epoch) is never a round number that I can easily associate with the wandb internal step (due to dataset/batch sizes not giving round number of steps per epoch).

thawro · 2023-03-29T20:00:16Z

A workaround for the PyTorchLightning usecase:

Add dict attribute to the class that subclasses the LightningModule
Whenever you want to go with self.logger.log_something(something) update that dict instead
Make only one logging call with the dict attribute

Example:

class ExampleLitModule(LightningModule):
    def __init__(self):
        super().__init__()
        self.metrics = {}

    def on_validation_epoch_end(self, stage: str):
        metrics = get_some_metrics() # your code
        self.metrics.update(metrics)
        self.logger.experiment.log(self.metrics, step=self.current_epoch)
        
        
class ExampleCallbackLogger(pl.Callback):
    def on_validation_epoch_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule):
	some_figure = get_some_figure() # your code
        pl_module.metrics["some_figure"] = some_figure

Using that approach all metrics and figures will be saved for the same step, which increases every time self.logger.experiment.log() is called

anthonyhu · 2023-03-31T15:32:35Z

Thank you for the workaround!

josiahls · 2023-04-05T00:55:04Z

@ramit-wandb I also have a similar issue related to logging images with a custom step. I have:

wandb_context.define_metric(f"{root_name}*", step_metric="global_step")
# ...
wandb_context.log({f"{root_name}/loss": loss.item(), "global_step": step})
# ... 
wandb_context.log(
    {
        f"{root_name}/target_heatmap": [
            wandb_context.Image(target_heatmap, caption="Target heatmap")
        ],
        "global_step": step,
    }
)

Using the exact same "global_step": step parameter pass, but only root_name}/loss uses the global step in the ui. The {root_name}/target_heatmap uses the internal step when it makes it to the ui.

It important to note that this is only a subset example, all of the non-image metrics track with global_step, while all of the image metrics/logs track the internal step.

Version:

Name: wandb
Version: 0.14.0

sydholl · 2023-10-12T22:03:02Z

WandB Internal User commented:
ramit-wandb commented:
I see what you mean, could you share a link to your project? There is no way to change the axis of the media panel currently, though that is something we have roadmapped for the future.

trainer/global_step is different from Step, and so they can not be used interchangeably.

ramit-wandb closed this as completed May 26, 2022

ramit-wandb reopened this May 31, 2022

kptkin added c:media a:app Area: Frontend/Backend labels Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Q] Log image per epoch and have slider per epoch #3455

[Q] Log image per epoch and have slider per epoch #3455

azinonos commented Apr 1, 2022 •

edited

ramit-wandb commented Apr 1, 2022

azinonos commented Apr 5, 2022

ramit-wandb commented Apr 5, 2022

azinonos commented Apr 6, 2022

ramit-wandb commented Apr 19, 2022

azinonos commented Apr 20, 2022 •

edited

ramit-wandb commented Apr 28, 2022

azinonos commented Apr 29, 2022

ramit-wandb commented May 9, 2022

exalate-issue-sync bot commented May 12, 2022

azinonos commented May 19, 2022

exalate-issue-sync bot commented May 19, 2022

ramit-wandb commented May 19, 2022

exalate-issue-sync bot commented May 23, 2022

exalate-issue-sync bot commented May 26, 2022

azinonos commented May 29, 2022

ramit-wandb commented May 31, 2022

anthonyhu commented Feb 22, 2023

ramit-wandb commented Feb 22, 2023

anthonyhu commented Feb 22, 2023

ramit-wandb commented Feb 28, 2023

anthonyhu commented Mar 1, 2023

ramit-wandb commented Mar 1, 2023

anthonyhu commented Mar 1, 2023

pme0 commented Mar 23, 2023

thawro commented Mar 29, 2023 •

edited

anthonyhu commented Mar 31, 2023

josiahls commented Apr 5, 2023 •

edited

sydholl commented Oct 12, 2023

[Q] Log image per epoch and have slider per epoch #3455

[Q] Log image per epoch and have slider per epoch #3455

Comments

azinonos commented Apr 1, 2022 • edited

ramit-wandb commented Apr 1, 2022

azinonos commented Apr 5, 2022

ramit-wandb commented Apr 5, 2022

azinonos commented Apr 6, 2022

ramit-wandb commented Apr 19, 2022

azinonos commented Apr 20, 2022 • edited

ramit-wandb commented Apr 28, 2022

azinonos commented Apr 29, 2022

ramit-wandb commented May 9, 2022

exalate-issue-sync bot commented May 12, 2022

azinonos commented May 19, 2022

exalate-issue-sync bot commented May 19, 2022

ramit-wandb commented May 19, 2022

exalate-issue-sync bot commented May 23, 2022

exalate-issue-sync bot commented May 26, 2022

azinonos commented May 29, 2022

ramit-wandb commented May 31, 2022

anthonyhu commented Feb 22, 2023

ramit-wandb commented Feb 22, 2023

anthonyhu commented Feb 22, 2023

ramit-wandb commented Feb 28, 2023

anthonyhu commented Mar 1, 2023

ramit-wandb commented Mar 1, 2023

anthonyhu commented Mar 1, 2023

pme0 commented Mar 23, 2023

thawro commented Mar 29, 2023 • edited

Example:

anthonyhu commented Mar 31, 2023

josiahls commented Apr 5, 2023 • edited

sydholl commented Oct 12, 2023

azinonos commented Apr 1, 2022 •

edited

azinonos commented Apr 20, 2022 •

edited

thawro commented Mar 29, 2023 •

edited

josiahls commented Apr 5, 2023 •

edited