Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--samples_per_plugin "images=0" is documented to show all images, but shows no images at all #5550

Open
bersbersbers opened this issue Feb 2, 2022 · 2 comments

Comments

@bersbersbers
Copy link

The TB 2.8.0 help says I can pass --samples_per_plugin "images=0" to show all images, but that actually shows none at all.

tensorboard --help

--samples_per_plugin SAMPLES_PER_PLUGIN.
An optional comma separated list of plugin_name=num_samples pairs to explicitly specify how many samples to keep per tag for that plugin. For unspecified plugins, TensorBoard randomly downsamples logged summaries to reasonable values to prevent out-of-memory errors for long running jobs. This flag allows fine control over that downsampling. Note that 0 means keep all samples of that type. For instance "scalars=500,images=0" keeps 500 scalars and all images. Most users should not need to set this flag.

import tensorflow as tf

with tf.summary.create_file_writer("tmp").as_default():
    tf.summary.image("Random image", tf.random.normal((1, 28, 28, 1)), step=0)
tensorboard --logdir tmp # shows the image
tensorboard --logdir tmp --samples_per_plugin "images=1" # shows the image
tensorboard --logdir tmp --samples_per_plugin "images=0" # does not show the image - this is the bug.

image

@sanatmpa1 sanatmpa1 self-assigned this Feb 2, 2022
@japie1235813
Copy link
Contributor

Thanks for filing the bug. Have verified that the bug is reproducible.

@nfelt
Copy link
Collaborator

nfelt commented Feb 4, 2022

Hmm, I think this issue arises because in the new DataProvider world, there isn't really a way to specify "no downsampling at all" since it's a required parameter and intentionally doesn't special-case 0:

downsample: Integer number of steps to which to downsample the
results (e.g., `1000`). See `DataProvider` class docstring
for details about this parameter. Required.
So the documentation has gotten stale and out of sync with the implementation.

To truly fix this, we'd need to revisit that constraint and relax it to allow None there (presumably with implementation-defined behavior as to how many points it then returns).

It's also worth noting that both the Python-backed read path (using EventMultiplexer) and the RustBoard read path both require being given a "samples per plugin" configuration up-front, before they read the event files, which allows them to control how large a buffer they retain. So even if we relax the DataProvider contract so you can ask for downsample=None or "as many points as are available", it won't really be "all the data" unless the read path was correctly configured to buffer unlimited data for that plugin.

That said, that side of things does appear to still be working, since EventMultiplexer should have continued to support it, and RustBoard special-cases it with some logic that maps 0 to all:

sample_hint_pairs = [
"%s=%s" % (k, "all" if v == 0 else v)
for k, v in self._samples_per_plugin.items()
]
samples_per_plugin = ",".join(sample_hint_pairs)

/// Set explicit series sampling
///
/// A comma separated list of `plugin_name=num_samples` pairs to explicitly specify how many
/// samples to keep per tag for the specified plugin. For unspecified plugins, series are
/// randomly downsampled to reasonable values to prevent out-of-memory errors in long-running
/// jobs. Each `num_samples` may be the special token `all` to retain all data without
/// downsampling. For instance, `--samples_per_plugin=scalars=500,images=all,audio=0` keeps 500
/// events in each scalar series, all of the images, and none of the audio.
#[clap(long, default_value = "", setting(clap::ArgSettings::AllowEmptyValues))]
samples_per_plugin: PluginSamplingHint,

So if we were to restore this feature, probably the best thing would be something like:

  1. Update the DataProvider contract, and all the implementations, to allow downsample=None.
  2. Update the CLI flag to specify that 0 really means 0 now, but you can pass the special token all to mean no limit (same behavior as RustBoard)
  3. Update all the plumbing to respect this (and to convert all on the command line into None).

Honestly, though, that might not be worth it given that there's an easy, if slightly ugly, workaround of just passing some arbitrarily large number. So it might make more sense to just fix the docs to suggest that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants