Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TF 2.0] tf.summary should be easier to use with graphs #26409

Open
nfelt opened this issue Mar 6, 2019 · 6 comments
Open

[TF 2.0] tf.summary should be easier to use with graphs #26409

nfelt opened this issue Mar 6, 2019 · 6 comments
Assignees
Labels
comp:tensorboard Tensorboard related issues TF 2.7 Issues related to TF 2.7.0 type:feature Feature requests

Comments

@nfelt
Copy link
Contributor

nfelt commented Mar 6, 2019

This feature request tracks improving the usability of tf.summary in TF 2.0 when used with graphs - specifically with tf.function and legacy graph mode.

Currently there are a number of interrelated limitations that make using tf.summary somewhat awkward and error-prone outside of eager mode:

  • In legacy graph mode, the writer must be configured in advance

    • a writer resource handle must be created via create_file_writer() before any graph construction happens, or all summary-writing functions become no-ops
    • note that resource initialization itself can be deferred (by calling writer.init() later), but all options to initialization, in particular the logdir, still must be passed earlier to create_file_writer()
    • this means it's not possible to create a graph first, and then at execution time decide on the logdir it should emit summaries to; you need to define the logdir and then define the graph
  • In tf.functions it's a similar story

    • the default writer is captured at graph construction time, aka the first function execution
    • if later executions of the function use the same trace, they will still reflect the default writer from the first execution, rather than picking up any new default writer that may exist
    • again, this means that you can't easily change the logdir in use at execution time; you would need to force the function to be re-traced to pick up the new default writer
  • tf.function has the additional complication that it can't own any state

    • this means any writer created inside the function must be assigned a reference outside the function in order to still exist when the function is actually executed (since otherwise it's automatically deleted at the end of the trace and no longer exists at graph execution time)
    • so it's not currently possible for a tf.function to have its own internal-only summary writer (akin to how they cannot currently have function local tf.Variables)
    • furthermore, any mistakes here tend to generate opaque "Resource not found" errors that don't really communicate what the issue might be
    • the safest approach is to always create the writer outside the tf.function and then just "re-enter" it within the tf.function via with writer.as_default(), and make sure the writer object exists as long as the tf.function is being used
  • The step and recording condition (tf.summary.record_if()) have milder but similar issues

    • they are also captured at graph definition time
    • however, they have a better workaround: set them to be either a tf.Variable or a placeholder (tf.compat.v1.placeholder for legacy graph mode, or a function argument for tf.function) and then set that value when executing the graph
@nfelt nfelt added comp:tensorboard Tensorboard related issues type:feature Feature requests TF 2.0 Issues relating to TensorFlow 2.0 labels Mar 6, 2019
@nfelt nfelt self-assigned this Mar 6, 2019
@shashvatshahi1998
Copy link
Contributor

@nfelt can you please guide I want to work on this.

@janosh
Copy link
Contributor

janosh commented Apr 23, 2019

Unlike tf.summary.FileWriter, TF 2.0's tf.summary.create_file_writer doesn't accept a Graph object. Is there a different way of visualizing a custom model's graph in TensorBoard with TF 2.0? #1961 doesn't clear things up for me since it's only concerned with Keras models which seem to take care of passing the graph to TensorBoard automatically.

Say I have a top-level @tf.function-decorated function that repeatedly constructs a Bayesian neural network and evaluates its performance for different choices of parameters, where would I add tf.summary.trace_on(graph=True, profiler=False) and tf.summary.trace_export(name="bnn", step=0) and/or @tf.function or similar to ensure TensorBoard has access to the model's graph?

def build_bnn(weights_list, biases_list, activation=tf.nn.relu):
    def model(X):
        net = X
        for (weights, biases) in zip(weights_list[:-1], biases_list[:-1]):
            net = dense(net, weights, biases, activation)
        # final linear layer
        net = tf.matmul(net, weights_list[-1]) + biases_list[-1]
        pred = net[:, 0]
        std_dev = net[:, 1]
        scale = tf.nn.softplus(std_dev) + 1e-6  # ensure scale is positive
        return tfd.Normal(loc=pred, scale=scale)

    return model

If I try to insert tf.summary.trace_on(graph=True, profiler=False) anywhere in build_bnn I get a warning

W0423 13:13:04.606260 4439250368 tf_logging.py:161] Cannot enable trace inside a tf.function.

@timtody
Copy link

timtody commented Oct 22, 2019

Any progress on this?

@kumariko
Copy link

@nfelt We are checking to see if you still need help on this issue . could you please check with latest released TF2.7 and let us know if the issue still persists in newer versions. Thanks!

@kumariko kumariko self-assigned this Dec 13, 2021
@kumariko kumariko added the stat:awaiting response Status - Awaiting response from author label Dec 13, 2021
@nfelt
Copy link
Contributor Author

nfelt commented Dec 13, 2021

This is pretty much all still relevant in current TF, yes.

@kumariko kumariko removed their assignment Dec 14, 2021
@kumariko kumariko added TF 2.7 Issues related to TF 2.7.0 and removed stat:awaiting response Status - Awaiting response from author TF 2.0 Issues relating to TensorFlow 2.0 labels Dec 14, 2021
@yatbear
Copy link
Member

yatbear commented Sep 29, 2022

Update: this is still relevant in current TF (tensorflow/tensorboard#5866).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:tensorboard Tensorboard related issues TF 2.7 Issues related to TF 2.7.0 type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

6 participants