Passing multiple logdirs doesn't work (and raises warning about "Conflict for name") #179

colinmorris · 2017-07-04T23:56:27Z

These work:

tensorboard --logdir logs/foo
tensorboard --logdir logs/bar
tensorboard --logdir a:logs/foo,b:logs/bar

But when I run this:

tensorboard --logdir logs/foo,logs/bar

I get a warning like this repeatedly printed:

WARNING:tensorflow:Conflict for name .: old path /home/colin/src/myproject/logs/foo, new path /home/colin/src/myproject/logs/bar

In tensorboard, only one run is shown.

I installed tensorflow-tensorboard using pip, and the version is 0.1.2

(I tried debugging, but the warning gets raised in a separate daemon thread, and I wasn't able to figure out how to get pdb to attach to it)

The text was updated successfully, but these errors were encountered:

teamdandelion · 2017-07-06T02:54:34Z

This is a design issue and not just a simple technical bug. Here's why:

If we have path/to/logdir/run1, path/to/logdir/run2, and you pass --logdir=path/to/logdir, you want the run names for run1 and run2 to be "run1" and "run2", naturally enough. But suppose there are events in path/to/logdir too. What should we call that run? Borrowing from unix conventions, we called it ".".

However, if you pass two unnamed logdirs, they both get auto-assigned "." and that creates the conflict.

Possibly we should fix this by requiring explicit names via the colon any time multiple logdirs are passed.

Going to mark this "contributiosn welcome". @colinmorris if you come up with a solution you like, feel free to send a pull request.

wchargin · 2017-07-06T20:42:12Z

@colinmorris you can attach pdb as follows:

diff --git a/tensorboard/backend/event_processing/event_multiplexer.py b/tensorboard/backend/event_processing/event_multiplexer.py
index 8a71f82..66e576a 100644
--- a/tensorboard/backend/event_processing/event_multiplexer.py
+++ b/tensorboard/backend/event_processing/event_multiplexer.py
@@ -124,6 +124,7 @@ class EventMultiplexer(object):
         if name in self._paths and self._paths[name] != path:
           # TODO(danmane) - Make it impossible to overwrite an old path with
           # a new path (just give the new path a distinct name)
+          import pdb; pdb.set_trace()  # XXX BREAKPOINT
           tf.logging.warning('Conflict for name %s: old path %s, new path %s',
                              name, self._paths[name], path)
         tf.logging.info('Constructing EventAccumulator for %s', path)

(You must invoke with bazel-bin/tensorboard/tensorboard --logdir /tmp/a,/tmp/b rather than bazel run tensorboard -- --logdir /tmp/a,/tmp/b.)

wchargin · 2017-07-06T20:44:42Z

For what it's worth, I used to run into this all the time, too: I would bazel run tensorboard -- --logdir /tmp/mnist,/tmp/histograms_demo,/tmp/scalars_demo,/tmp/text,/tmp/audio_demo for testing on comprehensive data sets, but this caused the collision as you describe. I worked around it by instead symlinking all those directories into a single parent directory. When you pass this as the unique logdir, everything works fine, because the root cause that @dandelionmane describes does not apply.

That said, I do recognize that this is just a workaround, and would be happy to see a clean solution to this problem. :-)

lucasb-eyer · 2017-11-26T12:58:09Z

I would love this flag to be more unix-y, such that I can run something like tensorboard --logdir experiment{1,2,3} super_new_run{5,8} and have them show up like that. (For all reasonable shells, that command translates to tensorboard --logdir experiment1 experiment2 experiment3 super_new_run5 super_new_run8)

The current syntax gets extremely tiring when I have a folder with many runs and need to start tensorboard with few individual runs to compare (which I need to do because tensorboard loads runs so slowly). The alternative with a folder with symlinks is just as tiring for my use-case.

SPP3000 · 2019-08-06T12:50:07Z

Is there a new way of get around this issue in tensorflow2.0-beta?

wchargin · 2019-08-06T15:50:58Z

I would love this flag to be more unix-y […]

I would love that, too.

Is there a new way of get around this issue in tensorflow2.0-beta?

Nothing new, no, though if you’re willing to add a shell function to
your .bashrc you can always automate the tiring symlink-directory
behavior:

# usage: multitb LOGDIR [LOGDIR...]
# runs TensorBoard with a comma-separated list of all provided logdirs
multitb() (
    set -eu
    if [ $# -eq 0 ]; then
        printf >&2 'fatal: provide at least one logdir\n'
        return 1
    fi
    tmpdir="$(mktemp -d)"
    for arg; do
        case "${arg}" in
            /*) ln -s "${arg}" "${tmpdir}/" ;;
            *) ln -s "${PWD}/${arg}" "${tmpdir}/" ;;
        esac
    done
    exit_code=0
    \command ls -l "${tmpdir}"
    printf 'tensorboard --logdir %s\n' "${tmpdir}"
    tensorboard --logdir "${tmpdir}" || exit_code=$?
    # This really should be 'find -H "${tmpdir}" -type l -delete` to
    # account for logdirs whose names start with `.`, which should
    # behave correctly on any POSIX system, but it is much more clear
    # that the version below cannot under any circumstances delete the
    # underlying data (e.g., with non-POSIX-compliant `find`(1)).
    rm "${tmpdir}"/*
    rmdir "${tmpdir}"
    return "${exit_code}"
)

Then running multitb experiment{1,2,3} super_new_run{5,8} actually
invokes TensorBoard with a fresh logdir that has the desired experiments
linked in.

This should work in any POSIX shell (e.g., bash, zsh).

guicho271828 · 2019-12-27T07:11:50Z

I would love this flag to be more unix-y, such that I can run something like tensorboard --logdir experiment{1,2,3} super_new_run{5,8}

so far I do $(echo experiment{1,2,3} super_new_run{5,8} | sed s/ /,/g ) ... stupid but works.

If we have path/to/logdir/run1, path/to/logdir/run2, and you pass --logdir=path/to/logdir, you want the run names for run1 and run2 to be "run1" and "run2", naturally enough. But suppose there are events in path/to/logdir too. What should we call that run? Borrowing from unix conventions, we called it ".".

I don't understand ... why they are named based on the dirnames, not filenames, like event... file?

wchargin · 2019-12-27T07:54:32Z

so far I do $(echo experiment{1,2,3} super_new_run{5,8} | sed s/ /,/g ) ... stupid but works.

Works, as long as your paths don’t have any funny characters in them.
This isn’t safe in general.

why they are named based on the dirnames, not filenames, like
event... file?

Two reasons:

Often, one logical run comprises many event files. This may happen
if your job is interrupted (e.g., preempted and then restarted while
running on a cluster, or auto-restarting after a crash) or in some
distributed training topologies. In this case, the data from all
event files in a directory is, roughly speaking, concatenated as if
there were only one file.
At the API layer, the user always specifies a directory name, not a
file name. Invoking tf.summary.create_file_writer("logs") creates
a directory called logs, not a normal file.

guicho271828 · 2019-12-27T15:14:34Z

Often, one logical run comprises many event files.

I see, that seems legit.

so, for the first example in this thread tensorboard --logdir logs/foo,logs/bar above, the working answer at the moment would be "use tensorboard --logdir logs", or putting the log files into one layer deeper, like logs/bar/log and logs/foo/log and use tensorboard --logdir logs/foo,logs/bar. I was accidentally using the second approach.

wchargin · 2019-12-27T17:02:01Z

If you have directories logs/foo and logs/bar, then just point
TensorBoard at --logdir logs.

The names logs/foo and logs/bar cannot be the names of event files,
because event file names always contain the string tfevents. But if
you somehow had event files logs/foo.tfevents.123 and
logs/bar.tfevents.123, then yes, you would need to put them into
subdirectories (say, logs/foo and logs/bar) and again point
TensorBoard’s --logdir at a common ancestor directory (logs).

guicho271828 · 2019-12-28T01:02:32Z

the point is, the original author may have millions of other directories under logs/ (e.g. logs/baz1 ... logs/baz1000), and want to specifically point to logs/foo and logs/bar. Assuming a specific usage and defaults to "include all" is generally not a nice idea for a widely used software as tensorboard.

I am not misunderstanding the event files structure. I am assuming the original author has files like logs/foo/xxx.tfevents....

shwang · 2020-06-06T00:19:00Z

Here's a simple way to make this work that avoids the overlapping run names issue.

We could add another flag --log_dir_multi [log_dir1] [log_dir2] ....

It errors if the --log_dir option is also present. To process this flag, Tensorboard first makes a temporary directory $gather_dir. Then Tensorboard makes a symlinks all the log dirs in $gather_dir/[log_dir#]. Finally Tensorboard proceeds as if we ran tensorboard --log_dir $gather_dir.

The run names will look as if you ran tensorboard --log_dir . but only with the selected log dirs.

shwang · 2020-06-06T00:23:11Z

I'd be happy to try implementing this if I get a go-ahead for this design from a maintainer.

Toeplitz · 2020-11-25T08:40:31Z

The following case should also be considered:

--logdir /local/folder,gs://bucket/path

A combination of local folders and training data in GCP buckets.

teamdandelion added core:backend stat:contributions welcome labels Jul 6, 2017

lucasb-eyer mentioned this issue Nov 26, 2017

TensorBoard: Infer first argument #543

Closed

wchargin mentioned this issue Sep 6, 2019

Remove special treatment for commas and colons in logdir specifiers #2615

Closed

wchargin mentioned this issue Jan 31, 2020

Support ignoring certain directories under the logdir #70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Passing multiple logdirs doesn't work (and raises warning about "Conflict for name") #179

Passing multiple logdirs doesn't work (and raises warning about "Conflict for name") #179

colinmorris commented Jul 4, 2017

teamdandelion commented Jul 6, 2017

wchargin commented Jul 6, 2017

wchargin commented Jul 6, 2017

lucasb-eyer commented Nov 26, 2017

SPP3000 commented Aug 6, 2019

wchargin commented Aug 6, 2019 •

edited

Loading

guicho271828 commented Dec 27, 2019

wchargin commented Dec 27, 2019

guicho271828 commented Dec 27, 2019

wchargin commented Dec 27, 2019

guicho271828 commented Dec 28, 2019

shwang commented Jun 6, 2020

shwang commented Jun 6, 2020

Toeplitz commented Nov 25, 2020 •

edited

Loading

Passing multiple logdirs doesn't work (and raises warning about "Conflict for name") #179

Passing multiple logdirs doesn't work (and raises warning about "Conflict for name") #179

Comments

colinmorris commented Jul 4, 2017

teamdandelion commented Jul 6, 2017

wchargin commented Jul 6, 2017

wchargin commented Jul 6, 2017

lucasb-eyer commented Nov 26, 2017

SPP3000 commented Aug 6, 2019

wchargin commented Aug 6, 2019 • edited Loading

guicho271828 commented Dec 27, 2019

wchargin commented Dec 27, 2019

guicho271828 commented Dec 27, 2019

wchargin commented Dec 27, 2019

guicho271828 commented Dec 28, 2019

shwang commented Jun 6, 2020

shwang commented Jun 6, 2020

Toeplitz commented Nov 25, 2020 • edited Loading

wchargin commented Aug 6, 2019 •

edited

Loading

Toeplitz commented Nov 25, 2020 •

edited

Loading