-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorBoard.dev fails when more than one model graph in the same run. #5188
Comments
Thanks for the report and sorry you are having trouble! I'm able to see your experiment, but I only see the graph. Looking in our logs it doesn't look like there are any failures associated with this experiment. It sounds like you are saying that the upload also fails when you try to upload using a local logdir. Is it possible for you to share this logdir with us so we can try to reproduce the error on our end? If not, can you try running tensorboard locally on the logdir, and let us know if it works?
|
Hi @bileschi First of all, regarding my comment on the images and TensorBoard.dev, I have seen that the development has been paused (#3585 (comment)). You can see the tfevents files I point to under folder issue here: https://github.com/rcruzgar/github_uploads/tree/master/issue
I have noticed that pointing to the subdirectory generates successfully a tensorboard.dev (https://tensorboard.dev/experiment/22i1FrDGRt2fwPo0lr51dA/):
However, when I only point to issue as
nothing is uploaded (https://tensorboard.dev/experiment/yVnKXlEgTVaRHYPjXUqYCw/). Isn't it possible to provide subfolders to tensorboard.dev? Cheers, |
Yeah, pointing at the root folder should work. Thanks for sharing your logdir. Let me check it out and see if I can replicate on my end. |
Ok, I can replicate your issue : Digging in to see if I can find the root cause. I suspect it has something to do with the uploader tripping over the images even though it's supposed to ignore them.
|
In the meantime, as a workaround, you can specify the limited allowlist of plugins you want to support (sorry no images) like so:
FYI, to others on TensorBoard team, I noticed that the upload fails when the plugins setting includes graphs like so |
Ok, narrowing down what's going on here: it apppears that the event file in the root dir contains more than one graph. Local TensorBoard handles this cleanly by only displaying the most recent one. TensorBoard.dev, however, throws an error. I found this by exploring the event file using the
Allow me to update the name of this bug to reflect the underlying issue. In the meantime, if you don't need to view the graphs, you can use the workaround by specifying |
Thank you, your solution seems to work for one experiment, even specifying a log from an AWS S3 bucket:
Just to mention that the flag --one_shot doesn't work when using data from S3 as the logdir. I can do this without troubles, even visualizing the images:
(I have uploaded another log to https://github.com/rcruzgar/github_uploads as issue2) Using data from S3 as well:
The problem comes when I want to use tensorboard dev upload with data from S3 and two logdirs. I have tried the following options, with no success:
Doesn't upload anything: https://tensorboard.dev/experiment/1u6OwPkjRdG4jbgfe2Glug/ . It seems that --logdir and more than one specified logdir doesn't work. So I tried with --logdir_spec:
But it seems that dev upload doesn't recognize --logdir_spec:
Do you have any recommendation? |
Awesome, I'm glad we are making a little progress here. And thanks for clearly walking through your analysis. It sounds like we have two TensorBoard.dev issues here. I have created two new issues to track those independently.
I'm not sure what we can do to solve your immediate problem, where you would like to view two separate S3 logdirs on the same TensorBoard.dev view. Probably best in the short term to do the download locally and upload from there dance. Note that we are unable to support S3 as cleanly as we support other filesystems, due to some peculiarities in their behavior (see #4786, #4255, pull/38203). TensorBoard's ability to read from S3 is a community supported contribution (we don't ourselves test it or verify it works). |
Thank you, @bileschi . I will then download it locally at the moment. Cheers. |
This should now be fixed. Can you please test again to make sure it works, @rcruzgar ? If there is still a problem, please re-open this issue. Thanks! |
Hi @bileschi , It works now without specifying the flag plugins. Thanks! |
Environment information
Issue description
Hi!
I have successfully created Tensorboard dashboards using data stored in AWS S3 Buckets this way:
AWS_REGION=eu-central-1 tensorboard --logdir s3://bucket_name/subfolder/
Even for few experiments using --logdir_spec flag like this:
AWS_REGION=eu-central-1 tensorboard --logdir_spec=NAME_EXP1:s3://bucket_name/subfolder1/,NAME_EXP2:s3://bucket_name/subfolder2/
I have tried to create a Tensorboard.dev to be shared with people, at least with one experiment, although ideally I would like to compare two at the same time:
AWS_REGION=eu-central-1 tensorboard dev upload --logdir s3://bucket_name/subfolder/
However, I get the following error message:
Only the model graph has been uploaded, but not the scalars nor the images.
I assume that the AWS S3 credentials are properly set, because it works for normal Tensorboard.
If I make a "normal tensorboard", without the dev upload option, I get what I want:
EDIT: I have to say that the tensorboard dev upload --logdir doesn't work neither with local data, producing the same error.
Do you have any idea of what could be happening and how to solve this?
Thank you!
Best regards,
Rubén.
The text was updated successfully, but these errors were encountered: