Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projector plugin hangs with "Fetching sprite image..." #3840

Closed
davidsoergel opened this issue Jul 15, 2020 · 11 comments
Closed

Projector plugin hangs with "Fetching sprite image..." #3840

davidsoergel opened this issue Jul 15, 2020 · 11 comments

Comments

@davidsoergel
Copy link
Member

Projector plugin hangs with "Fetching sprite image...", using our standard test demodir. I don't know the cause yet, or how long this has been an issue.

Screen Shot 2020-07-14 at 8 29 25 PM

@psybuzz
Copy link
Contributor

psybuzz commented Jul 15, 2020

Using the team's demo logdir, I can reproduce this error using an older version from ~July 8th.

The sprite image request fails with 400, and surprisingly it takes 10+ minutes before I can see the network error. For some reason it's taking a long time to fetch any run, tensor data.

image

@psybuzz
Copy link
Contributor

psybuzz commented Jul 15, 2020

The projector plugin backend responds with:
.../demodir/__debugger_data__/sprite.png" does not exist or is directory
Still investigating, but adding @hfiller and @caisq for projector + debugger expertise.

The demo logdir contains both projector data, and debugger v1 summaries (assumed from the __debugger_data__ run appearing in the network logs). I can confirm that

  • running the latest TB against the demo logdir stalls when Projector tries to load the __debugger_data__ run
  • running the latest TB in a separate logdir with only projector data works fine

Conclusion

  • Either the Projector backend is incorrectly trying to access Debugger v1 runs, or Debugger v1 runs are confusing the Projector backend
  • I still need to check how far back this has been happening

@psybuzz
Copy link
Contributor

psybuzz commented Jul 16, 2020

Confirmed that this issue is present at least as far back as June 8, so it's not a recent regression.

@psybuzz
Copy link
Contributor

psybuzz commented Jul 16, 2020

Manually bisected to the change #3653 (on June 1), which was not included in the recent TB 2.2.2 release (on May 28).

@hfiller is the PR above something we can revert safely, or is there a clear fix?

@hfiller
Copy link
Contributor

hfiller commented Jul 16, 2020

This PR can safely be reverted.

psybuzz added a commit that referenced this issue Jul 16, 2020
…runs" (#3850)

This reverts commit 12f6234.

The changes in [1] allowed TensorBoard to use the root directory's
projector config and apply it to each subdirectory.

If the root dir's config specifies a "sprite" that does not exist in every
subdirectory, it causes the Projector frontend to hang when loading a run.
See #3840 for details on
the issue.

Confirmed that this fixes the issue by running TB on a directory with:

- ./projector_config.pbtxt
```
embeddings {
  tensor_name: "EMNIST_Letters:0"
  metadata_path: "metadata.tsv"
  sprite {
    image_path: "sprite.png"
    single_image_dim: 28
    single_image_dim: 28
  }
}
```
- alongside a subdirectory `profile_demo_new` that contains a `.profile-empty` event file but no `sprite.png` file

[1] #3653
@juliagong
Copy link

Hi! I'm having this issue in tensorboard (~28,000 images and embeddings to be displayed in the projector). I'm using tensorboard version 1.15.0 and am wondering what the issue could be? The loading of tensors and metadata is fast as usual, and then it hangs on "Fetching sprite image...". This didn't happen when I had only 128 images and embeddings. Thanks!

@luuzk
Copy link

luuzk commented Mar 29, 2021

@juliagong Same here and same issue for very small (~500 kb) sprite images. There seems to be a problem with large amounts of images (e.g., 16,000 in my case). A fix would be nice!

@pindinagesh pindinagesh self-assigned this Jan 19, 2022
@pindinagesh
Copy link

@davidsoergel

Could you please have a look at similar issue and let us know if this helps. Thanks

@pindinagesh
Copy link

@davidsoergel

Closing this issue due to inactivity. Please feel free to reopen if this still exist. Thanks

@sgbaird
Copy link

sgbaird commented Dec 18, 2022

@JZL
Copy link

JZL commented May 15, 2023

Could be all unrelated to the above but after digging into this myself, it looks like a limitation for maximum img size in google chrome which causes a cascading failure when the web framework tries to load it in.

I didn't find the exact pixel size (this says 16384x16384), but in the networks tab of chrome devtools, you could see it happily downloading in the multi-MB jpg, but the preview tag showed the invalid image sprite. And the way to be certain is to download the image and try to load it locally into chrome and see if it fails.

When I made the image smaller in terms of pixels (not even in terms of file size), it worked.

It could be nice to have a folder of images and dynamically load the image on click/hover, so I can use higher res individual images, but I don't see how to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants