New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with many large event logs #1002
Comments
TensorBoard currently loads all runs into memory even if they aren't initially being displayed, so that when you select or deselect runs it can start showing that data in the UI immediately rather than having to then go crawl through the file. I'm a bit surprised though that you are running into memory issues because TensorBoard should be keeping only a fixed-size sample of the loaded data in memory, not the full size of the original log directory. In the case of images, it should keep only 10 images per tag and per run: The other thing is that with 100GB of logs in the directory, TensorBoard will just take a very long time to load them (even if they fit in memory) since there's only a single thread and that's just a lot of data to process. I agree it'd be useful if it were smarter about prioritizing runs to load, but for now a workaround could be to just create a new log directory and add a symlink to it for each run you want to show. E.g. if you have a log directory with runs like Also, we're working on adding support for a SQLite DB backend which should hopefully open up possibilities for loading run data only when needed and avoiding so much memory consumption, but it will still be a little while before that's reading for general use. |
Thank you @nfelt for the information. For the experimental SQLite DB backend that you mentioned, is there an example showing how to create and load the DB given large event files? |
@gweidner We don't really have an example yet, but it should be possible to populate a sqlite DB using either the loader.cc tool (built from TF source) or write TF code that uses |
Any news? |
+1. Is this still experimental? |
Any Updates on the ETA? |
@jart from PRs I got a feeling that you're working on adding support for a SQLite DB backend. |
It seems like a very useful feature set. Any way we can help this expedite with contribution? |
Hi folks, thanks for your continued interest and sorry there hasn't been much news. The SQLite DB backend work ran into difficulties and has been on hold for a while, but we're still very much aware that working with large logdirs is a pain point and we're working on more flexible modes of run selection and data management to address this. |
Any updates? |
i found a workaround in my case which might be useful for others. switch from chrome/chromium to mozilla firefox |
Add my data point: on a MacBook Pro 2018 13', loading tensorboard from a remote server with many large event logs (a few hundred MBs in total), using Chrome will hang for a while every time I click on the tensorboard page. Then I followed this suggestion and switched to Safari -- Woo! It is just amazing and so responsive! |
@nfelt is there any progress on that issue? |
Any updates on this issue? Sqlite backend anybody? |
I am training a CNN and i use tensorboard for the visualization of the training process and results.
As i create lots of image summaries while training the event log file often have size of about 7GB. When i point tensorboard to my runs-directory it seems to load all runs into memory even though non are activated in the ui. All log files in the runs directory total at about 100GB. Therefore loading everything into main memory (32GB on my system) doesn't work. Is there a way to only load log files once the runs are activated (on demand)? Am i missing something?
Thank you in advance.
The text was updated successfully, but these errors were encountered: