Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow grouping of multiple runs #376

Open
Timeroot opened this issue Aug 17, 2017 · 8 comments
Open

Allow grouping of multiple runs #376

Timeroot opened this issue Aug 17, 2017 · 8 comments

Comments

@Timeroot
Copy link

Is there a way to group multiple runs and display (for example), mean/median of their various success metrics? When conducting an experiment and trying to show that e.g. model A performs better than model B, on average, being able to do 5 instances of model A and 5 instances of model B, and plotting that as just two curves instead of 10, would be a handy tool. Based on some Googling there seems to be other people that would like this as well, and I can't find an option to do anything like that currently.

@wchargin
Copy link
Contributor

Interesting suggestion. You're correct that we don't currently have any machinery to do this.

To get you unblocked, it shouldn't be too hard to manually create aggregate runs: read in the data from each of your run files, and emit summaries that contain the mean/median/whatever of your individual runs, saving these to a brand new run.

A somewhat related, but certainly distinct, request is in #300.

This isn't currently high on our priorities, but we'd be open to reviewing PRs for it!

@donamin
Copy link

donamin commented Sep 20, 2017

I also think this feature can be very useful for tensorboard. Usually you have to run each setting about 10 times to make sure that it is working. It is also the best way to report results of each method in papers.

@sanchom
Copy link

sanchom commented Oct 13, 2017

A paper that uses visualizations like this: https://arxiv.org/pdf/1709.06560.pdf

I'd like to help if no one else is working on it yet.

One thing to sort out is how to let the user indicate what part of the run name should be averaged over. For example, if the runs are named like this:

lstm_dropout-0.5_trial-0001
lstm_dropout-0.5_trial-0002
lstm_dropout-0.5_trial-0003
lstm_dropout-0.25_trial-0001
lstm_dropout-0.25_trial-0002
lstm_dropout-0.25_trial-0003

How should the user say that they'd like averaging over the trial field?

I think in addition to the run selector, we'd need a field for run aggregation. Then, we could use a distribution view to show the mean and standard error at each step. This wouldn't make sense to show in a relative or wall time view though.

@chihuahua
Copy link
Member

chihuahua commented Oct 20, 2017

The upcoming custom-scalars plugin lets you create plots with custom run-tag combos ... as well as organize charts with any layout you want.
#664

Feedback very much appreciated. Also, if you want to contribute, there are still many features to build.

@Spenhouet
Copy link

Spenhouet commented Apr 18, 2018

At the moment if two events.out.tfevents.... files are in one folder, only the first shows up in TensorBoard.
I would wish for all such files in one folder to result in an average for all scalars.

@srinivr
Copy link

srinivr commented Jul 26, 2018

@chihuahua custom-scalars looks great! Any chance if tensorboard team will be able to work on what's being requested here? Average over several runs will be a popular request at least among RL researchers and will aid fair results being reported.

@balloch
Copy link

balloch commented Aug 8, 2018

Also wondering if there are any updates on this. Would be incredibly beneficial to the research community to have tools that make it easier to show bounds on results. Will increase the accountability of doing this regularly in research

@Spenhouet
Copy link

For quite some time I'm using a custom implementation to solve this problem.
Since there is still no build in function for this requirement I decided to clean up my own solution and to release it on GitHub as standalone tool:

https://github.com/Spenhouet/tensorboard-aggregator

This tool allows to aggregate multiple tensorboard runs by the max, min, mean, median and standard deviation of all scalars. The resulting aggregations are saved as new tensorboard summaries.
It is also possible to save the aggregations to .csv files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants