Support hparams logging to tensorboard #984

timothe-chaumont · 2022-07-25T22:02:18Z

Description

I created a HParam data class in the same way as Figure, Image ones. It can take any number of distinct hyperparameters and metrics as input.
In each .write() method of HumanOutputFormat, JSONOutputFormat and CSVOutputFormat, I have raised errors when hparams format was given. e.g.

            elif isinstance(value, HParam):
                raise FormatUnsupportedError(["stdout", "log"], "hparam")

I added a case in TensorBoardOutputFormat to log the hparams values to tensorboard:

            if isinstance(value, HParam):
                # we don't use `self.writer.add_hparams` to have control over the log_dir
                exp, ssi, sei = hparams(value.hparam_dict, metric_dict=value.metric_dict)
                self.writer.file_writer.add_summary(exp)
                self.writer.file_writer.add_summary(ssi)
                self.writer.file_writer.add_summary(sei)

As described in the code comment, I have not used self.writer.add_hparams(hparam_dict, metric_dict), provided by pytorch, but reused some of the code from that method. Using this first method, the hparams could not be saved in the same run folder as the other logs, and so metrics from SCALARS tab could not appear in HPARAMS tab.

I created one parametrized test : test_report_hparam_to_unsupported_format_raises_error() in the same way as other data classes. I did not create a test_report_hparam_to_tensorboard test, as hparams logs are not seen by EventAccumulator (c.f. Stack Overflow and EventAccumulator implementation).
Finally, in the tensorboard section of the documentation, I added an example of a callback that uses this new code to log hyperparameters to tensorboard.

Additionnal information - choices made

As hyperparameters are key-value pairs, adding the support for csv, json, or human output formats could be a future developpment.
It is not required to pass a metric_dict to hparams() or writer.add_hparams(), but if we don't, then nothing is displayed in HPARAMS tab. I have added a warning to alert the user about that.
When adding metrics in metric_dict, the users have 2 choices:
- Adding custom metrics -> only one value per run for each metric and no plot,
- Using metrics from the scalar section (with value 0) -> the user can visualize the plot of that metric in HPARAMS tab and the last value of that plot is displayed (c.f. image below).
It is not ideal to display the last value - instead of the best value, for example (issue discussed here) - but I decided to use the second one in the documentation example as I found it more relevant and intuitive.
In the example, I put the logic in _on_training_start() as we only have to log the hyperparameters & metrics once, but I could also added it in on_step() as I did it here.

Motivation and Context

~~[x] I have raised an issue to propose this change (required for new features and bug fixes)~~ I have proposed a solution to Also log hyperparameters to the tensorboard #428 : a way of logging hyperparameters to tensorboard.

closes #428

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have reformatted the code using make format (required)
I have checked the codestyle using make check-codestyle and make lint (required)
I have ensured make pytest and make type both pass. (required)
I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

araffin · 2022-08-05T20:28:03Z

stable_baselines3/common/logger.py

@@ -389,6 +414,13 @@ def write(self, key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, T
            if isinstance(value, Image):
                self.writer.add_image(key, value.image, step, dataformats=value.dataformats)

+            if isinstance(value, HParam):
+                # we don't use `self.writer.add_hparams` to have control over the log_dir
+                exp, ssi, sei = hparams(value.hparam_dict, metric_dict=value.metric_dict)


please use explicit names, what is ssi? sei?

It's done:

experiment, session_start_info, session_end_info = hparams(value.hparam_dict, metric_dict=value.metric_dict)

The content and meaning of those variables is described in hparam docstring. (And the initial short names where those used in pytorch SummaryWriter class)

araffin · 2022-08-05T20:28:57Z

tests/test_logger.py

@@ -296,6 +297,19 @@ def test_report_figure_to_unsupported_format_raises_error(tmp_path, unsupported_
    writer.close()


+@pytest.mark.parametrize("unsupported_format", ["stdout", "log", "json", "csv"])


is the new feature also tested somewhere?
currently only the failure case is tested?

I tested the new feature in practice, but not programmatically. It is not as easy as for the other types of logs (e.g. images):

I did not create a test_report_hparam_to_tensorboard test, as hparams logs are not seen by EventAccumulator (c.f. Stack Overflow and EventAccumulator implementation).

Two solutions are proposed in this stackoverflow page but one requires an external library, and the other is not very clean.

more than reading the logged hparams, we should at least run the logger.

stable_baselines3/common/logger.py

…/tim99oth99e/stable-baselines3 into feat/hparams-tensorboard-support

araffin

LGTM, thanks =)

timothe-chaumont and others added 8 commits July 23, 2022 11:18

create Hparam class & support in all OutputFormats

4998c18

add hparams documentation & example

3250b53

add hparam tests

e20429c

remove unnecessary test & fix name

8995cb3

format changes

dec77da

support hyperparameters logging to tensorboard

0c84035

fix HParams class docstring

26bd1e0

Merge branch 'master' into feat/hparams-tensorboard-support

59b0436

araffin self-requested a review July 29, 2022 18:13

araffin self-assigned this Jul 29, 2022

araffin reviewed Aug 5, 2022

View reviewed changes

stable_baselines3/common/logger.py Outdated Show resolved Hide resolved

araffin added 2 commits August 5, 2022 22:32

Merge branch 'master' into feat/hparams-tensorboard-support

b114927

Merge branch 'master' into feat/hparams-tensorboard-support

d623cac

araffin changed the title ~~Support hparams logging to tensorboad~~ Support hparams logging to tensorboard Aug 16, 2022

timothe-chaumont and others added 6 commits August 17, 2022 21:48

use more explicit variable names

332c19e

Merge branch 'feat/hparams-tensorboard-support' of https://github.com…

3966e39

…/tim99oth99e/stable-baselines3 into feat/hparams-tensorboard-support

raise error instead of warning

9131416

Merge branch 'master' into feat/hparams-tensorboard-support

7b05040

Unpin protobuf

d1b0b2a

Add test for logging hparams

f5fb53b

araffin approved these changes Aug 22, 2022

View reviewed changes

araffin merged commit 01cc127 into DLR-RM:master Aug 22, 2022

araffin mentioned this pull request Oct 6, 2022

[Bug] New hyperparameters logged through HParam don't get displayed in Tensorboard #1099

Closed

3 tasks

araffin mentioned this pull request Jan 26, 2023

[Bug]: Missing metrics when logging hyperparameters on tensorboard #1298

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support hparams logging to tensorboard #984

Support hparams logging to tensorboard #984

timothe-chaumont commented Jul 25, 2022 •

edited by araffin

Loading

araffin Aug 5, 2022

timothe-chaumont Aug 17, 2022 •

edited

Loading

araffin Aug 5, 2022

timothe-chaumont Aug 17, 2022 •

edited

Loading

araffin Aug 18, 2022

araffin left a comment

		@@ -296,6 +297,19 @@ def test_report_figure_to_unsupported_format_raises_error(tmp_path, unsupported_
		writer.close()


		@pytest.mark.parametrize("unsupported_format", ["stdout", "log", "json", "csv"])

Support hparams logging to tensorboard #984

Support hparams logging to tensorboard #984

Conversation

timothe-chaumont commented Jul 25, 2022 • edited by araffin Loading

Description

Additionnal information - choices made

Motivation and Context

Types of changes

Checklist:

araffin Aug 5, 2022

Choose a reason for hiding this comment

timothe-chaumont Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

araffin Aug 5, 2022

Choose a reason for hiding this comment

timothe-chaumont Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

araffin Aug 18, 2022

Choose a reason for hiding this comment

araffin left a comment

Choose a reason for hiding this comment

timothe-chaumont commented Jul 25, 2022 •

edited by araffin

Loading

timothe-chaumont Aug 17, 2022 •

edited

Loading

timothe-chaumont Aug 17, 2022 •

edited

Loading