Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add global_step parameter to SummaryWriter.add_hparams #109572

Closed

Conversation

ringohoffman
Copy link
Contributor

@ringohoffman ringohoffman commented Sep 19, 2023

Fixes #37738 where all hparam metrics can only be plotted at step 0. This is basically just a resubmission of #50653.

before:
Screenshot 2023-09-18 at 8 09 13 PM

after:
Screenshot 2023-09-18 at 7 56 52 PM

@ngimel @J0Nreynolds @ezyang @albanD

Fixes pytorch#37738 where all hparam metrics plots are plotted at step 0
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 19, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109572

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 27e800b with merge base a6d34c6 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 20, 2023
@ezyang
Copy link
Contributor

ezyang commented Sep 21, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 21, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ringohoffman
Copy link
Contributor Author

@ezyang Can you add a release notes tag?

@ezyang ezyang added release notes: visualization release notes category topic: new features topic category labels Sep 21, 2023
@ezyang
Copy link
Contributor

ezyang commented Sep 21, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm5.6-py3.8 / test (default, 3, 3, linux.rocm.gpu)

Details for Dev Infra team Raised by workflow job

@ringohoffman
Copy link
Contributor Author

Can the 1 failing job be retried? The failure looks unrelated.

=================================== FAILURES ===================================
____ TestCommonCUDA.test_python_ref__refs__conversions_float_cuda_complex64 ____
Traceback (most recent call last):
  File "test_ops.py", line 470, in test_python_ref
    self._ref_test_helper(lambda: TorchRefsMode(strict=True), device, dtype, op)
  File "test_ops.py", line 453, in _ref_test_helper
    self.assertTrue(ref_distance <= torch_distance, msg=msg)
  File "/opt/conda/envs/py_3.8/lib/python3.8/unittest/case.py", line 765, in assertTrue
    raise self.failureException(msg)
AssertionError: tensor(False, device='cuda:0') is not true : Reference result was farther (201.09133911132812) from the > precise computation than the torch result was (0.0)!

@ezyang
Copy link
Contributor

ezyang commented Sep 22, 2023

@pytorchbot merge -f "spurious fail"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: visualization release notes category topic: new features topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow user to update metrics in Tensorboard SummaryWriter.add_hparam
5 participants