-
Notifications
You must be signed in to change notification settings - Fork 207
Merge plugin/0.2.1 code into master #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* remove the Record Window End event if the duration is larger than 1 day to workaround the high resolution clock bug in pytorch profiler * remove debug print
update doc and tooltip
* add notes of python version in case of circular import bug * update wording
… is available in Tensorboard 2.0 (#323)
* workaround for negative gpu metrics from input json file * refine
This commit makes `walk`-ing directories follow symlinks when searching for run data (on local filesystems, where it's supported!). This makes the plugin's search behavior consistent with that of tensorboard itself; using symlink trees to organize runs is one of the recommendations made in the tensorboard docs to have fine-grained control over the naming of runs and the location of the data [1]: > TensorBoard walks log directories recursively; for finer-grained > control, prefer using a symlink tree. A unit test is added to validate the new behavior. [1] https://github.com/tensorflow/tensorboard/blob/master/README.md#logdir--logdir_spec-legacy-mode
* return loading run status to frontend to fix a couple of bugs * Add message when logdir has no runs (#343) * Add message when logdir has no runs * Correct Typography import * fix test failure * remove dead code Co-authored-by: TomWildenhain-Microsoft <67606533+TomWildenhain-Microsoft@users.noreply.github.com>
| For example, a kernel with only one thread per block can’t fully utilize each SM. | ||
|
|
||
| * Est. Achieved Occupancy: The bigger, the better. The definition of occupancy is [here](https://docs.nvidia.com/gameworks/content/developertools/desktop/analysis/report/cudaexperiments/kernellevel/achievedoccupancy.htm). | ||
| * Est. Achieved Occupancy: For most cases such as memory bandwidth bounded kernels, the higher the better. [Reference](http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0514-GTC2012-GPU-Performance-Analysis.pdf). The definition of occupancy is [here](https://docs.nvidia.com/gameworks/content/developertools/desktop/analysis/report/cudaexperiments/kernellevel/achievedoccupancy.htm). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvements in performance does not normally scale linearly after some point, but not sure how to best express this in a clear way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will consider more clear description in next release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to fix a couple of linter warnings so I changed this as well to "a higher value often translates to better performance, especially when the starting value is very low." Sounds OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is ok
| on_trace_ready=torch.profiler.tensorboard_trace_handler('./result', worker_name='worker0'), | ||
| record_shapes=True, | ||
| profile_memory=True, | ||
| profile_memory=True, # This will take 1 to 2 minutes. Setting it to False could greatly speedup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we also add a similar comment to with_stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do in next release.
|
@gdankel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
merge 0.2.1 bug fixes into master