Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use nvtx push pop in timeline #30567

Merged
merged 8 commits into from
Jan 20, 2021
Merged

use nvtx push pop in timeline #30567

merged 8 commits into from
Jan 20, 2021

Conversation

wanghuancoder
Copy link
Contributor

@wanghuancoder wanghuancoder commented Jan 19, 2021

PR types

Others

PR changes

Others

Describe

背景

使用nsys生成的*.qdrep文件,是NV原生支持的timeline,精准,能够补充profiler timeline的一些不足。
但常规的*.qdrep中的kernel名称无法阅读,很难通过kernel名称定位到是Paddle中的那个OP。也很难对应到模型的代码位置。
如果能够将OP名称打tag到timeline中,能够在模型中通过调用API就能在timeline中添加一些标记,将非常有助于性能调试。

主要功能

使用nvtx,提升NV timeline展示能力,方便性能调试。
本次主要是利用了NVTX的API:nvtxRangePush、nvtxRangePop,一个push+pop,可以在timeline中打一个“tag”标记代码执行位置。
这将非常有助于将NV timeline中的kernel与我们的OP映射,分析问题kernel与模型代码之间的关系。

添加4个内部API

  • core.nvprof_nvtx_push:调用nvtxRangePush。
  • core.nvprof_nvtx_pop:调用nvtxRangePop。
  • core.nvprof_enable_record_event:在所有Profiler的RecordEvent的构造函数中加nvtxRangePush,所有析构函数中加nvtxRangePop。
  • core.nvprof_disable_record_event:取消对RecordEvent事件的截取。

使用代码demo

for iter_id, data in enumerate(train_loader):
	if iter_id == 100:
		core.nvprof_start()
		core.nvprof_enable_record_event()
		core.nvprof_nvtx_push(str(iter_id))
	if iter_id == 110:
		core.nvprof_nvtx_pop()
		core.nvprof_stop()
		sys.exit()
	if iter_id > 100 and iter_id < 110:
		core.nvprof_nvtx_pop()
		core.nvprof_nvtx_push(str(iter_id))

在执行时,使用类似如下命令:

nsys profile -t cuda,nvtx -o my_report --capture-range=cudaProfilerApi --stop-on-range-end=true python train.py

timeline效果如下

image

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@@ -131,6 +131,7 @@ struct RecordEvent {
~RecordEvent();

bool is_enabled_{false};
bool is_pushed{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool is_pushed{false};
bool is_pushed_{false};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx!

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef _WIN32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important, I wonder why GetNvtxDsoHandle handles WIN32, _APPLE_, and etc, while here only _WIN32.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块主要是,windows下,nvToolsExt.h和libnvToolsExt.so相应的路径,我没有找到。不知道如何include和加载dll。这尤其是include<nvToolsExt.h>在windows流水线编译不过。
这个功能主要是团队内在Linux下使用的,因此就放弃支持windows了。

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants