Skip to content

Conversation

@JewelRoam
Copy link
Collaborator

@JewelRoam JewelRoam commented Aug 27, 2025

引入 GPU 计时器:

集成了 KernelBench 评测机制,当在 CUDA 设备上进行测试时(--device cuda),脚本现已采用基于 torch.cuda.Event 的 time_execution_with_cuda_event 函数进行计时。

对于 CPU 测试,保留了原有的 naive_timer 机制,并为其增加了与 GPU 模式类似的逐次试验(Trial-by-trial)的详细输出,统一了两种模式下的日志风格。

增加多次计时功能:

将计时试验次数(--trials)添加为命令行参数(默认5),允许用户根据评测需求灵活配置,方便进行快速调试或深度分析。

增加结构化导出功能

使用命令行参数 --output-dir 将模型/编译器配置参数组合下测定的性能数据存储到特定json文件中,方便后续分析。

@paddle-bot
Copy link

paddle-bot bot commented Aug 27, 2025

Thanks for your contribution!

@JewelRoam JewelRoam closed this Aug 28, 2025
@JewelRoam JewelRoam deleted the feature branch August 28, 2025 07:43
@JewelRoam JewelRoam restored the feature branch August 28, 2025 07:43
@JewelRoam JewelRoam reopened this Aug 28, 2025
@lixinqi lixinqi merged commit 6f7b3a7 into PaddlePaddle:develop Aug 29, 2025
3 checks passed
JewelRoam added a commit to JewelRoam/GraphNet that referenced this pull request Oct 29, 2025
* CONTRIBUTE_TUTORIAL_cn.md

* Handle big int tensors by converting to sparse COO

* Update utils

* Update utils

* Update utils

* Update utils

* Update utils

* Update paddle test compiler

* Add compilation_duration display

* resolve conflict

* Update test compiler

* feat: add stuctured json output in test compiler

* revert paddle test compiler

* rename 16:36

* rename performance_eval.py

* rename performance_eval.py

* Optimize configuration record in .json

* Optimize configuration record in .json

* Optimize configuration record in .json

* Optimize configuration record in .json

* Correct warmup

* Update

* Update

* Update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants