[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

tkonolige · 2022-11-17T18:04:24Z

Compute peak flops using fma instructions on CUDA targets. Supports arbitrary datatypes.

tvm-bot · 2022-11-17T18:04:28Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

No users to tag found in teams: roofline _{See #10317 for details}
Built docs for commit 5b5f025 can be found here.

_{Generated by tvm-bot}

tvm-bot · 2022-11-17T18:04:29Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

No users to tag found in teams: roofline _{See #10317 for details}

_{Generated by tvm-bot}

Compute peak flops using fma instructions on CUDA targets. Supports arbitrary datatypes.

[Roofline] Add fma (non-tensorcore) peak flops for CUDA

5b5f025

Compute peak flops using fma instructions on CUDA targets. Supports arbitrary datatypes.

tkonolige requested a review from AndrewZhaoLuo November 17, 2022 18:04

AndrewZhaoLuo approved these changes Nov 21, 2022

View reviewed changes

AndrewZhaoLuo merged commit b419c4b into apache:main Nov 21, 2022

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022

[Roofline] Add fma (non-tensorcore) peak flops for CUDA (apache#13419)

557464b

Compute peak flops using fma instructions on CUDA targets. Supports arbitrary datatypes.

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

tkonolige commented Nov 17, 2022

tvm-bot commented Nov 17, 2022 •

edited

tvm-bot commented Nov 17, 2022

[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

Conversation

tkonolige commented Nov 17, 2022

tvm-bot commented Nov 17, 2022 • edited

tvm-bot commented Nov 17, 2022

tvm-bot commented Nov 17, 2022 •

edited