Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roofline] Add fma (non-tensorcore) peak flops for CUDA #13419

Merged
merged 1 commit into from
Nov 21, 2022

Conversation

tkonolige
Copy link
Contributor

Compute peak flops using fma instructions on CUDA targets. Supports arbitrary datatypes.

Compute peak flops using fma instructions on CUDA targets. Supports
arbitrary datatypes.
@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 17, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: roofline See #10317 for details
  • Built docs for commit 5b5f025 can be found here.

Generated by tvm-bot

@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 17, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: roofline See #10317 for details

Generated by tvm-bot

@AndrewZhaoLuo AndrewZhaoLuo merged commit b419c4b into apache:main Nov 21, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
Compute peak flops using fma instructions on CUDA targets. Supports
arbitrary datatypes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants