Skip to content

Support larger hidden size in Attention Cuda kernel#7002

Merged
gh-yewang merged 5 commits into
masterfrom
wangye/hidden_size
Mar 15, 2021
Merged

Support larger hidden size in Attention Cuda kernel#7002
gh-yewang merged 5 commits into
masterfrom
wangye/hidden_size

Conversation

@gh-yewang
Copy link
Copy Markdown
Contributor

@gh-yewang gh-yewang commented Mar 13, 2021

-Parity confirmed by comparing with CPU

-Perf test:
number of heads = 32
seq_len = 128
head_size | hidden_size | latency(ms)
32 | 1024 | 0.582
48 | 1536 | 0.802
64 | 2048 | 1.110
--------------------------------------(apply new kernels below)
65 | 2080 | 1.142
80 | 2560 | 1.332
96 | 3072 | 1.669
108 | 3456 | 1.930
128 | 4096 | 2.314
144 | 4608 | 2.697
160 | 5120 | 2.956

@gh-yewang gh-yewang requested a review from a team as a code owner March 13, 2021 00:08
@gh-yewang gh-yewang marked this pull request as draft March 13, 2021 00:08
@gh-yewang gh-yewang changed the title (WIP)Support larger hidden size in Attention Cuda kernel Support larger hidden size in Attention Cuda kernel Mar 13, 2021
@gh-yewang gh-yewang changed the title Support larger hidden size in Attention Cuda kernel (WIP)Support larger hidden size in Attention Cuda kernel Mar 13, 2021
@gh-yewang gh-yewang requested a review from tianleiwu March 13, 2021 02:04
@gh-yewang gh-yewang changed the title (WIP)Support larger hidden size in Attention Cuda kernel Support larger hidden size in Attention Cuda kernel Mar 13, 2021
@gh-yewang gh-yewang marked this pull request as ready for review March 13, 2021 02:04
Comment thread onnxruntime/contrib_ops/cuda/bert/attention_transpose.cu Outdated
Comment thread onnxruntime/contrib_ops/cuda/bert/attention_transpose.cu
Comment thread onnxruntime/contrib_ops/cuda/bert/attention_past.cu Outdated
Comment thread onnxruntime/contrib_ops/cpu/bert/attention.cc Outdated
@gh-yewang gh-yewang requested a review from tianleiwu March 15, 2021 21:03
@gh-yewang gh-yewang merged commit 4e670f7 into master Mar 15, 2021
@gh-yewang gh-yewang deleted the wangye/hidden_size branch March 15, 2021 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants