Skip to content

Fix SkipLayerNorm for 2D input#17014

Merged
tianleiwu merged 4 commits intomainfrom
tlwu/fix_sln_element_size
Aug 8, 2023
Merged

Fix SkipLayerNorm for 2D input#17014
tianleiwu merged 4 commits intomainfrom
tlwu/fix_sln_element_size

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu commented Aug 4, 2023

Description

Fix an obvious bug:
(1) In packing mode, the input for SLN has two dimensions (introduced by #15283): [token_count, hidden_size]. Current code of element_count = input_dims[0] * sequence_length * hidden_size will use element_size = token_count * hidden_size * hidden_size, and causes invalid memory write in cuda kernel and ORT crash

and two minor issues:
(2) potential integer overflow in static_cast<int>(element_count)
(3) some dead code after return LaunchSkipLayerNormKernel that will never have chance to run.

Motivation and Context

@tianleiwu tianleiwu marked this pull request as draft August 4, 2023 23:48
@tianleiwu tianleiwu marked this pull request as ready for review August 5, 2023 04:05
@tianleiwu tianleiwu requested review from wangyems and yufenglee August 5, 2023 04:05
wangyems
wangyems previously approved these changes Aug 7, 2023
@tianleiwu tianleiwu merged commit fb11c67 into main Aug 8, 2023
@tianleiwu tianleiwu deleted the tlwu/fix_sln_element_size branch August 8, 2023 21:04
jchen351 pushed a commit that referenced this pull request Aug 12, 2023
Fix an obvious bug:
(1) In packing mode, the input for SLN has two dimensions (introduced by
#15283): [token_count, hidden_size]. Current code of `element_count =
input_dims[0] * sequence_length * hidden_size` will use element_size =
token_count * hidden_size * hidden_size, and causes invalid memory write
in cuda kernel and ORT crash

and two minor issues:
(2) potential integer overflow in `static_cast<int>(element_count)`
(3) some dead code after `return LaunchSkipLayerNormKernel` that will
never have chance to run.
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
Fix an obvious bug:
(1) In packing mode, the input for SLN has two dimensions (introduced by
microsoft#15283): [token_count, hidden_size]. Current code of `element_count =
input_dims[0] * sequence_length * hidden_size` will use element_size =
token_count * hidden_size * hidden_size, and causes invalid memory write
in cuda kernel and ORT crash

and two minor issues:
(2) potential integer overflow in `static_cast<int>(element_count)`
(3) some dead code after `return LaunchSkipLayerNormKernel` that will
never have chance to run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants