Skip to content

Commit

Permalink
fix debug build
Browse files Browse the repository at this point in the history
  • Loading branch information
tianleiwu committed May 1, 2024
1 parent 38b03b7 commit 93a6f63
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion onnxruntime/contrib_ops/cuda/sparse/sparse_attention.cc
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,6 @@ Status SparseAttention<T>::ComputeInternal(OpKernelContext* context) const {
output_shape[2] = static_cast<int64_t>(parameters.hidden_size);
Tensor* output = context->Output(0, output_shape);

assert(parameters.past_kv_format == AttentionQkvFormat::Q_K_V_BNSH);
std::vector<int64_t> present_dims = {
parameters.batch_size, parameters.kv_num_heads, parameters.max_sequence_length, parameters.head_size};
TensorShape present_shape(present_dims);
Expand Down

0 comments on commit 93a6f63

Please sign in to comment.