Skip to content

Conversation

@YuhanXu
Copy link
Contributor

@YuhanXu YuhanXu commented Sep 23, 2025

修复cache_kernel里的 CUDA error(700): 'cudaErrorIllegalAddress'。
由于int* batch_id_per_token默认值会被设成-1,
在代码const uint32_t ori_bi = batch_id_per_token[token_idx];中,-1被转成uint32_t会导致越界报错。

@YuhanXu YuhanXu changed the title FIX] Fix CUDA error(700): 'cudaErrorIllegalAddress' in CascadeAppendW… [FIX] Fix CUDA error(700): 'cudaErrorIllegalAddress' in CascadeAppendW… Sep 23, 2025
@gongshaotian
Copy link
Collaborator

gongshaotian commented Sep 23, 2025

PR 标题建议带上 [CUDAGraph] 标签

gongshaotian
gongshaotian previously approved these changes Sep 23, 2025
Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YuhanXu YuhanXu changed the title [FIX] Fix CUDA error(700): 'cudaErrorIllegalAddress' in CascadeAppendW… [CUDAGraph] [FIX] Fix CUDA error(700): 'cudaErrorIllegalAddress' in CascadeAppendW… Sep 23, 2025
const uint32_t qkv_bias = bias % hidden_size;
const uint32_t hi = qkv_bias / head_size;
const uint32_t h_bias = qkv_bias % head_size;
const uint32_t ori_bi = batch_id_per_token[token_idx];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以改成int类型吗

Copy link
Contributor Author

@YuhanXu YuhanXu Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不合适,因为uint32的取值范围大于int32。
不过单看这个kernel里ori_bi之后没有参与别的uint32的赋值,所以是可以的。
我改一下试试

…riteCacheKVQKV cache_kernel(). Continue when batch_id_per_token[token_idx] is default value -1.
@paddle-bot
Copy link

paddle-bot bot commented Sep 23, 2025

Thanks for your contribution!

Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongshaotian gongshaotian merged commit 44010ce into PaddlePaddle:develop Sep 24, 2025
16 of 17 checks passed
@YuhanXu YuhanXu deleted the rope_cache_kernel_bugfix branch October 15, 2025 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants