Skip to content

Feature or Bug: Is OGA support the feature all DeviceSpan<int32_t>& next_tokens are model.pad_token_id? #1297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
buyuliushen opened this issue Mar 4, 2025 · 2 comments
Assignees

Comments

@buyuliushen
Copy link

buyuliushen commented Mar 4, 2025

Describe the bug

Is OGA support the feature all DeviceSpan<int32_t>& next_tokens are model.pad_token_id?

If the answer is Yes, there maybe bug in the following code. if the answer is No, do you have plan to support this feature? or how do you deal with this scene?

If all DeviceSpan<int32_t>& next_tokens are model.pad_token_id:
in the following code, after the end of while loop, token_index will be -1, and input_sequence_lengths[b] will be 0;

size_t token_index = new_kv_length;
while (token_index-- > 0) {
auto next_token = const_cast<DeviceSpan<int32_t>&>(next_tokens).CpuSpan()[b * new_kv_length + token_index];
if (next_token != model_.config_->model.pad_token_id)
break;
}
input_sequence_lengths[b] = static_cast<int>(token_index + 1);
}

then go to the following code, token_index = -1; then in L55, the first parameter of logits_raw.subspan(),
(vocab_index * seq_length + token_index * vocab_size) * element_size maybe = (0 * seq_length + (-1 ) * vocab_size) * element_size = -1 * vocab_size * element_size, it's a negative value.

size_t token_index = input_sequence_lengths[batch_index] - 1;
for (int beam_index = 0; beam_index < num_beams; beam_index++) {
auto target = logits_last_tokens.subspan(vocab_index * element_size, vocab_size * element_size);
auto source = logits_raw.subspan((vocab_index * seq_length + token_index * vocab_size) * element_size, vocab_size * element_size);
target.CopyFrom(source);
vocab_index += vocab_size;
}

in the subspan(), the parameter datatype is size_t, it's unsigned, so if the first parameter is a negative value, it maybe very large as to size_t, then L417 will report error and lead to crash.

using size_type = size_t;
Image

Image
@buyuliushen buyuliushen changed the title Feature or Bug: Is OGA support the feature all '''DeviceSpan<int32_t>& next_tokens''' are '''model.pad_token_id'''? Feature or Bug: Is OGA support the feature all DeviceSpan<int32_t>& next_tokens are model.pad_token_id? Mar 4, 2025
@RyanUnderhill
Copy link
Contributor

@aciddelgado You probably know the most about this new continuous decoding logic, we should support this edge case. I can add a unit test but what do you think the best behavior is when it happens?

@buyuliushen
Copy link
Author

Hi @RyanUnderhill @aciddelgado , any update about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants