Skip to content

[lora] allow int64 values for LoRA ID to avoid overflow#1574

Merged
AlpinDale merged 1 commit into
mainfrom
int64-lora-id
Nov 4, 2025
Merged

[lora] allow int64 values for LoRA ID to avoid overflow#1574
AlpinDale merged 1 commit into
mainfrom
int64-lora-id

Conversation

@AlpinDale

Copy link
Copy Markdown
Member

No description provided.

Signed-off-by: AlpinDale <alpindale@gmail.com>
@AlpinDale AlpinDale merged commit 253aec1 into main Nov 4, 2025
1 check passed
@AlpinDale AlpinDale deleted the int64-lora-id branch November 4, 2025 04:53

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the dtype of request_lora_mapping from np.int32 to np.int64 in both gpu_input_batch.py and tpu_input_batch.py to prevent potential overflows with LoRA IDs. While this change is necessary to support larger IDs, it introduces a risk of type mismatch with downstream components like CUDA or TPU kernels that might still expect 32-bit integers. I've added critical comments highlighting the need to ensure all consumers of this array are updated to prevent potential data corruption or crashes.


# lora related
self.request_lora_mapping = np.zeros((self.max_num_reqs,), dtype=np.int32)
self.request_lora_mapping = np.zeros((self.max_num_reqs,), dtype=np.int64)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change to np.int64 can cause a critical type mismatch. If downstream consumers of this array (e.g., CUDA kernels) still expect np.int32, it can lead to silent data corruption or crashes due to incorrect memory interpretation. It is crucial that all components using request_lora_mapping are also updated to handle 64-bit integers.


# lora related
self.request_lora_mapping = np.zeros((self.max_num_reqs,), dtype=np.int32)
self.request_lora_mapping = np.zeros((self.max_num_reqs,), dtype=np.int64)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change to np.int64 can cause a critical type mismatch. If downstream consumers of this array (e.g., TPU kernels) still expect np.int32, it can lead to silent data corruption or crashes due to incorrect memory interpretation. It is crucial that all components using request_lora_mapping are also updated to handle 64-bit integers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant