Skip to content

Conversation

cauyxy
Copy link
Contributor

@cauyxy cauyxy commented Aug 17, 2023

improvement to the memory management on machines with low CPU memory. Previously, the state variable was not explicitly deleted after its use, potentially leading to killed when you loading a big model in low cpu mem meachine.

cauyxy added 5 commits August 17, 2023 20:20
improvement to the memory management on machines with low CPU memory. Previously, the state variable was not explicitly deleted, leading to killed when your cpu mem is low
improvement to the memory management on machines with low CPU memory. Previously, the state variable was not explicitly deleted after its use, potentially leading to killed when you loading a big model in low cpu mem meachine.
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the fix!

@zhuohan123 zhuohan123 merged commit 73b3de7 into vllm-project:main Aug 17, 2023
randxie pushed a commit to randxie/vllm that referenced this pull request Aug 29, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yma11 pushed a commit to yma11/vllm that referenced this pull request Feb 25, 2025
When the input is 2D, we unsqueeze it to 3D to meet HPUFusedRMSNorm requirements
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
…llm-project#784)

### What this PR does / why we need it?
In the w8a8 quantization code of `fused_experts`, the output of almost
every operator is assigned a new variable name. If we want to save NPU
memory, we manually `del` these variables to end their lifecycle, which
fills the code with `del` statements and looks inelegant.
Therefore, I plan to names the output of most operators as
`hidden_states`, thereby ending the lifecycle of the previous
`hidden_states`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

Signed-off-by: ApsarasX <apsarax@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants