Skip to content

Comments

[Feature] Graph mode for deepseek v2/v3#347

Closed
SidaoY wants to merge 37 commits intovllm-project:v0.7.3-devfrom
SidaoY:graph_mode_v073
Closed

[Feature] Graph mode for deepseek v2/v3#347
SidaoY wants to merge 37 commits intovllm-project:v0.7.3-devfrom
SidaoY:graph_mode_v073

Conversation

@SidaoY
Copy link
Contributor

@SidaoY SidaoY commented Mar 18, 2025

Graph mode for deepseek v2/v3

@SidaoY SidaoY force-pushed the graph_mode_v073 branch 7 times, most recently from f030e15 to 8026c32 Compare March 18, 2025 09:10
Signed-off-by: SidaoY <1024863041@qq.com>
@SidaoY SidaoY force-pushed the graph_mode_v073 branch 5 times, most recently from cc64a96 to 3453767 Compare March 19, 2025 01:30
Signed-off-by: SidaoY <1024863041@qq.com>
MengqingCao and others added 4 commits March 19, 2025 01:53
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: SidaoY <1024863041@qq.com>
@SidaoY
Copy link
Contributor Author

SidaoY commented Mar 19, 2025

CI failed due to network issues.

linfeng-yuan and others added 6 commits March 20, 2025 14:34
… larger bs

Signed-off-by: linfeng-yuan <1102311262@qq.com>
Signed-off-by: linfeng-yuan <1102311262@qq.com>
Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
Signed-off-by: linfeng-yuan <1102311262@qq.com>
feat: support torchrun for multinode dp
@ganyi1996ppo
Copy link
Collaborator

@SidaoY Please also file a PR to master branch

Signed-off-by: mengwei805 <mengwei25@huawei.com>
Yizhou Liu and others added 6 commits March 27, 2025 15:25
Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
Signed-off-by: libaokui <libaokui@huawei.com>
* Add support for pd separation

* make patch for pd separation based on vllm-ascend

---------

Co-authored-by: q30056305 <qianzihui@huawei.com>
Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
… models, change rope for q

Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
…ronment variable

Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
…te torch.compile

Signed-off-by: Yizhou Liu <liuyizhou5@h-partners.com>
@tt545571022
Copy link

tt545571022 commented Apr 9, 2025

I try to run example/offline_inference_npu.py with graph mode using 029f304 branch,But I encountered the following error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/_utils/error_code.py", line 43, in wapper
[rank0]:     return func(*args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 322, in __call__
[rank0]:     return self._get_compiled_gm(gm, example_inputs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 358, in _get_compiled_gm
[rank0]:     return _GmRunner(self._gen_compiled_gm(gm, example_inputs))
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 376, in _gen_compiled_gm
[rank0]:     concrete_graph: ConcreteGraphBase = _NpuGraphConverter(
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 152, in run
[rank0]:     super().run(*args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch/fx/interpreter.py", line 146, in run
[rank0]:     self.env[node] = self.run_node(node)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 146, in run_node
[rank0]:     return super().run_node(n)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch/fx/interpreter.py", line 203, in run_node
[rank0]:     return getattr(self, n.op)(n.target, args, kwargs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 119, in inner
[rank0]:     result = f(self, target, args, kwargs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 206, in call_function
[rank0]:     return self._wrap('call_function')(target, args, kwargs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/npu_fx_compiler.py", line 193, in inner
[rank0]:     npu_outputs = self._graph.parse_node(target, args_npu, kwargs_npu, meta_outputs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/_ge_concrete_graph/continguous_utils.py", line 152, in wrapper
[rank0]:     return func(self, target, args_new, kwargs_new, meta_outputs)
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/torch_npu/dynamo/torchair/_ge_concrete_graph/fx2ge_converter.py", line 838, in parse_node
[rank0]:     raise RuntimeError(f"Unsupported torch op {target} by ge")
[rank0]: RuntimeError: Unsupported torch op auto_functionalized by ge

[rank0]: While executing %auto_functionalized : [num_users=2] = call_function[target=torch.ops.higher_order.auto_functionalized](args = (atb._npu_rotary_embedding.default,), kwargs = {positions: %arg16_1, query: %view_2, key: %view_3, head_size: 64, cos_sin_cache: %arg15_1, is_neox_style: False})
[rank0]: Original traceback:
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm/model_executor/models/deepseek_v2.py", line 677, in forward
[rank0]:     hidden_states = self.model(input_ids, positions, kv_caches,
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm/model_executor/models/deepseek_v2.py", line 633, in forward
[rank0]:     hidden_states, residual = layer(positions, hidden_states,
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm/model_executor/models/deepseek_v2.py", line 550, in forward
[rank0]:     hidden_states = self.self_attn(
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm/model_executor/models/deepseek_v2.py", line 469, in forward
[rank0]:     return self.mla_attn(hidden_states_or_q_c, kv_c_normed, k_pe, kv_cache,
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm_ascend/ops/attention.py", line 65, in attention_forward
[rank0]:     return self.impl.forward(self, query, key, value, self_kv_cache,
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm_ascend/attention.py", line 1115, in forward
[rank0]:     q_pe, k_pe = self.rotary_emb(attn_metadata.input_positions,
[rank0]:   File "/root/anaconda3/envs/asd-graph/lib/python3.10/site-packages/vllm_ascend/ops/rotary_embedding.py", line 77, in rope_deepseek_forward_oot
[rank0]:     torch_npu._npu_rotary_embedding(

how should I solve this problem?
environment:
910B2C
pip list | grep vllm
vllm 0.7.3+empty
vllm_ascend 0.1.dev97+g029f304
and the docker is from quay.io/ascend/vllm-ascend:v0.7.3rc1
@SidaoY @MengqingCao

@mengwei805 mengwei805 force-pushed the graph_mode_v073 branch 2 times, most recently from 30aee2a to f6b64bc Compare April 10, 2025 11:20
Signed-off-by: mengwei805 <mengwei25@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.