[Bug] several patched models active all experts in forward

### Problem Description

 Qwen3 VL/gpt oss/llama4.
As we apply the same patch in inference, this leads to very slow speed at decoding phase

### Reproduction Steps

~

### Environment Information

_No response_

### Error Logs

```shell

```

### Additional Context

_No response_