Is your feature request related to a problem?
This may result in the graphics memory not being fully utilized,And there is a problem with the Qwen3-next-fp8 model.
Describe the Solution you'd like
for mac & gpu using different params bytes
Alternatives Considered (Optional)
No response
Additional Context (Optional)
No response