You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a problem with the way we compile our CUDA kernels. I'll try to see if we can modify the compilation process (likely with conditional compilation) to avoid this, but it will probably be difficult.
Describe the bug
it does not support some old hardware.
Can it just convert bfloat16 to float16 before loading model. just like vllm is doing?
The text was updated successfully, but these errors were encountered: