You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, OpenPPL for cuda only supports fp16 right now. We will support int8 ASAP.
When you input a fp32 data file, OpenPPL will automatically transfer to fp16.
"--quantization" is designed for int8, so you don't need to use it.
How to quantization INT8 / FP16?
Define_string_opt("--quantization", g_flag_quantization, "", "declare **json file** saved quantization information");
what does the json file look like?
The text was updated successfully, but these errors were encountered: