-
Notifications
You must be signed in to change notification settings - Fork 2k
Closed
Labels
type:featureNew feature or requestNew feature or request
Description
Describe the problem or feature request
It would be useful to support quantizing to float16 over the wire and then dequantizing back to float32 on the client.
I'm working with a model where the 2-byte affine quantization in TFJS performs very poorly but using float16 quantization has a negligible impact on performance.
Metadata
Metadata
Assignees
Labels
type:featureNew feature or requestNew feature or request