-
Notifications
You must be signed in to change notification settings - Fork 2k
float16 quantization + granular quantization #3340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pyu10055
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, thank you for the contribution. I did a quick round of review, overall the PR looks good, there are couple questions, and there are some conflicts you need to resolve. Thanks!
Reviewable status: 0 of 1 approvals obtained (waiting on @JesseFarebro and @pyu10055)
tfjs-converter/python/tensorflowjs/quantization.py, line 117 at r1 (raw file):
def dequantize_weights(
can you add support for float16 dequantization?
tfjs-converter/python/tensorflowjs/quantization_test.py, line 50 at r1 (raw file):
d = np.ones(5, dtype=np.float32) q, metadata = quantization.quantize_weights(d, np.uint8) assert 'scale' in metadata and 'min' in metadata
Can you add test for float16 quantization?
tfjs-converter/python/tensorflowjs/read_weights.py, line 194 at r1 (raw file):
offset += dtype.itemsize * value.size if quant_info and dtype in [np.uint8, np.uint16]: value = quantization.dequantize_weights(
float16 weight should also convert back to the original dtype
tfjs-converter/python/tensorflowjs/write_weights_test.py, line 701 at r1 (raw file):
manifest = write_weights.write_weights( groups, TMP_DIR, shard_size_bytes=1024, quantization_dtype_map={'uint8': '*'})
can you add test for float16?
tfjs-converter/python/tensorflowjs/converters/converter.py, line 421 at r1 (raw file):
if float16 is not None: quantization_dtype_map[quantization.QUANTIZATION_DTYPE_FLOAT16] = \ float16.split(',')
if any of the entry is empty, it will generate an empty list, will it translate to match all weights?
tfjs-converter/python/tensorflowjs/converters/wizard.py, line 471 at r1 (raw file):
'default': False }, {
this is bit confusing, should user need all three types? maybe for wizard flow, have the user choose one compression type first, then specify the name in the second question? what do you think?
tfjs-core/src/io/io_utils.ts, line 144 at r1 (raw file):
`Weight ${spec.name} has unknown ` + `quantization dtype ${quantization.dtype}. ` + `Supported quantization dtypes are: 'uint8' and 'uint16'.`);
add 'float16'
tfjs-core/src/io/io_utils_test.ts, line 630 at r1 (raw file):
}); describe('float16', () => {
can you add couple conversion tests for non-zero numbers?
JesseFarebro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 approvals obtained (waiting on @JesseFarebro and @pyu10055)
tfjs-converter/python/tensorflowjs/quantization.py, line 117 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
can you add support for float16 dequantization?
Done.
tfjs-converter/python/tensorflowjs/quantization_test.py, line 50 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
Can you add test for float16 quantization?
Done. I also refactored these test cases to be more clear on what we're testing here.
tfjs-converter/python/tensorflowjs/read_weights.py, line 194 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
float16 weight should also convert back to the original dtype
Done. float16 is supported in dequantize_weights and I removed the check for uint8 and uint16 here. I also added a new test case to make sure we decode to the proper dtype.
tfjs-converter/python/tensorflowjs/write_weights_test.py, line 701 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
can you add test for float16?
Done. I just added on to this test case so there is an example of quantizing different weights.
tfjs-converter/python/tensorflowjs/converters/converter.py, line 421 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
if any of the entry is empty, it will generate an empty list, will it translate to match all weights?
I'm not quite sure what you mean here, are you referring to supplying the flag --quantize_float16 without providing any weight names? If so, this is handled by argparse which would provide the default value of * if nargs == 0.
tfjs-converter/python/tensorflowjs/converters/wizard.py, line 471 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
this is bit confusing, should user need all three types? maybe for wizard flow, have the user choose one compression type first, then specify the name in the second question? what do you think?
I think this is a good idea, if one requires more fine-grained control they can supply these arguments using tensorflowjs_converter. I feel like the wizard flow for quantization will catch most use cases.
tfjs-core/src/io/io_utils.ts, line 144 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
add 'float16'
Done.
|
Hi @pyu10055, Thanks for taking a look, I resolved most of the issues you outlined. I need to resolve a couple more things before it's ready to be looked at again. More specifically, I need a better way of resolving the case where you want to catch all other weights with a specific quantization value, i.e., won't catch all remaining variables and quantize to It won't take much work but I do want to provide sufficient tests for this mapping scheme. I'll let you know when it's ready to be looked at again. |
pyu10055
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is great. One suggestion, can you keep the current quantization flag, since when we deprecate flag, we want to give user some time, for example couple releases before we finally delete them.
In this case, we shall show a warning message if the old qunatization flag is used, and I think those old flag can be mapped to your new quantization map, and leave the real code the same. Only the converter.py wrapper code need to be changed. For wizard, since it is interactive, we can remove the flag without a problem.
Reviewable status: 0 of 1 approvals obtained (waiting on @JesseFarebro)
tfjs-converter/python/tensorflowjs/converters/converter.py, line 421 at r1 (raw file):
Previously, JesseFarebro (Jesse Farebrother) wrote…
I'm not quite sure what you mean here, are you referring to supplying the flag
--quantize_float16without providing any weight names? If so, this is handled by argparse which would provide the default value of*ifnargs == 0.
got it, thanks for explaining.
JesseFarebro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rebased off the latest changes and re-added the ---quantization_bytes flag that just converts to the new quantization map. I added some tests for the fallthrough method I discussed above. These changes are accompanied by more tests for the quantization map parsing. I also added some more float16 decoding tests and fixed up the failing WASM tests.
@pyu10055 if you could take another look now I think this is in a good shape.
Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055)
tfjs-core/src/io/io_utils_test.ts, line 630 at r1 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
can you add couple conversion tests for non-zero numbers?
Done.
pyu10055
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good, I have approved this PR with couple minor comments. Thanks, this is awesome.
Reviewed 2 of 18 files at r1, 16 of 16 files at r2.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @JesseFarebro)
tfjs-converter/README.md, line 158 at r3 (raw file):
|`--strip_debug_ops` | Strips out TensorFlow debug operations `Print`, `Assert`, `CheckNumerics`. Defaults to `True`.| |`--quantization_bytes` | (Deprecated) How many bytes to optionally quantize/compress the weights to. Valid values are 1 and 2. which will quantize int32 and float32 to 1 or 2 bytes respectively. The default (unquantized) size is 4 bytes.| |`--quantize_float16` | Comma separated list of node names to apply foat16 quantization. You can use the wildcard symbol (*) to apply quantization to multiple nodes. (e.g., conv/*/weights) |
float16
tfjs-converter/README.md, line 158 at r3 (raw file):
|`--strip_debug_ops` | Strips out TensorFlow debug operations `Print`, `Assert`, `CheckNumerics`. Defaults to `True`.| |`--quantization_bytes` | (Deprecated) How many bytes to optionally quantize/compress the weights to. Valid values are 1 and 2. which will quantize int32 and float32 to 1 or 2 bytes respectively. The default (unquantized) size is 4 bytes.| |`--quantize_float16` | Comma separated list of node names to apply foat16 quantization. You can use the wildcard symbol (*) to apply quantization to multiple nodes. (e.g., conv/*/weights) |
maybe say, you can also use wildcard symbol, since the default behavior is to match all weights.
tfjs-converter/README.md, line 160 at r3 (raw file):
|`--quantize_float16` | Comma separated list of node names to apply foat16 quantization. You can use the wildcard symbol (*) to apply quantization to multiple nodes. (e.g., conv/*/weights) | |`--quantize_uint8` | Comma separated list of node names to apply 1 byte affine integer quantization. You can use the wildcard symbol (*) to apply quantization to multiple nodes. (e.g., conv/*/weights) | |`--quantize_uint16` | Comma separated list of node names to apply affine integer quantization. You can use the wildcard symbol (*) to apply quantization to multiple nodes. (e.g., conv/*/weights) |
2 bytes affine integer quantization
tfjs-converter/README.md, line 411 at r3 (raw file):
This will exclude biases and any weights that don't begin with conv/. This can be a powerful tool to reduce model size while trying to maximize performance.
this is awesome new feature, thank you!
tfjs-converter/python/tensorflowjs/quantization.py, line 109 at r3 (raw file):
data: A numpy array of dtype 'float32' or 'int32'. quantization_dtype: A numpy dtype to quantize weights to. Only np.uint8 and np.uint16 are supported.
add np.float32
tfjs-converter/python/tensorflowjs/quantization.py, line 114 at r3 (raw file):
quantized_data: The quantized weights as a numpy array with dtype `quantization_dtype`. scale: The linearly scaling constant used for quantization.
add description for scale and min_val will be empty for np.float32
tfjs-converter/python/tensorflowjs/converters/converter.py, line 424 at r3 (raw file):
print( 'Warning: --quantization_bytes will be deprecated in a future release\n' 'Please consider using --quantization_uint16, --quantization_uint8, '
nitpick: move --quantization_uint8 before --quantization_uint16
tfjs-converter/python/tensorflowjs/converters/converter.py, line 431 at r3 (raw file):
dtype = quantization.QUANTIZATION_BYTES_TO_DTYPES[quantization_bytes] quantization_dtype_map[dtype] = True
should this be True or '*'?
tfjs-converter/python/tensorflowjs/converters/converter.py, line 509 at r3 (raw file):
'--%s' % common.QUANTIZATION_TYPE_FLOAT16, type=str, default=None,
should all of these args default to '*' ?
JesseFarebro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @pyu10055)
tfjs-converter/README.md, line 158 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
float16
Done.
tfjs-converter/README.md, line 158 at r3 (raw file):
you can also use wildcard symbol, since the default behavior is to match all weights.
I revised the wording for the flags, hopefully we can both agree on the new format. I wanted to try and mention that it can be provided as a flag without any nodes.
tfjs-converter/README.md, line 160 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
2 bytes affine integer quantization
Done.
tfjs-converter/README.md, line 411 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
this is awesome new feature, thank you!
Done.
tfjs-converter/python/tensorflowjs/quantization.py, line 109 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
add np.float32
Done.
tfjs-converter/python/tensorflowjs/quantization.py, line 114 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
add description for scale and min_val will be empty for np.float32
Reformatted the docstring to be more concise with the return value.
tfjs-converter/python/tensorflowjs/converters/converter.py, line 424 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
nitpick: move --quantization_uint8 before --quantization_uint16
I like the nitpick 😛 . I changed it!
tfjs-converter/python/tensorflowjs/converters/converter.py, line 431 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
should this be True or '*'?
It could be either but I'll stick with True as with the new map parsing it signifies that dtype will be the fallthrough value for all weights.
tfjs-converter/python/tensorflowjs/converters/converter.py, line 509 at r3 (raw file):
Previously, pyu10055 (Ping Yu) wrote…
should all of these args default to '*' ?
I don't think so. Isn't default the value in the absence of the flag. The value of const=True below is what we use when only the flag is provided, i.e., --quantize_float16 (fallthrough case). Defaulting to None essentially means that if no quantization flag is provided they will default to None, I have this check in _parse_quantization_dtype_map where you can see the is not None check.
|
Thanks for the careful reviews @pyu10055. Do you have any idea what the timeframe for a new release will look like? |
As discussed in #3221 this enables support for quantizing to float16 over the wire. Some models perform better with float16 quantization versus the current integer-based affine scaling method.
I also needed the ability so support a form of granular quantization where I want to quantize only certain values. This PR also includes these changes to the converter, no changes are required to the core in order to support this.
To recap the major changes are:
Converter
--quantization_bytesis no longer a flag in the converter. It didn't make sense as we support two methods for 2 bytes. I added three new flags:--quantize_float16,--quantize_uint8,--quantize_uint16. These flags can be supplied on their own or you can specify specific weight names or groups of weights using a wildcard expansion. For example,Would quantize weights that match the prefix
conv/to float16, all biases with the prefixconv/using 2 byte affine quantization, and all remaining weights with 1 byte affine quantization.Weight Decoding
The only other addition revolves around decoding the float16 values in Javascript to float32 values. JS doesn't support any form of
Float16Arrayat this time so I manually decode the weights using a table-based lookup method that is referenced from http://www.fox-toolkit.org/ftp/fasthalffloatconversion.pdfTesting
I tested this myself with my own models and was able to deterministically reproduce my own results from Python TF. I wrote some tests for the float16 decoding edge cases and a couple of basic decoding test cases.
All the converter test cases are passing and I wrote a couple more tests for the new quantization scheme.
I haven't really tested the converter wizard yet, it doesn't run locally for me (
tf.appis not supported?) and I haven't gotten around to building the pip package to test it out. I think the wizard quantization flow is kind of weird and will probably require some modifications.This change is