Add model compression to FP16 weights #7588

mvafin · 2021-09-21T20:56:11Z

Details:

Add new command line argument to MO to make compression
Make a note in the old data_type option that it is deprecated

Tickets:

64953

...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp

...ngine/src/transformations/src/transformations/rt_info/mark_precision_sensitive_subgraphs.cpp

inference-engine/tests/functional/inference_engine/transformations/compress_constants_test.cpp

...gine/src/transformations/include/transformations/common_optimizations/compress_constants.hpp

mvafin · 2021-10-07T11:58:55Z

Had to force rebase, because of changes in third-party which I couldn't revert otherwise

inference-engine/src/transformations/include/transformations/rt_info/decompression.hpp

...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp

...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp

model-optimizer/mo/back/offline_transformations.py

...clude/transformations/common_optimizations/disable_decomression_convert_constant_folding.hpp

...gine/src/transformations/include/transformations/common_optimizations/compress_constants.hpp

...lude/transformations/common_optimizations/disable_decompression_convert_constant_folding.hpp

...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp

...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp

rkazants · 2021-11-03T11:21:22Z

...rc/transformations/include/transformations/common_optimizations/compress_float_constants.hpp

+namespace ov {
+namespace pass {
+
+class TRANSFORMATIONS_API CompressFloatConstantsImpl;


why do we need to reveal Impl in API?

rkazants · 2021-11-03T11:25:51Z

...rc/transformations/include/transformations/common_optimizations/compress_float_constants.hpp

+ * @ingroup ie_transformation_common_api
+ * @brief CompressFloatConstants transformation replaces FP32/FP64 Constants with FP16 ones.
+ */
+class ov::pass::CompressFloatConstants : public ov::pass::GraphRewrite {


It seems not only about constant compression. It also covers parameters. I propose to rename it.

It changes constants, parameters only get old api map with runtime info, CompressFloatConstants describe best I think.

...rmations/include/transformations/common_optimizations/mark_precision_sensitive_subgraphs.hpp

...ne/src/transformations/src/transformations/common_optimizations/compress_float_constants.cpp

rkazants · 2021-11-03T13:32:25Z

...ne/src/transformations/src/transformations/common_optimizations/compress_float_constants.cpp

+                    order.resize(r);
+                    std::iota(order.begin(), order.end(), 0);
+                } else {
+                    return false;


why can't we handle dynamic rank and use {} for order?

old api map is used for old api, right? Old api doesnt support dynamic rank anyway

This is open question. It makes sense to discuss offline how we handle dynamic rank paramters.

...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp

...e/src/transformations/src/transformations/disable_decompression_convert_constant_folding.cpp

...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp

rkazants · 2021-11-03T15:38:36Z

inference-engine/src/transformations/src/transformations/rt_info/disable_fp16_compression.cpp

+// Copyright (C) 2018-2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+


propose to combine these routines with decompression* routines in one place

It is not a good idea in my opinion, all attribute are located in individual files, which is easier to manage

ngraph/core/src/op/util/precision_sensitive_attribute.cpp

ngraph/core/src/op/topk.cpp

…ression

model-optimizer/mo/main.py

* Add model compression to FP16 weights * Fix build * Fix build * Fix build * Add wrapper over ConvertPrecision * Add documentation to attributes * Fix MO IR Reader * Fix build * Return DisableDecompressionConvertConstantFolding call in CommonOptimizations * Temporarily disable old_api map * Fix TI Convert issue * Apply review feedback * Fix build * Fix build * Fix build

mvafin requested review from GlebKazantaev and ilyachur as code owners September 21, 2021 20:56

mvafin requested review from a team September 21, 2021 20:56

openvino-pushbot added category: Python API OpenVINO Python bindings category: MO Model Optimizer labels Sep 21, 2021

ilya-lavrenov reviewed Sep 21, 2021

View reviewed changes

...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp Outdated Show resolved Hide resolved

mvafin requested review from a team and ilya-lavrenov September 22, 2021 07:40

ilya-lavrenov requested a review from vladimir-paramuzov September 22, 2021 08:15

ilyachur reviewed Sep 23, 2021

View reviewed changes

mvafin requested a review from a team September 27, 2021 09:38

mvafin requested review from a team as code owners September 29, 2021 19:46

ilyachur approved these changes Oct 4, 2021

View reviewed changes

...gine/src/transformations/include/transformations/common_optimizations/compress_constants.hpp Outdated Show resolved Hide resolved

mvafin requested a review from a team October 7, 2021 11:26

mvafin force-pushed the api2/compression branch 2 times, most recently from 722c6c7 to 6f254d2 Compare October 7, 2021 11:58

jane-intel requested changes Oct 15, 2021

View reviewed changes

rkazants reviewed Oct 17, 2021

View reviewed changes

...clude/transformations/common_optimizations/disable_decomression_convert_constant_folding.hpp Outdated Show resolved Hide resolved

mvafin requested a review from a team October 20, 2021 21:57

mvafin force-pushed the api2/compression branch 2 times, most recently from 246618b to 1789616 Compare October 21, 2021 13:10

GlebKazantaev suggested changes Oct 21, 2021

View reviewed changes

mvafin requested review from rkazants, jane-intel and GlebKazantaev October 27, 2021 07:32