-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add model compression to FP16 weights #7588
Add model compression to FP16 weights #7588
Conversation
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...ngine/src/transformations/src/transformations/rt_info/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
inference-engine/tests/functional/inference_engine/transformations/compress_constants_test.cpp
Outdated
Show resolved
Hide resolved
...gine/src/transformations/include/transformations/common_optimizations/compress_constants.hpp
Outdated
Show resolved
Hide resolved
722c6c7
to
6f254d2
Compare
Had to force rebase, because of changes in third-party which I couldn't revert otherwise |
inference-engine/src/transformations/include/transformations/rt_info/decompression.hpp
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
...clude/transformations/common_optimizations/disable_decomression_convert_constant_folding.hpp
Outdated
Show resolved
Hide resolved
246618b
to
1789616
Compare
...gine/src/transformations/include/transformations/common_optimizations/compress_constants.hpp
Outdated
Show resolved
Hide resolved
...lude/transformations/common_optimizations/disable_decompression_convert_constant_folding.hpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...e-engine/src/transformations/src/transformations/common_optimizations/compress_constants.cpp
Outdated
Show resolved
Hide resolved
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
namespace ov { | ||
namespace pass { | ||
|
||
class TRANSFORMATIONS_API CompressFloatConstantsImpl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to reveal Impl
in API?
* @ingroup ie_transformation_common_api | ||
* @brief CompressFloatConstants transformation replaces FP32/FP64 Constants with FP16 ones. | ||
*/ | ||
class ov::pass::CompressFloatConstants : public ov::pass::GraphRewrite { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems not only about constant compression. It also covers parameters. I propose to rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It changes constants, parameters only get old api map with runtime info, CompressFloatConstants describe best I think.
...rmations/include/transformations/common_optimizations/mark_precision_sensitive_subgraphs.hpp
Show resolved
Hide resolved
...rmations/include/transformations/common_optimizations/mark_precision_sensitive_subgraphs.hpp
Show resolved
Hide resolved
...ne/src/transformations/src/transformations/common_optimizations/compress_float_constants.cpp
Outdated
Show resolved
Hide resolved
order.resize(r); | ||
std::iota(order.begin(), order.end(), 0); | ||
} else { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't we handle dynamic rank and use {} for order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old api map is used for old api, right? Old api doesnt support dynamic rank anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is open question. It makes sense to discuss offline how we handle dynamic rank paramters.
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Outdated
Show resolved
Hide resolved
...e/src/transformations/src/transformations/disable_decompression_convert_constant_folding.cpp
Show resolved
Hide resolved
...nsformations/src/transformations/common_optimizations/mark_precision_sensitive_subgraphs.cpp
Show resolved
Hide resolved
// Copyright (C) 2018-2021 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
propose to combine these routines with decompression*
routines in one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a good idea in my opinion, all attribute are located in individual files, which is easier to manage
* Add model compression to FP16 weights * Fix build * Fix build * Fix build * Add wrapper over ConvertPrecision * Add documentation to attributes * Fix MO IR Reader * Fix build * Return DisableDecompressionConvertConstantFolding call in CommonOptimizations * Temporarily disable old_api map * Fix TI Convert issue * Apply review feedback * Fix build * Fix build * Fix build
* Add model compression to FP16 weights * Fix build * Fix build * Fix build * Add wrapper over ConvertPrecision * Add documentation to attributes * Fix MO IR Reader * Fix build * Return DisableDecompressionConvertConstantFolding call in CommonOptimizations * Temporarily disable old_api map * Fix TI Convert issue * Apply review feedback * Fix build * Fix build * Fix build
Details:
Tickets: