-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merging and pre hooks #302
Merging and pre hooks #302
Conversation
81ffd59
to
ecf36c5
Compare
5b44d9e
to
0c81fcc
Compare
This PR does two things:
Will add more commentaries in-place. |
class InputIndexEntry: | ||
def __init__(self, path: Tuple[Union[int, str], ...], getter: Callable, setter: Callable): | ||
self.path = path | ||
self.getter = getter | ||
self.setter = setter | ||
|
||
class TupleRebuildingSetter: | ||
def __init__(self, idx_to_set, current_tuple, previous_level_setter_for_current_tuple): | ||
self._previous_level_setter = previous_level_setter_for_current_tuple | ||
self._current_tuple = current_tuple | ||
self._idx_to_set = idx_to_set | ||
|
||
def __call__(self, value): | ||
tmp_list = list(self._current_tuple) | ||
tmp_list[self._idx_to_set] = value | ||
new_tuple = tuple(tmp_list) | ||
self._current_tuple = new_tuple | ||
self._previous_level_setter(new_tuple) | ||
|
||
|
||
class OperatorInput: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A PyTorch function input is not necessarily a single raw
torch.Tensor
for each input argument of the function. Considertorch.cat([x1, x2])
, which has the input tensorsx1
andx2
located inside a container, which itself is being passed as an argument. This function call, however, conceptually takes two tensors as its input, and if this is not traced properly, the NNCF graph will become disjoint. In a previous implementation this was circumvented by flattening the node inputs - a node's input signature, which identifies it among the other NNCF nodes, is a flat list of all tensors in the input arguments nesting hierarchy.
The same approach is taken here - a flat view of the function's input tensors is being constructed, and each tensor is conceptually assigned a separate "input port ID" ordinal. However, this flat view must be a view, i.e. it should allow the tensors to be replaced with the results of pre-processing while preserving the original nested structure. There should be an option to do so separately for each traced tensor present in the input argument container hierarchy - only in this case will the pre-hooking for separate inputs work in most cases. Sadly, Python has no real notion of writeable iterators; therefore, it is necessary for each tensor in a flat view to store and use associated getter and setter functions, which are built using the closest owning container's __getitem__
and __setitem__
and the associated index of the tensor in the said container.
@@ -234,7 +234,7 @@ def add_node(self, op_exec_context: OperationExecutionContext, inputs) -> NNCFNo | |||
self._nx_graph.add_edge(parent, node_key) | |||
has_traced_inputs = True | |||
self._nx_graph.edges[parent, node_key][NNCFGraph.ACTIVATION_SHAPE_EDGE_ATTR] = info.shape | |||
self._nx_graph.edges[parent, node_key][NNCFGraph.IN_PORT_NAME] = i | |||
self._nx_graph.edges[parent, node_key][NNCFGraph.IN_PORT_NAME_EDGE_ATTR] = i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Since PyTorch/Python combo is essentially dynamic, there is no clear definition of how many inputs (in terms of NNCF graph nodes) does a given function function take (see
torch.cat
example above). The function's "input ports" therefore are not something static; rather, these are enumerated according to the number of traced tensors arriving as the function (node) input.
nncf/nncf_network.py
Outdated
|
||
|
||
class InsertionPoint: | ||
def __init__(self, ia_op_exec_context: InputAgnosticOperationExecutionContext, | ||
insertion_type: InsertionType): | ||
insertion_type: InsertionType, | ||
input_port_id: int = None): | ||
self.ia_op_exec_context = ia_op_exec_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- An insertion point now stores an input port to affect, in order to represent the pre-hook semantics as well.
None
specifies a post-hook insertion.
nncf/quantization/algo.py
Outdated
quantizers_between_quantizable_layers: QuantizersBetweenQuantizableLayers = None): | ||
self.quantizer_module_ref = quantizer_module_ref | ||
self.affected_ia_op_exec_contexts = affected_ia_op_exec_contexts | ||
self.affected_insertions = affected_insertions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The
InputAgnosticOperationExecutionContext
is an address of a node, but the insertion of a hook for a node can now proceed in two ways - either post- or pre-. Therefore a position of a quantizer is no longer defined by a node, but by a node and the insertion type.
pre_hook_id = PreHookId(ia_op_exec_context, input_port_id) | ||
if pre_hook_id in self._pre_hooks: | ||
raise KeyError("Pre hook for context {} is already registered".format(str(pre_hook_id))) | ||
self._pre_hooks[pre_hook_id] = fn_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Pre-hooks is what will allow to quantize branches in the network separately.
Below, on the left, is the result of the MobileNetV2 quantization prior to this PR, and on the right - with this PR.
Notice that the __add__
node used to have a single, nominal pre-hook insertion point, while in fact it takes two tensor activations. The pre-hook insertion points were also non-functional, i.e. the actual insertion could only occur at a post-hook point, which led to a situation where either both of the branches would be quantized with a single quantizer (if a FQ is inserted in the top-most post-hook insertion point), or none of them would.
With the PR, the node has two pre-hook insertion points, corresponding to the number of inputs (i.e. "ports"). Furthermore, it is possible to separately quantize the conv2d
input, using the conv2d
's pre-hook 0, and the residual __add__
input, using the __add__
's pre-hook 0.
nncf/quantization/algo.py
Outdated
final_quantizer_config_setup[insertion_info] = quantizer_config | ||
|
||
finalized_proposal = quantization_proposal.finalize(final_quantizer_config_setup) | ||
final_quantizer_setup = prop_graph_solver.get_final_quantizer_setup(finalized_proposal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The quantizer propagation algorithm has been modified to work in two stages. This is related to the new quantizer configuration merging algorithm. Basically the first step of the propagation results in a proposal - a set of the points at which a quantizer should be inserted in order to quantize all operations specified in HW config with a supported configuration, and, for each point, a list of possible quantizer configurations out of which one and only one should be chosen. The choice could be either affected by utilizing the NNCF config options/presets, or it could be delegated to a specialized algorithm, such as HAWQ in case multiple bitwidths are present in the quantizer configuration list, or an RL algorithm.
Once the final quantizer configurations are chosen, an extra step is necessary to merge the quantizers that are redundant - e.g. if one and the same final quantizer configuration has been chosen both for the global quantizer of all downstream branches and for the local quantizer on a branch. This is illustrated below - at the top is the "proposed" state, which allows a choice among multiple quantizer configuration for each quantizer (blue nodes). At the bottom is the final state provided that the user's/algorithm's choice for quantizers 59, 7 and 4 had all been to use 8 bit symmetric per-tensor quantization; the quantizers 7 and 4 are redundant with respect to quantizer 59 and therefore are discarded.
# do not merge at all, or ... | ||
MODERATE = 1 # ... only merge for exact matches | ||
AGGRESSIVE = 2 # ... merge common parts, and if a branch quantizer has options for scope narrowing in addition to | ||
# the common part, keep the quantizer on branch | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AGGRESSIVE
strategy will be used by default, and it is the one most likely to garner performance benefits.
# if insertion_type == InsertionType.OPERATOR_POST_HOOK: | ||
# return True | ||
# return False | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that all insertion points, both pre- and post-hook, are functional, this function is mostly obsolete.
else: | ||
# Not all of the dominated quantizers have reached the blocking node yet | ||
self._active_propagating_quantizers_queue.appendleft(curr_prop_quantizer) | ||
return quant_prop_graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The new propagation algorithm will make all quantizers that are trying to pass up through a downward-branching node wait until the quantizers on the neighbouring branches reach that node as well. Only then will the merge process occur, which may result in the new merge quantizer being added above the branching node to propagate further, and in the existing branch quantizers being discarded. For the
PropagationStrategy.AGGRESSIVE
, it is also possible that neither of the branch quantizers is discarded, but the merge quantizer is still created - for instance, if one of the branches will potentially require requantization, in case the user/specialized algorithm chooses a corresponding configuration.
Here, the merge quantizer will have a single 8-bit configuration associated since this is the only option that is supported by all the downstream branches. If, at the final quantization config selection stage of the setup, some of the branches also end up having 8-bit quantization chosen for them, the corresponding branch quantizers will be discarded since the global quantizer already has 8-bit quantization. The user is still, however, presented with a possibility to choose non-8-bit quantization for all the branches, in which case requantization will occur on the branches.
|
||
def get_base(self) -> 'InputAgnosticOperationExecutionContext': | ||
return self.ia_op_exec_context | ||
|
||
def get_suffix(self) -> str: | ||
return '' | ||
return '|OUTPUT' if self.input_port_id is None else '|INPUT{}'.format(self.input_port_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string representation of a quantizer has changed since now it needs to reflect the insertion type - either as an input quantizer to an operation or as an output quantizer. This may affect the code that relies upon it - please comment if you know of such pieces of code.
@alexsu52 @AlexKoff88 @ljaljushkin @asenina, add others as necessary. |
0c81fcc
to
9763f56
Compare
Jenkins please retry a build |
"62 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/NNCFConv2d[conv]/conv2d" [color=lightblue, id=62, label="conv2d_#62", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/NNCFConv2d[conv]", style=filled, type=conv2d]; | ||
"63 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/BatchNorm2d[bn]/batch_norm" [id=63, label="batch_norm_#63", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/BatchNorm2d[bn]", style=filled, type=batch_norm]; | ||
"64 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/RELU" [id=64, label="RELU_#64", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]", style=filled, type=RELU]; | ||
"65 Inception3/InceptionA[Mixed_5b]/cat" [id=65, label="cat_#65", scope="Inception3/InceptionA[Mixed_5b]", style=filled, type=cat]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After enabling heterogenous quantizer configuration handling for HAWQ, I managed to restore most of the reference bitwidth distribution graphs to the same state that is currently present on develop.
Current state of the same piece of the graph that we considered in this discussion:
Note that 4-bit AFQ requantizations appeared on the branches. This is due to the fact top-level FQ is 8-bit asymmetric per-tensor, which is the only configuration compatible with both the 8-bit symmetric per-channel listed as supported by avg_pool2d in vpu.json
, and the conv2d inputs, which support 4-bit symmetric per-tensor and 8-bit asymmetric per-tensor quantizations. Since 4 bits have been chosen for conv2d weights, the corresponding activation FQs have been selected as 4-bit, which leads to requantization. Another requantization (no. 104) occurs for avg_pool2d to narrow the quantized output range from asymmetric to symmetric; although per-channel quantization in no.104 is not narrowing w.r.t. nos. 34, 45, 61, 69, this does not break anything.
Also note that the requantizers share the adjacency group with their base quantizers. Hope this is OK.
Jenkins please retry a build |
9ab84ab
to
f265239
Compare
73d5a0b
to
7c14e91
Compare
* Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>
* Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>
* Release v1.5.0 of NNCF to master (#254) * Allow sharing activation quantizers in different graph points (#67) * Update version and docs on develop (#77) * Update 3rd party integration patches (#79) * Doc updates (#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (#85) * Fix percentile per-channel init (#86) Fixes: #83 * Omit nodes called during debugging from entering NNCF graph (#87) * Enable custom range initializers for overriden scopes in schema (#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (#92) * Fix mmdetection patch (#93) * Update mmdetection patch to v2.3.0 (#95) * Allow registering user modules as NNCF modules for weight quantization (#99) * Assign latest tensor shape during ForwardTraceOnly() (#96) * Enable GPT2 ops (#98) * Fix HW config scenario with ops missing in HW config definition (#94) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (#461) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>
* Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init
* Release v1.5.0 of NNCF to master (openvinotoolkit#254) * Allow sharing activation quantizers in different graph points (openvinotoolkit#67) * Update version and docs on develop (openvinotoolkit#77) * Update 3rd party integration patches (openvinotoolkit#79) * Doc updates (openvinotoolkit#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (openvinotoolkit#85) * Fix percentile per-channel init (openvinotoolkit#86) Fixes: openvinotoolkit#83 * Omit nodes called during debugging from entering NNCF graph (openvinotoolkit#87) * Enable custom range initializers for overriden scopes in schema (openvinotoolkit#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (openvinotoolkit#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (openvinotoolkit#92) * Fix mmdetection patch (openvinotoolkit#93) * Update mmdetection patch to v2.3.0 (openvinotoolkit#95) * Allow registering user modules as NNCF modules for weight quantization (openvinotoolkit#99) * Assign latest tensor shape during ForwardTraceOnly() (openvinotoolkit#96) * Enable GPT2 ops (openvinotoolkit#98) * Fix HW config scenario with ops missing in HW config definition (openvinotoolkit#94) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (openvinotoolkit#461) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (openvinotoolkit#263) * make customer happy to see param name that is wrong (openvinotoolkit#259) * kernel chainges * Add pruning sample tests. (openvinotoolkit#268) * Change an operation order in create_compressed_model (openvinotoolkit#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (openvinotoolkit#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (openvinotoolkit#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (openvinotoolkit#257) * Pruning of ConvTranspose (openvinotoolkit#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (openvinotoolkit#279) * Added Unet Mapillary AC configs (openvinotoolkit#281) * Added flag for collection quickly computed stats (openvinotoolkit#287) * Remove __getattr__ from SampleConfig (openvinotoolkit#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (openvinotoolkit#291) * Set proper workdir path for Mask-RCNN (openvinotoolkit#294) * Proper BN momentum parameter and train mode setting in BN adaptation (openvinotoolkit#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (openvinotoolkit#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (openvinotoolkit#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (openvinotoolkit#303) * Switch to VOC2012 in eval mode (openvinotoolkit#295) * Updated pruning configs and results (openvinotoolkit#305) * Don't call MLFlow if it's not enabled (openvinotoolkit#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (openvinotoolkit#296) * Fixed paths to mixed-precision configs (openvinotoolkit#306) * Correct configs for mixed precision models (openvinotoolkit#307) After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (openvinotoolkit#308) * Correct configs for mixed precision models After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (openvinotoolkit#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (openvinotoolkit#314) * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (openvinotoolkit#316) * Made initialization depending on the number of samples. (openvinotoolkit#309) * Wrapped MLFlow for safe access (openvinotoolkit#313) * Introduced a separate batch size for initialization (openvinotoolkit#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (openvinotoolkit#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (openvinotoolkit#322) * Show subprocess log in test assertion stacktrace (openvinotoolkit#325) * Adjust ICNet compressed target values (openvinotoolkit#326) * Do not replace parameter during symmetric range init (openvinotoolkit#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (openvinotoolkit#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (openvinotoolkit#328) * Use PyTorch 1.7 (openvinotoolkit#223) * Move epoch_step and step to the beginning of epoch for staged worker (openvinotoolkit#318) * Use torch 1.7.0 for third party sanity tests (openvinotoolkit#333) * Fix mixing cyrillic and latin letters (openvinotoolkit#335) * Fix calculate statistics in local mode sparsity. (openvinotoolkit#337) * Fk/update packages versions (openvinotoolkit#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (openvinotoolkit#334) * Change tensorboardX to pytorch.utils.tensorboard (openvinotoolkit#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (openvinotoolkit#323) * Corrected grouping of activation quantizers (openvinotoolkit#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (openvinotoolkit#342) * Changed AC configs for SSD models (openvinotoolkit#341) * Revert "Fk/update packages versions (openvinotoolkit#338)" (openvinotoolkit#343) This reverts commit 8c17e0c. * Fk/update packages versions (openvinotoolkit#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (openvinotoolkit#340) * fix config path (openvinotoolkit#346) * Add Embedding to the CPU HW config definition (openvinotoolkit#347) * Added separated execution OV tests to start parraleling (openvinotoolkit#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (openvinotoolkit#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (openvinotoolkit#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (openvinotoolkit#349) * Add quantization support for nn.EmbeddingBag (openvinotoolkit#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (openvinotoolkit#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (openvinotoolkit#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (openvinotoolkit#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (openvinotoolkit#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (openvinotoolkit#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (openvinotoolkit#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (openvinotoolkit#362) * Revert "Added ONNX Q-DQ converting parameters (openvinotoolkit#362)" (openvinotoolkit#368) This reverts commit b0504e9. * Beta directory (openvinotoolkit#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (openvinotoolkit#370) * Fix missing default value (openvinotoolkit#373) * Enable batch norm adaptation by default (openvinotoolkit#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (openvinotoolkit#372) * Add pre post processing test (openvinotoolkit#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (openvinotoolkit#375) * Use a reduced number of BN adaptation samples for sanity testing (openvinotoolkit#378) * Dropped last data point in all DataLoaders to prevent issue with BN (openvinotoolkit#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (openvinotoolkit#377) * Reduce BN adaptation samples count in HAWQ sanity configs (openvinotoolkit#380) * Fix object detection sample. (openvinotoolkit#383) * Added Q-DQ ONNX converting parameter (openvinotoolkit#369) * Links to models were updated (openvinotoolkit#386) * include_mask flag for tfds decoder was added (openvinotoolkit#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (openvinotoolkit#388) * change VOC dataset namings (openvinotoolkit#387) * Configure device by common function for all samples (openvinotoolkit#391) * Reduced num_init_samples for range init to accelerate sanity tests (openvinotoolkit#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (openvinotoolkit#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (openvinotoolkit#393) * Print flops pruning level in statistic (openvinotoolkit#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (openvinotoolkit#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (openvinotoolkit#397) * Fix pruning l2norm (openvinotoolkit#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (openvinotoolkit#384) * converted relative imports to absolute imports (openvinotoolkit#396) * Add ac configs for pruned unet and ssd300 (openvinotoolkit#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (openvinotoolkit#398) * Add some explanations to make doc clearer (openvinotoolkit#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (openvinotoolkit#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (openvinotoolkit#404) * Use links to config files for NNCF READMEs (openvinotoolkit#407) * Combined package (openvinotoolkit#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (openvinotoolkit#405) * Remove NNCF package dependency on tensorboard (openvinotoolkit#411) * Small scheduler fixes (openvinotoolkit#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (openvinotoolkit#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (openvinotoolkit#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (openvinotoolkit#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (openvinotoolkit#415) * using common registry (openvinotoolkit#414) * fixed sanity tests for samples (openvinotoolkit#417) * Common NNCFConfig (openvinotoolkit#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (openvinotoolkit#420) * Fix NoCompressionAlgorithmBuilder (openvinotoolkit#426) * fixed issues with paths (openvinotoolkit#425) * 00.0:Updating NNCF github dockerfiles against last changes (openvinotoolkit#436) * Change thresholds for pruned ssd300 (openvinotoolkit#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (openvinotoolkit#439) Fixes: openvinotoolkit#416 * Use non-recursive BFS for graph traversal (openvinotoolkit#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (openvinotoolkit#441) * Minor fixes for HAWQ (openvinotoolkit#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (openvinotoolkit#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (openvinotoolkit#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (openvinotoolkit#450) * Changed working logic with json metrics (openvinotoolkit#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (openvinotoolkit#454) * Fix metric value for ssd300_mobilenet_voc. (openvinotoolkit#453) * Do not follow symlinks when opening files (openvinotoolkit#451) * Correctly construct Q-DQ config for E2E tests (openvinotoolkit#456) * Update documentation for the v1.6.0 release (openvinotoolkit#457) * Add torch.load warnings and path resolution (openvinotoolkit#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>
No description provided.