Merging and pre hooks #302

vshampor · 2020-11-26T15:34:32Z

No description provided.

vshampor · 2020-12-01T16:04:52Z

This PR does two things:

Allows input tensors for NNCF-wrapped PyTorch functions to be pre-processed separately, e.g. introduces the "pre-hook" functionality; previously only the function output tensors could be post-processed.
Introduces a new algorithm for the merging of the propagating quantizers that have reached one and the same downward-branching node.

Will add more commentaries in-place.

vshampor · 2020-12-01T16:15:30Z

nncf/dynamic_graph/context.py

+class InputIndexEntry:
+    def __init__(self, path: Tuple[Union[int, str], ...], getter: Callable, setter: Callable):
+        self.path = path
+        self.getter = getter
+        self.setter = setter
+
+class TupleRebuildingSetter:
+    def __init__(self, idx_to_set, current_tuple, previous_level_setter_for_current_tuple):
+        self._previous_level_setter = previous_level_setter_for_current_tuple
+        self._current_tuple = current_tuple
+        self._idx_to_set = idx_to_set
+
+    def __call__(self, value):
+        tmp_list = list(self._current_tuple)
+        tmp_list[self._idx_to_set] = value
+        new_tuple = tuple(tmp_list)
+        self._current_tuple = new_tuple
+        self._previous_level_setter(new_tuple)
+
+
 class OperatorInput:


A PyTorch function input is not necessarily a single raw torch.Tensor for each input argument of the function. Consider torch.cat([x1, x2]), which has the input tensors x1 and x2 located inside a container, which itself is being passed as an argument. This function call, however, conceptually takes two tensors as its input, and if this is not traced properly, the NNCF graph will become disjoint. In a previous implementation this was circumvented by flattening the node inputs - a node's input signature, which identifies it among the other NNCF nodes, is a flat list of all tensors in the input arguments nesting hierarchy.

The same approach is taken here - a flat view of the function's input tensors is being constructed, and each tensor is conceptually assigned a separate "input port ID" ordinal. However, this flat view must be a view, i.e. it should allow the tensors to be replaced with the results of pre-processing while preserving the original nested structure. There should be an option to do so separately for each traced tensor present in the input argument container hierarchy - only in this case will the pre-hooking for separate inputs work in most cases. Sadly, Python has no real notion of writeable iterators; therefore, it is necessary for each tensor in a flat view to store and use associated getter and setter functions, which are built using the closest owning container's __getitem__ and __setitem__ and the associated index of the tensor in the said container.

vshampor · 2020-12-01T16:19:49Z

nncf/dynamic_graph/graph.py

@@ -234,7 +234,7 @@ def add_node(self, op_exec_context: OperationExecutionContext, inputs) -> NNCFNo
            self._nx_graph.add_edge(parent, node_key)
            has_traced_inputs = True
            self._nx_graph.edges[parent, node_key][NNCFGraph.ACTIVATION_SHAPE_EDGE_ATTR] = info.shape
-            self._nx_graph.edges[parent, node_key][NNCFGraph.IN_PORT_NAME] = i
+            self._nx_graph.edges[parent, node_key][NNCFGraph.IN_PORT_NAME_EDGE_ATTR] = i


Since PyTorch/Python combo is essentially dynamic, there is no clear definition of how many inputs (in terms of NNCF graph nodes) does a given function function take (see torch.cat example above). The function's "input ports" therefore are not something static; rather, these are enumerated according to the number of traced tensors arriving as the function (node) input.

vshampor · 2020-12-01T16:21:23Z

nncf/nncf_network.py



 class InsertionPoint:
    def __init__(self, ia_op_exec_context: InputAgnosticOperationExecutionContext,
-                 insertion_type: InsertionType):
+                 insertion_type: InsertionType,
+                 input_port_id: int = None):
        self.ia_op_exec_context = ia_op_exec_context


An insertion point now stores an input port to affect, in order to represent the pre-hook semantics as well. None specifies a post-hook insertion.

vshampor · 2020-12-01T16:35:02Z

nncf/quantization/algo.py

                 quantizers_between_quantizable_layers: QuantizersBetweenQuantizableLayers = None):
        self.quantizer_module_ref = quantizer_module_ref
-        self.affected_ia_op_exec_contexts = affected_ia_op_exec_contexts
+        self.affected_insertions = affected_insertions


The InputAgnosticOperationExecutionContext is an address of a node, but the insertion of a hook for a node can now proceed in two ways - either post- or pre-. Therefore a position of a quantizer is no longer defined by a node, but by a node and the insertion type.

vshampor · 2020-12-01T16:40:21Z

nncf/dynamic_graph/context.py

+        pre_hook_id = PreHookId(ia_op_exec_context, input_port_id)
+        if pre_hook_id in self._pre_hooks:
+            raise KeyError("Pre hook for context {} is already registered".format(str(pre_hook_id)))
+        self._pre_hooks[pre_hook_id] = fn_list


Pre-hooks is what will allow to quantize branches in the network separately.
Below, on the left, is the result of the MobileNetV2 quantization prior to this PR, and on the right - with this PR.

Notice that the __add__ node used to have a single, nominal pre-hook insertion point, while in fact it takes two tensor activations. The pre-hook insertion points were also non-functional, i.e. the actual insertion could only occur at a post-hook point, which led to a situation where either both of the branches would be quantized with a single quantizer (if a FQ is inserted in the top-most post-hook insertion point), or none of them would.

With the PR, the node has two pre-hook insertion points, corresponding to the number of inputs (i.e. "ports"). Furthermore, it is possible to separately quantize the conv2d input, using the conv2d's pre-hook 0, and the residual __add__ input, using the __add__'s pre-hook 0.

vshampor · 2020-12-01T16:46:08Z

nncf/quantization/algo.py

+                final_quantizer_config_setup[insertion_info] = quantizer_config
+
+            finalized_proposal = quantization_proposal.finalize(final_quantizer_config_setup)
+            final_quantizer_setup = prop_graph_solver.get_final_quantizer_setup(finalized_proposal)


The quantizer propagation algorithm has been modified to work in two stages. This is related to the new quantizer configuration merging algorithm. Basically the first step of the propagation results in a proposal - a set of the points at which a quantizer should be inserted in order to quantize all operations specified in HW config with a supported configuration, and, for each point, a list of possible quantizer configurations out of which one and only one should be chosen. The choice could be either affected by utilizing the NNCF config options/presets, or it could be delegated to a specialized algorithm, such as HAWQ in case multiple bitwidths are present in the quantizer configuration list, or an RL algorithm.

Once the final quantizer configurations are chosen, an extra step is necessary to merge the quantizers that are redundant - e.g. if one and the same final quantizer configuration has been chosen both for the global quantizer of all downstream branches and for the local quantizer on a branch. This is illustrated below - at the top is the "proposed" state, which allows a choice among multiple quantizer configuration for each quantizer (blue nodes). At the bottom is the final state provided that the user's/algorithm's choice for quantizers 59, 7 and 4 had all been to use 8 bit symmetric per-tensor quantization; the quantizers 7 and 4 are redundant with respect to quantizer 59 and therefore are discarded.

vshampor · 2020-12-01T16:58:20Z

nncf/quantization/quantizer_propagation.py

+                      # do not merge at all, or ...
+    MODERATE = 1      # ... only merge for exact matches
+    AGGRESSIVE = 2    # ... merge common parts, and if a branch quantizer has options for scope narrowing in addition to
+                      # the common part, keep the quantizer on branch



The AGGRESSIVE strategy will be used by default, and it is the one most likely to garner performance benefits.

vshampor · 2020-12-01T16:58:59Z

nncf/quantization/quantizer_propagation.py

+        # if insertion_type == InsertionType.OPERATOR_POST_HOOK:
+        #     return True
+        # return False
+        return True


Now that all insertion points, both pre- and post-hook, are functional, this function is mostly obsolete.

vshampor · 2020-12-01T17:13:39Z

nncf/quantization/quantizer_propagation.py

+            else:
+                # Not all of the dominated quantizers have reached the blocking node yet
+                self._active_propagating_quantizers_queue.appendleft(curr_prop_quantizer)
+            return quant_prop_graph


The new propagation algorithm will make all quantizers that are trying to pass up through a downward-branching node wait until the quantizers on the neighbouring branches reach that node as well. Only then will the merge process occur, which may result in the new merge quantizer being added above the branching node to propagate further, and in the existing branch quantizers being discarded. For the PropagationStrategy.AGGRESSIVE, it is also possible that neither of the branch quantizers is discarded, but the merge quantizer is still created - for instance, if one of the branches will potentially require requantization, in case the user/specialized algorithm chooses a corresponding configuration.

Here, the merge quantizer will have a single 8-bit configuration associated since this is the only option that is supported by all the downstream branches. If, at the final quantization config selection stage of the setup, some of the branches also end up having 8-bit quantization chosen for them, the corresponding branch quantizers will be discarded since the global quantizer already has 8-bit quantization. The user is still, however, presented with a possibility to choose non-8-bit quantization for all the branches, in which case requantization will occur on the branches.

vshampor · 2020-12-01T17:27:08Z

nncf/quantization/quantizer_id.py


    def get_base(self) -> 'InputAgnosticOperationExecutionContext':
        return self.ia_op_exec_context

    def get_suffix(self) -> str:
-        return ''
+        return '|OUTPUT' if self.input_port_id is None else '|INPUT{}'.format(self.input_port_id)


The string representation of a quantizer has changed since now it needs to reflect the insertion type - either as an input quantizer to an operation or as an output quantizer. This may affect the code that relies upon it - please comment if you know of such pieces of code.

vshampor · 2020-12-01T17:27:47Z

@alexsu52 @AlexKoff88 @ljaljushkin @asenina, add others as necessary.

asenina · 2020-12-01T18:21:35Z

Jenkins please retry a build

ljaljushkin · 2020-12-02T08:45:57Z

...raphs/quantized/hawq/inception_v3_device_VPU__mode_liberal__setup_type_propagation_based.dot

+"62 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/NNCFConv2d[conv]/conv2d" [color=lightblue, id=62, label="conv2d_#62", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/NNCFConv2d[conv]", style=filled, type=conv2d];
+"63 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/BatchNorm2d[bn]/batch_norm" [id=63, label="batch_norm_#63", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/BatchNorm2d[bn]", style=filled, type=batch_norm];
+"64 Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]/RELU" [id=64, label="RELU_#64", scope="Inception3/InceptionA[Mixed_5b]/BasicConv2d[branch_pool]", style=filled, type=RELU];
+"65 Inception3/InceptionA[Mixed_5b]/cat" [id=65, label="cat_#65", scope="Inception3/InceptionA[Mixed_5b]", style=filled, type=cat];


This graph looks strange now

input for AvgPool can't be INT4 by HW config.

previously, we had the same quantization parameters on all inputs and outputs of the concat node, which is HW-friendly. now, we allow uncoupled bitwidth of activations on the branches.

how it was before

After enabling heterogenous quantizer configuration handling for HAWQ, I managed to restore most of the reference bitwidth distribution graphs to the same state that is currently present on develop.
Current state of the same piece of the graph that we considered in this discussion:

Note that 4-bit AFQ requantizations appeared on the branches. This is due to the fact top-level FQ is 8-bit asymmetric per-tensor, which is the only configuration compatible with both the 8-bit symmetric per-channel listed as supported by avg_pool2d in vpu.json, and the conv2d inputs, which support 4-bit symmetric per-tensor and 8-bit asymmetric per-tensor quantizations. Since 4 bits have been chosen for conv2d weights, the corresponding activation FQs have been selected as 4-bit, which leads to requantization. Another requantization (no. 104) occurs for avg_pool2d to narrow the quantized output range from asymmetric to symmetric; although per-channel quantization in no.104 is not narrowing w.r.t. nos. 34, 45, 61, 69, this does not break anything.

Also note that the requantizers share the adjacency group with their base quantizers. Hope this is OK.

nncf/quantization/precision_init/hawq_debug.py

vshampor · 2020-12-02T10:17:40Z

Jenkins please retry a build

…nment

* Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

* Release v1.5.0 of NNCF to master (#254) * Allow sharing activation quantizers in different graph points (#67) * Update version and docs on develop (#77) * Update 3rd party integration patches (#79) * Doc updates (#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (#85) * Fix percentile per-channel init (#86) Fixes: #83 * Omit nodes called during debugging from entering NNCF graph (#87) * Enable custom range initializers for overriden scopes in schema (#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (#92) * Fix mmdetection patch (#93) * Update mmdetection patch to v2.3.0 (#95) * Allow registering user modules as NNCF modules for weight quantization (#99) * Assign latest tensor shape during ForwardTraceOnly() (#96) * Enable GPT2 ops (#98) * Fix HW config scenario with ops missing in HW config definition (#94) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (#461) * Fix input quantization in case of embeddings (#97) * Added sanity tests for third party integration (#45) * Expose quantizer linking through config (#100) * Add citing section to frontpage README (#103) * Fix bad rebase in asymmetric quantization ONNX export (#104) * Use default quantizer configuration for op weights not specified in HW config (#105) * Update transformers to v3.0.2 (#107) * Fix symmetric quantizer per-channel init for max values close to 0 (#109) * Add unified scales in HW config operation (via quantizer linking) (#108) * Add quantization metric (#33) * Make HW config parsing conform to the implicit rules (#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (#113) * Use sequential sampling for evaluation across example scripts (#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (#115) * Properly handle ops in HW config without quantization configs associated (#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (#122) * Fix pretrained (#125) * Mark Convs as non-depthwise for 1 input channel case (#126) * Add non-RELU activations to fusable patterns (#124) * Fixed Pylint warnings (#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (#132) * Add per layer initialization of ranges. (#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (#138) * Fix PyLint. (#139) * Introduced compression ratio parameter for Mixed Precision init (#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (#150) * Set data loading workers to 0 across tests to force single process (#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (#166) * Fix command line creation for test_compression_training (#167) * Improve eval test code (#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (#158) * Fix third_party_sanity and eval test bugs (#169) * Fix mmdetection dataset search path for SSD (#176) * Test stability (#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (#178) * Fixed disabling gradients of quantizers for HAWQ (#184) * Corrected default values in range initializers (#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (#182) Important for CI * Fix security issues with imports (#185) * Fix paths to COCO in mmdetection third party sanity tests (#186) * Build graphs within the torch.no_grad() context (#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (#190) * Make default install support non-GPU cases (#193) * Fixed backward compatibility test (#195) * Improve quantizer setup for hanging batchnorm nodes (#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (#171) * Add more security fixes (#197) * Removed double logging to stdout (#198) * ignore frozen layers during filter pruning (#200) * Use latest matplotlib version (#206) * Use propagation based mode by default (#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (#224) * Add per-layer sparsity. (#127) * Do not call _quantize_inputs for propagation based mode (#229) * Consistent bitwidth for activations and weight in propagation mode (#191) * Added sota eval tests via AC (#142) * Refactored HAWQ: split functionality into separate files (#232) * Allow quantizing modules that share their weights for multiple operations (#235) * Filter quantizers that directly act upon integer inputs (#228) * Add support sparsity freeze epoch for magnitude sparsity. (#218) * Liberal bitwidth assignment mode by default on precision initialization (#222) * Fix AdaptiveSparsityScheduler. (#236) * Fix threesigma init (#240) * Build extensions in a temporary folder (#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (#220) * Fixed skip Openvino tests and preinstall (#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (#249) * Extend input handling flexibility (#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (#253) * Update documentation for release v1.5.0 (#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (#263) * make customer happy to see param name that is wrong (#259) * kernel chainges * Add pruning sample tests. (#268) * Change an operation order in create_compressed_model (#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (#257) * Pruning of ConvTranspose (#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (#279) * Added Unet Mapillary AC configs (#281) * Added flag for collection quickly computed stats (#287) * Remove __getattr__ from SampleConfig (#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (#291) * Set proper workdir path for Mask-RCNN (#294) * Proper BN momentum parameter and train mode setting in BN adaptation (#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (#303) * Switch to VOC2012 in eval mode (#295) * Updated pruning configs and results (#305) * Don't call MLFlow if it's not enabled (#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (#296) * Fixed paths to mixed-precision configs (#306) * Correct configs for mixed precision models (#307) After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (#308) * Correct configs for mixed precision models After #300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (#314) * Move call epoch_step() method to begin of epoch. (#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (#316) * Made initialization depending on the number of samples. (#309) * Wrapped MLFlow for safe access (#313) * Introduced a separate batch size for initialization (#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (#322) * Show subprocess log in test assertion stacktrace (#325) * Adjust ICNet compressed target values (#326) * Do not replace parameter during symmetric range init (#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (#328) * Use PyTorch 1.7 (#223) * Move epoch_step and step to the beginning of epoch for staged worker (#318) * Use torch 1.7.0 for third party sanity tests (#333) * Fix mixing cyrillic and latin letters (#335) * Fix calculate statistics in local mode sparsity. (#337) * Fk/update packages versions (#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (#334) * Change tensorboardX to pytorch.utils.tensorboard (#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (#323) * Corrected grouping of activation quantizers (#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (#342) * Changed AC configs for SSD models (#341) * Revert "Fk/update packages versions (#338)" (#343) This reverts commit 8c17e0c. * Fk/update packages versions (#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (#340) * fix config path (#346) * Add Embedding to the CPU HW config definition (#347) * Added separated execution OV tests to start parraleling (#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (#349) * Add quantization support for nn.EmbeddingBag (#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (#362) * Revert "Added ONNX Q-DQ converting parameters (#362)" (#368) This reverts commit b0504e9. * Beta directory (#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (#370) * Fix missing default value (#373) * Enable batch norm adaptation by default (#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (#372) * Add pre post processing test (#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (#375) * Use a reduced number of BN adaptation samples for sanity testing (#378) * Dropped last data point in all DataLoaders to prevent issue with BN (#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (#377) * Reduce BN adaptation samples count in HAWQ sanity configs (#380) * Fix object detection sample. (#383) * Added Q-DQ ONNX converting parameter (#369) * Links to models were updated (#386) * include_mask flag for tfds decoder was added (#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (#388) * change VOC dataset namings (#387) * Configure device by common function for all samples (#391) * Reduced num_init_samples for range init to accelerate sanity tests (#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (#393) * Print flops pruning level in statistic (#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (#397) * Fix pruning l2norm (#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (#384) * converted relative imports to absolute imports (#396) * Add ac configs for pruned unet and ssd300 (#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (#398) * Add some explanations to make doc clearer (#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (#404) * Use links to config files for NNCF READMEs (#407) * Combined package (#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (#405) * Remove NNCF package dependency on tensorboard (#411) * Small scheduler fixes (#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (#415) * using common registry (#414) * fixed sanity tests for samples (#417) * Common NNCFConfig (#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (#420) * Fix NoCompressionAlgorithmBuilder (#426) * fixed issues with paths (#425) * 00.0:Updating NNCF github dockerfiles against last changes (#436) * Change thresholds for pruned ssd300 (#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (#439) Fixes: #416 * Use non-recursive BFS for graph traversal (#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (#441) * Minor fixes for HAWQ (#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (#450) * Changed working logic with json metrics (#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (#454) * Fix metric value for ssd300_mobilenet_voc. (#453) * Do not follow symlinks when opening files (#451) * Correctly construct Q-DQ config for E2E tests (#456) * Update documentation for the v1.6.0 release (#457) * Add torch.load warnings and path resolution (#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

* Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init

* Release v1.5.0 of NNCF to master (openvinotoolkit#254) * Allow sharing activation quantizers in different graph points (openvinotoolkit#67) * Update version and docs on develop (openvinotoolkit#77) * Update 3rd party integration patches (openvinotoolkit#79) * Doc updates (openvinotoolkit#84) * Add info on export to Usage.md * Fix third party headers * Fix import in transformers patch (openvinotoolkit#85) * Fix percentile per-channel init (openvinotoolkit#86) Fixes: openvinotoolkit#83 * Omit nodes called during debugging from entering NNCF graph (openvinotoolkit#87) * Enable custom range initializers for overriden scopes in schema (openvinotoolkit#89) * Enable custom quantization configs and initializers for overriden scopes in schema * code style * remove range config duplication * obsolete import * Fix model saving in transformers patch (openvinotoolkit#91) * Patch TracedTensor's __repr__ method instead of torch.Tensor's (openvinotoolkit#92) * Fix mmdetection patch (openvinotoolkit#93) * Update mmdetection patch to v2.3.0 (openvinotoolkit#95) * Allow registering user modules as NNCF modules for weight quantization (openvinotoolkit#99) * Assign latest tensor shape during ForwardTraceOnly() (openvinotoolkit#96) * Enable GPT2 ops (openvinotoolkit#98) * Fix HW config scenario with ops missing in HW config definition (openvinotoolkit#94) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Release v1.6.0 of NNCF to master (openvinotoolkit#461) * Fix input quantization in case of embeddings (openvinotoolkit#97) * Added sanity tests for third party integration (openvinotoolkit#45) * Expose quantizer linking through config (openvinotoolkit#100) * Add citing section to frontpage README (openvinotoolkit#103) * Fix bad rebase in asymmetric quantization ONNX export (openvinotoolkit#104) * Use default quantizer configuration for op weights not specified in HW config (openvinotoolkit#105) * Update transformers to v3.0.2 (openvinotoolkit#107) * Fix symmetric quantizer per-channel init for max values close to 0 (openvinotoolkit#109) * Add unified scales in HW config operation (via quantizer linking) (openvinotoolkit#108) * Add quantization metric (openvinotoolkit#33) * Make HW config parsing conform to the implicit rules (openvinotoolkit#111) (except for the "any supported quantization for the ops in config without specified quantizations", because they need config wildcarding, to be implemented as a follow-up) * Fix MobileNetV2 INT8 config (openvinotoolkit#113) * Use sequential sampling for evaluation across example scripts (openvinotoolkit#114) Hopefully this will make nightly compression training "eval" tests more stable. * Fix third_party_sanity tests (openvinotoolkit#115) * Properly handle ops in HW config without quantization configs associated (openvinotoolkit#119) These get associated with a "wildcard" propagating quantizer, which will either get merged with any other quantizer during propagation, or get assigned a default quantization config. * Make criterion optional in signature of register_default_init_args() (openvinotoolkit#121) * make optional criterion in signature of register_default_init_args() * update README.md as Vasiliy asked * Add Googlenet with pruning configs (openvinotoolkit#122) * Fix pretrained (openvinotoolkit#125) * Mark Convs as non-depthwise for 1 input channel case (openvinotoolkit#126) * Add non-RELU activations to fusable patterns (openvinotoolkit#124) * Fixed Pylint warnings (openvinotoolkit#129) * Fix bug with CompositeCompressionAlgorithmController export_model() signature (openvinotoolkit#132) * Add per layer initialization of ranges. (openvinotoolkit#116) * Add prepare_for_export() to commit pre export for CompressionAlgortihmController; Update for CompositeCompressionAlgorithmController (openvinotoolkit#138) * Fix PyLint. (openvinotoolkit#139) * Introduced compression ratio parameter for Mixed Precision init (openvinotoolkit#133) * Introduced compression ratio parameter for Mixed Precision init It's used for choosing optimal mixed precision configuration for a given ratio. Compression ratio of mixed precision quantization is calculated by relation to fully INT8 one. Total compression for the model is sum of compression for each quantized layer, which is multiplication the layer's (Conv, Deconv, Linear) FLOPS and number of bits for its quantization. The ratio is used for estimation of performance boost for quantized model It's a better proxy for amount of calculation then number of parameters multiplied by bitwidth * Added link to the full configuration file with template usage * disclaimer about model specific params in template * corrected articles, contractions, mixed precision-> mixed-precision * Fix bug with NoCompressionAlgorithmController (openvinotoolkit#150) * Set data loading workers to 0 across tests to force single process (openvinotoolkit#162) * Set data loading workers to 0 across tests to force single process Could fix the consequences of pytorch/pytorch#39570 * Remove more-itertools dependency * Specify NNCF import order in docs (openvinotoolkit#161) * Specify NNCF import order in docs * Fix frontpage integration instructions * Bump mmdetection version to 2.4.0 (openvinotoolkit#166) * Fix command line creation for test_compression_training (openvinotoolkit#167) * Improve eval test code (openvinotoolkit#160) * Fix bug with different torch devices in get_scale_zp_from_input_low_input_high (openvinotoolkit#158) * Fix third_party_sanity and eval test bugs (openvinotoolkit#169) * Fix mmdetection dataset search path for SSD (openvinotoolkit#176) * Test stability (openvinotoolkit#179) * Increase eval threshold for test_compression_training cases CUDA computation seems to inherently cause differences of at least 0.01% in accuracy metric computation between the train and eval runs * Reduce batch size for SSD512 eval CI runs (avoid OOM) * Renamings (openvinotoolkit#178) * Fixed disabling gradients of quantizers for HAWQ (openvinotoolkit#184) * Corrected default values in range initializers (openvinotoolkit#183) - Right minimal and maximum values for mean_min_max doesn't skip check for not collected statistics and prevents from initializing by inf values. - Percentile init doesn't crash by default * Refactor imports in setup.py (openvinotoolkit#182) Important for CI * Fix security issues with imports (openvinotoolkit#185) * Fix paths to COCO in mmdetection third party sanity tests (openvinotoolkit#186) * Build graphs within the torch.no_grad() context (openvinotoolkit#187) Should reduce memory usage during create_compressed_model * Fix security issues directly in code (openvinotoolkit#189) * Return zero-valued torch.Tensor in CompressionLoss by default instead of int (openvinotoolkit#190) * Make default install support non-GPU cases (openvinotoolkit#193) * Fixed backward compatibility test (openvinotoolkit#195) * Improve quantizer setup for hanging batchnorm nodes (openvinotoolkit#192) * Do not merge subgraphs if subgraph has more than one output node * Mark BatchNorm as INPUTS_QUANTIZABLE by default Will manifest itself in case there is a batch norm operation that was not merged to any previous op, i.e. should accept quantized input instead of FP32 * Fix export for nodes with metatypes not redefined by pruning algo (openvinotoolkit#171) * Add more security fixes (openvinotoolkit#197) * Removed double logging to stdout (openvinotoolkit#198) * ignore frozen layers during filter pruning (openvinotoolkit#200) * Use latest matplotlib version (openvinotoolkit#206) * Use propagation based mode by default (openvinotoolkit#181) * Set propagation_based mode by default. * Fix compressed graphs. * Fix quantize inputs option. * Add operator metatypes for 'sigmoid' and 'add' operator (openvinotoolkit#209) * Add operator metatypes for 'sigmoid' and 'add' operator * remove trailing spaces Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Grouping of pruning modules + clusterisation classes * Small fixes * Introduced `enabled` parameter for Quantizers (openvinotoolkit#194) Also: * corrected script to add new quantization parameters to checkpoints * added warning on exporting disabled quantizations * print statistics about enabled quantizers by default * Added model analysis file * Update documentation (openvinotoolkit#219) * Update documentation. * Update docs. Add dependencies for param to json schema. * Fixes for grads + batch norms * To fix cpu_only part (openvinotoolkit#221) * To update cpu_only part dockerfile; fix issue with setup.py install with --cpy-only opt; fix README.md * apply remarks * Fix register_operator (openvinotoolkit#224) * Add per-layer sparsity. (openvinotoolkit#127) * Do not call _quantize_inputs for propagation based mode (openvinotoolkit#229) * Consistent bitwidth for activations and weight in propagation mode (openvinotoolkit#191) * Added sota eval tests via AC (openvinotoolkit#142) * Refactored HAWQ: split functionality into separate files (openvinotoolkit#232) * Allow quantizing modules that share their weights for multiple operations (openvinotoolkit#235) * Filter quantizers that directly act upon integer inputs (openvinotoolkit#228) * Add support sparsity freeze epoch for magnitude sparsity. (openvinotoolkit#218) * Liberal bitwidth assignment mode by default on precision initialization (openvinotoolkit#222) * Fix AdaptiveSparsityScheduler. (openvinotoolkit#236) * Fix threesigma init (openvinotoolkit#240) * Build extensions in a temporary folder (openvinotoolkit#239) * Refactoring + added step with model analysis * Criterion generalization for HAWQ algorithm (openvinotoolkit#230) * Criterion generalization for HAWQ algorithm * scope_node -> node_scope * Documentation update * Described in docs when to use additional parameter 'criterion_fn' * Fixes for pruning info * fix quantization range initialization in case of 1 scale channel (openvinotoolkit#241) fix quantization range initialization in case of 1 scale channel to avoid initialization only by single slice of data (data[0]) and ignoring the other (data[1], data[2],.....) * Patch Semantic Segmentation Application to export onnx and test with resume flag (openvinotoolkit#244) Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> * Add DW-conv to input quantizable op. (openvinotoolkit#220) * Fixed skip Openvino tests and preinstall (openvinotoolkit#246) * Small cleanup + refactoring * Corrected handling of barrier on the graph traverse (openvinotoolkit#249) * Extend input handling flexibility (openvinotoolkit#242) * Handle inputs better using input_infos * Update nncf/model_creation.py * Corrected handling Inception outputs in classification sample (openvinotoolkit#251) * Change quantization levels for SymmetricQuantizer from 255 to 256 (openvinotoolkit#225) * Change quantization levels for SymmetricQuantizer from 255 to 256 * Update test_functions with new level * Fix bug with weights range, Make formulas dependent only from one value - levels, thereby reducing the chance to make a mistake * Fix PyLint * Update HW configs with new quantization level_low * Fix bug with float type * Change type() to isinstance() * Change return values order in calculate_level_ranges * step 1 * Fix bug with export to Q/DQ (openvinotoolkit#248) * Fix bug with export to Q/DQ Add hack of export processing for our old checkpoints Add Exception raising for exporting per-channel Q/DQ layers, as PyTorch ONNX exporting supports only per-tensor. * Fix Pylint * Update layers.py * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint * Fix bug in AssymetricQuantizer export; Add tests * Fix pylint Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Update results and links to the checkpoints (openvinotoolkit#253) * Update documentation for release v1.5.0 (openvinotoolkit#252) * Update documentation for release v1.5.0 * Corrected HAWQ documentation * Add per-range initialization notes Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> * Add Mask-RCNN-R50FPN-INT8 config for mmdetection (openvinotoolkit#174) * rebase * add third-party sanity tests for Mask-RCNN IS model * add Mask-RCNN accuracy results to tables * fix link in README * add instance segmentation ref to README * fix voc path * fix retinanet config * Update version.py * Fixed old tests tests * Add test for pruning groups checks * Fix pylint + small cleanup * More clarification about `bits` parameter in docs (openvinotoolkit#263) * make customer happy to see param name that is wrong (openvinotoolkit#259) * kernel chainges * Add pruning sample tests. (openvinotoolkit#268) * Change an operation order in create_compressed_model (openvinotoolkit#265) * Introduce additional evaluation of loss function to SSD application * Expanded table, skiped unsupported models (openvinotoolkit#234) Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> * Mlflow log (openvinotoolkit#243) * mlflow logging * something * some changes * Some fixes and clear up * Symbolic link update * Final Updates * Little fixes * Little fixes(one more) * Test mlflow off * Deleted hardcoded log dir * Generalization * Clear up * Fixes * code fixes * Common classification functions carry out * Metrics logging changes * Fix comments * Fix pylint * Fix pylint * Fix last linter warnings * Cpu nms kernels replaced by torch func * Extended test for model analysis * Clean up * Small pylint + comments fixes * Fix gradients zeroing + prune batch norms by default * Fix prune batch norm default * Fix test * is cuda * Compress in eval mode (openvinotoolkit#257) * Pruning of ConvTranspose (openvinotoolkit#274) * Add pruning of ConvTranspose * Rename to target_weight_dim_for_compression * fixes * Fix zero_grad * get_op_types_of_pruned_modules * Fixed collecting metrics.json for incomplete eval test (openvinotoolkit#279) * Added Unet Mapillary AC configs (openvinotoolkit#281) * Added flag for collection quickly computed stats (openvinotoolkit#287) * Remove __getattr__ from SampleConfig (openvinotoolkit#292) Newer `addict` version uses custom private attributes for internal working and __getattr__ disrupted it. It was quite useless anyway. * Fix H/W on an image in the mock coco dataset (openvinotoolkit#291) * Set proper workdir path for Mask-RCNN (openvinotoolkit#294) * Proper BN momentum parameter and train mode setting in BN adaptation (openvinotoolkit#288) * proper BN momenta parameter and train mode setting in BN adaptation * use training mode switcher context maganer for BN adaptation inference * Testing OPs quantization by synthetic tests (openvinotoolkit#297) Also * Made LeakyRELU as input_quantizable OP * Removed extra dot-files for ManyNonEvalModules test case * Revised mixed-precision related content (openvinotoolkit#300) * Moved mixed_precision configs to the separate folder * Minimized the scope of parameters in this config removing as much as possible and let them be the defaults ones. * Remove .dot extension in the HW config test case descriptor (openvinotoolkit#303) * Switch to VOC2012 in eval mode (openvinotoolkit#295) * Updated pruning configs and results (openvinotoolkit#305) * Don't call MLFlow if it's not enabled (openvinotoolkit#304) Required to avoid mlflow.exceptions.MlflowException: Could not create run under non-active experiment with ID 0. * Add input/output-names parameters to export_model function. (openvinotoolkit#296) * Fixed paths to mixed-precision configs (openvinotoolkit#306) * Correct configs for mixed precision models (openvinotoolkit#307) After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience (openvinotoolkit#308) * Correct configs for mixed precision models After openvinotoolkit#300 *_hawq.json configs are propagation-based, but checkpoint are still for pattern-based quantization settings That's why manual configs should be used to achieve a target accuracy * Removed custom SqueezeNet model for better user experience Originally we had a modified copy of SqueezeNet model to workaround a bug in ONNX exporter with converting MaxPool with ceil_mode=True. This bug isn't actual now for torch 1.5 and there's almost identical SqueezeNet model in torchivision > 0.6. That's why custom SqueezeNet was deleted as not needed to remove confusion. There's no changes in the corresponding NNCF graph. Previously trained checkpoints for custom SqueezeNet can be loaded and evaluated with SqueezeNet from torchvision. INT8 model has the same accuracy, mixed model is differ only by ~0.01 in maximum. * Added ResNet-18 magnitude Filter Pruning config and snapshot (openvinotoolkit#311) * Added ResNet-18 magnitude Filter Pruning config and snapshot * Adjusted checkpoint validation * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename value of target_device from "NONE" to "TRIAL" (openvinotoolkit#314) * Move call epoch_step() method to begin of epoch. (openvinotoolkit#231) * Move call epoch_step() method to begin of epoch. * Move sparsity_init parameter to algo logic. * Fix some sanity sample tests for semantic segmentation. * Fix object detection example. * Update docs. * Fix per_step option scheduler. Refactoring. * Rename target_device "NONE" to "TRIAL". * Fix NMS CUDA extensions import for CPU only case (openvinotoolkit#316) * Made initialization depending on the number of samples. (openvinotoolkit#309) * Wrapped MLFlow for safe access (openvinotoolkit#313) * Introduced a separate batch size for initialization (openvinotoolkit#315) * Separate data_loader is registered for initialization via `register_default_init_args` * WA for Python 3.6 on CI (openvinotoolkit#321) * Use mock 32x32 dataset instead of actual CIFAR for sanity test runs (openvinotoolkit#322) * Show subprocess log in test assertion stacktrace (openvinotoolkit#325) * Adjust ICNet compressed target values (openvinotoolkit#326) * Do not replace parameter during symmetric range init (openvinotoolkit#327) The initialization using the controller method may occur *after* the optimizer received the list of model's parameters, so replacing the parameter as a whole during such initialization will break the gradient updates. * Increase number of epochs in sanity test runs (openvinotoolkit#324) Should uncover more bugs. * Replace the rest of num_init_steps entries with num_init_samples (openvinotoolkit#328) * Use PyTorch 1.7 (openvinotoolkit#223) * Move epoch_step and step to the beginning of epoch for staged worker (openvinotoolkit#318) * Use torch 1.7.0 for third party sanity tests (openvinotoolkit#333) * Fix mixing cyrillic and latin letters (openvinotoolkit#335) * Fix calculate statistics in local mode sparsity. (openvinotoolkit#337) * Fk/update packages versions (openvinotoolkit#338) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Update accuracy target values after switching to torch 1.7.0 (openvinotoolkit#334) * Change tensorboardX to pytorch.utils.tensorboard (openvinotoolkit#332) * Change tensorboardX to tensorboard * Add tensorboard version * Add domain in onnx-model for custom operations. (openvinotoolkit#323) * Corrected grouping of activation quantizers (openvinotoolkit#339) Not merged FQ's for activations should be in different groups, if unmerged activation FQ on the branch goes directly after another FQ for activation (common input for different branches). start->FQ_A Conv \ / POST_HOOK / \ PRE_HOOK PRE_HOOK | \ div MaxPool here->|FQ_A| \ / POST_HOOK * Adjust thresholds due to new torchvision FP32 checkpoints acc. drop (openvinotoolkit#342) * Changed AC configs for SSD models (openvinotoolkit#341) * Revert "Fk/update packages versions (openvinotoolkit#338)" (openvinotoolkit#343) This reverts commit 8c17e0c. * Fk/update packages versions (openvinotoolkit#344) * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Adding definitive version of required packages, move to python3.8, update ReadMe * Add difinitive versions of packages only * Add difinitive versions of packages only.fix01 * Add to requiremet for pandas * Fix mistake in tensorboard name * Fix per-layer sparsity. Add stub scheduler. (openvinotoolkit#340) * fix config path (openvinotoolkit#346) * Add Embedding to the CPU HW config definition (openvinotoolkit#347) * Added separated execution OV tests to start parraleling (openvinotoolkit#282) * Remove no_empty_cache in an attempt to fix sporadic CI failures (openvinotoolkit#348) * Add an option to optimize logarithms of quantizer scales instead of scales directly (openvinotoolkit#329) * add sclae_log parameters for quantization. its allow to increse convergence speed for high scales and increase accuracy for low scales. * add _ to make some variable "hidden" * variant of setter for scale * add setter for input_range for asymetric quantizer * scale_log_flag used outside to print status so I've back .sclae_log_flag instead of._scale_log_flag * made scale_log_flag read only * add test for sclae_log parameter. * Update test_scale_log.py * add missing key check due to load_state_dict white spaces fix * remove quantizer.scale = torch.nn.Parameter() to avoid torch error * fix test_unified_scales_are_identical_in_onnx fail due to unable to set Parameter by property * remove useless init method * split long line * fix tes_unified_scales * Update test_scale_log.py * update ref file by replace scale -> _scale_tensor * Update README.md * Update README.md * Update layers.py * fix HookAutoRemove * Improvements Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> * Fixed protobuf error (openvinotoolkit#349) * Add quantization support for nn.EmbeddingBag (openvinotoolkit#330) * Add quantization support for nn.EmbeddingBag * Add EmbeddingBagMetatype to DEFAULT_QUANT_TRAIT_TO_OP_DICT * Add synthetic model quantization for nn.Embedding/EmbeddingBag and F.embedding_bag * Remove duplicated synthetic model test of nn.Embedding * Add EmbeddingBag to the CPU HW config definition * replace TorchBinaryMethodDesc test of F.embedding_bag with SingleLayerModelDesc * Handle network input nodes to NNCFEmbeddingBag * Fix pylint warnings * Vpu config revision (openvinotoolkit#356) * Revised VPU config * More extreme ratio for VPU config to test INT2 bitwidth assignment Also updated reference graphs Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> * Renamed case-sensitive files to prevent git issue on Windows (openvinotoolkit#357) After checkout fresh develop, git confuses file with/without capital letter. As a result this file can't be discarded. `git config --global core.ignorecase true` doesn't work as well * Update mmdet patch (openvinotoolkit#354) * Update mmdet patch * Update configs and meta * Add export tests * Update test * Update package installation * Compression statistics before training (openvinotoolkit#345) * Compression statistics before training * Compression statistics before training * print_statistics sanity test * Object detection test fixes * is_main_process aligning * pylint disabling * Pruning refactoring to work with FLOPs target too (openvinotoolkit#320) * Added pruning_flops_target param and all necessary functions * Added tests * Pylint fixed * Fixed comments: BatchNorm deleted from flops calculations and small refactoring * Fix tests * Delete bias from FLOPs calc + test reverting * Fix bug with mmdet patch (openvinotoolkit#363) * Fix bug with mmdet patch * Fix bugs * Fix pylint * Added ONNX Q-DQ converting parameters (openvinotoolkit#362) * Revert "Added ONNX Q-DQ converting parameters (openvinotoolkit#362)" (openvinotoolkit#368) This reverts commit b0504e9. * Beta directory (openvinotoolkit#364) * create beta directory with the experimental implementation of the Neural Network Compression Framework for TensorFlow (NNCF TF) * update documentation * updated checkpoint links * nncf-tensorflow alpha * Use PyLint 2.6+ (openvinotoolkit#370) * Fix missing default value (openvinotoolkit#373) * Enable batch norm adaptation by default (openvinotoolkit#360) * Remove immediate failure when trying to use NNCF with torch 1.5.0 (openvinotoolkit#372) * Add pre post processing test (openvinotoolkit#374) * Fix missing default value * Add pre_post processing tests * Relax upper-bound threshold for mixed precision ResNet50 (openvinotoolkit#375) * Use a reduced number of BN adaptation samples for sanity testing (openvinotoolkit#378) * Dropped last data point in all DataLoaders to prevent issue with BN (openvinotoolkit#379) There is a little chance that last data point may have a batch size equal to 1, which leads to an error: ``` ValueError: Expected more than 1 value per channel when training ``` We caught this error in sanity tests with CIFAR10. The dataset has 1000 data points. There're 333 data points with batch_size=3 and the last one with batch_size=1. Training may fail in the end of epoch, which is not accepted for bigger datasets. * Fix eval failures due to BN adaptation enabled by default (openvinotoolkit#377) * Reduce BN adaptation samples count in HAWQ sanity configs (openvinotoolkit#380) * Fix object detection sample. (openvinotoolkit#383) * Added Q-DQ ONNX converting parameter (openvinotoolkit#369) * Links to models were updated (openvinotoolkit#386) * include_mask flag for tfds decoder was added (openvinotoolkit#385) * include_mask flag for tfds decoder was added * Support of the input_info param was added (openvinotoolkit#388) * change VOC dataset namings (openvinotoolkit#387) * Configure device by common function for all samples (openvinotoolkit#391) * Reduced num_init_samples for range init to accelerate sanity tests (openvinotoolkit#392) * Basic progress bar to avoid multiprocessing issue with tqdm(DataLoader) (openvinotoolkit#390) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Basic progress bar to avoid multiprocess issue with tqdm(DataLoader) * Add pruned ssd300 and unet_mapillary (openvinotoolkit#393) * Print flops pruning level in statistic (openvinotoolkit#367) * Print flops pruning level in statistic * Calculate current flops after update masks * Fix: missed transpose convolution * add test_calculation_of_flops * Fix compute_flops_hook for nn.linear * Add comment for compute_flops_hook * Add AutoML-based mixed-precision initialization mode - AutoQ (openvinotoolkit#250) * Adaptation of MIT HAN Lab's HAQ: Hardware-Aware Automated Quantization with Mixed Precision * Introduce a Deep Reinforcement Learning algorithm (DDPG) to learn and initialize layer-wise quantization bitwidth, prior to NNCF quantize-aware fine-tuning * The mixed-precision initialization is optimized towards minimal accuracy drop given a user-specified model size constraint * Supported precision depends on target HW (VPU 8/4/2) or user-specified precision space * Fix path to unet_mapillary_pruning_geometric_median checkpoint (openvinotoolkit#397) * Fix pruning l2norm (openvinotoolkit#310) * Fix pruning l2norm * Use register_module for l2norm * Add filter by algorithms for registred modules * Add condition to add _registred_name in registred module * resolve comments * fix pylint * Update reference dot files * Separate the examples and test Python package requirements from NNCF (openvinotoolkit#384) * converted relative imports to absolute imports (openvinotoolkit#396) * Add ac configs for pruned unet and ssd300 (openvinotoolkit#399) * Add ac configs for pruned unet and ssd300 * Add batch 32 for ssd300_vgg_voc_pruning_geometric_median * Added proper license for DDPG-related code (openvinotoolkit#398) * Add some explanations to make doc clearer (openvinotoolkit#395) * Add some explanations to make doc clearer * docs cleanup Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> * Simplify paths to configs (openvinotoolkit#400) * Path to config was fixed * Paths to configs were simplified * Add ssd_mobilenet_voc_sparsity_int8 config (openvinotoolkit#404) * Use links to config files for NNCF READMEs (openvinotoolkit#407) * Combined package (openvinotoolkit#410) * beta.nncf package * removed pytest.ini * Return pandas to the list of requirements (openvinotoolkit#405) * Remove NNCF package dependency on tensorboard (openvinotoolkit#411) * Small scheduler fixes (openvinotoolkit#412) * Add step to pruning shedulers and algo + delete redundant pruning rate setting * Fix tests * Revert same pruning rate changes * Add pruning_init in test_calculation_of_flops Co-authored-by: Kaglinskaya <maria.kaglinskaya@intel.com> * [TF] Minor fixes (openvinotoolkit#403) * Minor fixes * Pylint issues were fixed * Extra line was removed Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> * [TF] Add handling of non-distributed strategy (openvinotoolkit#401) * Default strategy was added * cpu-only flag was disabled for Mask R-CNN training * Fixed non-distributed mode for the object detection sample * Merging and pre hooks (openvinotoolkit#302) * Add pre-hook functionality to quantization * Add quantizer merging logic to the propagation mode * Properly update and merge quantizers between quantizable layers * Move adjacent quantizer group creation closer to the builder stage * Store affected op node key in the propagating quantizer * Refactor quantization to jointly quantize weights and activations * Fix clearing constraint sets during liberal activation bitwidth assignment * Add initial version of build-time range init * Make HAWQ work with heterogenous quantizer configurations * Finalize the switch to build-time range init * Properly compare quantizer configs for requantization purposes * Fix quantizer ordering once again * Improve HAWQ bitwidth reference graph formatting * Add NNCF network clean view tests * Fix errors * Use statistics approach for the runtime range init * Add tests for separate statistic collectors * Extend range init setting tests * Fix rebasing issues * Switch AutoQ to setting compatible configs instead of bitwidths * Ref HAWQ file adjustments after fixing experimental controller init * Relax requirements packages versions (openvinotoolkit#415) * using common registry (openvinotoolkit#414) * fixed sanity tests for samples (openvinotoolkit#417) * Common NNCFConfig (openvinotoolkit#413) * using common config * added jsonschema to requirements * Fix third-party sanity tests (openvinotoolkit#420) * Fix NoCompressionAlgorithmBuilder (openvinotoolkit#426) * fixed issues with paths (openvinotoolkit#425) * 00.0:Updating NNCF github dockerfiles against last changes (openvinotoolkit#436) * Change thresholds for pruned ssd300 (openvinotoolkit#435) diff_fp32_min from -1.2 to -4.8 * Use one of the registered JSON meta-schemae (openvinotoolkit#439) Fixes: openvinotoolkit#416 * Use non-recursive BFS for graph traversal (openvinotoolkit#440) * Use non-recursive BFS for graph traversal Python does not handle deep recursion stacks well. * Use DFS by default, after all * Add AC config for SSD300_mobilenet on voc. (openvinotoolkit#441) * Minor fixes for HAWQ (openvinotoolkit#442) Set debug log directory for collecting hawq-related data not only in debug mode, but via option `dump_precision_init_data` Corrected printing of chosen bitwidth configuration * Init on same device by default (openvinotoolkit#438) * Use model's own device for initialization by default * Adjust init args documentation * Add at::DeviceGuard invocations in kernels to support non-'cuda:0' devices * Use cuda for precision init tests * Remove extra entries from MANIFEST.in (openvinotoolkit#452) * Add AutoQ end-to-end config for image classification samples (resnet50 and mobilenet_v2) (openvinotoolkit#450) * Changed working logic with json metrics (openvinotoolkit#447) * Add AutoQ config with fine-tuning recipe for resnet50 and mobilenet_v2 Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> * Apply nncf.register_module correctly in transformers (openvinotoolkit#454) * Fix metric value for ssd300_mobilenet_voc. (openvinotoolkit#453) * Do not follow symlinks when opening files (openvinotoolkit#451) * Correctly construct Q-DQ config for E2E tests (openvinotoolkit#456) * Update documentation for the v1.6.0 release (openvinotoolkit#457) * Add torch.load warnings and path resolution (openvinotoolkit#458) Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: Ivan Lazarevich <ivan.lazarevich@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com> * accuracy aware draft * refactor to introduce TrainingRunner for training loop control * move accuracy aware loop to common * address comments * update accuracy aware * add support for TF samples * refactor keras API sample Co-authored-by: Vasily Shamporov <vasily.shamporov@intel.com> Co-authored-by: Pave Finashov <66466565+pfinashx@users.noreply.github.com> Co-authored-by: Anastasia Senina <Anastasia.Senina@intel.com> Co-authored-by: Aleksei Kashapov <aleksei.kashapov@intel.com> Co-authored-by: Maria Kaglinskaya <maria.kaglinskaya@intel.com> Co-authored-by: Lyalyushkin Nikolay <nikolay.lyalyushkin@intel.com> Co-authored-by: vuiseng9 <vuiseng9@gmail.com> Co-authored-by: Chua, Vui Seng <vui.seng.chua@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fedorx.kutsepin@intel.com> Co-authored-by: krodyush <konstantin.rodyushkin@intel.com> Co-authored-by: skholkin <holckin100@gmail.com> Co-authored-by: Sergei Kholkin <sergei.kholkin@intel.com> Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com> Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Pavel Finashov <pavelx.finashov@intel.com> Co-authored-by: Alexander Suslov <alexander.suslov@intel.com> Co-authored-by: Daniil Lyakhov <daniil.lyakhov@intel.com> Co-authored-by: Andrey Churkin <andrey.churkin@intel.com> Co-authored-by: Fyodor Kutsepin (aka Oddy O) <fyodor.kutsepin@gmail.com>

This was referenced Nov 26, 2020

Add pre-hook functionality to quantization #168

Closed

Add quantizer merging logic to the propagation mode #299

Closed

vshampor force-pushed the merging_and_pre_hooks branch from 81ffd59 to ecf36c5 Compare November 26, 2020 16:23

vshampor mentioned this pull request Nov 27, 2020

Merge activation quantizers after HAWQ init in propagation mode. #165

Closed

vshampor force-pushed the merging_and_pre_hooks branch 3 times, most recently from 5b44d9e to 0c81fcc Compare December 1, 2020 15:58

vshampor marked this pull request as ready for review December 1, 2020 16:01

vshampor requested a review from a team December 1, 2020 16:01

vshampor commented Dec 1, 2020

View reviewed changes

vshampor force-pushed the merging_and_pre_hooks branch from 0c81fcc to 9763f56 Compare December 1, 2020 17:29

ljaljushkin reviewed Dec 2, 2020

View reviewed changes

ljaljushkin requested changes Dec 2, 2020

View reviewed changes

nncf/quantization/precision_init/hawq_debug.py Show resolved Hide resolved

vshampor force-pushed the merging_and_pre_hooks branch 4 times, most recently from 9ab84ab to f265239 Compare December 16, 2020 14:17

vshampor added 18 commits January 19, 2021 21:38

Add pre-hook functionality to quantization

32c82a3

Add quantizer merging logic to the propagation mode

8938dd0

Properly update and merge quantizers between quantizable layers

b0a7c6d

Move adjacent quantizer group creation closer to the builder stage

2121322

Store affected op node key in the propagating quantizer

dbfcc90

Refactor quantization to jointly quantize weights and activations

29ec2e6

Fix clearing constraint sets during liberal activation bitwidth assig…

8524f18

…nment

Add initial version of build-time range init

4a668c5

Make HAWQ work with heterogenous quantizer configurations

b7747ea

Finalize the switch to build-time range init

001b33a

Properly compare quantizer configs for requantization purposes

d41709d

Fix quantizer ordering once again

7669de5

Improve HAWQ bitwidth reference graph formatting

1ac6905

Add NNCF network clean view tests

5195bb8

Fix errors

91e8125

Use statistics approach for the runtime range init

6689de1

Add tests for separate statistic collectors

c73d8fc

Extend range init setting tests

935a1f5

vshampor force-pushed the merging_and_pre_hooks branch from 73d5a0b to 7c14e91 Compare January 19, 2021 18:38

vshampor added 3 commits January 19, 2021 21:39

Fix rebasing issues

6f1935f

Switch AutoQ to setting compatible configs instead of bitwidths

43672ef

Ref HAWQ file adjustments after fixing experimental controller init

7c14e91

vshampor merged commit 2f0cab7 into openvinotoolkit:develop Jan 19, 2021

asenina mentioned this pull request Mar 29, 2021

Invalid quantization metrics: placed AQs more than potential ones #408

Closed

vshampor deleted the merging_and_pre_hooks branch January 2, 2022 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging and pre hooks #302

Merging and pre hooks #302

vshampor commented Nov 26, 2020

vshampor commented Dec 1, 2020

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020

vshampor Dec 1, 2020

vshampor Dec 1, 2020

vshampor Dec 1, 2020

vshampor commented Dec 1, 2020

asenina commented Dec 1, 2020

ljaljushkin Dec 2, 2020

vshampor Jan 7, 2021 •

edited

Loading

vshampor commented Dec 2, 2020

Merging and pre hooks #302

Merging and pre hooks #302

Conversation

vshampor commented Nov 26, 2020

vshampor commented Dec 1, 2020

vshampor Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

vshampor Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

vshampor Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

vshampor Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

vshampor Dec 1, 2020

Choose a reason for hiding this comment

vshampor Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

vshampor Dec 1, 2020

Choose a reason for hiding this comment

vshampor Dec 1, 2020

Choose a reason for hiding this comment

vshampor Dec 1, 2020

Choose a reason for hiding this comment

vshampor Dec 1, 2020

Choose a reason for hiding this comment

vshampor commented Dec 1, 2020

asenina commented Dec 1, 2020

ljaljushkin Dec 2, 2020

Choose a reason for hiding this comment

vshampor Jan 7, 2021 • edited Loading

Choose a reason for hiding this comment

vshampor commented Dec 2, 2020

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Dec 1, 2020 •

edited

Loading

vshampor Jan 7, 2021 •

edited

Loading