Skip to content

DenseNet transfer learning, custom dataset quantization: [ERROR] Unexpected exception InvalidArgumentError() happened during tuning #45

@mesolmaz

Description

@mesolmaz

I've trained a multiclass classifier with TF using transfer learning and have been trying to use Neural Compressor in dev container enviroment. Firstly, I had to modify build_imagenet_data.py for grayscale image to obtain custom tfrecord file. Then I modifed densenet121.yaml for tfrecord location, height, width, scale and mean_value.
I come across InvalidArgumentError() during tuning, which I am not able to overcome.

root@docker-desktop:/workspaces/Neural_Compressor/neural-compressor/examples/tensorflow/image_recognition/tensorflow_models/quantization/ptq# bash run_tuning.sh --config=densenet121.yaml --input_model="/workspaces/Neural_Compressor/models/tensorflow/model" --output_model=./nc_densenet121

  • main --config=densenet121.yaml --input_model=/workspaces/Neural_Compressor/models/tensorflow/model --output_model=./nc_densenet121
  • init_params --config=densenet121.yaml --input_model=/workspaces/Neural_Compressor/models/tensorflow/model --output_model=./nc_densenet121
  • for var in "$@"
  • case $var in
    ++ echo --config=densenet121.yaml
    ++ cut -f2 -d=
  • config=densenet121.yaml
  • for var in "$@"
  • case $var in
    ++ echo --input_model=/workspaces/Neural_Compressor/models/tensorflow/model
    ++ cut -f2 -d=
  • input_model=/workspaces/Neural_Compressor/models/tensorflow/model
  • for var in "$@"
  • case $var in
    ++ echo --output_model=./nc_densenet121
    ++ cut -f2 -d=
  • output_model=./nc_densenet121
  • run_tuning
  • python main.py --input-graph /workspaces/Neural_Compressor/models/tensorflow/model --output-graph ./nc_densenet121 --config densenet121.yaml --tune
    2022-02-14 18:47:44.290292: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
    To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
    2022-02-14 18:47:44.292744: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
    2022-02-14 18:48:34 [WARNING] Output tensor names should not be empty.
    2022-02-14 18:48:34 [WARNING] Input tensor names should not be empty.
    2022-02-14 18:49:02.538820: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
    2022-02-14 18:49:02.539043: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-02-14 18:49:02.564178: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1149] Optimization results for grappler item: graph_to_optimize
    function_optimizer: function_optimizer did nothing. time = 0.019ms.
    function_optimizer: function_optimizer did nothing. time = 0.001ms.

2022-02-14 18:49:04.277543: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2022-02-14 18:49:04.277786: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-02-14 18:49:04.524204: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1149] Optimization results for grappler item: tf_graph
constant_folding: Graph size after: 1118 nodes (-612), 1787 edges (-612), time = 85.762ms.
constant_folding: Graph size after: 1118 nodes (0), 1787 edges (0), time = 70.253ms.

2022-02-14 18:49:57 [INFO] ConvertLayoutOptimizer elapsed time: 1.95 ms
2022-02-14 18:49:58.696739: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2022-02-14 18:49:58.697680: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-02-14 18:49:58.829495: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1149] Optimization results for grappler item: graph_to_optimize
model_pruner: Graph size after: 1115 nodes (-3), 1784 edges (-3), time = 19.002ms.
shape_optimizer: shape_optimizer did nothing. time = 1.244ms.
dependency_optimizer: Graph size after: 1114 nodes (-1), 1171 edges (-613), time = 19.209ms.
debug_stripper: debug_stripper did nothing. time = 1.636ms.
loop_optimizer: Graph size after: 1114 nodes (0), 1171 edges (0), time = 9.353ms.
model_pruner: Graph size after: 1114 nodes (0), 1171 edges (0), time = 10.482ms.
shape_optimizer: shape_optimizer did nothing. time = 1.148ms.
dependency_optimizer: Graph size after: 1114 nodes (0), 1171 edges (0), time = 13.312ms.
debug_stripper: debug_stripper did nothing. time = 1.285ms.

2022-02-14 18:49:58 [INFO] Pass GrapplerOptimizer elapsed time: 1625.51 ms
2022-02-14 18:49:59 [INFO] Pass SwitchOptimizer elapsed time: 227.41 ms
2022-02-14 18:49:59 [INFO] Pass RemoveTrainingNodesOptimizer elapsed time: 229.28 ms
2022-02-14 18:49:59 [INFO] Pass SplitSharedInputOptimizer elapsed time: 16.05 ms
2022-02-14 18:49:59 [INFO] Pass GraphFoldConstantOptimizer elapsed time: 215.11 ms
2022-02-14 18:49:59 [INFO] Pass FuseColumnWiseMulOptimizer elapsed time: 227.62 ms
2022-02-14 18:50:00 [INFO] Pass StripUnusedNodesOptimizer elapsed time: 892.47 ms
2022-02-14 18:50:00 [INFO] Pass GraphCseOptimizer elapsed time: 225.92 ms
2022-02-14 18:50:01 [INFO] Pass FoldBatchNormNodesOptimizer elapsed time: 718.75 ms
2022-02-14 18:50:01 [INFO] Pass UpdateEnterOptimizer elapsed time: 103.94 ms
2022-02-14 18:50:01 [INFO] Pass ConvertLeakyReluOptimizer elapsed time: 113.15 ms
2022-02-14 18:50:01 [INFO] Pass ConvertAddToBiasAddOptimizer elapsed time: 114.01 ms
2022-02-14 18:50:02 [INFO] Pass FuseTransposeReshapeOptimizer elapsed time: 116.84 ms
2022-02-14 18:50:02 [INFO] Pass FuseConvWithMathOptimizer elapsed time: 115.22 ms
2022-02-14 18:50:02 [INFO] Pass ExpandDimsOptimizer elapsed time: 112.96 ms
2022-02-14 18:50:02 [INFO] Pass InjectDummyBiasAddOptimizer elapsed time: 128.91 ms
2022-02-14 18:50:02 [INFO] Pass MoveSqueezeAfterReluOptimizer elapsed time: 112.66 ms
2022-02-14 18:50:03 [INFO] Pass Pre Optimization elapsed time: 58651.82 ms
2022-02-14 18:50:03 [INFO] Get FP32 model baseline.
2022-02-14 18:50:03 [INFO] Start to evaluate the TensorFlow model.
2022-02-14 18:50:04 [WARNING] Fail to forward with batch size=32, set to 1 now.
2022-02-14 18:50:04 [ERROR] Unexpected exception InvalidArgumentError() happened during tuning.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/experimental/quantization.py", line 151, in execute
self.strategy.traverse()
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/strategy/strategy.py", line 323, in traverse
self.baseline = self._evaluate(self.model)
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/strategy/strategy.py", line 501, in _evaluate
val = self.multi_objective.evaluate(eval_func, model)
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/objective.py", line 222, in evaluate
acc = eval_func(model)
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/utils/create_obj_from_config.py", line 132, in eval_func
return adaptor.evaluate(model, dataloader, postprocess,
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/utils/utility.py", line 240, in fi
res = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/adaptor/tensorflow.py", line 379, in evaluate
results = eval_func(dataloader)
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/adaptor/tensorflow.py", line 326, in eval_func
for idx, (inputs, labels) in enumerate(dataloader):
File "/usr/local/lib/python3.8/dist-packages/neural_compressor/experimental/data/dataloaders/tensorflow_dataloader.py", line 88, in _generate_dataloader
for iter_tensors in dataset:
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 800, in next
return self._next_internal()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 783, in _next_internal
ret = gen_dataset_ops.iterator_get_next(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2845, in iterator_get_next
_ops.raise_from_not_ok_status(e, name)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 7107, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [offset_width must be >= 0.]
[[{{node crop_to_bounding_box/Assert/Assert}}]] [Op:IteratorGetNext]
2022-02-14 18:50:04 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.
Traceback (most recent call last):
File "main.py", line 69, in
evaluate_opt_graph.run()
File "main.py", line 58, in run
q_model.save(self.args.output_graph)
AttributeError: 'NoneType' object has no attribute 'save'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions