Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze onecc options #9369

Open
hyunsik-yoon opened this issue Jun 30, 2022 · 23 comments
Open

Analyze onecc options #9369

hyunsik-yoon opened this issue Jun 30, 2022 · 23 comments

Comments

@hyunsik-yoon
Copy link
Contributor

hyunsik-yoon commented Jun 30, 2022

related #9313

Let me analyze onecc options and figure out how to handle this for one init.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jun 30, 2022

one-optimizer options

Study result:

Format

  • Optimization pass name
    • model type dependency
    • backend dependency
    • related links

Let's start from

if (arser.get<bool>("--fold_add_v2"))

Optimization passes

Another ref:
check #4353 - which was req from ONERT
check grouping issue #5784

There is an issue in my internal journal repo, which has a bit more link inside internal data.

@seanshpark
Copy link
Contributor

image

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 1, 2022

  • Lenged
    • -: don't care
    • o: should be default
    • x: should NOT be default
option  onnx tflite ONERT npu
FusePreActivationBatchNorm x x x x
MakeBatchNormGammaPositive x x x x
ReplaceMulAddWithDepthwiseConv - - x o
ReplaceSubWithAdd - - x o
ExpandBroadcastConst - - x o
RemoveQuantDequantSeq - - x x
RemoveFakeQuant - - x x
ShuffleWeightTo16x1Float32 x x x x
ReplaceNonConstFCWithBatchMatMul - - x o
SparsifyTensorPass x x x x
ConvertNCHWToNHWC see below x - -
others o o o o

Regarding ConvertNCHWToNHWC:

  • if input is ONNX with NCHW input (rank == 4 && transpose with specific perm surrounding some op (e.g., conv2d))
    • if input data is NCHW? ==> ConvertNCHWToNHWC is default
    • if input/out data is NHWC? ==> ConvertNCHWToNHWC,, nchw_to_hnwc_input_shape, nchw_to_hnwc_output_shape are default

Q. How can we handle a weird Onnx model that has 2 inputs? What if one is NCHW input and another is NHWC input?
A. Current onecc also cannot handle this. It changes all input layout format.


(updated)
I remvoed ExpandBroadcastConst from the default list. (details in one-vscode-triv repo issue 93.

@hyunsik-yoon
Copy link
Contributor Author

Some thought:

  • CLI could be something like the following:
    • $ one-init a.tflite --model_type=tflite --backend=npu
      • --model_type is optional
    • when model type is onnx with image input, two more mandatory args could be provided:
      • $ one-init a.onnx --model_type=onnx --input_data_layout=NCHW --output_data_layout=NCHW --backend=npu
  • How about put all ops at the last row (others) in the above table into -O1 ?
  • For npu-only optimization info, where should we put this info?
    • location
      • inside a code of one-init vs as a data file read by one-init?
      • I prefer data file because this file has no dependency with ONE ver and can be deployed separately
      • in future, we could consider python script, e.g., def getDefaultOptionsNpu(), when some computation is necessary
    • what repo can we store this file?
      • Is it OK to store this file in ONE as a name of npu code name?
      • Or... how about some naming like trix2.json? Note that end user won't enter --backend==trix2.
      • Or we can make another repo that has this data file. Put this file when make internal package. (Such repo could store info that maps npu code name to npu tool executable name....)

@hyunsik-yoon hyunsik-yoon changed the title Analyze optimization options Analyze onecc options Jul 1, 2022
@jinevening
Copy link
Contributor

In #9369 (comment), RemoveQuantDequantSeq and RemoveFakeQuant should not be turned on by default (even for npu). They change the semantics of a model.

Those options were implemented for experiment. I think we should not expose those options to users (CC @seanshpark )

@hyunsik-yoon
Copy link
Contributor Author

@jinevening Thanks. Table was updated.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 4, 2022

Let's dig into import options now.

import options

one-import-tflite

  • input path and output path is all we need. they can be default options.
$ ./one-import-tflite -h
usage: one-import-tflite [-h] [-v] [-V] [-C CONFIG] [-i INPUT_PATH]
                         [-o OUTPUT_PATH]

command line tool to convert TensorFlow lite to circle

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -V, --verbose         output additional information to stdout or stderr
  -C CONFIG, --config CONFIG
                        run with configuation file

converter arguments:
  -i INPUT_PATH, --input_path INPUT_PATH
                        full filepath of the input file
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        full filepath of the output file
$ ./one-import-tf
usage: one-import-tf [-h] [-v] [-V] [-C CONFIG] [--v1 | --v2]
                     [--graph_def | --saved_model | --keras_model]
                     [-i INPUT_PATH] [-o OUTPUT_PATH] [-I INPUT_ARRAYS]
                     [-s INPUT_SHAPES] [-O OUTPUT_ARRAYS]
                     [--save_intermediate]
one-import-tf: error: the following arguments are required: -i/--input_path -o/--output_path

one-import-tf

  • If model is --model_format graph_def, input_array, input_shape, output_array should be mandatory
  • we should know the following to make default options:
    • we should be able to know whether model format is graph_def. If so, the following should be also found to make complete default list.
      • model inputs
      • model input shapes
      • model outputs
    • we should be able to know which version of tensorflow (v1 or v2) is used to make model?

if flags.model_format == "graph_def":
if not flags.input_arrays:
raise ValueError("--input_arrays must be provided")
if not flags.output_arrays:
raise ValueError("--output_arrays must be provided")
input_shapes = None

if flags.input_shapes:
if not flags.input_arrays:
raise ValueError("--input_shapes must be used with --input_arrays")
if flags.input_shapes.count(":") != flags.input_arrays.count(","):

ref:

one-import-onnx

  • default options can be decided (without difficulty, probably)

#9161

  • [-I INPUT_ARRAYS] [-O OUTPUT_ARRAYS] should be removed from help message
$ ./one-import-onnx  -h
...
usage: one-import-onnx [-h] [-v] [-V] [-C CONFIG] [-i INPUT_PATH]
                       [-o OUTPUT_PATH] [-I INPUT_ARRAYS] [-O OUTPUT_ARRAYS]
                       [--model_format MODEL_FORMAT]
                       [--converter_version CONVERTER_VERSION]
                       [--save_intermediate]

command line tool to convert ONNX to circle

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -V, --verbose         output additional information to stdout or stderr
  -C CONFIG, --config CONFIG
                        run with configuation file
  --save_intermediate   Save intermediate files to output folder

converter arguments:
  -i INPUT_PATH, --input_path INPUT_PATH
                        full filepath of the input file
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        full filepath of the output file
  -I INPUT_ARRAYS, --input_arrays INPUT_ARRAYS
                        names of the input arrays, comma-separated
  -O OUTPUT_ARRAYS, --output_arrays OUTPUT_ARRAYS
                        names of the output arrays, comma-separated
  --model_format MODEL_FORMAT
  --converter_version CONVERTER_VERSION

one-import-bcq

also has same options with one-import-tf.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 4, 2022

one-quantize

  • one-quantize provides default options (e.g., --input_dtype float32, --quantized_dtype uint8, --granularity layer, --input_type quantized_dtype, --output_type quantized_dtype, --min_percentile 1.0, --max_percentile 99.0, --mode percentile)
    • can we use these as default quantization param?
      • note: there could be variations on default value depending on backends. E.g., some backend may prefer channel over layer. Some provides limitation, e.g., channel and layer for q8 but only channel for q16, etc.
  • input_data
    parser.add_argument(
    '-d',
    '--input_data',
    type=str,
    help=
    'full filepath of the input data used for post-training quantization. if not specified, run with random input data.'
    )
  • We may need to set some option to backend-specific default value.
$ ./one-quantize -h
usage: one-quantize [-h] [-v] [-V] [-C CONFIG] [-i INPUT_PATH] [-d INPUT_DATA]
                    [-f INPUT_DATA_FORMAT] [-o OUTPUT_PATH] [-p]
                    [--input_dtype INPUT_DTYPE]
                    [--input_model_dtype INPUT_MODEL_DTYPE]
                    [--quantized_dtype QUANTIZED_DTYPE]
                    [--granularity GRANULARITY] [--input_type INPUT_TYPE]
                    [--output_type OUTPUT_TYPE]
                    [--min_percentile MIN_PERCENTILE]
                    [--max_percentile MAX_PERCENTILE] [--mode MODE]
                    [--force_quantparam] [--tensor_name TENSOR_NAME]
                    [--scale SCALE] [--zero_point ZERO_POINT]

command line tool to quantize circle model

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -V, --verbose         output additional information to stdout or stderr
  -C CONFIG, --config CONFIG
                        run with configuation file
  -i INPUT_PATH, --input_path INPUT_PATH
                        full filepath of the input circle model
  -d INPUT_DATA, --input_data INPUT_DATA
                        full filepath of the input data used for post-training
                        quantization. if not specified, run with random input
                        data.
  -f INPUT_DATA_FORMAT, --input_data_format INPUT_DATA_FORMAT
                        file format of input data. h5/hdf5 (default),
                        list/filelist (a text file where a file path of input
                        data is written in each line), or dir/directory (a
                        directory where input data are saved)
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        full filepath of the output quantized model
  -p, --generate_profile_data
                        generate profiling data

arguments for quantization:
  --input_dtype INPUT_DTYPE
                        input model data type (supported: float32,
                        default=float32). Deprecated (Use input_model_dtype)
  --input_model_dtype INPUT_MODEL_DTYPE
                        input model data type (supported: float32,
                        default=float32)
  --quantized_dtype QUANTIZED_DTYPE
                        data type of output quantized model (supported: uint8,
                        int16, default=uint8)
  --granularity GRANULARITY
                        quantization granularity (supported: layer, channel,
                        default=layer)
  --input_type INPUT_TYPE
                        data type of inputs of quantized model (supported:
                        uint8, int16, default=quantized_dtype). QUANTIZE Op
                        will be inserted at the beginning of the quantized
                        model if input_type is different from quantized_dtype.
  --output_type OUTPUT_TYPE
                        data type of outputs of quantized model (supported:
                        uint8, int16, default=quantized_dtype). QUANTIZE Op
                        will be inserted at the end of the quantized model if
                        output_type is different from quantized_dtype.
  --min_percentile MIN_PERCENTILE
                        minimum percentile (0.0~100.0, default=1.0). Algorithm
                        parameter for calibration. This is valid when
                        calibration algorithm is percentile.
  --max_percentile MAX_PERCENTILE
                        maximum percentile (0.0~100.0, default=99.0).
                        Algorithm parameter for calibration. This is valid
                        when calibration algorithm is percentile.
  --mode MODE           calibration algorithm for post-training quantization
                        (supported: percentile/moving_average,
                        default=percentile). 'percentile' mode uses the n-th
                        percentiles as min/max values. 'moving_average' mode
                        records the moving average of min/max.

arguments for force_quantparam option:
  --force_quantparam    overwrite quantparam (scale, zero_point) to the
                        specified tensor in the quantized model.
  --tensor_name TENSOR_NAME
                        tensor name (string)
  --scale SCALE         scale (float)
  --zero_point ZERO_POINT
                        zero point (int)

  • if --input_data is not provided by default, one-quantize will use random data and this decrease accuracy.
    • option 1) Add comment (with suggestion ID) inside generated cfg file mentioning this
      • ; #suggestion: With no 'input-data', random data will be used and this could decrease accuracy. Consider using input data file.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 4, 2022

discussed with @lemmaa and @seanshpark

How to handle option that is dependent on backend

  • option 1.
  • option 2.
    • like other ONE backend tools, create backendname-init tool (e.g., for foo backend, create foo-init tool)
    • one-init first create cfg for ONE. then foo-init will add more default options for foo backend.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 6, 2022

one-init

Let's design CLI.

I added an option into CLI if

  • the option does not have any default value
  • without the option, compilation fails
$ /usr/share/one/bin/one-init -h
usage: one-init [-h] [-v] [-i INPUT_PATH] [-t INPUT_TYPE] [-b BACKEND] [--] [COMMANDS FOR BACKEND]

command line tool for generate *.cfg file for the given model

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -V, --verbose         output additional information to stdout or stderr

arguments for generation:
  -i INPUT_PATH, --input_path INPUT_PATH
                        full filepath of the input model file
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        full filepath of the output cfg file
  -m MODEL_TYPE, --model_type MODEL_TYPE
                        type of input model: "onnx", "tflite", "tf", "bcq"
                        If this option is not provided, file extension is used to decide model type.
  --tf_graphdef_input_arrays INPUT_ARRAYS 
                        names of the input arrays, comma-separated
                        (argument when input model is TF graphdef)            
  --tf_graphdef_input_shapes INPUT_SHAPES
                        shapes corresponding to --input_arrays, colon-
                        separated (ex:"1,4,4,3:1,20,20,3")
                        (argument when input model is TF graphdef)            
  --tf_graphdef_output_arrays OUTPUT_ARRAYS (argument when input model is TF graphdef)
                        names of the output arrays, comma-separated
                        (argument when input model is TF graphdef)            
  --onnx_input_data_layout LAYOUT
                        layout of input data files (argument when input model is onnx)
                        value could be: "NCHW", "NHWC", "not_image"
                        When input layout is neither NCHW nor NHWC, use "not_image".
  --onnx_output_data_layout LAYOUT
                        layout of output data files (argument when input model is onnx)
                        value could be: "NCHW", "NHWC", "not_image"
                        When output layout is neither NCHW nor NHWC, use "not_image".
  -b BACKEND, --backend BACKEND
                        backend name to use

trix-init

Like other ONE tool and backend tools, one-init could call backend-init cmd.
let's assume that backend name is "trix".

candidate 1.

$ trix-init -h [-C CONFIG]

arguments for generation:
  -C CONFIG     add default options into ONE configuation file
  • However, we cannot assume that all backend provider has ini or json lib. So this approach does not seem to good.

candidate 2.

Calling trix-init prints default option to console in json.
Note that printing such json does not require json lib.

$ trix-init 
{ 
  "optimize": ["ReplaceMulAddWithDepthwiseConv", "ReplaceSubWithAdd", ..., ],
  "quantize": { "quantized_dtype": "uint8", "granularity": "channel", ...  }
}

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 13, 2022

Discussion with @seanshpark

  1. args for onnx only
    • args originated for onnx model are better NOT to included for, e.g., tflite model compilation because
      • adding more args will degrade compilation speed
      • such op sequences rarely appear in tflite
        -[ ] classify those args and apply them only for onnx model
  2. ConvertNCHWToNHWC
    -[ ] For onnx model, this needs to be default. Add this option for onnx model
  3. Make cfg for compilation (not for accuracy, memory, latency) at first step
    • Related to [one-init] Discuss level of expectation on cfg file #9313 (comment)
    • Accuracy could not be good with some default op (e.g., default arg of one-quantize uses random data, which could lower the accuracy)
    • For the first step, focus on cfg file with which compilation will succeed ; for the next step, consider accuracy, memory, and latency
  4. comment inside cfg
    • one-init can generate useful comment for user
    • Consider how ONE-vscode cfg editor can do the following: (/cc @llFreetimell)
      • show comment
      • save comment when user press Ctrl-S
  5. grouping args of one-optimizes
    • We can start tasks related to grouping args after releasing one-init if dev date for releasing one-init is not long enough

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 13, 2022

args originated for onnx model are better NOT to included for, e.g., tflite model compilation

Let's list onnx-only args. These will be removed when model is not onnx.

one-optimize arg from onnx (WIP)

suspect (these could be introduced (심증) for onnx but I couldn't find any info):

@hyunsik-yoon
Copy link
Contributor Author

@seanshpark @jinevening
Could you briefly check #9369 (comment)?
Thanks in advance!

@jinevening
Copy link
Contributor

jinevening commented Jul 13, 2022

The options in #9369 (comment) were introduced for onnx models, but they can be used also for other frontends (tf, tflite).

Among the list, I think onnx-only option is only ConvertNCHWToNHWC.

@seanshpark
Copy link
Contributor

seanshpark commented Jul 13, 2022

Could you briefly check #9369 (comment)?

but they can be used also for other frontends (tf, tflite).

I agree on this but in the first step, for some safety, I prefer put these only to ONNX models.
And in the second step, with some experience and testing, we can put to common.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 13, 2022

@jinevening

The options in #9369 (comment) were introduced for onnx models, but they can be used also for other frontends (tf, tflite).

I agree.

What @seanshpark has concern about was that the patterns for these args might rarely occur in tflite models. So we thought that adding those args could increase compilation time for tflite models but show small impact on optimization.

Like @seanshpark suggested, how about adding those args into the second step someday?
Meanwhile, inside cfg files for tflite models, I think that we can add some comment about these args so that users can easily turn on some of these options.

@jinevening
Copy link
Contributor

I get it. Let's go with that option :)

@hyunsik-yoon
Copy link
Contributor Author

Let me close this and create another issue to implement one-init.

@hyunsik-yoon
Copy link
Contributor Author

hyunsik-yoon commented Jul 18, 2022

I got the following error:

$ onecc -C gen-from-one-init-for-tflite.cfg
usage: one-optimize [-h] [-v] [-V] [-C CONFIG] [-p]
                    [--change_outputs CHANGE_OUTPUTS] [-i INPUT_PATH]
                    [-o OUTPUT_PATH] [--convert_nchw_to_nhwc]
...
                    [--fuse_activation_function] [--fuse_instnorm]
                    [--replace_cw_mul_add_with_depthwise_conv]
                    [--remove_fakequant] [--remove_quantdequant]
...
                    [--remove_unnecessary_split]
                    [--replace_non_const_fc_with_batch_matmul]
                    [--resolve_customop_add] [--resolve_customop_batchmatmul]
...
                    [--transform_min_relu_to_relu6]

one-optimize: error: the following arguments are unrecognized: replace_sub_with_add

The error msg says that replace_sub_with_add is not recognized. Hmm..

replace_sub_with_add is not in constant.py. I am not sure if this is intentional or a bug.

@seanshpark
Copy link
Contributor

seanshpark commented Jul 18, 2022

I am not sure if this is intentional or a bug.

Bug -_-; If you like, you can add this to one-cmds :)

or... @llFreetimell was this intentional ?

@hyunsik-yoon
Copy link
Contributor Author

I added replace_sub_with_add into /usr/share/one/bin/onelib/constant.py and now I got model-opt.circle.
However one-quantize doesn't seem to work.

/usr/share/one/bin/one-quantize -V -i ./models/model-opt.circle -o ./models/model-q.circle --quantized_dtype uint8 --granularity channel

@seanshpark
Copy link
Contributor

However one-quantize doesn't seem to work.

ping @jinevening

@llFreetimell
Copy link
Contributor

replace_sub_with_add is not in constant.py. I am not sure if this is intentional or a bug

I think it's simple mistake of @mhs4670go @.@
After I finished the implementation of replaceSubWithAdd, constant.py was created...!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants