[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

Torrero · 2024-07-27T09:54:27Z

This is draft for odc: hidden switching mechanism. fcircle-qcircle step was implemented.

Conv2D_000.circle is used as model for test (it was added just for this draft verification)
Size of the model: 556B

This draft contains auto compilation (odc: hidden switching mechanism) , which is executed in nnfw_run_with_auto_compilation function.

nnfw_run_with_auto_compilation function was added.
odc::QuantizerManager, odc::Quantizer and odc::MinMaxReader were changes for set and minmax threshold and cheсking readiness for quantization. Function for removing minmax file was added. (question about set up the minmax file location)
OdcInfo class for storaging actual state of the odc was added.

After model quantization, the removing minmax file step was added.
After first loading quantized model the following sequence was implemented:

save the inputs and outputs buffers of the initial fcricle model
reload qcircle model to the session
recover inputs and outputs

after that do inference of the quantized model

For running test of this draft:

BUILD_TYPE=Debug BUILD_JOBS=6 make -f Makefile.template

cd ./Product/out/unittest

./nnfw_api_gtest --gtest_filter=TestOdcAutoCompilation*

Time of the test execution: 11 ms
Peak memory footprint of the test: 4,939 Kb

ONE-DCO-1.0-Signed-off-by: Evgenii Maltsev e.maltsev@samsung.com

hseok-oh · 2024-07-30T07:57:06Z

I don't check all your implementation yet, but here is my comment

Please introduce different run API for hidden switching, not in nnfw_run.
Your QDQBufferAdapter implementation looks almost same with [onert] Type-aware quantized model input/output buffer setting #13287. Please remove functional duplication if it is same.

Torrero · 2024-08-01T19:24:09Z

Please introduce different run API for hidden switching, not in nnfw_run.

Your QDQBufferAdapter implementation looks almost same with #13287. Please remove functional duplication if it is same.

@hseok-oh Thank you for pointing out your implementation of quantization and dequantization of buffers. I removed this functional duplication and moved hidden_switching_mechanizm from nnfw_run to nnfw_run_odc_hsm

jyoungyun · 2024-08-02T05:54:37Z

(optional)

I think it would be good to use a different function name to make it easier for the user to understand, but I am not sure about the appropriate function name.. :( It's a long name, but how about nnfw_run_with_auto_compilation? If you have a better idea for the function name, you can use the name. :)

Torrero · 2024-08-02T08:13:12Z

(optional)

I think it would be good to use a different function name to make it easier for the user to understand, but I am not sure about the appropriate function name.. :( It's a long name, but how about nnfw_run_with_auto_compilation? If you have a better idea for the function name, you can use the name. :)

@jyoungyun It's a good idea, thank you. I renamed the function name to nnfw_run_with_auto_compilation and changed some descriptions.

Torrero · 2024-08-23T17:33:49Z

@hseok-oh @jyoungyun
I updated this draft, added compilation step.
Now there is an attempt to compile of the quantized model and load compiled model, if it fails, try to load and run the quantized model.

Could you advise me, please, how can I add a test of the compilation part?

hseok-oh · 2024-08-27T11:00:36Z

IMO, responsibility of saving minmax recording count threshold and checking recording count should have in quantizer, not minmax recorder. So API should query to quantizer that it is ready to quantize, and quantizer should check recording count from file. In this point of view, if quantizer replies it is ready, API will change exec config to stop recording and request quantizer to actually quantize.

Additionally, hdf5 minmax dumper is not used now, and that will be removed.

Torrero · 2024-09-11T19:33:02Z

@hseok-oh

I moved checking the readiness for quantization and minmax recording threshold to odc::Quantizer. Also function for removing minmax file was added to it (Maybe it is better to move this function to odc::MinMaxReader or we can define another location for it)

Torrero · 2024-09-26T18:02:05Z

Negative test was added

…circle-qcircle step was implemented with compilation Conv2D_000.circle is used as model for test (it was added just for this draft verification) This draft contains auto compilation, which is executed in `nnfw_run_auto_compilation` function. - `nnfw_run_auto_compilation` function was added. - QuantizerManager, quantizer and MinMaxReader were changes for set and minmax threshold and cheking readiness for quantization. Function for removing minmax file was added. - OdcInfo class for storaging actual state of the odc was added. - for compilation step it uses "session::codegen" function After model quantization, the removing minmax file step was added. After first loading quantized model the following sequence was implemented: - save the inputs and outputs buffers of the initial fcricle model - attempt to compile of the quantized model and load compiled model, if it fails, try to load the quantized model - recover inputs and outputs after that - do inference of the compiled or quantized model Positive and negative tests were added ONE-DCO-1.0-Signed-off-by: Evgenii Maltsev e.maltsev@samsung.com

Torrero · 2024-10-25T15:55:33Z

@hseok-oh PTAL

Torrero requested review from jyoungyun, glistening and hseok-oh July 27, 2024 09:54

Torrero force-pushed the odc_hidden_switching_mechanizm branch from fe5bf20 to 967a66c Compare July 27, 2024 10:09

Torrero mentioned this pull request Jul 27, 2024

[onert] Hidden switching mechanism for on-device compiler #13288

Open

Torrero force-pushed the odc_hidden_switching_mechanizm branch from 967a66c to e5e7230 Compare August 1, 2024 19:14

Torrero force-pushed the odc_hidden_switching_mechanizm branch from e5e7230 to da388c1 Compare August 2, 2024 08:08

Torrero force-pushed the odc_hidden_switching_mechanizm branch 3 times, most recently from 7140f5b to 992deca Compare August 23, 2024 17:27

Torrero force-pushed the odc_hidden_switching_mechanizm branch 2 times, most recently from ad10e22 to 4d0ae5f Compare September 11, 2024 19:31

Torrero force-pushed the odc_hidden_switching_mechanizm branch from 4d0ae5f to 28ed907 Compare September 26, 2024 18:00

Torrero force-pushed the odc_hidden_switching_mechanizm branch from 28ed907 to fe7d8ce Compare October 18, 2024 07:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

Torrero commented Jul 27, 2024 •

edited

Loading

hseok-oh commented Jul 30, 2024

Torrero commented Aug 1, 2024

jyoungyun commented Aug 2, 2024

Torrero commented Aug 2, 2024

Torrero commented Aug 23, 2024

hseok-oh commented Aug 27, 2024 •

edited

Loading

Torrero commented Sep 11, 2024

Torrero commented Sep 26, 2024

Torrero commented Oct 25, 2024

[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

Are you sure you want to change the base?

[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

Conversation

Torrero commented Jul 27, 2024 • edited Loading

hseok-oh commented Jul 30, 2024

Torrero commented Aug 1, 2024

jyoungyun commented Aug 2, 2024

Torrero commented Aug 2, 2024

Torrero commented Aug 23, 2024

hseok-oh commented Aug 27, 2024 • edited Loading

Torrero commented Sep 11, 2024

Torrero commented Sep 26, 2024

Torrero commented Oct 25, 2024

Torrero commented Jul 27, 2024 •

edited

Loading

hseok-oh commented Aug 27, 2024 •

edited

Loading