Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[onert/odc] Draft for odc: hidden switching mechanism. fcircle-qcircle step #13530

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Torrero
Copy link
Contributor

@Torrero Torrero commented Jul 27, 2024

This is draft for odc: hidden switching mechanism. fcircle-qcircle step was implemented.

Conv2D_000.circle is used as model for test (it was added just for this draft verification)
Size of the model: 556B

This draft contains auto compilation (odc: hidden switching mechanism) , which is executed in nnfw_run_with_auto_compilation function.

  • nnfw_run_with_auto_compilation function was added.
  • odc::QuantizerManager, odc::Quantizer and odc::MinMaxReader were changes for set and minmax threshold and cheсking readiness for quantization. Function for removing minmax file was added. (question about set up the minmax file location)
  • OdcInfo class for storaging actual state of the odc was added.

After model quantization, the removing minmax file step was added.
After first loading quantized model the following sequence was implemented:

  • save the inputs and outputs buffers of the initial fcricle model
  • reload qcircle model to the session
  • recover inputs and outputs

after that do inference of the quantized model

For running test of this draft:

BUILD_TYPE=Debug BUILD_JOBS=6 make -f Makefile.template

cd ./Product/out/unittest

./nnfw_api_gtest --gtest_filter=TestOdcAutoCompilation*

Time of the test execution: 11 ms
Peak memory footprint of the test: 4,939 Kb

ONE-DCO-1.0-Signed-off-by: Evgenii Maltsev e.maltsev@samsung.com

@hseok-oh
Copy link
Contributor

I don't check all your implementation yet, but here is my comment

@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch from 967a66c to e5e7230 Compare August 1, 2024 19:14
@Torrero
Copy link
Contributor Author

Torrero commented Aug 1, 2024

Please introduce different run API for hidden switching, not in nnfw_run.

Your QDQBufferAdapter implementation looks almost same with #13287. Please remove functional duplication if it is same.

@hseok-oh Thank you for pointing out your implementation of quantization and dequantization of buffers. I removed this functional duplication and moved hidden_switching_mechanizm from nnfw_run to nnfw_run_odc_hsm

@jyoungyun
Copy link
Contributor

(optional)

I think it would be good to use a different function name to make it easier for the user to understand, but I am not sure about the appropriate function name.. :( It's a long name, but how about nnfw_run_with_auto_compilation? If you have a better idea for the function name, you can use the name. :)

@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch from e5e7230 to da388c1 Compare August 2, 2024 08:08
@Torrero
Copy link
Contributor Author

Torrero commented Aug 2, 2024

(optional)

I think it would be good to use a different function name to make it easier for the user to understand, but I am not sure about the appropriate function name.. :( It's a long name, but how about nnfw_run_with_auto_compilation? If you have a better idea for the function name, you can use the name. :)

@jyoungyun It's a good idea, thank you. I renamed the function name to nnfw_run_with_auto_compilation and changed some descriptions.

@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch 3 times, most recently from 7140f5b to 992deca Compare August 23, 2024 17:27
@Torrero
Copy link
Contributor Author

Torrero commented Aug 23, 2024

@hseok-oh @jyoungyun
I updated this draft, added compilation step.
Now there is an attempt to compile of the quantized model and load compiled model, if it fails, try to load and run the quantized model.

Could you advise me, please, how can I add a test of the compilation part?

@hseok-oh
Copy link
Contributor

hseok-oh commented Aug 27, 2024

IMO, responsibility of saving minmax recording count threshold and checking recording count should have in quantizer, not minmax recorder. So API should query to quantizer that it is ready to quantize, and quantizer should check recording count from file. In this point of view, if quantizer replies it is ready, API will change exec config to stop recording and request quantizer to actually quantize.

Additionally, hdf5 minmax dumper is not used now, and that will be removed.

@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch 2 times, most recently from ad10e22 to 4d0ae5f Compare September 11, 2024 19:31
@Torrero
Copy link
Contributor Author

Torrero commented Sep 11, 2024

@hseok-oh

I moved checking the readiness for quantization and minmax recording threshold to odc::Quantizer. Also function for removing minmax file was added to it (Maybe it is better to move this function to odc::MinMaxReader or we can define another location for it)

@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch from 4d0ae5f to 28ed907 Compare September 26, 2024 18:00
@Torrero
Copy link
Contributor Author

Torrero commented Sep 26, 2024

Negative test was added

…circle-qcircle step was implemented with compilation

Conv2D_000.circle  is used as model for test (it was added just for this draft verification)

This draft contains auto compilation, which is executed in `nnfw_run_auto_compilation` function.

- `nnfw_run_auto_compilation` function was added.
- QuantizerManager, quantizer and MinMaxReader were changes for set and minmax threshold and cheking readiness for quantization. Function for removing minmax file was added.
- OdcInfo class for storaging actual state of the odc was added.
- for compilation step it uses "session::codegen" function

After model quantization, the removing minmax file step was added.
After first loading quantized model the following sequence was implemented:

- save the inputs and outputs buffers of the initial fcricle model
- attempt to compile of the quantized model and load compiled model,  if it fails, try to load the quantized model
- recover inputs and outputs

after that

- do inference of the compiled or quantized model

Positive and negative tests were added

ONE-DCO-1.0-Signed-off-by:  Evgenii Maltsev e.maltsev@samsung.com
@Torrero Torrero force-pushed the odc_hidden_switching_mechanizm branch from 28ed907 to fe7d8ce Compare October 18, 2024 07:54
@Torrero
Copy link
Contributor Author

Torrero commented Oct 25, 2024

@hseok-oh PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants