Enable for qnn operations for const folding transformation #9164

apeskov · 2021-09-30T16:17:02Z

Current the sequence cons -> qnn.quantize is not treated like a constant subgraph. Suggestion is to allow FoldConstant pass to replace this pattern with single int8 constant tensor.

Reason: Some BYOC runtimes may has a limitation to have a weight like a constant tensor. Pointed FoldConstant pass limitation may breaks BYOC runtimes applicability.

Signed-off-by: Alexander Peskov <peskovnn@gmail.com>

manupak

Thanks @apeskov !

It is an interesting change.

However, looking at the test case, I am struggling to understand why this can not be achieved via running Legalize before ConstantFold pass. Would you be able to share some motivation?

manupak · 2021-10-01T07:31:42Z

tests/python/relay/test_pass_fold_constant.py

+        func = relay.Function([x], add)
+        return func
+
+    zz = run_opt_pass(before(), transform.FoldConstant())


How is it different from running Legalize followed by FoldConstant ?

masahi · 2021-10-07T08:55:42Z

@apeskov Please see this PR #9135. I understand why you want to do this, namely, constant fold quantize(weight_fp32) in a QNN graph. Returning float32 weights from the PyTorch frontend and relying on Relay constant folding to recover quantized weights was my design mistake. Now you can directly obtain quantized weights from the frontend (we do quantize at numpy level).

@manupa-arm Running lowering before const fold is not acceptable when we want to keep the rest of QNN graphs (BYOC), while selectively lower constant subgraphs and evaluate them.

manupak · 2021-10-07T09:06:13Z

@masahi What is stopping us from the running the legalization on the IRModule with just the external function ? i.e. in the relay.ext.<codegen>

masahi · 2021-10-07T09:12:02Z

I'm thinking of use cases in BYOC where we want to pattern match against QNN ops, in which case we don't want to run QNN legalization. Not sure if this answers your question @manupa-arm

In my prev job, I directly took QNN subgraphs and send them to external codegen. I believe ethos-N does something similar. We had to develop constant folding on the external codegen side.

manupak · 2021-10-07T09:18:19Z

Hmmm, I was not suggesting to run the legalization before the partitioning.

We could identify the patterns with QNN ops and then we partition the external function.
I believe the requirement to do the legalization + constant folding comes post partitioning.

So there are two places we could do this :

post PartitionGraph pass in the partition_for_* function
relay.ext. which gets the partitioned function passed in via relay.build process.

I believe, this particular requirement is to mutate only the external function (thus, we could do it 2) and not 1) ) and not the 'main'. Therefore, why cant we achieve the same effect running the two passes -- legalization + constant folding -- in 2) ?

masahi · 2021-10-07T09:47:37Z

Hmm interesting, I never thought about doing constant folding on partitioned functions. My use cases have always been doing constant folding on main, before partitioning. For example, that was the case in PyTorch frontend before #9135 which always produced something like qnn.quantize(const_weight_fp32). The other case is QNN produced by FakeQuantizationToInteger pass, which also generates many qnn.quantize with constant weights.

In 2), if we run legalization on partitioned functions, wouldn't that decompose all QNN ops? I couldn't easily extract qparams anymore, for example. I needed to retain QNN ops all the way until I translated them to the external IR, so running legalization had never been my option. I did wish that we could selectively lower const-foldable QNN subgraphs only. Maybe I'm missing something.

manupak · 2021-10-07T10:23:13Z

In 2), if we run legalization on partitioned functions, wouldn't that decompose all QNN ops?

I think the constant folding pass is supposed to work in the IRModule (with the external function). Therefore, everything in the IRModule will be affected. However, we could create IRModules with what is in-scope for the transformation.

I needed to retain QNN ops all the way until I translated them to the external IR, so running legalization had never been my option. I did wish that we could selectively lower const-foldable QNN subgraphs only. Maybe I'm missing something.

It is about further granularity one would to do further partitioning. Today, I think we need to do further partitioning to achieve this. However, whether we want to annotations to block constant folding seems like an interesting but an orthogonal conversation to this one.

In the scope of changes in this PR, I feel it does the same thing (destroys QNN info in the process of constant folding). However, we could control what we want to pass into the Constant Folding Pass.

masahi · 2022-01-09T23:10:30Z

@apeskov please update or close this PR

masahi · 2022-01-19T20:59:12Z

@apeskov Please update or close this PR

areusch · 2022-04-08T22:39:50Z

closing due to inactivity, feel free to reopen

FoldConstant. Enable for qnn primitives

9b58e06

Signed-off-by: Alexander Peskov <peskovnn@gmail.com>

apeskov requested review from anijain2305, areusch, comaniac, jroesch, junrushao, jwfromm, MarisaKirisame, mbrookhart, merrymercy, slyubomirsky, tqchen, vinx13, wweic, yzhliu, zhiics and ZihengJiang as code owners September 30, 2021 16:17

manupak reviewed Oct 1, 2021

View reviewed changes

jroesch added the status: need review label Jan 19, 2022

jroesch assigned manupak Jan 19, 2022

jroesch removed request for areusch, wweic and yzhliu January 19, 2022 20:55

jroesch requested review from masahi and removed request for jroesch, tqchen, MarisaKirisame, mbrookhart, zhiics, slyubomirsky, vinx13, comaniac, jwfromm, anijain2305, merrymercy, ZihengJiang and junrushao January 19, 2022 20:55

areusch closed this Apr 8, 2022

This was referenced May 4, 2022

[QNN] Enable constant folding for QNN operations #11210

Closed

[QNN] Enable constant folding for QNN operations. #11228

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable for qnn operations for const folding transformation #9164

Enable for qnn operations for const folding transformation #9164

apeskov commented Sep 30, 2021

manupak left a comment •

edited

Loading

manupak Oct 1, 2021

masahi commented Oct 7, 2021 •

edited

Loading

manupak commented Oct 7, 2021 •

edited

Loading

masahi commented Oct 7, 2021 •

edited

Loading

manupak commented Oct 7, 2021

masahi commented Oct 7, 2021 •

edited

Loading

manupak commented Oct 7, 2021

masahi commented Jan 9, 2022

masahi commented Jan 19, 2022

areusch commented Apr 8, 2022

Enable for qnn operations for const folding transformation #9164

Enable for qnn operations for const folding transformation #9164

Conversation

apeskov commented Sep 30, 2021

manupak left a comment • edited Loading

Choose a reason for hiding this comment

manupak Oct 1, 2021

Choose a reason for hiding this comment

masahi commented Oct 7, 2021 • edited Loading

manupak commented Oct 7, 2021 • edited Loading

masahi commented Oct 7, 2021 • edited Loading

manupak commented Oct 7, 2021

masahi commented Oct 7, 2021 • edited Loading

manupak commented Oct 7, 2021

masahi commented Jan 9, 2022

masahi commented Jan 19, 2022

areusch commented Apr 8, 2022

manupak left a comment •

edited

Loading

masahi commented Oct 7, 2021 •

edited

Loading

manupak commented Oct 7, 2021 •

edited

Loading

masahi commented Oct 7, 2021 •

edited

Loading

masahi commented Oct 7, 2021 •

edited

Loading