Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] AlterOpLayout Pass Refactoring #3670

Open
yzhliu opened this issue Jul 30, 2019 · 7 comments

Comments

@yzhliu
Copy link
Member

commented Jul 30, 2019

This is the follow-up issue for https://discuss.tvm.ai/t/rfc-functionality-of-alteroplayout-and-possible-refactoring/ To enhance the AlterOpLayout pass,

The reason to do the refactoring is to address the limitation in current AlterOpLayout pass:

  • the altered op should have the same number of arguments as the previous one
  • do not support nested tuple arguments
  • It cannot properly deal with the scenario where altered operator has a different type, e.g., conv2d replaced by conv2d+add, which @anijain2305 recently required for quantization

It is extremely difficult (or even impossible) to address the above problems in current design, the detailed reason can be found in "Expand me to read the functionality of AlterOpLayout, as well as the motivation of doing refactoring." part of the post on discuss

I would like to propose 4 more passes to replace current AlterOpLayout pass,

  • Layout inference pass
    To infer the layout of each layer.
    example:
    https://github.com/yzhliu/tvm-1/blob/refactor_alter_layout/src/relay/op/nn/convolution.cc#L144
    https://github.com/yzhliu/tvm-1/blob/refactor_alter_layout/tests/python/relay/test_pass_infer_layout.py

  • Rewrite operator pass
    This pass is to rewrite the operator to another (set of) operator(s), while the shape/dtype/layout need to remain the same for input and output. API,

    @conv2d_rewrite_op.register("cpu")
    def _rewrite_conv2d(attrs, inputs, tinfo):

    This can be used to convert
    (NCHW) -> conv2d -> (NCHW)
    to
    (NCHW) -> LT(NCHW->NCHW16c) -> conv2d_NCHW16c -> LT(NCHW16c -> NCHW) -> (NCHW)

  • Propagate layout pass
    This pass is to convert other operators to use the layout of its previous operator, it can be used to convert
    conv2d_NCHW16c -> LT(NCHW16c->NCHW) -> add
    to
    conv2d_NCHW16c -> LT(NCHW16c->NCHW) -> LT(NCHW->NCHW16c) -> add -> LT(NCHW16c->NCHW)
    The API looks like, (can be pre-defined rules)

    @add_propagate_layout()
    propagate_layout_add(origin=["NCHW", "CHW"], preferred=["NCHW16c", "CHW16c"])
  • Peephole pass
    Remove unnecessary layout transform operators, it can be used to convert
    conv2d_NCHW16c -> LT(NCHW16c->NCHW) -> LT(NCHW->NCHW16c) -> add -> LT(NCHW16c->NCHW)
    to
    conv2d_NCHW16c -> add -> LT(NCHW16c->NCHW)

@tqchen @merrymercy @anijain2305

@anijain2305

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

What do you guys think about having Layout as a member of TensorType? Currently Type basically means dtype and shape. I think it is very useful to have Layout there as well. If thats the case, the Legalize API will get arg_dtypes, and thus layout, enabling transformation based on input layouts.

This also means infer_type gets more complicated, as it will have to infer layout as well. However, I think it should be fine to have some Layouts as undefined/non-inferable (something like that).

@tqchen @merrymercy @zhiics

@tqchen

This comment has been minimized.

Copy link
Member

commented Aug 9, 2019

Right now the design decision is to not put Layout as part of the type system, because most of the original deep learning programs do not have layout and they are implied in the operator. We could however, provide a separate column of layout attribute that get passed in during rewriting if we need it later.

@anijain2305

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2019

If its ok, I will give a couple of reasons why I think treating layout as first class citizens is important (The world can do with one more opinion :) )

  • It seems to me that layout was an afterthought for the frameworks. They started with just one layout, as deep learning progressed, we realized we need more layouts. For backward compatibility, we are in this somewhat awkward state.
  • Another strong reason is that frameworks care about layout on an op basis, but DLC cares about layouts on the network basis. As a major deep learning compiler, we might be in a unique position to address layout handing.

That being said, I agree as far as the Legalize API goes, we can add the layouts as another argument. That should be enough for our purposes.

Please also let me know if there was an Issue or design discussion for layouts.

@tqchen

This comment has been minimized.

Copy link
Member

commented Aug 9, 2019

I agree that layout need to be treated carefully and handling layout deserves major effort from the compilation optimization side.

Technically, the only question is how to store these information:

  • Types are stored in a row based format(each Expr has a type)
  • Layout and other attributes could be stored as a column based format (just like dataframe) in the function attribute of the function.

We could have an InferLayout pass, that takes in a Function and return a new Function, with additional function attribute layout(which is a Map<Expr, Layout>)

@anijain2305

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2019

I see. I missed the implementation detail point. My first preference is place it inside Type (but I guess that maybe is not the preferred choice as of now given how frameworks handles layout).

The second option that you give is pretty good too. However, how do we read the layout for example in a CallNode? For example, if I want to pass layouts to Legalize API for a particular call node, it would not know the function it belongs to (correct me if I am wrong).

@kevinthesun

This comment has been minimized.

Copy link
Contributor

commented Aug 12, 2019

To support CV model in tensorflow/tflite, do we need to add propagate rules for ops to support conversion from "NHWC" to "NCHW"? If so, would it be easier to add these rules as operator attributes?

@anijain2305

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2019

@yzhliu Gave it some more thoughts over last few days. I think there might be slightly better way to deal with layouts.

  • Instead of directly using the transpose operators, maybe we can use some new annotation ops like annotate.change_layout (or maybe use layout_transform). This will have a src and dst layout. For each op that goes through legalization, we can have one change_layout in the start and one at the end.

  • Now, instead of propagating the layout, I would hoist/sink the change_layout annotation ops if the ops are layout agnostic. This is similar to Loop Invariant Code Motion (or here, if you allow me, Layout Invariant Code Motion ;-) ). We can hoist these annotated ops as much as possible. Therefore, you would not insert any new ops, you will just try to move them to right place.

  • If done properly, the end state will have back to back change_layout ops. Now, a peephole can remove them. And then a realize pass can convert the remaining annotated ops to a sequence of existing Relay ops.

This should be able to work for both TF and MxNet, i.e., support both NCHW and NHWC layout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.