Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] Introduce tir.allocate_const to TIR #8472

Closed
wants to merge 1 commit into from

Conversation

d-smirnov
Copy link
Contributor

This PR is adding non-scalar constant representation in TIR. This is used to
express constants (i.e., parameters) in the TIR instead of bypassing the
TIR as it's done until now.

Co-authored-by: Giuseppe Rossini giuseros85@gmail.com

@tqchen
Copy link
Member

tqchen commented Jul 14, 2021

Thanks @d-smirnov . Given that this will bring a non-trival change to the key TIR data structure, it might be worth discussing it first. I realized that it was touched a bit in the USMP RFC but slipped through since initially I thought it was a feature proposal without touching the IR itself. We can open a new discussion thread or followup on that thread.

In particular, up until now we have avoided the constant node in the low level because the specific implementation can really differ in different backends. Specific allocations might and data copy might be needed on devices with special memory. So from the general architecture PoV we want to avoid the allocate constant node. But instead have a way to point to the global data section.

It would be great to discuss the motivation and possible alternatives. My understanding is that we have a potential need to embed parameters into the data segments. We might be able to do things differently. e.g. introduce a constant data segment to IRModule and allow TIR to refer to these global Vars.

cc @areusch @manupa-arm who I believe was previously involved in the conversation.

@tqchen tqchen added the status: need RFC need RFC discussion label Jul 14, 2021
@manupak
Copy link
Contributor

manupak commented Jul 14, 2021

Hi @tqchen ,

Thanks for taking a look at this.

Our initial plan is to use this to support link-param implementation via faithfully expressing them in the TIR -- thus it could get lowered to using current link-param code generation (There is a subsequent PR to this effect). In a way, this will "bound" to IR only if the user provides link-param argument. If the specific implementation requires, the constants to be specially handled in the application layer, we think that is use-case where --link-params is not provided, in which case FuseOps will lift the constant tensor out of the IR into arguments (p0, p1, etc)

One of the motivation, this allows compiled constant artifacts to be expressed in the IR (e.g. compressed weights). Moreover, it allows the USMP to pool constants into pool in a clear fashion with ultimate goal of exposing them out of the "main" function.

Therefore, I dont think ability to express non-scalar constants always force us to define a general lowering for all targets rather it allows the ability to perform analysis/mutations/pooling of such constants before they being lifted out.

WDYT ?

@tqchen
Copy link
Member

tqchen commented Jul 14, 2021

Thanks @manupa-arm . Changing the IR definition can create a bigger impact than attaching optional attrs etc. Since the new node introduce semantics that existing passes may not taken into account. So it would be indeed great to put some thoughts into this. We need to put extra care into this to maintain architecture consistency across all the sub areas of interest.

My current read is that the function attrs approach can also be used to run the same optimization mentioned(pooling the constant). Except that rewrite needs to be done which takes the function attrs and intrinsics. I would still recommend us take that route first.

Given the common need of embedding artifacts, we might want to find a way to express global constant artifact in the IRModule, perhaps as Map<GlobalVar, DataSegment>, and referring them in the IR. This way could be cleaner than inlining the constant directly.

@manupak
Copy link
Contributor

manupak commented Jul 14, 2021

Hmmm, we do view the tir.allocate_const having closer semantics to tir.allocate, the difference being the former have data inside of it unlike the tir.allocate.

The existing TIR anyway treats non-scalar constant data as opaque vars passed in, therefore, I dont quite follow why existing passes get effected. (because we are expressing more information about a var that used to be an input).

Despite not seeing the effect on the existing passes, the alternatives would be to define the constant data within a boundary of a PrimFunc. The ability to define constants within the scope of PrimFunc allows specialized lowering of Relay to TIR express the compiled constants with proximity to the PrimFunc while enabling the use of TIR (USMP or otherwise) Passes of TVM. To explore the alternative you mentioned a bit, what do you think about positioning that Map inside the PrimFunc or using function attrs ?

@tqchen
Copy link
Member

tqchen commented Jul 14, 2021

I think we could start with PrimFunc attrs to store that Map<String, Data>, and then associate the Vars with the name.

@tvm.script.tir
def myfunc():
    tir.attrs({
        "link_params": {"model0": array} 
    })        
   my_param_var = tir.get_link_param("model0")
   ...

@manupak
Copy link
Contributor

manupak commented Jul 15, 2021

Hi @tqchen,

After considering the suggestions, these are few options we could think of :

C1 :

@tvm.script.tir
def myfunc():
    tir.attrs({
        "link_params": {"model0": array} 
    })        
   my_param_var = tir.get_link_param("model0")

C2 :

@tvm.script.tir
def myfunc():
    tir.attrs({
        "link_params": {my_param_var: array} 
    })        

C3 :

@tvm.script.tir
def myfunc():   
   param = tir.allocate_const([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], "int32", [10])

First of all, we do feel all the options could achieve what we want here.

If we consider C1, it seems to hide behind a tir builtin as a level of indirection to obtain 'array'. Therefore, between C1 & C2, we feel C2 is better and we do not see the requirement for that level of in-direction.

Between, C2 and C3, 'array' need to express information such as data, datatype, alignment, etc. Thus, we do have to introduce a specialized buffer_var (or something to that effect) to hold those information. Therefore, we feel tir.allocate_const is cleaner and intuitive.

Regarding the cons of C3, would you be able to elaborate how this affects negatively (where the compilation requires constants to be bound to the IR) in other areas/passes ? We cannot figure out how certain mutators and visitors get affected by adding an extra leaf node to IR w.r.t to other approaches mentioned here.

cc : @giuseros @mbaret

@areusch
Copy link
Contributor

areusch commented Jul 15, 2021

@manupa-arm @d-smirnov I believe your primary goal as I understand it is to encode the constant data in TIR so that post-scheduling visitors can modify the IR and the data based on whole-program analysis prior to exporting the model parameters, is that correct? it would be great to capture this in an e.g. RFC or design doc. I agree that --link-params constants are a reasonable place to start. I'm not sure what the right IR construct is, but perhaps we could consider it in terms of this longer-term goal rather than the immediate goal of extracting --link-params from TIR constants.

Traditionally, params were fetched by the executor. With AOT, we now have the option to encode much of the executor logic in TIR; this means though that we need to be careful not to over-complicate the various codegen. I think this latter concern isn't generally a problem if we are judicious with the things we add to TIR. Finally, we need to ensure that post-scheduling visitors won't break given the new TIR nodes. Given constants are not currently encoded in post-schedule TIR, I think the main case we need to consider is when a pass expected to find params by inspecting a PrimFunc param, but now needs to also inspect something else.

I agree with @tqchen it would be good to discuss the longer-term vision on the forum or with an RFC, even if it is small. We also need to consider how other executors could be coupled with passes that modify the constant data. We don't want to create special cases in TIR for specific executors when we don't need to.

@tqchen
Copy link
Member

tqchen commented Jul 15, 2021

Thanks everyone. While it is indeed possible that adding a new TIR node won't impact the current flow, changing the IR data structure itself is indeed a major architecture change that we need to think more thouroughly. Since a stable set of IR data structure and semantics would help us in the long run.

Just want to clarify the situations. We can also roughly map the solutions to:

  • C3: Introduce a new IR node
  • C2: embed into attrs, but allow attrs to refer to the IR node
  • C1: embed as meta data and use instrinsics to refer to it.

For the particular application in mind usually, C3 certainly brings the least amount of indirection. However, it also brings deeper surgery to the IR architecture itself. In the extreme case, one might fall into a that introduces new IR node for each kind of application to touch meta-data.

The main possible problem in C2 is that we want to make sure meta data in attrs remains constant during rewrite, so that the my_param_var won't get rewritten to something else, and passes does not need to consider the context to check if all variables are bound.

C1 is the approach that falls within the current architecture of meta data handling. While I agree that it may not be the most direct option at the moment, it will bring the least amount of impact to the TIR architecture itself. We also want to make sure that the metadata handling mechanism works for cases like this and possible future ones, and improvement if things does not work out well.

Popping up a level, we want to carefully think about the general need to referring to a constant meta data, and perhaps build some more native mechanism than C1, but also generalizes the particular case of allocate_const. For example, maybe we want to allow relay and TIR to share a same mechanism of such constant data reference. That is the long term solution that we might want to think about along the directions(e.g. bring the constant map as part of IRModule).

@manupak
Copy link
Contributor

manupak commented Jul 16, 2021

Thanks everyone! great discussions.

I ll first reply to @areusch comments.

believe your primary goal as I understand it is to encode the constant data in TIR so that post-scheduling visitors can modify the IR and the data based on whole-program analysis prior to exporting the model parameters, is that correct? it would be great to capture this in an e.g. RFC or design doc.

Yes this is correct. Once we conclude the discussion, we are happy to create a pull request to tvm-rfcs. For the discussion, I feel its better to have it here rather than going back and forth via discuss forum -- so its easier for reader to follow the discussion later.

I'm not sure what the right IR construct is, but perhaps we could consider it in terms of this longer-term goal rather than the immediate goal of extracting --link-params from TIR constants.

Agreed, though --link-params kind of mean to bind the params to the IR --> then to the codegeneration of operators itself (unless they are pooled and pulled out). Yes, the subsequent PR will support the code generation as well for linked constants :).

Given constants are not currently encoded in post-schedule TIR, I think the main case we need to consider is when a pass expected to find params by inspecting a PrimFunc param, but now needs to also inspect something else.

I dont believe this (a pass expected to find params by inspecting a PrimFunc param) is possible until we embed the constants in TIR for TIR passes.

We also need to consider how other executors could be coupled with passes that modify the constant data. We don't want to create special cases in TIR for specific executors when we don't need to.

Agreed. Actually, the work we are trying to do here makes the params insensitive to the executors when the user wants them linked to the operators. Therefore, we dont see how embeddeding constants to TIR is executor specific.

@manupak
Copy link
Contributor

manupak commented Jul 16, 2021

Hi @tqchen,

Thanks for the elaboration on the choices that we could take to support constants in TIR.

For the particular application in mind usually, C3 certainly brings the least amount of indirection. However, it also brings deeper surgery to the IR architecture itself. In the extreme case, one might fall into a that introduces new IR node for each kind of application to touch meta-data.

I think it should be stated --link-params are just one of useful usage of having the ability to express constant data in TIR. However, having the ability to express constant in TIR allows to introduce constants that are properties of the operators, mutations required by the HW, etc. Therefore, this is an extension of the IR rather than a surgery to the IR. I'd view surgery as rooting out an established TIR construct and relocating/redefining with to support more features/usecases. Moreover, the process of compilation for variety of targets will include many constants that undergoes different transformation for optimization and also to be prepared as requirement for the HW to execute/decode them.

I think we could agree that having to introduce an IR node for each meta-data (of the model) might not be desirable. However, we feel ability to express constants in the TIR goes beyond them being meta-data (for ones to be considered meta-data). This is because the constants does not necessarily are meta-data of the model. Frankly, Im not even sure why we call constants involved in a computation as meta-data ?

Therefore, its better to understand why we are categorizing constant non-scalar data as meta-data (always) is quite important in making the decision.

The main possible problem in C2 is that we want to make sure meta data in attrs remains constant during rewrite, so that the my_param_var won't get rewritten to something else, and passes does not need to consider the context to check if all variables are bound.

So to summarize, the tir builtin will provide a write-acess restriction, correct? I feel that is an overkill where we should use a 'const' specifier (or qualify the buffer_var to just loadable and not storeable) to restrict if thats required. Then again, not sure what value it brings because users will not write TIR directly -- unlike 'const' specifier in other languages.

C1 is the approach that falls within the current architecture of meta data handling. While I agree that it may not be the most direct option at the moment, it will bring the least amount of impact to the TIR architecture itself. We also want to make sure that the metadata handling mechanism works for cases like this and possible future ones, and improvement if things does not work out well.

I'd agree with this if we have good reasons why constant non-scalar data is always treated as metadata. That is dependent on the resolution for the above point. Then again, I think we should welcome positive extention to the IR which does not involve a surgery.

Popping up a level, we want to carefully think about the general need to referring to a constant meta data, and perhaps build some more native mechanism than C1, but also generalizes the particular case of allocate_const. For example, maybe we want to allow relay and TIR to share a same mechanism of such constant data reference. That is the long term solution that we might want to think about along the directions(e.g. bring the constant map as part of IRModule).

While all meta-data are constants, we would need to think whether all constants are meta-data :) . Again, related to the same point above.

Let say, we had good reasons to decide all constants are meta-data. Then, we'd propose not to pull out constants that are accessed by primfuncs early on to IRModule scope -- that could happen later. This allows different compilation flows to contribute the constants for global optimizations later on.

@tqchen
Copy link
Member

tqchen commented Jul 16, 2021

Thanks @manupa-arm . I totally agree that we need to evolve the IR design itself. Constants are indeed special, that would require some careful thinking, I just meant to say that it would take more thoughts and possibly some lessons in less impactful way(e.g. the attribute approach)

For example, assume that if we decided that it is a good idea to introduce constant in TIR, there are quite a few set of considerations that needs to go into them:

  • Q0: What is the interaction model with the high level TIR schedule(e.g. do we need to allocate multi dimensional constant buffer)
  • Q1: What if we want to express constant that are shared across functions(in which case inling the data may not be the right approach)
  • Q2: What is the semantics (when the scope itself is special and requires the loading pattern).

Different considerations of the above Qs might come to a different conclusions. For example, considering Q1 we may not inline the data into the allocate_const node. It would be important to make an informed choice at some time point and stick with that one and avoid introducing a node then change it later.

Our current consideration to constants so far is that most of the transformations happens at the relay level, and at the level of TIR, there are a diverse set of operations involved to allocate, initialize load the constant before use, and it may be hard to converge on a single Q2. As a result, we usually represent constant as a var.

So to summarize:

  • We understand it is important to keep an eye on how do we handle and transform constant that might bring architecture changes.
  • Because it is hard to make an immediate choice, and we can start with an less impactful approach via attrs, we could unblock by going with that approach.
  • Once we have learnt enough lessons, we can then come back and revisit this matter, introducing architecture changes to the IR or IRModule.

This PR is adding non-scalar constant representation in TIR. This is used to
express constants (i.e., parameters) in the TIR instead of bypassing the
TIR as it's done until now.

Change-Id: Id3afc4d7197260cb43ecde60f05ccbce3fc42430

Co-authored-by: Giuseppe Rossini <giuseros85@gmail.com>
Change-Id: Id4a09a637c9c1fd7d49989c6c10f474a78569e18
@manupak
Copy link
Contributor

manupak commented Jul 20, 2021

Hi @tqchen,

Thanks for explanation of the concerns. I agree that they are valid concerns.

Q0: What is the interaction model with the high level TIR schedule(e.g. do we need to allocate multi dimensional constant buffer) ?

By this if you mean what we currently know as TE, then we can bound the constants post-scheduling to the TIR, Can't we ?. I may not have the full picture in a world where whole of TE disappears. An elaboration would be more than welcome here :) .

Q1: What if we want to express constant that are shared across functions(in which case inling the data may not be the right approach)

While params are one of the type of constants that could appear in a TIR PrimFunc, I feel global constant sharing is a subsequent optimization for merge duplicate global constants. (Similar to LLVM's ConstantMerge).

Q2 is seems a bit broader point that is based on there would be different interpretation for Q1 and Q0.

Different considerations of the above Qs might come to a different conclusions. For example, considering Q1 we may not inline the data into the allocate_const node. It would be important to make an informed choice at some time point and stick with that one and avoid introducing a node then change it later.

If the constants in consideration are params, then there is a choice whether to inline them or not. However, if the constants are generated in the process of compilation, we might need something like ConstantMerge, to achieve the objective of Q1. I generally agree its better to know most possibilities before defining a IR node, but its not clear to me how varied the representation of the constants could be in TIR as a IR node.

Our current consideration to constants so far is that most of the transformations happens at the relay level, and at the level of TIR, there are a diverse set of operations involved to allocate, initialize load the constant before use, and it may be hard to converge on a single Q2. As a result, we usually represent constant as a var.

We could also use transformations at a TIR level by enabling the representation of constants in TIR as well. For the semantics, how will it differ from a tir.allocate apart from the buffer_var being read-only (non store-able) and has values loaded by definition ? I dont particularly see mechanics of Q1 having an impact on how constants should be defined in TIR.

Let's say if we did decide to delay C3,
between C1 and C2, why is C2 different from loading an input var that matched to a buffer_var, where the latter does not need an intrinsic like tir.get_input_param() ?

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manupak
Copy link
Contributor

manupak commented Dec 13, 2021

@d-smirnov I suppose this is the first PR that needs to go in ?
(I m waiting to review this though I've seen updates to the PR that depends on this).

Let me know what PR should I take a look ?

@manupak manupak added status: need update need update based on feedbacks and removed status: need RFC need RFC discussion labels Dec 13, 2021
@manupak
Copy link
Contributor

manupak commented Jan 13, 2022

This is superseded by #8509 .

@manupak manupak closed this Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: need update need update based on feedbacks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants