[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

nvjle · 2024-02-26T16:41:13Z

Currently the decoder and encoder emitters will crash if DefaultMode is used within an EncodingByHwMode. As can be done today for RegInfoByHwMode and ValueTypeByHwMode, this patch adds support for this usage in EncodingByHwMode:
let EncodingInfos =
EncodingByHwMode<[ModeA, DefaultMode], [EncA, EncDefault]>;

Currently the decoder and encoder emitters will crash if DefaultMode is used within an EncodingByHwMode. As can be done today for RegInfoByHwMode and ValueTypeByHwMode, this patch adds support for this usage in EncodingByHwMode: let EncodingInfos = EncodingByHwMode<[ModeA, DefaultMode], [EncA, EncDefault]>;

nvjle · 2024-02-26T17:02:31Z

I hope one of @topperc @jyknight @jmolloy @0x59616e @wangpc-pp (based on recent commits or past reviews to the same general area) might review this small patch, thank you.

topperc · 2024-02-26T18:43:02Z

llvm/utils/TableGen/CodeGenHwModes.h

@@ -52,6 +52,11 @@ struct CodeGenHwModes {
    assert(Id != 0 && "Mode id of 0 is reserved for the default mode");
    return Modes[Id - 1];
  }
+  const StringRef getModeName(unsigned Id, bool IncludeDefault = false) const {


Drop const. It doesn't do anything since it's returning by value.

Drop const. It doesn't do anything since it's returning by value.

Done.
And could I again trouble you (I'll soon ask for write privs) to merge on my behalf?

jyknight

This comment isn't exactly related to this patch, but, I'm wondering what you are intending to use this feature for, and what the long-term strategy for EncodingByHwMode is.

I had actually been considering proposing to remove it entirely, because it seems pretty incompatible with how tablegen instruction specification normally works.

In particular, an instruction encoding is generally shared between instructions which use a constant value for a given field, and those which use it as an operand. E.g. take RISCV's "RVInstR", which defines rs1, rs2, and rd fields. In different subtypes, it might be populated via an instruction operand, or e.g. via "let rd = 0".

But with the EncodingByHwMode, the constant "let" stops working, because the encoding's named fields are no longer part of the tablegen record for the instruction.

Any thoughts?

nvjle · 2024-02-26T21:06:19Z

This comment isn't exactly related to this patch, but, I'm wondering what you are intending to use this feature for, and what the long-term strategy for EncodingByHwMode is.

We are already using this feature (that is, EncodingByHwMode) and have been for quite some time in a production downstream backend. This backend targets a very complicated, long-lived, evolving ISA and has encodings that can (and already has, and will) change from hardware generation-to-generation/version-to-version according to the needs of the computer architects in our organization. One could literally have an instruction set today with encoding A, and a year later (or less), have largely the same instruction set but with a vastly different encoding. Hardware efficiency concerns (speed, power, etc) are a big reason for this.

For most existing common ISAs, it is almost unthinkable that the encodings would substantially change beyond a handful of instructions-- and the encodings are basically forever. But for some machines (e.g., the ones I am targeting) this would overly constrain and handcuff the architects when trying to meet extreme performance goals.

I believe we're using this feature as it was intended. Maybe one of way of thinking about it as a way to make encodings independently configurable for a given opcode. One could, say, bake-in a prefix or postfix for every such opcode (e.g., MYINSN_MODEA, MYINSN_MODEB, ...) for every mode. That is clearly terrible-- at least when one has many such instructions.
For example, there are many possible passes in our backend (let's just say peephole-like optimizers for discussion's sake) that refer to these opcodes. That code ideally just references one instruction MYINSN rather than having to consider all variants of what are really just one instruction-- and all of that code must be updated if the encodings change in the future. It is less error-prone, reduces code duplication and maintenance.

Although we did not introduce EncodingByHwMode, we certainly find it to be an extremely useful feature that would cause us grief if it were gone-- and that would definitely make an already complicated backend even more so. We're not the only downstream backend using it.

From that point of view, the long-term strategy is to continue using this wonderful feature-- and time permitting-- contribute any fixes or improvements upstream. Some of these fixes have been sitting locally in at least one downstream backend. I personally am trying to reduce the impedance mismatch of "living downstream" so our merges go smoother. That-- and I think the whole community benefits. In my personal case, I've used this feature in two different backends for two different organizations-- making some of the same fixes twice.

Further, it is my plan/hope that the backend discussed above is eventually upstreamed so that LLVM will have an in-tree target using the feature. The timing of that goes all the way up to near deity-levels in the organization, so in the meantime, I hope to continue to push enhancements we make upstream (including unit tests to catch bitrot).

I had actually been considering proposing to remove it entirely, because it seems pretty incompatible with how tablegen instruction specification normally works.

In particular, an instruction encoding is generally shared between instructions which use a constant value for a given field, and those which use it as an operand. E.g. take RISCV's "RVInstR", which defines rs1, rs2, and rd fields. In different subtypes, it might be populated via an instruction operand, or e.g. via "let rd = 0".

But with the EncodingByHwMode, the constant "let" stops working, because the encoding's named fields are no longer part of the tablegen record for the instruction.

Any thoughts?

jyknight · 2024-02-26T21:40:33Z

That's great context, thanks -- and I don't want to cause you any grief. :)

But I also know of a private backend that tried to use this feature, but ran into real trouble due to the inability to set constant values in the usual way. Hearing that you're using this feature extensively makes me think that maybe you have figured out good ways to workaround (or fix?) this issue?

nvjle · 2024-02-27T00:09:41Z

That's great context, thanks -- and I don't want to cause you any grief. :)

But I also know of a private backend that tried to use this feature, but ran into real trouble due to the inability to set constant values in the usual way. Hearing that you're using this feature extensively makes me think that maybe you have figured out good ways to workaround (or fix?) this issue?

Without more context, I can't claim to have a solution to the (potential?) issue. I've worked around problems, but those were mainly just implementation shortcomings (like the one this patch addresses). The feature may or may not make sense for every use case related to encoding. It makes huge sense in our context-- and I/we are not the only stakeholders.

Perhaps creating a stand-alone, unit-test-sized .td file (not unlike HwEncodeDecode*.td) exhibiting what you wish to do would help give context. But I don't think that has much to do with this patch, which just fixes a minor problem in the existing feature (which has been around since 2019). Here, all I wish to do is allow DefaultMode as one of the allowable modes-- just as can be done with RegInfoByHwMode. This was just a minor oversight in the original implementation.

jyknight · 2024-02-27T02:08:30Z

Given the context you've already provided, I don't object to this PR (nor other future ones addressing issues in HwMode support), so, yes, this definitely isn't the right place to continue the discussion.

But, this is what I mean: https://gist.github.com/jyknight/af4c461b7ad7a5f0bcfac6dd5a33d4a1

wangpc-pp

LGTM.
It seems that we are all good for this PR?

nvjle · 2024-02-28T15:34:55Z

LGTM. It seems that we are all good for this PR?

It appears so, would you mind merging on my behalf?

topperc reviewed Feb 26, 2024

View reviewed changes

jyknight reviewed Feb 26, 2024

View reviewed changes

Address review comment-- remove 'const'.

a588378

wangpc-pp approved these changes Feb 27, 2024

View reviewed changes

wangpc-pp merged commit ad43ea3 into llvm:main Feb 28, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

nvjle commented Feb 26, 2024

nvjle commented Feb 26, 2024

topperc Feb 26, 2024

nvjle Feb 26, 2024

jyknight left a comment

nvjle commented Feb 26, 2024 •

edited

jyknight commented Feb 26, 2024

nvjle commented Feb 27, 2024

jyknight commented Feb 27, 2024

wangpc-pp left a comment •

edited

nvjle commented Feb 28, 2024

[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

Conversation

nvjle commented Feb 26, 2024

nvjle commented Feb 26, 2024

topperc Feb 26, 2024

Choose a reason for hiding this comment

nvjle Feb 26, 2024

Choose a reason for hiding this comment

jyknight left a comment

Choose a reason for hiding this comment

nvjle commented Feb 26, 2024 • edited

jyknight commented Feb 26, 2024

nvjle commented Feb 27, 2024

jyknight commented Feb 27, 2024

wangpc-pp left a comment • edited

Choose a reason for hiding this comment

nvjle commented Feb 28, 2024

nvjle commented Feb 26, 2024 •

edited

wangpc-pp left a comment •

edited