Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TableGen] Add support for DefaultMode in per-HwMode encode/decode. #83029

Merged
merged 2 commits into from
Feb 28, 2024

Conversation

nvjle
Copy link
Contributor

@nvjle nvjle commented Feb 26, 2024

Currently the decoder and encoder emitters will crash if DefaultMode is used within an EncodingByHwMode. As can be done today for RegInfoByHwMode and ValueTypeByHwMode, this patch adds support for this usage in EncodingByHwMode:
let EncodingInfos =
EncodingByHwMode<[ModeA, DefaultMode], [EncA, EncDefault]>;

Currently the decoder and encoder emitters will crash if DefaultMode is
used within an EncodingByHwMode. As can be done today for RegInfoByHwMode
and ValueTypeByHwMode, this patch adds support for this usage in
EncodingByHwMode:
  let EncodingInfos =
    EncodingByHwMode<[ModeA, DefaultMode], [EncA, EncDefault]>;
@nvjle
Copy link
Contributor Author

nvjle commented Feb 26, 2024

I hope one of @topperc @jyknight @jmolloy @0x59616e @wangpc-pp (based on recent commits or past reviews to the same general area) might review this small patch, thank you.

@@ -52,6 +52,11 @@ struct CodeGenHwModes {
assert(Id != 0 && "Mode id of 0 is reserved for the default mode");
return Modes[Id - 1];
}
const StringRef getModeName(unsigned Id, bool IncludeDefault = false) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop const. It doesn't do anything since it's returning by value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop const. It doesn't do anything since it's returning by value.

Done.
And could I again trouble you (I'll soon ask for write privs) to merge on my behalf?

Copy link
Member

@jyknight jyknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't exactly related to this patch, but, I'm wondering what you are intending to use this feature for, and what the long-term strategy for EncodingByHwMode is.

I had actually been considering proposing to remove it entirely, because it seems pretty incompatible with how tablegen instruction specification normally works.

In particular, an instruction encoding is generally shared between instructions which use a constant value for a given field, and those which use it as an operand. E.g. take RISCV's "RVInstR", which defines rs1, rs2, and rd fields. In different subtypes, it might be populated via an instruction operand, or e.g. via "let rd = 0".

But with the EncodingByHwMode, the constant "let" stops working, because the encoding's named fields are no longer part of the tablegen record for the instruction.

Any thoughts?

@nvjle
Copy link
Contributor Author

nvjle commented Feb 26, 2024

This comment isn't exactly related to this patch, but, I'm wondering what you are intending to use this feature for, and what the long-term strategy for EncodingByHwMode is.

We are already using this feature (that is, EncodingByHwMode) and have been for quite some time in a production downstream backend. This backend targets a very complicated, long-lived, evolving ISA and has encodings that can (and already has, and will) change from hardware generation-to-generation/version-to-version according to the needs of the computer architects in our organization. One could literally have an instruction set today with encoding A, and a year later (or less), have largely the same instruction set but with a vastly different encoding. Hardware efficiency concerns (speed, power, etc) are a big reason for this.

For most existing common ISAs, it is almost unthinkable that the encodings would substantially change beyond a handful of instructions-- and the encodings are basically forever. But for some machines (e.g., the ones I am targeting) this would overly constrain and handcuff the architects when trying to meet extreme performance goals.

I believe we're using this feature as it was intended. Maybe one of way of thinking about it as a way to make encodings independently configurable for a given opcode. One could, say, bake-in a prefix or postfix for every such opcode (e.g., MYINSN_MODEA, MYINSN_MODEB, ...) for every mode. That is clearly terrible-- at least when one has many such instructions.
For example, there are many possible passes in our backend (let's just say peephole-like optimizers for discussion's sake) that refer to these opcodes. That code ideally just references one instruction MYINSN rather than having to consider all variants of what are really just one instruction-- and all of that code must be updated if the encodings change in the future. It is less error-prone, reduces code duplication and maintenance.

Although we did not introduce EncodingByHwMode, we certainly find it to be an extremely useful feature that would cause us grief if it were gone-- and that would definitely make an already complicated backend even more so. We're not the only downstream backend using it.

From that point of view, the long-term strategy is to continue using this wonderful feature-- and time permitting-- contribute any fixes or improvements upstream. Some of these fixes have been sitting locally in at least one downstream backend. I personally am trying to reduce the impedance mismatch of "living downstream" so our merges go smoother. That-- and I think the whole community benefits. In my personal case, I've used this feature in two different backends for two different organizations-- making some of the same fixes twice.

Further, it is my plan/hope that the backend discussed above is eventually upstreamed so that LLVM will have an in-tree target using the feature. The timing of that goes all the way up to near deity-levels in the organization, so in the meantime, I hope to continue to push enhancements we make upstream (including unit tests to catch bitrot).

I had actually been considering proposing to remove it entirely, because it seems pretty incompatible with how tablegen instruction specification normally works.

In particular, an instruction encoding is generally shared between instructions which use a constant value for a given field, and those which use it as an operand. E.g. take RISCV's "RVInstR", which defines rs1, rs2, and rd fields. In different subtypes, it might be populated via an instruction operand, or e.g. via "let rd = 0".

But with the EncodingByHwMode, the constant "let" stops working, because the encoding's named fields are no longer part of the tablegen record for the instruction.

Any thoughts?

@jyknight
Copy link
Member

That's great context, thanks -- and I don't want to cause you any grief. :)

But I also know of a private backend that tried to use this feature, but ran into real trouble due to the inability to set constant values in the usual way. Hearing that you're using this feature extensively makes me think that maybe you have figured out good ways to workaround (or fix?) this issue?

@nvjle
Copy link
Contributor Author

nvjle commented Feb 27, 2024

That's great context, thanks -- and I don't want to cause you any grief. :)

But I also know of a private backend that tried to use this feature, but ran into real trouble due to the inability to set constant values in the usual way. Hearing that you're using this feature extensively makes me think that maybe you have figured out good ways to workaround (or fix?) this issue?

Without more context, I can't claim to have a solution to the (potential?) issue. I've worked around problems, but those were mainly just implementation shortcomings (like the one this patch addresses). The feature may or may not make sense for every use case related to encoding. It makes huge sense in our context-- and I/we are not the only stakeholders.

Perhaps creating a stand-alone, unit-test-sized .td file (not unlike HwEncodeDecode*.td) exhibiting what you wish to do would help give context. But I don't think that has much to do with this patch, which just fixes a minor problem in the existing feature (which has been around since 2019). Here, all I wish to do is allow DefaultMode as one of the allowable modes-- just as can be done with RegInfoByHwMode. This was just a minor oversight in the original implementation.

@jyknight
Copy link
Member

Given the context you've already provided, I don't object to this PR (nor other future ones addressing issues in HwMode support), so, yes, this definitely isn't the right place to continue the discussion.

But, this is what I mean: https://gist.github.com/jyknight/af4c461b7ad7a5f0bcfac6dd5a33d4a1

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It seems that we are all good for this PR?

@nvjle
Copy link
Contributor Author

nvjle commented Feb 28, 2024

LGTM. It seems that we are all good for this PR?

It appears so, would you mind merging on my behalf?

@wangpc-pp wangpc-pp merged commit ad43ea3 into llvm:main Feb 28, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants