Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'H'-extension ordering #837

Open
a4lg opened this issue Apr 20, 2022 · 8 comments
Open

'H'-extension ordering #837

a4lg opened this issue Apr 20, 2022 · 8 comments

Comments

@a4lg
Copy link
Contributor

a4lg commented Apr 20, 2022

Related to: #781

My Proposal

Update extension category list:

From: "IMAFDQLCBKJTPVN"
To:   "IMAFDQLCBKJTPVNH" (add "H" to the end)

(assuming #831 is applied)

This is incompatible with Spike but required changes to the software ecosystem will be minimal.

IMPORTANT

This is about making consensus for software adoption.

  1. Faster adoption
  2. Fewer incompatibilities (with some software implementing H already)

So we can close this issue once consensus is made (not when documentation is changed).

Background

There is a ongoing work on QEMU to generate long ISA string (for DeviceTree) from enabled extensions.

The problem is, this includes following extensions:

  • H (Hypervisor extension)
  • Zhinx (IEEE754 binary16 arithmetic in GPR)
  • Zhinxmin (IEEE754 binary16 arithmetic in GPR; conversion only)

Because extension category order of H is not yet determined, we cannot make sure that generated ISA string is valid on long-term.

Some other software (including Spike, Linux and LLVM) also began using H as hypervisor extension and/or Zhinx* as IEEE754 binary16 support extensions and some ordering is incompatible with another.

I request to define category order of H in the extension category list (version 20191213: IMAFDQLCBJTPVN, latest draft: IMAFDQLCBKJTPV) to minimize confusion (as fast as possible; even before official documentation changes) and enable preliminary support for hypervisor extension.

Software List and Compatibility Issues So Far

Ordering Table

Software / Spec Related Ext. Ordering Notes
1. Spike H → P → V Incompatible with others (with P→V→H)
2. Linux (GENERATOR) P → V → H
3. QEMU (Past) P → V → N… → H
3. QEMU (Future) P → V → H
4. GNU Binutils (K… →) P → V → N
5. GCC (K… →) P → V → N
6. LLVM (single-letter) P → V → N
6. LLVM (Z* extensions) P → V → N… → H → K K must be fixed (out of scope from this issue)
RISC-V ISA 20191213 P → V → N H is missing
RISC-V ISA Draft (K… →) P → V N is removed (but is it okay? >#831), H is missing
My Proposal (K… →) P → V → N → H Keep N (#831), add H here for
compatibility with Linux/QEMU/LLVM

Software 1: Spike (PARSER / GENERATOR)

Spike parses ISA string given by --isa command line argument. The parser is relatively strict on single-letter extensions (not that strict on multi-letter ones). It also expands G to IMAFD so it's also a generator.

Source: riscv/isa_parser.cc (on master)

isa_parser_t::isa_parser_t(const char* str, const char *priv)
  : extension_table(256, false)
{
  isa_string = strtolower(str);
  const char* all_subsets = "mafdqchpv";

Note that this is incompatible with Linux and QEMU.

Software 2: Linux (PARSER / GENERATOR)

Version (PARSER): 4.15 or later (parser is changed multiple times)
Version (GENERATOR): 5.18 or later (before 5.18, it printed DeviceTree entry as-is)
Scope (GENERATOR): Output of /proc/cpuinfo

The ISA string parser is very permissive, accepting ISA strings with invalid order. So, the effect of changing the rule is minimal, unless user-mode software tries to parse ISA string given by the kernel strictly.

ISA string generator is used on 5.18 or later to print support ISA and extensions from feature bitmap (before 5.18, original DeviceTree value is directly printed out).

Source: arch/riscv/kernel/cpu.c (on master)

static const char base_riscv_exts[13] = "imafdqcbkjpvh";

This is used by the GENERATOR to generate ISA string when printing /proc/cpuinfo.

This is incompatible with Spike but mostly compatible with QEMU (except K but it's harmless because K will not be a single extension).

Software 3: QEMU (GENERATOR)

QEMU generates ISA string from enabled extensions (specified by QEMU-style extension-enabling options). So, it's a generator.

Version: Past (possibly before 7.1)

Source: target/riscv/cpu.c (on master)

static const char riscv_exts[26] = "IEMAFDQCLBJTPVNSUHKORWXYZG";

ISA string is generated from this variable and target misa CSR. It has a few problems as follows:

  1. C and L have wrong order (harmless since L is actually never supported)
  2. K and J have wrong order (harmless since both J and K are not going to be a single extension)
  3. Some bits should not be reflected to the ISA string (I sent a patch to fix this as shown in "Version: Future", short portion)

This is incompatible with Spike but mostly compatible with Linux (except K but it's harmless because K will not be a single extension).

Version: Future (possibly 7.1 or later)

Version: merged to QEMU RISC-V branch for the next version (possibly 7.1 release or later)

On this branch, it supports adding multi-letter extensions to the ISA string (stored as a part of DeviceTree).

Source: target/riscv/cpu.c (on riscv-to-apply.next branch)

static const char riscv_single_letter_exts[] = "IEMAFDQCPVH";

ISA string (short portion) is generated from this variable and target misa CSR.

This is chosen for compatibility with past QEMU (riscv_single_letter_exts is a strict subset of riscv_exts before, preserving order).

Source: target/riscv/cpu.c (on riscv-to-apply.next branch)

    struct isa_ext_data isa_edata_arr[] = {
        ISA_EDATA_ENTRY(zfh, ext_zfh),
        ISA_EDATA_ENTRY(zfhmin, ext_zfhmin),
        ISA_EDATA_ENTRY(zfinx, ext_zfinx),
        ISA_EDATA_ENTRY(zhinx, ext_zhinx),
        ISA_EDATA_ENTRY(zhinxmin, ext_zhinxmin),
        ISA_EDATA_ENTRY(zdinx, ext_zdinx),

This array is used to generate long portion of the ISA string by enabled extensions and must be sorted by canonical order.

Note that, despite that H is after D, Zhinx and Zhinxmin are placed before Zdinx. This is not good and I'll fix or remove Zhinx* tentatively (independently to this discussion).

This is incompatible with Spike but compatible with Linux (except Zhinx*).

Software 4: GNU Binutils (not implemented yet; PARSER / GENERATOR)

Not implemented yet (but requires handling once H extension is introduced). The parser is relatively strict.

It also need to emit RISC-V attributes (containing required ISA and extensions) so it's not only a parser but a generator.

Source: bfd/elfxx-riscv.c (on master)

static const char riscv_ext_canonical_order[] = "eigmafdqlcbjktpvn";

H is not included yet. Note that J and K have wrong order and I already sent a patch to fix it.

Software 5: GNU Compiler Collection (not implemented yet; PARSER / GENERATOR)

Not implemented yet (but requires handling once H extension is introduced). The parser is relatively strict.

It can also emit RISC-V attributes (containing required ISA and extensions) so it's not only a parser but a generator.

Source: gcc/common/config/riscv/riscv-common.cc (on master)

  return "mafdqlcbjktpvn";

H is not included yet. Note that J and K have wrong order and I will send a patch to fix it.

Software 6: LLVM (not implemented yet; PARSER / GENERATOR)

Not implemented yet (but requires handling once H extension is introduced). The parser is relatively strict.

It also need to emit RISC-V attributes (containing required ISA and extensions) so it's not only a parser but a generator.

Source: llvm/lib/Support/RISCVISAInfo.cpp (on main)

static constexpr StringLiteral AllStdExts = "mafdqlcbjtpvn";

H is not included yet. Maybe I should add K to the list, though.

The most notable thing in LLVM is, it already supports both Zk* and Zhinx* extensions. LLVM handles all single-letters not listed in AllStdExts (except i and e) are ordered by its alphabetical order, after all supported single-letter extensions (listed in AllStdExts). This rule takes effect on Z* extensions.

That means, current LLVM effectively has extension order mafdqlcbjtpvnhk. If we move K (by adding k to AllStdExts), H is located at the end (making almost compatible with others).

@kito-cheng
Copy link
Member

kito-cheng commented Apr 20, 2022

binutils, GCC and LLVM are all treat H as a multi-letter extension name prefix, that's what spec defined now.

As a RISC-V GCC maintainer, I really hope this could be fixed in spec soon.

@atishp04
Copy link
Contributor

Thanks @a4lg for the detailed analysis. Can you send a patch for the unpriv spec naming section (chapter 28) as well ?
@gfavor @aswaterman : Is it possible to get it merged asap to avoid further incompatibility across the software stack ?

@aswaterman
Copy link
Member

We'll discuss it at the next architecture review meeting.

@a4lg
Copy link
Contributor Author

a4lg commented Apr 22, 2022

@atishp04 Give me two days.
Although this kind of proposal is to be reviewed in the architecture review meeting (when? >@aswaterman), making draft proposal with actual changes is not a bad idea (even if the draft pull-request itself is rejected).

@aswaterman
Copy link
Member

Sorry, I’m not willing to establish the precedent of giving a timeline for this kind of AR. We’ll do our best to be prompt.

@a4lg
Copy link
Contributor Author

a4lg commented Apr 22, 2022

@aswaterman It's alright. I just wanted to know whether we have some time to discuss based on "proof-of-concept" changes.

@a4lg
Copy link
Contributor Author

a4lg commented Apr 23, 2022

@aswaterman
Copy link
Member

This is temporarily resolved via db7a4a0, but the ordering aspect will soon be made moot; see #831 (comment)

a4lg added a commit to a4lg/riscv-isa-sim that referenced this issue May 22, 2022
Based on current consensus made in:
<riscv/riscv-isa-manual#837>,
this commit changes the canonical order of 'H'.

Of course, making --isa string more permissive can be a solution to
possible incompatibility caused by this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants