Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-rt][RISCV] Implement __riscv_ifunc_select #85790

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Mar 19, 2024

This patch implement the __riscv_ifunc_select base on Hwprobe.

unsigned __riscv_ifunc_select(struct riscv_hwprobe *ReqirePreKey, unsigned Length);

The __riscv_ifunc_select function checks if all required keys and their corresponding values from the caller match the hwprobe result. If all features are available in the current runtime environment, it returns 1. Otherwise, it returns 0.

This patch implement the `__riscv_ifunc_select` base on `Hwprobe/cpuinfo`.

```
unsigned __riscv_ifunc_select(char *FeatStrs);
```

The __riscv_ifunc_select function checks if the features specified in FeatStrs are supported.
If all features are available in the current runtime environment, it returns 1. Otherwise, it returns 0.
@BeMg
Copy link
Contributor Author

BeMg commented Mar 19, 2024

This patch make #85786 could run some real test.

#define RISCV_HWPROBE_MISALIGNED_FAST (3 << 0)
#define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0)
#define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we align to the newest interface: https://docs.kernel.org/arch/riscv/hwprobe.html?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@BeMg
Copy link
Contributor Author

BeMg commented Mar 30, 2024

  1. Align with latest sys_riscv_hwprobe
  2. Update __riscv_ifunc_select, from __riscv_ifunc_select(char *) to __riscv_ifunc_select(unsigned long long, unsigned long long).
  3. Remove the cpuinfo relate code and string process relate code
  4. Use the bitset method to determine whether a set of extension is available for current environment.

@BeMg BeMg marked this pull request as ready for review March 30, 2024 12:13
@BeMg BeMg requested a review from kito-cheng March 30, 2024 12:14
Copy link

github-actions bot commented Apr 1, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@BeMg BeMg requested review from lukel97 and wangpc-pp April 1, 2024 02:35
@wangpc-pp
Copy link
Contributor

Are there any processes for GNU/GCC implementation? If we want to port glibc, I think it should be required.

@BeMg
Copy link
Contributor Author

BeMg commented Apr 8, 2024

  1. Let the caller to manage and construct the necessary key/value pairs for hwprobe, eliminating the need for the runtime site to sync with the hwprobe key table.
  2. Modify __riscv_ifunc_select to accept a pointer to riscv_hwprobe and its length, so the prototype does not need to be updated when the hwprobe keys increase or change.


#endif // defined(__linux__)

unsigned __riscv_ifunc_select(struct riscv_hwprobe *ReqireKeys,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reqire -> Require

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

// hwprobe not success
if (initHwProbe(HwprobePairs, 2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the 2 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be replaced with Length from the argument. Fixed.

unsigned Length) {
#if defined(__linux__)
// Init Hwprobe
struct riscv_hwprobe HwprobePairs[64];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does 64 come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It need a big enough buffer to place the key from __riscv_ifunc_select argument, and pass it into sys_hwprobe.

For now, the RISCV_HWPROBE_MAX_KEY is 6.

https://github.com/torvalds/linux/blob/20cb38a7af88dc40095da7c2c9094da3873fea23/arch/riscv/include/asm/hwprobe.h#L11

unsigned Length) {
#if defined(__linux__)
// Init Hwprobe
struct riscv_hwprobe HwprobePairs[Length];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: are we allowed to use VLA here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use malloc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is the convention of compiler-rt development, I think we should avoid using standard libraries here.
VLA is OK here, I just don't know if we can use this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect use VLA here may introduce vulnerabilities since we don't have any length check and this could trigger stack overflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or could we define a MAX_PAIR_LENGTH constant here and discard content exceeding that limit before invoking the system call, or simply return false in this situation?

For example:

Something like

#define MAX_PAIR_LENGTH 6

struct riscv_hwprobe HwprobePairs[MAX_PAIR_LENGTH];

if (Length > MAX_PAIR_LENGTH)
   Length = MAX_PAIR_LENGTH;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we limit the number of features in target_clones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. That limit come from hwprobe system call's key. I expect the compiler will transform the feature inside target_clones into minimum hwprobe system call pair.

The pair from the arguments of __riscv_ifunc_select could be mapped into a struct riscv_hwprobe.

For example:

; query for version "arch=rv64im"
// CHECK: @__riscv_hwprobe_args = internal global [2 x %riscv_hwprobe_pair] [%riscv_hwprobe_pair { i64 3, i64 1 }, %riscv_hwprobe_pair { i64 4, i64 0 }]
; query for version "arch=+zbb,+zba,+zbc,+zbkb,+zknd,+zknh,+zksed,+zvksh,+zfh,+zfa,+v"
// CHECK: @__riscv_hwprobe_args.1 = internal global [2 x %riscv_hwprobe_pair.0] [%riscv_hwprobe_pair.0 { i64 3, i64 1 }, %riscv_hwprobe_pair.0 { i64 4, i64 4462766492 }]

// CHECK-LABEL: define weak_odr ptr @foo1.resolver() comdat {
// CHECK-NEXT:  resolver_entry:
// CHECK-NEXT:    [[TMP0:%.*]] = call i1 @__riscv_ifunc_select(ptr @__riscv_hwprobe_args, i32 2)
...

// CHECK-LABEL: define weak_odr ptr @foo2.resolver() comdat {
// CHECK-NEXT:  resolver_entry:
// CHECK-NEXT:    [[TMP0:%.*]] = call i1 @__riscv_ifunc_select(ptr @__riscv_hwprobe_args.1, i32 2)
// CHECK-NEXT:    br i1 [[TMP0]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
...

The "arch=+zbb,+zba,+zbc,+zbkb,+zknd,+zknh,+zksed,+zvksh,+zfh,+zfa,+v" version has more features than the "arch=rv64im" version, but only the pair value is updated during __riscv_ifunc_select invocation.

But we need to increase limit when hwprobe support more key for new feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think the MAX_PAIR_LENGTH way is OK to me.

@BeMg
Copy link
Contributor Author

BeMg commented Apr 22, 2024

Since this resolver function is expected to be available and interchangeable for both libgcc and compiler-rt, a formal specification for the resolver function interface is necessary.

I've create one for this PR riscv-non-isa/riscv-c-api-doc#74 and provide the three different candidate approach to achieve the same purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants