Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function multi-version proposal #48

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Jul 26, 2023

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

The problem is

  • How to implement this function
  • Where to provide this function

From the compiler's view, it will generate the IFUNC resolver when there are more than one implementation with the same symbol name.

Consider following example:

__attribute__((target("default"))) int foo (int index)
{
  return index;
}

__attribute__((target("arch=rv64gc"))) int foo (int index)
{
 return index;
}

void bar() {
  foo(0);
}

The corresponding assembly will look like:

bar() {
(foo.ifunc())(0);
}

.set foo.ifunc, foo.resolver

func_ptr foo.resolver() {
  if (__riscv_ifunc_select("m_a_f_d_c"))
    return ptr foo.m_a_f_d_c;
  return ptr foo.default;
}

int foo.default(int index) {
	...
}

int foo.m_a_f_d_c(int index) {
	...
}

The resolver that the compiler generates query and selects for each candidate function. When fulfilling the requirement, then return the corresponding function ptr for further processing.

In this proposal, the major part of the resolver function is __riscv_ifunc_select. __riscv_ifunc_select must retrieve the hardware information for deciding whether to execute the specific function.

Here we propose that function as the following declaration

bool __riscv_ifunc_select(char *FeatureStr);

Where FeatureString is a string that concatenates all target features belonging to a particular function. The form can be described in the following BNF form.

When hardware fulfills the FeatureStr, then returns true. Otherwise this function returns false.


2023/09/04 Update: The following section take the linux platform as example for __riscv_ifunc_select implementation.

There are two ways to retrieve hardware information.

  • RISC-V Hardware Probing Interface [1]
    • Not fully cover all extensions, and need to sync the all defined symbol from linux kernel source code.
  • /proc/cpuinfo isa string
    • Not every system has the cpuinfo file.

Another problem is where to place the function definition.

The compiler-rt/libgcc is a good place to implement these functions, like other target(x86/aarch64) implementation.

[1] https://docs.kernel.org/riscv/hwprobe.html

@sorear
Copy link

sorear commented Aug 2, 2023

This repository is for specifications of features that are portable between multiple RISC-V toolchains. As such it is inappropriate to specify any behavior exclusively in terms of Linux-only interfaces like hwprobe and cpuinfo.

Providing a string-to-bool or string-to-int (for things like Zicbo* cache block size) lookup interface as a portable frontend to Linux's syscalls, the HWCAP-inspired interfaces on the BSDs, and whatever NT ends up with is a good idea, although it's useful for more than just ifunc; @jrtc27 suggested __riscv_get_extension for essentially this interface.

BeMg added a commit to BeMg/llvm-project that referenced this pull request Sep 6, 2023
Here is proposal riscv-non-isa/riscv-c-api-doc#48.

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

Differential Revision: https://reviews.llvm.org/D155938
riscv-c-api.md Outdated
bool __riscv_ifunc_select(char *FeatureString)
```

Where FeatureString is a string that concatenating all target features belonging to a particular function. The form can be described in the following BNF form.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concatenating -> concatenates

riscv-c-api.md Outdated

## Function Multi-version

Function multi-versioning(FMV) provides an approach to selecting the appropriate function according to the runtime environment. This feature is triggered by `target/target_clones` function attribute. The compiler generates the resolver function based on the IFUNC mechanism. It expects that there is an API in the runtime environment for FMV to check if it fulfils all extension requirements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fulfils -> fulfills

riscv-c-api.md Outdated
| target("arch=rv64gc") | "BASE-FUNC-NAME".zifencei_zicsr_m_f_d_c_a |
| target("default") | "BASE-FUNC-NAME" |

NOTE: Should mangling name need to consider the feature come form extension dependency?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

form -> from

riscv-c-api.md Outdated

NOTE: Should mangling name need to consider the feature come form extension dependency?

### Runtime Featrue API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Featrue -> Feature

During the Function multi-version dispatch the function, we need a method to retrieve the RISC-V hardware environment to make sure all extension must be available.

The problem is

* How to implement this function
* Where to provide this function

From the compiler's view, it will generate the IFUNC resolver when there are more than one implementation with the same symbol name.

Consider following example:

```
__attribute__((target("default"))) int foo (int index)
{
  return index;
}

__attribute__((target("arch=rv64gc"))) int foo (int index)
{
 return index;
}

void bar() {
  foo(0);
}
```

The corresponding assembly will look like:

```
bar() {
(foo.ifunc())(0);
}

.set foo.ifunc, foo.resolver

func_ptr foo.resolver() {
  if (__riscv_ifunc_select("m_a_f_d_c"))
    return ptr foo.m_a_f_d_c;
  return ptr foo.default;
}

int foo.default(int index) {
	...
}

int foo.m_a_f_d_c(int index) {
	...
}
```

The resolver that the compiler generates query and selects for each candidate function. When fulfilling the requirement, then return the corresponding function ptr for further processing.

In this proposal, the major part of the resolver function is `__riscv_ifunc_select`. `__riscv_ifunc_select` must retrieve the hardware information for deciding whether to execute the specific function.

Here we propose that function as the following declaration

```
bool __riscv_ifunc_select(char *FeatureStr);
```

Where FeatureString is a string that concatenates all target features belonging to a particular function. The form can be described in the following BNF form.

When hardware fulfills the FeatureStr, then returns true. Otherwise this function returns false.

There are two ways to retrieve hardware information.

* RISC-V Hardware Probing Interface [1]
** Not fully cover all extensions, and need to sync the all defined symbol from linux kernel source code.
* /proc/cpuinfo isa string
** Not every system has the cpuinfo file.

Another problem is where to place the function definition.

The compiler-rt/libgcc is a good place to implement these functions, like other target(x86/aarch64) implementation.

[1] https://docs.kernel.org/riscv/hwprobe.html
riscv-c-api.md Outdated

Each `ATTR-STRING` defines the distinguished version for the current function. Notably, the ATTR-STRING list must include `default` implying the translation unit scope build attributes.

The syntax and constraints of `ATTR-STRING` are identical to [target attribute](#__attribute__targetattr-string).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to use anything other than arch= here, so it probably doesn't make sense to use the full target attribute syntax. We can only probe for features, not for tune-ness.

riscv-c-api.md Outdated
@@ -317,6 +317,44 @@ __attribute__((target("arch=+v"))) int foo(void) { return 0; }
__attribute__((target("arch=+zbb"))) int foo(void) { return 1; }
```

### `__attribute__((target_clones("<ATTR-STRING>", ...)))

The `target_clones` attribute is used to create multiple versions of a function. The compiler emits multiple versions based on the provided arguments.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `target_clones` attribute is used to create multiple versions of a function. The compiler emits multiple versions based on the provided arguments.
The `target_clones` attribute is used to create multiple versions of a function. The compiler will emit multiple versions based on the provided arguments.

riscv-c-api.md Outdated

The `target_clones` attribute is used to create multiple versions of a function. The compiler emits multiple versions based on the provided arguments.

Each `ATTR-STRING` defines the distinguished version for the current function. Notably, the ATTR-STRING list must include `default` implying the translation unit scope build attributes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each `ATTR-STRING` defines the distinguished version for the current function. Notably, the ATTR-STRING list must include `default` implying the translation unit scope build attributes.
Each `ATTR-STRING` defines a distinguished version of the function. The ATTR-STRING list must include `default` indicating the translation unit scope build attributes.

riscv-c-api.md Outdated

The `target_version` attribute is used to create one version of a function. Functions with the same signature may exist with multiple versions in the same translation unit.

The syntax and constraints of `ATTR-STRING` are identical to the [target attribute](#__attribute__targetattr-string). Notably, the version of the function must include `default`, that implying the translation unit scope build attributes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The syntax and constraints of `ATTR-STRING` are identical to the [target attribute](#__attribute__targetattr-string). Notably, the version of the function must include `default`, that implying the translation unit scope build attributes.
The syntax and constraints of `ATTR-STRING` are identical to the [target attribute](#__attribute__targetattr-string). The version of one function must equal `default`, indicating the translation unit scope build attributes.


For example, the following foo function creates one version.

```c
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have an example with two versions? The description in clang's documentation is very unclear.

riscv-c-api.md Outdated

## Function Multi-version

Function multi-versioning(FMV) provides an approach to selecting the appropriate function according to the runtime environment. This feature is triggered by `target_version/target_clones` function attribute. The compiler generates the resolver function based on the IFUNC mechanism. It expects that there is an API in the runtime environment for FMV to check if it fulfills all extension requirements.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matching ACLE I think this should document the result (it's constant for the life of the process and the first or most specific version) while leaving the rest psABI-defined or implementation-defined, including the runtime feature API.

@sorear
Copy link

sorear commented Feb 27, 2024

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

@BeMg
Copy link
Contributor Author

BeMg commented Mar 3, 2024

Hi @sorear, thanks for the comment.

Can arch= be removed from target_version and target_clones, since nothing else can appear there? Then it becomes just [[gnu::target_clones("+zbb","default")]] or similar.

For target_version and target_clones's ATTR-STRING, I tend to reuse the format like target attribute to avoid confusion. Removing mtune and mcpu could reduce the complexity of usage and could be treated as a subset, but using the format like [[gnu::target_clones("+zbb","default")]] is more inconsistent between target attribute and target_clones/target_version.

From the compiler's perspective, mtune, mcpu information will be highly related to the compilation result. If someday, we need to add mtune and mcpu to target_version /target_clones's ATTR-STRING, it will be easier and not break the the existing code.

@kito-cheng any other comments on this topic?

How are versions and clones prioritized? A big list of every possible exception isn't going to work for us, so it should be something in the source code. Not sure if declaration order would cause problems.

Currently, the selection order depend on IFUNC resolver's implementation. We plan to add the one another option inside ATTR-STRING that represents the user's manual priority weight. Like target_clones("default", "arch=rv64gc;prior=5", "arch=rv64g;prior=7").

@BeMg
Copy link
Contributor Author

BeMg commented Mar 21, 2024

I have created a pull request llvm/llvm-project#85786 to LLVM to implement target_clones in the current proposal.

I have also created a draft pull request llvm/llvm-project#85790 to implement __riscv_ifunc_select, which allows target_clones to run in a QEMU environment.

For example

clang -march=rv64g -rtlib=compiler-rt targetclones.c
qemu-riscv64 -cpu rv64,zbb=true,zba=true -B 0x100000 -L /path/to/sysroot a.out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants