-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FMV] Runtime Resolver Function #74
base: main
Are you sure you want to change the base?
Conversation
Relate patch: |
cc @kito-cheng |
Is two word "FullFill" supposed to be the single word "Fulfill"? |
Do we intend to support __builtin_cpu_supports which is built on the same interface as function multiversioning on other targets like X86? That will require a reasonably fast query mechanism. String processing may be too much for that. |
Oops, I think there is a typo here. Updated. |
If we only allow one extension each time. Does it provide a reasonably fast query mechanism? Or must it be some kind of bit operation to determine support? For example, compiler generate this resolver function base on
|
My concern is that each time you pass a string into the compiler-rt interface, it will need to execute multiple strcmps to compare the input string against every extension name the library knows about to figure out which extension is being asked for. That gets expensive if called very often. On x86, builtin_cpu_supports calls the library the first time to update some global variables. After the first time it is a load and a bit test |
If you use a sensible data structure like a trie you can do it linearly in the length of the input string |
To enhance both the performance(compare to string base) and portability(compare to hwprobe base), I have updated the runtime interface with a new layer for each queryable extension. This approach is similar to approach 2 described in the PR's description. This comment aims to explain it with a concrete example using the IFUNC resolver function and Two structures are defined in the runtime library to store the status of hardware-enabled extensions: Each queryable extension has a unique position inside the structure bit to represent whether it is enabled. For example: extension m enable bit could be stored inside
Additionally, there is a function to initialize these two structures using a system-provided mechanism:
In summary, this approach uses When the compiler emits the IFUNC resolver function, it can use these structures to check whether all extension requirements are fulfilled. Here is a simple example for a resolver:
func_ptr foo1.resolver() {
__init_riscv_features_bit();
if (MAX_QUERY_LENGTH > __riscv_feature_bits.length)
raise_error();
// Try arch=rv64im
unsigned long long rv64im_require_feature_0 = constant_build_during_compiation_time();
unsigned long long rv64im_require_feature_1 = constant_build_during_compiation_time();
...
if (
((rv64im_require_feature_0 & __riscv_feature_bits.features[0]) == rv64im_require_feature_0) &&
((rv64im_require_feature_1 & __riscv_feature_bits.features[1]) == rv64im_require_feature_1) &&
...)
return foo1.rv64im;
return foo1.default;
} |
Who's specifying which bit is what? |
My idea is that bit is only meaningful for runtime function and compiler that using The remaining problem is how to synchronize the extension bitmask across LLVM, compiler-rt, GCC, and libgcc. I don't have a solution for this yet. @kito-cheng Any ideas on how we can achieve this synchronization? |
Update: add the extension |
This proposal got positive feedback from RISC-V GNU community :) |
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
…ature_bits/__init_riscv_features_bit Base on riscv-non-isa/riscv-c-api-doc#74, this patch defines the __riscv_feature_bits and __riscv_vendor_feature_bits structures to store the enabled feature bits at runtime. It also introduces the __init_riscv_features_bit function to update these structures based on the platform query mechanism. Additionally, the groupid/bitmask definitions from riscv-non-isa/riscv-c-api-doc#74 are declared and used to update the __riscv_feature_bits and __riscv_vendor_feature_bits structures.
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
…ature_bits/__init_riscv_features_bit Base on riscv-non-isa/riscv-c-api-doc#74, this patch defines the __riscv_feature_bits and __riscv_vendor_feature_bits structures to store the enabled feature bits at runtime. It also introduces the __init_riscv_features_bit function to update these structures based on the platform query mechanism. Additionally, the groupid/bitmask definitions from riscv-non-isa/riscv-c-api-doc#74 are declared and used to update the __riscv_feature_bits and __riscv_vendor_feature_bits structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment :)
IMO it's way simpler to just have the resolver call hwprobe directly, rather than trying to introduce this intermediate format and the associated library helper functions. We don't even need to specify anything here: the compiler could just generate the hwprobe calls directly and then call into the VDSO via the provided argument to the IFUNC resolver. That said: this is essentially just duplicating one of the early hwprobe designs, and thus has a bunch of design flaws we spent a few versions sorting out. So if you want to go with it, probably best to sort out things like:
So I'd recommend doing basically nothing here: we already have all the tools we need to implement FMV at the binary/library level, we just need to mark the multi-target attributes as legal so we can implement them. |
The resolver isn't the only use of this. I'm assuming we should support |
On X86, it's called by the resolver function. Only the first call does anything real, the other calls early out if its already been done. I suggested we should cache the information rather than doing a syscall of hwprobe for every multiversion function. |
I am not sure if compiler can generate code to invoke vDSO direct, but this part is like optimization on reducing the overhead of query the capability of host machine, I am kinda less concern around this since current proposal can cache that when first call For other concern: We intend to add extension first, and we believe bit mask is enough for now, and our goal is reach same capability as IFUNC in glibc, which we don't intend to address heterogeneous-ISA systems or extensions from multiple vendors yet, and we may extend the syntax on future if needed. And for IFUNC...I believe there are few security issue around that, but I don't see we have other choice for short-term, both LLVM and GCC are didn't provide such infrastructure without IFUNC, and I am not sure it worth to spend another half year to doing that is worth, also we don't document down we use IFUNC, so we can change the implementation to get rid of IFUNC stuffs in future if we think it's necessary, and the |
Remove the bitmask that can't be query by hwprobe directly. And update Bitpos base on current support extension alphabetical order. |
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
Base on riscv-non-isa/riscv-c-api-doc#74. This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc. The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
This PR proposes a runtime resolver function that retrieves the environment information. Since this resolver function is expected to be available and interchangeable for both
libgcc
andcompiler-rt
, a formal specification for the resolver function interface is necessary.When generating the resolver function for function multiversioning, a mechanism is necessary to obtain the environment information.
To achieve this goal, several steps need to be taken:
Step 1 is handled by the compiler, while step 3 must follow the necessary steps from the platform during runtime.
This RFC aims to propose how the compiler and runtime function can tackle step 2.
Here is a example
In this example, there are two versions of function
bar
. One for default, another for "rv64gcv".If the environment meets the requirements, then bar can utilize the
arch=rv64gcv
version. Otherwise, it will invoke the default version.This process be controlled by the
ifunc
resolver function.The version
arch=rv64gcv
requireThe problem 2 is about where to maintain the relationship between extension names and platform-dependent probe forms.
Here are three possible approach to achieve goal.