-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Null pointer check for destination to store new vl of Fault-Only-First Loads intrinsics #153
Comments
|
I seconded your proposal, null-pointer checking could be easier removed in the optimization flow, so I am prefer this way (oh, okay, and the GCC implementation is I did :P ), and I believe there is practical scenario since you guys hit this issue. |
|
What is the use case for not wanting the new_vl value? Don't you need to know how many elements were read to be able to use the loaded data? |
|
Trying the solution you gave, you will end up with the assembly as below: test_vleff_save_new_vl_to_not_null: I don't think users want to see the branch instructions here. If you really have the cases that users don't want the new_vl value. I think we should try another solution. Actually,I tried the Clang/LLVM. It seems to work well: https://godbolt.org/z/8r5vznc3o |
|
I have the same question as Craig. Can anyone please provide a real-world use-case where |
I would like to use Stream Cipher(https://en.wikipedia.org/wiki/Stream_cipher) as an example, but after thinking, I think it isn't a typical use case. And I asked my colleagues who use these intrinsics, the answer is that they barely use Fault-Only-First loads, let alone a real-world use-case where new_vl is unused. Most of our failed cases are for extreme testing. So, I have to say, it's hard to find one. As for Stream Cipher, there are about two important parameters: 1) a byte stream(file stream, network stream, etc) to encrpyt/decrypt, donotes |
That is because we don't know whether
As you can see, the assembly are wrong for #include <stddef.h>
#include <riscv_vector.h>
int8_t a[]={1,2,3};
vint8mf8_t __attribute__((noinline))
test_vleff_save_new_vl_to_null(const int8_t *base, size_t vl) {
return vle8ff_v_i8mf8(base, NULL, vl);
}
int main(){
test_vleff_save_new_vl_to_null(a, 3);
return 0;
}The assembly are: test_vleff_save_new_vl_to_null: # @test_vleff_save_new_vl_to_null
vsetvli zero, a1, e8, mf8, ta, mu
vle8ff.v v8, (a0)
# recursion here.
main: # @main
lui a0, %hi(a)
addi a0, a0, %lo(a)
li a1, 3
call test_vleff_save_new_vl_to_null |
|
I tried to implement https://godbolt.org/z/cvTaYn5Tn #include <riscv_vector.h>
size_t strlen_1(const char *str) {
int first;
size_t gvl = vsetvlmax_e8m8();
const char *start = str;
do {
vint8m8_t v = vle8ff_v_i8m8(str, NULL, gvl);
vbool1_t m = vmseq_vx_i8m8_b1(v, '\0', gvl);
first = vfirst_m_b1(m, gvl);
str += gvl;
} while (first < 0);
return (str - start) + (first - gvl);
}
size_t strlen_2(const char *str) {
int first;
size_t new_vl;
size_t gvl = vsetvlmax_e8m8();
const char *start = str;
do {
vint8m8_t v = vle8ff_v_i8m8(str, &new_vl, gvl);
vbool1_t m = vmseq_vx_i8m8_b1(v, '\0', gvl);
first = vfirst_m_b1(m, gvl);
str += new_vl;
} while (first < 0);
return (str - start) + (first - new_vl);
}I don't know if it is a normal use case, but I think this may be an example where |
|
Thanks for taking the time to write up this example. I see two problems with The first problem is this code segment: Suppose that the V-register allocated to Of course, you could avoid this problem by, say, initializing Suppose that the true end of the string occurs between I don't see a simple way to avoid both of these problems without checking EDIT: @topperc just pointed out to me in a personal communication, vector loads have the following property:
So the mitigations I proposed are probably insufficient. This doesn't change my conclusion that |
|
You also can see spec example https://github.com/riscv/riscv-v-spec/blob/master/example/strlen.s#L12, it reads the |
|
Similarly to the If, in the future, we discover a use-case for null So my suggestion is to not change anything right now. |
|
After thinking, I agree that there are some differences between
Under your given hypothesis, the behavior of
Both
Both
For
For
Actually, I may have implemented size_t strlen_3(const char *str) {
int first;
size_t new_vl;
size_t gvl = vsetvlmax_e8m8();
const char *start = str;
do {
vint8m8_t v = vle8ff_v_i8m8(str, &new_vl, gvl);
vbool1_t m = vmseq_vx_i8m8_b1(v, '\0', new_vl); // pass new_vl here.
first = vfirst_m_b1(m, new_vl); // pass new_vl here.
str += new_vl;
} while (first < 0);
return (str - start) + (first - new_vl);
}For
I think it is impossible that the true end of the string occurs after an address which causes a trap, isn't it? |
Suppose the end of the string is in a page that belongs to your process, but is not currently backed by physical memory. It has been swapped to disk and needs to be paged back in when accessed. Here's what will happen. Suppose the start of that page occurs somewhere after element 0. The load will detect the page fault for the unavailable page. The vl will be trimmed to the elements that were able to be read and returned in new_vl. A well implemented loop will see that the 0 wasn't found in the first new_vl elements. The pointer will be advanced by new_vl and a new load issued on the next loop iteration. Now the start of the unmapped page is in element 0. This will trap to the operating system. The operating system will fetch the page from disk and put it into physical memory. The user process will be restarted and will reissue the load with a vstart of 0. This time the load will succeed without trapping since the page has now been mapped by the OS. The fault only first load can't tell the difference between a page that the process is allowed to access but isn't mapped and one that the process is not allowed to access. The vl will be trimmed for either case. It's not until the start of the page is at element 0 that the OS will get involved to distinguish the two cases. |
Oh, I get it! Thanks! I think I missed something here. @nick-knight |
There are two outputs in Fault-Only-First Loads intrinsics: 1) loaded vector; 2) new vl. For example:
vint8mf8_t vle8ff_v_i8mf8 (const int8_t *base, size_t *new_vl, size_t vl);, it returns a vector ofvint8mf8_tand stores new vl tonew_vl.Sometimes, we just want to ignore the new vl. In previous GCC implementation[1], we won't store new vl to destination
new_vlif it is a null pointer. It is common that user will pass null pointer and expect that new vl will be thrown away.However, we found LLVM has different behavior when we switched to LLVM and some of ours codebases got compilation failures or runtime errors. The reason is that LLVM doesn't do any null pointer checking for destination to store new vl and store to null pointer is an undefined behavior. See also [2] for more previous discussions.
My proposal is that we should do null pointer check for destination to store new vl of Fault-Only-First Loads and document it.
Or we should add a note for users that
new_vlshould not be a null pointer as @zakk0610 proposed.References:
[1] https://github.com/riscv-collab/riscv-gcc/blob/riscv-gcc-10.1-rvv-dev/gcc/config/riscv/riscv_vector.h#L289
[2] https://reviews.llvm.org/D126461
The text was updated successfully, but these errors were encountered: