UPC-IR: parametrize Remote Pointer (RP) #2

nenadv · 2014-10-13T12:49:04Z

At this point the remote pointer in IR is 64-bit quantity with 44 bits allocated for the address (offset) and 20 bits for the thread number.

Add configuration options so we can change these values.

nenadv · 2014-10-14T14:33:01Z

Resolved by 040d61e
and similar change to clang-upc.

…oring the subregister. For 0-lane stores, we used to generate code similar to: fmov w8, s0 str w8, [x0, x1, lsl #2] instead of: str s0, [x0, x1, lsl #2] To correct that: for store lane 0 patterns, directly match to STR <subreg>0. Byte-sized instructions don't have the special case for a 0 index, because FPR8s are defined to have untyped content. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8

…nction to end of subclass. NFC The previous attempt at fixing this only moved the problem to the subclass vtable. We can safely move the function into the subclass so attempt to fix it that way. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@236112 91177308-0d34-0410-b5e6-96231b3b80d8

…aries. I have two immediate motivations for adding this: 1) It makes writing expectations in tests *dramatically* easier. A quick example that is a taste of what is possible: std::vector<int> v = ...; EXPECT_THAT(v, UnorderedElementsAre(1, 2, 3)); This checks that v contains '1', '2', and '3' in some order. There are a wealth of other helpful matchers like this. They tend to be highly generic and STL-friendly so they will in almost all cases work out of the box even on custom LLVM data structures. I actually find the matcher syntax substantially easier to read even for simple assertions: EXPECT_THAT(a, Eq(b)); EXPECT_THAT(b, Ne(c)); Both of these make it clear what is being *tested* and what is being *expected*. With `EXPECT_EQ` this is implicit (the LHS is expected, the RHS is tested) and often confusing. With `EXPECT_NE` it is just not clear. Even the failure error messages are superior with the matcher based expectations. 2) When testing any kind of generic code, you are continually defining dummy types with interfaces and then trying to check that the interfaces are manipulated in a particular way. This is actually what mocks are *good* for -- testing *interface interactions*. With generic code, there is often no "fake" or other object that can be used. For a concrete example of where this is currently causing significant pain, look at the pass manager unittests which are riddled with counters incremented when methods are called. All of these could be replaced with mocks. The result would be more effective at testing the code by having tighter constraints. It would be substantially more readable and maintainable when updating the code. And the error messages on failure would have substantially more information as mocks automatically record stack traces and other information *when the API is misused* instead of trying to diagnose it after the fact. I expect that #1 will be the overwhelming majority of the uses of gmock, but I think that is sufficient to justify having it. I would actually like to update the coding standards to encourage the use of matchers rather than any other form of `EXPECT_...` macros as they are IMO a strict superset in terms of functionality and readability. I think that #2 is relatively rarely useful, but there *are* cases where it is useful. Historically, I think misuse of actual mocking as described in #2 has led to resistance towards this framework. I am actually sympathetic to this -- mocking can easily be overused. However I think this is not a significant concern in LLVM. First and foremost, LLVM has very careful and rare exposure of abstract interfaces or dependency injection, which are the most prone to abuse with mocks. So there are few opportunities to abuse them. Second, a large fraction of LLVM's unittests are testing *generic code* where mocks actually make tremendous sense. And gmock is well suited to building interfaces that exercise generic libraries. Finally, I still think we should be willing to have testing utilities in tree even if they should be used rarely. We can use code review to help guide the usage here. For a longer and more complete discussion of this, see the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108672.html The general consensus seems that this is a reasonable direction to start down, but that doesn't mean we should race ahead and use this everywhere. I have one test that is blocked on this to land and that was specifically used as an example. Before widespread adoption, I'm going to work up some (brief) guidelines as some of these facilities should be used sparingly and carefully. Differential Revision: https://reviews.llvm.org/D28156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291606 91177308-0d34-0410-b5e6-96231b3b80d8

Introduction ============ This patch added intial support for bpf program compile once and run everywhere (CO-RE). The main motivation is for bpf program which depends on kernel headers which may vary between different kernel versions. The initial discussion can be found at https://lwn.net/Articles/773198/. Currently, bpf program accesses kernel internal data structure through bpf_probe_read() helper. The idea is to capture the kernel data structure to be accessed through bpf_probe_read() and relocate them on different kernel versions. On each host, right before bpf program load, the bpfloader will look at the types of the native linux through vmlinux BTF, calculates proper access offset and patch the instruction. To accommodate this, three intrinsic functions preserve_{array,union,struct}_access_index are introduced which in clang will preserve the base pointer, struct/union/array access_index and struct/union debuginfo type information. Later, bpf IR pass can reconstruct the whole gep access chains without looking at gep itself. This patch did the following: . An IR pass is added to convert preserve_*_access_index to global variable who name encodes the getelementptr access pattern. The global variable has metadata attached to describe the corresponding struct/union debuginfo type. . An SimplifyPatchable MachineInstruction pass is added to remove unnecessary loads. . The BTF output pass is enhanced to generate relocation records located in .BTF.ext section. Typical CO-RE also needs support of global variables which can be assigned to different values to different hosts. For example, kernel version can be used to guard different versions of codes. This patch added the support for patchable externals as well. Example ======= The following is an example. struct pt_regs { long arg1; long arg2; }; struct sk_buff { int i; struct net_device *dev; }; #define _(x) (__builtin_preserve_access_index(x)) static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) = (void *) 4; extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version; int bpf_prog(struct pt_regs *ctx) { struct net_device *dev = 0; // ctx->arg* does not need bpf_probe_read if (__kernel_version >= 41608) bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev)); else bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev)); return dev != 0; } In the above, we want to translate the third argument of bpf_probe_read() as relocations. -bash-4.4$ clang -target bpf -O2 -g -S trace.c The compiler will generate two new subsections in .BTF.ext, OffsetReloc and ExternReloc. OffsetReloc is to record the structure member offset operations, and ExternalReloc is to record the external globals where only u8, u16, u32 and u64 are supported. BPFOffsetReloc Size struct SecLOffsetReloc for ELF section #1 A number of struct BPFOffsetReloc for ELF section #1 struct SecOffsetReloc for ELF section clangupc#2 A number of struct BPFOffsetReloc for ELF section clangupc#2 ... BPFExternReloc Size struct SecExternReloc for ELF section #1 A number of struct BPFExternReloc for ELF section #1 struct SecExternReloc for ELF section clangupc#2 A number of struct BPFExternReloc for ELF section clangupc#2 struct BPFOffsetReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types }; struct BPFExternReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t ExternNameOff; ///< The string for external variable }; Note that only externs with attribute section ".BPF.patchable_externs" are considered for Extern Reloc which will be patched by bpf loader right before the load. For the above test case, two offset records and one extern record will be generated: OffsetReloc records: .long .Ltmp12 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String .long .Ltmp18 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String ExternReloc record: .long .Ltmp5 # Insn Offset .long 165 # External Variable In string table: .ascii "0:1" # string offset=242 .ascii "__kernel_version" # string offset=165 The default member offset can be calculated as the 2nd member offset (0 representing the 1st member) of struct "sk_buff". The asm code: .Ltmp5: .Ltmp6: r2 = 0 r3 = 41608 .Ltmp7: .Ltmp8: .loc 1 18 9 is_stmt 0 # t.c:18:9 .Ltmp9: if r3 > r2 goto LBB0_2 .Ltmp10: .Ltmp11: .loc 1 0 9 # t.c:0:9 .Ltmp12: r2 = 8 .Ltmp13: .loc 1 19 66 is_stmt 1 # t.c:19:66 .Ltmp14: .Ltmp15: r3 = *(u64 *)(r1 + 0) goto LBB0_3 .Ltmp16: .Ltmp17: LBB0_2: .loc 1 0 66 is_stmt 0 # t.c:0:66 .Ltmp18: r2 = 8 .loc 1 21 66 is_stmt 1 # t.c:21:66 .Ltmp19: r3 = *(u64 *)(r1 + 8) .Ltmp20: .Ltmp21: LBB0_3: .loc 1 0 66 is_stmt 0 # t.c:0:66 r3 += r2 r1 = r10 .Ltmp22: .Ltmp23: .Ltmp24: r1 += -8 r2 = 8 call 4 For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number 8 is the structure offset based on the current BTF. Loader needs to adjust it if it changes on the host. For instruction .Ltmp5, "r2 = 0", the external variable got a default value 0, loader needs to supply an appropriate value for the particular host. Compiling to generate object code and disassemble: 0000000000000000 bpf_prog: 0: b7 02 00 00 00 00 00 00 r2 = 0 1: 7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2 2: b7 02 00 00 00 00 00 00 r2 = 0 3: b7 03 00 00 88 a2 00 00 r3 = 41608 4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2> 5: b7 02 00 00 08 00 00 00 r2 = 8 6: 79 13 00 00 00 00 00 00 r3 = *(u64 *)(r1 + 0) 7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3> 0000000000000040 LBB0_2: 8: b7 02 00 00 08 00 00 00 r2 = 8 9: 79 13 08 00 00 00 00 00 r3 = *(u64 *)(r1 + 8) 0000000000000050 LBB0_3: 10: 0f 23 00 00 00 00 00 00 r3 += r2 11: bf a1 00 00 00 00 00 00 r1 = r10 12: 07 01 00 00 f8 ff ff ff r1 += -8 13: b7 02 00 00 08 00 00 00 r2 = 8 14: 85 00 00 00 04 00 00 00 call 4 Instructions clangupc#2, clangupc#5 and clangupc#8 need relocation resoutions from the loader. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365503 91177308-0d34-0410-b5e6-96231b3b80d8

nenadv closed this as completed Oct 14, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPC-IR: parametrize Remote Pointer (RP) #2

UPC-IR: parametrize Remote Pointer (RP) #2

nenadv commented Oct 13, 2014

nenadv commented Oct 14, 2014

UPC-IR: parametrize Remote Pointer (RP) #2

UPC-IR: parametrize Remote Pointer (RP) #2

Comments

nenadv commented Oct 13, 2014

nenadv commented Oct 14, 2014