Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix build for upstream LLVM and clang #210

Merged
merged 1 commit into from
Apr 15, 2019

Conversation

mmarchini
Copy link
Contributor

No description provided.

@brendangregg
Copy link
Contributor

This works for me with an llvm+clang build, and I also tested on llvm 5 from packages, and it still worked.

@drzaeus77 @yonghong-song does this work for you too?

@brendangregg
Copy link
Contributor

@drzaeus77 @yonghong-song can someone else please confirm this works so we can merge this PR?

@mmarchini
Copy link
Contributor Author

ping @drzaeus77 @yonghong-song can you check if this patch fixed the issue?

@mmarchini
Copy link
Contributor Author

Rebased and added Travis tests for LLVM 8. Three tests are failing. Apparently anonymous structs are broken with clang 8:

# ./src/bpftrace -de 'struct Foo { struct { int x; } bar; } kprobe:f { $foo = (Foo*)0; @x = $foo->bar.x; }'
struct Foo { struct { int x; } bar; };

Program
 kprobe:f
  =
   variable: $foo
   (Foo*)
    int: 0
  =
   map: @x
   .
    .
     dereference
      variable: $foo
     bar
    x

Unknown struct/union: 'Foo::(anonymous at definitions.h:1:14)'
# ./src/bpftrace -de 'struct Foo { int m; struct { int x; int y; } bar; int n; } k:f { @foo = (Foo)0; @bar = @foo.bar; @x = @foo.bar.x; }'
struct Foo { int m; struct { int x; int y; } bar; int n; };

Program
 kprobe:f
  =
   map: @foo
   (Foo)
    int: 0
  =
   map: @bar
   .
    map: @foo
    bar
  =
   map: @x
   .
    .
     map: @foo
     bar
    x

Unknown struct/union: 'Foo::(anonymous at definitions.h:1:21)'

We should probably fix this before landing this PR.

@mmarchini
Copy link
Contributor Author

Apparently clang_Cursor_isAnonymous behavior changed (likely because of llvm/llvm-project@c05e6f430eba, but I'm not sure).

@mmarchini
Copy link
Contributor Author

Yes, clang_Cursor_isAnonymous is behaving differently in clang <=7 and clang 8.

Let's use the struct below as an example:

struct Foo { 
  struct { 
    int x; 
  } bar; 
}

If we call clang_Cursor_isAnonymous passing a cursor to struct { int x; }, clang 7 returns false and clang 8 returns true.

Interestingly though, both versions return true for indirect structs (example below).

struct Foo { 
  struct { 
    int x; 
  }; 
}

In theory, the new clang behavior is correct (in the first example struct { int x; } is indeed anonymous), but we're using clang_Cursor_isAnonymous to determine if the struct has direct or indirect access in our code.

I'm not familiar with clang API, so I don't know if there's a way to check if a struct has indirect access.

A workaround we could implement on bpftrace side is to make our lexer give a name to all anonymous structs with direct access. For example, struct Foo { struct { int x; } bar; } would become struct Foo { struct __Foo__bar { int x; } bar; }. This would probably fix the issue.

@MaskRay
Copy link
Contributor

MaskRay commented Mar 16, 2019

Can you split the source change and .travis.yml Dockerfile changes? The source level change looks good to me to make bpftrace build with llvm 8 or trunk.

bison_target(bison_parser src/parser.yy ${CMAKE_BINARY_DIR}/parser.tab.cc)

For people who

bison generated parser.tab.h uses C++ typeid keyword so bpftrace has to be built
with -frtti (default). While LLVM defaults to -DLLVM_ENABLE_RTTI=OFF

A workaround we could implement on bpftrace side is to make our lexer give a name to all anonymous structs with direct access. For example, struct Foo { struct { int x; } bar; } would become struct Foo { struct __Foo__bar { int x; } bar; }. This would probably fix the issue.

An alternative is to check if clang_getCursorSpelling(cursor) is empty.

gentoo-bot pushed a commit to gentoo/gentoo that referenced this pull request Mar 20, 2019
Patch taken from upstream PR:
bpftrace/bpftrace#210

Copyright: Sony Interactive Entertainment Inc.
Package-Manager: Portage-2.3.62, Repoman-2.3.12
Signed-off-by: Patrick McLean <chutzpah@gentoo.org>
@mmarchini
Copy link
Contributor Author

Can you split the source change and .travis.yml Dockerfile changes? The source level change looks good to me to make bpftrace build with llvm 8 or trunk.

I can, but the failing tests are uncovering a real bug. Not sure if it's a good idea to fix the build if the resulting binary will not be fully functional.

@mmarchini
Copy link
Contributor Author

Unless libclang has a clear, reliable way to determine if a struct is indirect, the code will get pretty messy and fragile. I'm close to make it work, but I'm not liking the result.

@ajor we only use libclang for union/struct parsing, right? Do you think it would be feasible to move union/struct parsing to bpftrace and drop libclang as a dependency? This would also solve #266 and #139 (comment). I'm willing to work on this.

@ajor
Copy link
Member

ajor commented Mar 28, 2019

I did try to do my own struct parsing at first before turning to libclang. I think it would be a lot of work to do properly:

and probably more

@mmarchini
Copy link
Contributor Author

Yes, it would be laborious. OTOH, we are introducing some ugly hacks and workarounds because of libclang (this PR and #476), and we are unable to build bpftrace statically outside of alpine because of libclang as a dependency.

I'm not saying we must remove libclang and go with our own implementation, but using libclang is not pain-free, and having our own implementation could be more maintainable in the long-term.

@mmarchini
Copy link
Contributor Author

Updated the message on my last commit to explain the logic behind the changes. This is good from my side, just need some reviews.

@jasonk000
Copy link
Contributor

For me, this looks good. Builds and passes all tests on clang8.

[jkoch@jkoch-nflx build]$ clang -v
clang version 8.0.0 (tags/RELEASE_800/final)
[jkoch@jkoch-nflx build]$ sudo ./tests/bpftrace_test 
....
[----------] Global test environment tear-down
[==========] 255 tests from 8 test cases ran. (1482 ms total)
[  PASSED  ] 255 tests.

Two warnings introduced (below), fixed with:

--- a/src/types.h
+++ b/src/types.h
@@ -73,7 +73,7 @@ public:
   bool is_internal = false;
   bool is_pointer = false;
   bool is_tparg = false;
-  size_t pointee_size;
+  size_t pointee_size = 0;
[ 20%] Building CXX object src/CMakeFiles/bpftrace.dir/clang_parser.cpp.o
In file included from /home/jkoch/code/bpftrace/src/ast/ast.h:8,
                 from /home/jkoch/code/bpftrace/src/clang_parser.cpp:9:
/home/jkoch/code/bpftrace/src/types.h: In function ‘bpftrace::ClangParser::parse(bpftrace::ast::Program*, bpftrace::StructMap&)::<lambda(CXCursor, CXCursor, CXClientData)>’:
/home/jkoch/code/bpftrace/src/types.h:59:7: warning: ‘<anonymous>.bpftrace::SizedType::pointee_size’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 class SizedType
       ^~~~~~~~~

In file included from /home/jkoch/code/bpftrace/src/ast/ast.h:8,
                 from /home/jkoch/code/bpftrace/src/clang_parser.cpp:9:
/home/jkoch/code/bpftrace/src/types.h: In function ‘bpftrace::ClangParser::parse(bpftrace::ast::Program*, bpftrace::StructMap&)::<lambda(CXCursor, CXCursor, CXClientData)>’:
/home/jkoch/code/bpftrace/src/types.h:59:7: warning: ‘<anonymous>.bpftrace::SizedType::pointee_size’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 class SizedType
       ^~~~~~~~~

@mmarchini
Copy link
Contributor Author

pointee_size was not defined in this PR, so I guess it's related to a new clang check instead of the changes on the PR. Anyway, I need to rebase this, so I'll fix this warning as well :)
Thank you for looking into this!

LLVM changed the behavior of clang_Cursor_isAnonymous in [1]. The old
behavior would returned false for direct-acccess anonymous structs
within other structs, whereas LLVM 8 returns true. To circumvent this
behavior change among LLVM versions, we keep track of all structs
defined within a struct. We don't parse the substruct recursively (if we
do it might be parsed multiple times, and since we don't know yet if
this is a direct or indirect strucutre, we might parse it incorrectly),
instead we keep the cursor saved in a map. If this substruct is later
declared as an attribute of the supersctruct, that means we have a
direct-accessible struct. We remove it from our map, and we parse
recursively (parsing recursively the cursor pointing to the declaration
will effectively parse the struct definition).

After the first parse, any struct left in our map is an indirect struct.
Since we skipped the parsing stage for those, we need to run
`clang_visitChildren` again for each struct cursor we have saved. We
repeat this until there's no unvisited structs in the map. Keep in mind
that while visiting a new struct we might find more indirect structs.

Also add Travis jobs to test against LLVM and clang 8 on Ubuntu.

[1]: llvm/llvm-project@c05e6f4
@mmarchini mmarchini merged commit 80ce138 into bpftrace:master Apr 15, 2019
if (struct_name == "")
struct_name = ptypestr;
remove_struct_prefix(struct_name);
return CXChildVisit_Continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we returned CXChildVisit_Recurse here can we get rid of the looping logic and recording univisited indirect structs? Sorry, my build is still broken so I can't check myself (LLVM 8 wasn't the only problem on my computer apparently)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because at this point we don't know if this struct is just being defined (indirect struct) or if it will be used as the type for a field of the parent struct (direct struct). The cursor is pointing to the struct declaration, not in the field declaration.

Look at the struct below (the current clang cursor is what's inside < and >:

struct Foo {
  int field1;
  <struct {
    int field2;
  }> field3;
  char field4;
}

In this case, we want to be able to access field2 as Foo.field3.field2. But we don't know if this is a direct or indirect struct, and we'll only know once we move our cursor to forward, in which case we'll be at:

struct Foo {
  int field1;
  struct {
    int field2;
  } <field3>;
  char field4;
}

Same thing goes to indirect structs, we'll only know how the access will work once we move the cursor.

Older versions of LLVM would differentiate those two access types with clang_Cursor_isAnonymous, but from what I can tell this was a bug which was fixed on LLVM 8. Keeping track of inner structs and deferring their parsing until we have enough information was the only way I could find to keep things working across LLVM versions.

auto type = clang_getCanonicalType(clang_getCursorType(c));
auto typestr = get_clang_string(clang_getTypeSpelling(type));

if (indirect_structs.count(typestr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we hit this? Could you add a comment as it's not immediately clear to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the example above, would be in this case:

struct Foo {
  int field1;
  struct {
    int field2;
  } <field3>;
  char field4;
}

We saved the possibly indirect struct in our previous step, and then when we moved the cursor we realized that we were actually declaring a field of the parent struct. In this case we don't want to treat the struct as indirect, so we remove it from our list of indirect structs.

Looking at this again, this seems to be incomplete, as we'll never parse the unnamed struct in that example (unless I'm missing something). I'll investigate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently it works, but I don't know why 🙃

Copy link
Contributor Author

@mmarchini mmarchini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have added waaaay more comments to this code :)

if (struct_name == "")
struct_name = ptypestr;
remove_struct_prefix(struct_name);
return CXChildVisit_Continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because at this point we don't know if this struct is just being defined (indirect struct) or if it will be used as the type for a field of the parent struct (direct struct). The cursor is pointing to the struct declaration, not in the field declaration.

Look at the struct below (the current clang cursor is what's inside < and >:

struct Foo {
  int field1;
  <struct {
    int field2;
  }> field3;
  char field4;
}

In this case, we want to be able to access field2 as Foo.field3.field2. But we don't know if this is a direct or indirect struct, and we'll only know once we move our cursor to forward, in which case we'll be at:

struct Foo {
  int field1;
  struct {
    int field2;
  } <field3>;
  char field4;
}

Same thing goes to indirect structs, we'll only know how the access will work once we move the cursor.

Older versions of LLVM would differentiate those two access types with clang_Cursor_isAnonymous, but from what I can tell this was a bug which was fixed on LLVM 8. Keeping track of inner structs and deferring their parsing until we have enough information was the only way I could find to keep things working across LLVM versions.

auto type = clang_getCanonicalType(clang_getCursorType(c));
auto typestr = get_clang_string(clang_getTypeSpelling(type));

if (indirect_structs.count(typestr))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the example above, would be in this case:

struct Foo {
  int field1;
  struct {
    int field2;
  } <field3>;
  char field4;
}

We saved the possibly indirect struct in our previous step, and then when we moved the cursor we realized that we were actually declaring a field of the parent struct. In this case we don't want to treat the struct as indirect, so we remove it from our list of indirect structs.

Looking at this again, this seems to be incomplete, as we'll never parse the unnamed struct in that example (unless I'm missing something). I'll investigate.

@mmarchini
Copy link
Contributor Author

Ok, I figured it out. When we move the cursor forward we'll be holding the struct definition and the field declaration. The cursor type will be FieldDecl and when we return CXChildVisit_Recurse we'll parse the inner struct. If we pass CXChildVisit_Recurse instead of CXChildVisit_Continue when the cursor is still a StructDecl, we'll parse the struct two times (and in the first time we don't know how we should access the struct fields).

So the flow looks like this:

Direct struct access

Step 1

struct Foo {
  int field1;
  <struct {
    int field2;
  }> field3;
  char field4;
}

In this case, we want to be able to access field2 as Foo.field3.field2. But we don't know if this is a direct or indirect struct, and we'll only know once we move our cursor to forward. So we store the struct for later and use CXChildVisit_Continue.

Step 2

struct Foo {
  int field1;
  <struct {
    int field2;
  } field3>;
  char field4;
}

Now we have all the information we need. We know this is a field declaration and we don't want to treat the struct as indirect, so we remove it from the indirect structs list. We'll follow the most common code path here, and at the end we'll return CXChildVisit_Recurse, which will then parse the annonymous struct

Step 3

struct Foo {
  int field1;
  struct {
    <int field2>;
  } field3;
  char field4;
}

We know we are inside a struct used in a field declaration, so we'll assign field2 to that struct, which will allow us to access it as Foo.field3.field2.

Indirect struct access

Step 1

struct Foo {
  int field1;
  <struct {
    int field2;
  }>;
  char field4;
}

In this case, we want to be able to access field2 as Foo.field2. But we don't know if this is a direct or indirect struct, and we'll only know once we move our cursor to forward. So we store the struct for later and use CXChildVisit_Continue.

Step 2

struct Foo {
  int field1;
  struct {
    int field2;
  };
  <char field4>;
}

We'll keep advancing our cursor until we reach the end of our file. At that point we know the fields of our struct will be indirectly accessed, but we havent't parsed the anonymous struct yet. So we start parsing again, passing the cursor we saved on step 1 as starting point. Note that we'll only parse the children of our struct, so we are not parsing the entire file again.

Step 3

struct Foo {
  int field1;
  struct {
    <int field2>;
  };
  char field4;
}

We know we are inside a struct which was not used in a field declaration, so we'll assign field2 to it's parent struct, which will allow us to access it as Foo.field2.


I'm not 100% sure the terminology direct/indirect is correct, but I'm trying to keep their use coherent across comments.

@mmarchini mmarchini mentioned this pull request May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants