Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate perf event data structure in Python automatically #2198

Merged
merged 2 commits into from Feb 14, 2019

Conversation

@boat0
Copy link
Contributor

boat0 commented Feb 12, 2019

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is redundant and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

// define output data structure in C
struct data_t {
    u32 pid;
    u64 ts;
    char comm[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(events);
...

Old way:

# define output data structure in Python
TASK_COMM_LEN = 16    # linux/sched.h
class Data(ct.Structure):
    _fields_ = [("pid", ct.c_ulonglong),
                ("ts", ct.c_ulonglong),
                ("comm", ct.c_char * TASK_COMM_LEN)]

def print_event(cpu, data, size):
    event = ct.cast(data, ct.POINTER(Data)).contents
...

New way:

def print_event(cpu, data, size):
    event = b["events"].event(data)
...
@yonghong-song

This comment has been minimized.

Copy link
Collaborator

yonghong-song commented Feb 12, 2019

[buildbot, test this please]

@boat0

This comment has been minimized.

Copy link
Contributor Author

boat0 commented Feb 12, 2019

It seems that buildbot has skipped 60073ef upon which 78ea22f is based.

'long long': ct.c_longlong,
'unsigned long long': ct.c_ulonglong,
'u64': ct.c_ulonglong,
'void *': ct.c_void_p }

This comment has been minimized.

Copy link
@yonghong-song

yonghong-song Feb 12, 2019

Collaborator

Maybe add s8, s16, s32, s64 as well?
How do we handle __int128 and unsigned __int128? I guess in such cases, the old way is needed where __int128 is coded as s64[2]?

This comment has been minimized.

Copy link
@boat0

boat0 Feb 13, 2019

Author Contributor

Yes. The old way works great for __int128 here too:
'__int128' : (ct.c_longlong * 2),

s8/16/32/64 also added. Any other type that is not supported will stop
the script from running with message
"Type: '%s' not recognized. Please define the data with ctypes manually."

i = 0
while i < num_fields:
field_name = lib.bpf_perf_event_field_name(self.bpf.module, i)
field_type = lib.bpf_perf_event_field_type(self.bpf.module, i)

This comment has been minimized.

Copy link
@yonghong-song

yonghong-song Feb 12, 2019

Collaborator

Is it possible to have just one libbcc call like lib.bpf_perf_event_field to get field name and type in one shot?

This comment has been minimized.

Copy link
@boat0

boat0 Feb 13, 2019

Author Contributor

Ok. We can concatenate type and name in libbcc and then split them in Python.

@yonghong-song

This comment has been minimized.

Copy link
Collaborator

yonghong-song commented Feb 12, 2019

@boat0 buildbot is okay. I see the error with llvm 3.8.0 and llvm 3.8.1 on ubuntu.

CMakeFiles/usdt-static.dir/usdt_args.cc.o -c /tmp/debuild.DTQhKj/bcc/src/cc/usdt/usdt_args.cc
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc: In member function ‘bool ebpf::BTypeVisitor::VisitCallExpr(clang::CallExpr*)’:
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc:818:46: error: ‘const class clang::Type’ has no member named ‘getPointeeOrArrayElementType’; did you mean ‘getPointeeType’?
                                            ->getPointeeOrArrayElementType()->getAsTagDecl();
                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                              getPointeeType

Could you check whether you can find an alternative implementation? It would be good to find an implementation working for all llvm's. But if you cannot, using #ifdef is okay.

Also, could you add some tests to test some of your supported types including array types?
This patch can greatly simplify how to write bcc python scripts. Thanks!

@boat0

This comment has been minimized.

Copy link
Contributor Author

boat0 commented Feb 13, 2019

@boat0 buildbot is okay. I see the error with llvm 3.8.0 and llvm 3.8.1 on ubuntu.

CMakeFiles/usdt-static.dir/usdt_args.cc.o -c /tmp/debuild.DTQhKj/bcc/src/cc/usdt/usdt_args.cc
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc: In member function ‘bool ebpf::BTypeVisitor::VisitCallExpr(clang::CallExpr*)’:
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc:818:46: error: ‘const class clang::Type’ has no member named ‘getPointeeOrArrayElementType’; did you mean ‘getPointeeType’?
                                            ->getPointeeOrArrayElementType()->getAsTagDecl();
                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                              getPointeeType

Could you check whether you can find an alternative implementation? It would be good to find an implementation working for all llvm's. But if you cannot, using #ifdef is okay.

Now we use getPointeeType().getTypePtr() instead of getPointeeOrArrayElementType()

@boat0

This comment has been minimized.

Copy link
Contributor Author

boat0 commented Feb 13, 2019

Also, could you add some tests to test some of your supported types including array types?

OK. I will do it in a subsequent PR.
I am now using tools/tcpconnect.py as a test and everything is OK.

tcpconnect.py has two events defined and features __int128 and array types:

// separate data structs for ipv4 and ipv6
struct ipv4_data_t {
    u64 ts_us;
    u32 pid;
    u32 uid;
    u32 saddr;
    u32 daddr;
    u64 ip;
    u16 dport;
    char task[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(ipv4_events);

struct ipv6_data_t {
    u64 ts_us;
    u32 pid;
    u32 uid;
    unsigned __int128 saddr;
    unsigned __int128 daddr;
    u64 ip;
    u16 dport;
    char task[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(ipv6_events);
@boat0 boat0 force-pushed the boat0:gen-pydata branch 2 times, most recently from d8dbbd5 to b3868c5 Feb 13, 2019
@yonghong-song

This comment has been minimized.

Copy link
Collaborator

yonghong-song commented Feb 13, 2019

[buildbot, test this please]

boat0 added 2 commits Feb 14, 2019
When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is tedious and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

    // define output data structure in C
    struct data_t {
        u32 pid;
        u64 ts;
        char comm[TASK_COMM_LEN];
    };
    BPF_PERF_OUTPUT(events);
    ...

Old way:

    # define output data structure in Python
    TASK_COMM_LEN = 16    # linux/sched.h
    class Data(ct.Structure):
        _fields_ = [("pid", ct.c_ulonglong),
                    ("ts", ct.c_ulonglong),
                    ("comm", ct.c_char * TASK_COMM_LEN)]

    def print_event(cpu, data, size):
        event = ct.cast(data, ct.POINTER(Data)).contents
    ...

New way:

    def print_event(cpu, data, size):
        event = b["events"].event(data)
    ...
Take tcpconnect.py as an example to show the new, simpler way
of outputing perf event.

Other tools/examples can be simplified in a similar way.
@boat0 boat0 force-pushed the boat0:gen-pydata branch from b3868c5 to e415438 Feb 14, 2019
@boat0

This comment has been minimized.

Copy link
Contributor Author

boat0 commented Feb 14, 2019

Last build failed at test_usdt2.py in which event type is defined as a simple int type.
It is supposed to do nothing in this case, but the checking turns out to be not working.
In this update I make it working by

auto type_arg1 = Call->getArg(1)->IgnoreCasts()->getType().getTypePtr()->getPointeeType().getTypePtr();

if (type_arg1->isStructureType()) {...}
@yonghong-song

This comment has been minimized.

Copy link
Collaborator

yonghong-song commented Feb 14, 2019

[buildbot, test this please]

@yonghong-song

This comment has been minimized.

Copy link
Collaborator

yonghong-song commented Feb 14, 2019

The following __int128 should be the most common usage:

                      '__int128'          : (ct.c_longlong * 2),
                       'unsigned __int128' : (ct.c_ulonglong * 2),

If users try to map __int128 differently in python, they should try to explicit ct_type definitions in python instead.

@yonghong-song yonghong-song merged commit 3156303 into iovisor:master Feb 14, 2019
1 check passed
1 check passed
default
Details
boat0 added a commit to boat0/bcc that referenced this pull request Feb 15, 2019
Simplify code following iovisor#2198 (iovisor#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
yonghong-song added a commit that referenced this pull request Feb 15, 2019
Simplify code following #2198 (#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
@boat0 boat0 deleted the boat0:gen-pydata branch Feb 15, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 16, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 16, 2019
yonghong-song added a commit that referenced this pull request Feb 16, 2019
@pchaigno

This comment has been minimized.

Copy link
Contributor

pchaigno commented Feb 22, 2019

This is great! Thanks a lot @boat0!

The developer tutorial would also need to be updated to take this pull request into account.

boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
@boat0

This comment has been minimized.

Copy link
Contributor Author

boat0 commented Feb 23, 2019

This is great! Thanks a lot @boat0!

The developer tutorial would also need to be updated to take this pull request into account.

pr submitted #2226. Thanks!

arighi pushed a commit to arighi/bcc that referenced this pull request Mar 21, 2019
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
)

* generate perf event data structure in Python automatically

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is tedious and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

    // define output data structure in C
    struct data_t {
        u32 pid;
        u64 ts;
        char comm[TASK_COMM_LEN];
    };
    BPF_PERF_OUTPUT(events);
    ...

Old way:

    # define output data structure in Python
    TASK_COMM_LEN = 16    # linux/sched.h
    class Data(ct.Structure):
        _fields_ = [("pid", ct.c_ulonglong),
                    ("ts", ct.c_ulonglong),
                    ("comm", ct.c_char * TASK_COMM_LEN)]

    def print_event(cpu, data, size):
        event = ct.cast(data, ct.POINTER(Data)).contents
    ...

New way:

    def print_event(cpu, data, size):
        event = b["events"].event(data)
    ...

* tools/tcpconnect.py: deduce perf event data structure automatically

Take tcpconnect.py as an example to show the new, simpler way
of outputing perf event.

Other tools/examples can be simplified in a similar way.
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
…sor#2204)

Simplify code following iovisor#2198 (iovisor#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.