Skip to content

generate perf event data structure in Python automatically #2198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 14, 2019

Conversation

boat0
Copy link
Contributor

@boat0 boat0 commented Feb 12, 2019

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is redundant and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

// define output data structure in C
struct data_t {
    u32 pid;
    u64 ts;
    char comm[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(events);
...

Old way:

# define output data structure in Python
TASK_COMM_LEN = 16    # linux/sched.h
class Data(ct.Structure):
    _fields_ = [("pid", ct.c_ulonglong),
                ("ts", ct.c_ulonglong),
                ("comm", ct.c_char * TASK_COMM_LEN)]

def print_event(cpu, data, size):
    event = ct.cast(data, ct.POINTER(Data)).contents
...

New way:

def print_event(cpu, data, size):
    event = b["events"].event(data)
...

@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

@boat0
Copy link
Contributor Author

boat0 commented Feb 12, 2019

It seems that buildbot has skipped 60073ef upon which 78ea22f is based.

'long long': ct.c_longlong,
'unsigned long long': ct.c_ulonglong,
'u64': ct.c_ulonglong,
'void *': ct.c_void_p }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add s8, s16, s32, s64 as well?
How do we handle __int128 and unsigned __int128? I guess in such cases, the old way is needed where __int128 is coded as s64[2]?

Copy link
Contributor Author

@boat0 boat0 Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The old way works great for __int128 here too:
'__int128' : (ct.c_longlong * 2),

s8/16/32/64 also added. Any other type that is not supported will stop
the script from running with message
"Type: '%s' not recognized. Please define the data with ctypes manually."

i = 0
while i < num_fields:
field_name = lib.bpf_perf_event_field_name(self.bpf.module, i)
field_type = lib.bpf_perf_event_field_type(self.bpf.module, i)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have just one libbcc call like lib.bpf_perf_event_field to get field name and type in one shot?

Copy link
Contributor Author

@boat0 boat0 Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. We can concatenate type and name in libbcc and then split them in Python.

@yonghong-song
Copy link
Collaborator

@boat0 buildbot is okay. I see the error with llvm 3.8.0 and llvm 3.8.1 on ubuntu.

CMakeFiles/usdt-static.dir/usdt_args.cc.o -c /tmp/debuild.DTQhKj/bcc/src/cc/usdt/usdt_args.cc
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc: In member function ‘bool ebpf::BTypeVisitor::VisitCallExpr(clang::CallExpr*)’:
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc:818:46: error: ‘const class clang::Type’ has no member named ‘getPointeeOrArrayElementType’; did you mean ‘getPointeeType’?
                                            ->getPointeeOrArrayElementType()->getAsTagDecl();
                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                              getPointeeType

Could you check whether you can find an alternative implementation? It would be good to find an implementation working for all llvm's. But if you cannot, using #ifdef is okay.

Also, could you add some tests to test some of your supported types including array types?
This patch can greatly simplify how to write bcc python scripts. Thanks!

@boat0
Copy link
Contributor Author

boat0 commented Feb 13, 2019

@boat0 buildbot is okay. I see the error with llvm 3.8.0 and llvm 3.8.1 on ubuntu.

CMakeFiles/usdt-static.dir/usdt_args.cc.o -c /tmp/debuild.DTQhKj/bcc/src/cc/usdt/usdt_args.cc
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc: In member function ‘bool ebpf::BTypeVisitor::VisitCallExpr(clang::CallExpr*)’:
/tmp/debuild.DTQhKj/bcc/src/cc/frontends/clang/b_frontend_action.cc:818:46: error: ‘const class clang::Type’ has no member named ‘getPointeeOrArrayElementType’; did you mean ‘getPointeeType’?
                                            ->getPointeeOrArrayElementType()->getAsTagDecl();
                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                              getPointeeType

Could you check whether you can find an alternative implementation? It would be good to find an implementation working for all llvm's. But if you cannot, using #ifdef is okay.

Now we use getPointeeType().getTypePtr() instead of getPointeeOrArrayElementType()

@boat0
Copy link
Contributor Author

boat0 commented Feb 13, 2019

Also, could you add some tests to test some of your supported types including array types?

OK. I will do it in a subsequent PR.
I am now using tools/tcpconnect.py as a test and everything is OK.

tcpconnect.py has two events defined and features __int128 and array types:

// separate data structs for ipv4 and ipv6
struct ipv4_data_t {
    u64 ts_us;
    u32 pid;
    u32 uid;
    u32 saddr;
    u32 daddr;
    u64 ip;
    u16 dport;
    char task[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(ipv4_events);

struct ipv6_data_t {
    u64 ts_us;
    u32 pid;
    u32 uid;
    unsigned __int128 saddr;
    unsigned __int128 daddr;
    u64 ip;
    u16 dport;
    char task[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(ipv6_events);

@boat0 boat0 force-pushed the gen-pydata branch 2 times, most recently from d8dbbd5 to b3868c5 Compare February 13, 2019 09:50
@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is tedious and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

    // define output data structure in C
    struct data_t {
        u32 pid;
        u64 ts;
        char comm[TASK_COMM_LEN];
    };
    BPF_PERF_OUTPUT(events);
    ...

Old way:

    # define output data structure in Python
    TASK_COMM_LEN = 16    # linux/sched.h
    class Data(ct.Structure):
        _fields_ = [("pid", ct.c_ulonglong),
                    ("ts", ct.c_ulonglong),
                    ("comm", ct.c_char * TASK_COMM_LEN)]

    def print_event(cpu, data, size):
        event = ct.cast(data, ct.POINTER(Data)).contents
    ...

New way:

    def print_event(cpu, data, size):
        event = b["events"].event(data)
    ...
Take tcpconnect.py as an example to show the new, simpler way
of outputing perf event.

Other tools/examples can be simplified in a similar way.
@boat0
Copy link
Contributor Author

boat0 commented Feb 14, 2019

Last build failed at test_usdt2.py in which event type is defined as a simple int type.
It is supposed to do nothing in this case, but the checking turns out to be not working.
In this update I make it working by

auto type_arg1 = Call->getArg(1)->IgnoreCasts()->getType().getTypePtr()->getPointeeType().getTypePtr();

if (type_arg1->isStructureType()) {...}

@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

@yonghong-song
Copy link
Collaborator

The following __int128 should be the most common usage:

                      '__int128'          : (ct.c_longlong * 2),
                       'unsigned __int128' : (ct.c_ulonglong * 2),

If users try to map __int128 differently in python, they should try to explicit ct_type definitions in python instead.

@yonghong-song yonghong-song merged commit 3156303 into iovisor:master Feb 14, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 15, 2019
Simplify code following iovisor#2198 (iovisor#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
yonghong-song pushed a commit that referenced this pull request Feb 15, 2019
Simplify code following #2198 (#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
@boat0 boat0 deleted the gen-pydata branch February 15, 2019 05:27
boat0 added a commit to boat0/bcc that referenced this pull request Feb 16, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 16, 2019
yonghong-song pushed a commit that referenced this pull request Feb 16, 2019
@pchaigno
Copy link
Contributor

This is great! Thanks a lot @boat0!

The developer tutorial would also need to be updated to take this pull request into account.

boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
boat0 added a commit to boat0/bcc that referenced this pull request Feb 23, 2019
@boat0
Copy link
Contributor Author

boat0 commented Feb 23, 2019

This is great! Thanks a lot @boat0!

The developer tutorial would also need to be updated to take this pull request into account.

pr submitted #2226. Thanks!

arighi pushed a commit to arighi/bcc that referenced this pull request Mar 21, 2019
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
)

* generate perf event data structure in Python automatically

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is tedious and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

    // define output data structure in C
    struct data_t {
        u32 pid;
        u64 ts;
        char comm[TASK_COMM_LEN];
    };
    BPF_PERF_OUTPUT(events);
    ...

Old way:

    # define output data structure in Python
    TASK_COMM_LEN = 16    # linux/sched.h
    class Data(ct.Structure):
        _fields_ = [("pid", ct.c_ulonglong),
                    ("ts", ct.c_ulonglong),
                    ("comm", ct.c_char * TASK_COMM_LEN)]

    def print_event(cpu, data, size):
        event = ct.cast(data, ct.POINTER(Data)).contents
    ...

New way:

    def print_event(cpu, data, size):
        event = b["events"].event(data)
    ...

* tools/tcpconnect.py: deduce perf event data structure automatically

Take tcpconnect.py as an example to show the new, simpler way
of outputing perf event.

Other tools/examples can be simplified in a similar way.
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
…sor#2204)

Simplify code following iovisor#2198 (iovisor#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
palexster pushed a commit to palexster/bcc that referenced this pull request Jul 7, 2019
CrackerCat pushed a commit to CrackerCat/bcc that referenced this pull request Jul 31, 2024
)

* generate perf event data structure in Python automatically

When ring buffers are opened to receive custom perf event, we have
to define the event data structure twice: once in BPF C and once
in Python. It is tedious and error-prone.

This patch implements the automatic generation of Python data structure
from the C declaration, thus making the redundant definition in Python
unnecessary. Example:

    // define output data structure in C
    struct data_t {
        u32 pid;
        u64 ts;
        char comm[TASK_COMM_LEN];
    };
    BPF_PERF_OUTPUT(events);
    ...

Old way:

    # define output data structure in Python
    TASK_COMM_LEN = 16    # linux/sched.h
    class Data(ct.Structure):
        _fields_ = [("pid", ct.c_ulonglong),
                    ("ts", ct.c_ulonglong),
                    ("comm", ct.c_char * TASK_COMM_LEN)]

    def print_event(cpu, data, size):
        event = ct.cast(data, ct.POINTER(Data)).contents
    ...

New way:

    def print_event(cpu, data, size):
        event = b["events"].event(data)
    ...

* tools/tcpconnect.py: deduce perf event data structure automatically

Take tcpconnect.py as an example to show the new, simpler way
of outputing perf event.

Other tools/examples can be simplified in a similar way.
CrackerCat pushed a commit to CrackerCat/bcc that referenced this pull request Jul 31, 2024
…sor#2204)

Simplify code following iovisor#2198 (iovisor#2198).

Some tools are not touched: mountsnoop.py, trace.py, lib/*.py, old/*.py.
CrackerCat pushed a commit to CrackerCat/bcc that referenced this pull request Jul 31, 2024
CrackerCat pushed a commit to CrackerCat/bcc that referenced this pull request Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants