Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ringbuf Support for Python API #2989

Merged
merged 44 commits into from Jun 25, 2020

Conversation

willfindlay
Copy link
Contributor

@willfindlay willfindlay commented Jun 24, 2020

This pull request contains an implementation for ringbuf support in bcc's Python API.

Fixes #2985.

Summary of Changes

  • Added ringbuf helpers from libbpf API to libbcc
  • Added a new RingBuf class to represent the ringbuf map
  • Added a new EventArrayBase class to reduce code duplication between PerfEventArray and RingBuf
  • Added BPF_RINGBUF_OUTPUT macro for BPF programs
  • Added documentation

Examples

The following example programs describe how the API works.

On the BPF side, we have two choices for submitting events. map.ringbuf_output works like map.perf_submit for perf buffers,
while map.ringbuf_reserve/map.ringbuf_discard/map.ringbuf_submit present an alternative approach that allows for finer control.

map.ringbuf_output works like:

BPF_RINGBUF_OUTPUT(events);

struct data_t data {
    u32 pid;
};

TRACEPOINT_PROBE(raw_syscalls, sys_enter) {
    u32 pid = bpf_get_current_pid_tgid();

    struct data_t data = { .pid = pid };

    events.ringbuf_output(&data, sizeof(data), 0 /* flags */);
    
    return 0;
}

map.ringbuf_reserve/map.ringbuf_discard/map.ringbuf_submit work like:

BPF_RINGBUF_OUTPUT(events);

struct data_t data {
    u32 pid;
};

TRACEPOINT_PROBE(raw_syscalls, sys_enter) {
    u32 pid = bpf_get_current_pid_tgid();

    struct data_t *data = events.ringbuf_reserve(sizeof(struct data_t));
    if (!data)
        return 1;

    data->pid = pid;

    events.ringbuf_submit(data, 0 /* flags */);
    // or, to discard the data...
    events.ringbuf_discard(data, 0 /* flags */);

    return 0;
}

On the Python side, the user calls open_ring_buffer to associate a callback with the buffer. There are two options for reading data from kernelspace: ring_buffer_poll which works like perf_buffer_poll and ring_buffer_consume which reads data without polling first.

import time

from bcc import BPF

b = BPF(src_file=b'source.c')

def callback(ctx, data, size):
    event = b['events'].event(data)
    print(event.pid)

b['events'].open_ring_buffer(callback)

while True:
    # Option 1...
    b.ring_buffer_poll(30)
    # Option 2...
    b.ring_buffer_consume()
    time.sleep(0.5)

Unlike perf buffer callbacks, ringbuf supports return values from the callback, which can be used to indicate error conditions to stop polling early. If a callback does not return an integer, we fall back to returning 0, which approximates perf buffer behavior.

@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 24, 2020

[buildbot, test this please]

@willfindlay
Copy link
Contributor Author

willfindlay commented Jun 24, 2020

Is there an easy way to see what tests specifically are failing? Seems to be something in test_tools_smoke.py from what I can tell.

@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 24, 2020

Is there an easy way to see what tests specifically are failing? Seems to be something in test_tools_smoke.py from what I can tell.

Probably a flaky test. Let me try again.

@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 24, 2020

[buildbot, test this please]

@willfindlay
Copy link
Contributor Author

willfindlay commented Jun 24, 2020

Ah seems like it was indeed a flaky test.

src/cc/export/helpers.h Outdated Show resolved Hide resolved
src/cc/frontends/clang/b_frontend_action.cc Outdated Show resolved Hide resolved
src/cc/libbpf.c Outdated Show resolved Hide resolved
src/cc/libbpf.h Outdated Show resolved Hide resolved
src/python/bcc/table.py Outdated Show resolved Hide resolved
src/python/bcc/table.py Outdated Show resolved Hide resolved
@willfindlay
Copy link
Contributor Author

willfindlay commented Jun 24, 2020

Thank you for your comments @anakryiko. I left a few followup questions / concerns as replies. I should have time to update this PR later today or tomorrow.

@willfindlay willfindlay marked this pull request as draft Jun 24, 2020
@willfindlay
Copy link
Contributor Author

willfindlay commented Jun 24, 2020

@anakryiko I have now addressed all of your feedback (with the exception of pages vs bytes in the ringbuf macro, as discussed). Things seem to be in a good place now, but if you have any other concerns let me know.

@willfindlay willfindlay marked this pull request as ready for review Jun 25, 2020
@willfindlay willfindlay requested a review from brendangregg as a code owner Jun 25, 2020
@willfindlay
Copy link
Contributor Author

willfindlay commented Jun 25, 2020

@yonghong-song Should be good to test again. I've added unit tests under tests/python/test_ringbuf.py, as well as examples and documentation.

@willfindlay willfindlay changed the title [RFC] Ringbuf Support for Python API Ringbuf Support for Python API Jun 25, 2020
Copy link
Contributor

@anakryiko anakryiko left a comment

It looks good from API perspective, would be good to get a review from someone more familiar with BCC internals, though. See also the point of being able to size ringbuf from user-space.


Syntax: ```table.open_ring_buffer(callback, ctx=None)```

This operates on a table as defined in BPF as BPF_RINGBUF_OUTPUT(), and associates the callback Python function ```callback``` to be called when data is available in the ringbuf ring buffer. This is part of the new (Linux 5.8+) recommended mechanism for transferring per-event data from kernel to user space. Unlike perf buffers, ringbuf sizes are specified within the BPF program, as part of the ```BPF_RINGBUF_OUTPUT``` macro. If the callback is not processing data fast enough, some submitted data may be lost. In this case, the events should be polled more frequently and/or the size of the ring buffer should be increased.
Copy link
Contributor

@anakryiko anakryiko Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to set ringbuf size from Python API in user-space? There are probably cases, where this size is determined dynamically by user space logic

Copy link
Contributor Author

@willfindlay willfindlay Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would be easy to implement given the way bcc treats maps. We could potentially change this in the future, but for now, I think the current API is fine. For now, if the user desires different map sizes according to userspace logic, they can just programmatically generate the correct C code. This is already a common pattern in bcc tools.


try:
while 1:
b.ring_buffer_consume()
Copy link
Contributor

@anakryiko anakryiko Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ring_buffer_consume is very specialized use-case, I'd do all the examples with ring_buffer_poll() and just mention ring_buffer_consume as an alternative, as you did in the example above.

@anakryiko
Copy link
Contributor

anakryiko commented Jun 25, 2020

@yonghong-song, do you squash all those 44 commits into one when you land? I wonder if it makes sense to ask people to re-organize and squash into fewer logic commits before landing. E.g., implementation, testing and documentation commits. 44 commits are pretty unwieldy.

@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 25, 2020

@anakryiko Thanks for review. I won't push these 44 commits. I indeed frequently squash them into one commit. llvm sometimes also has giant commits. I think it is okay not to put burden on the developer for already done work.
I will take a look shortly.

@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 25, 2020

[buildbot, test this please]

1 similar comment
@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 25, 2020

[buildbot, test this please]

@yonghong-song yonghong-song merged commit fe730f2 into iovisor:master Jun 25, 2020
2 checks passed
@yonghong-song
Copy link
Collaborator

yonghong-song commented Jun 25, 2020

Merged. Thanks @willfindlay and @anakryiko!

ismhong added a commit to ismhong/bcc that referenced this issue Dec 6, 2021
The merge request [iovisor#2989] add
include <asm/page.h> in src/cc/export/helper.h for new feature in
Linux 5.7, but it will cause compiling bpf program failed, since
it tried to include assembly code in header file.

Temporally comment out this include, maybe need to check how to fix
this issue in the future.
@raoufkh
Copy link

raoufkh commented Jul 11, 2022

Hello

First, thank you @willfindlay for this work.

I have one question, in TC and XDP programs, how can we send the network packet to the user space program. Using perf buffers, this was possible by using the perf_submit_skb method while perf_submit allows only a data struct we define.

I'm trying to include an unsigned char * pointer in my struct (which I will send to the user space program using ringbuf_output but I don't manage to copy the network packet inside the struct. Doing so, the struct memory space will contain the value of the pointer (memory address of netwokr packet) but not the network packet itself. Regarding to the linux documentation, ring buffers should allow sending entire network packet to the user space.

In the BCC reference guide and by browsing the libbpf library, I saw that when opening a ring buffer, there is a ctx argument to pass but there is no information on how to use it. Does this argument allow to solve my problem?

Could you provide help please?

Regards
Raouf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants