Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Refactor bpftime for Enhanced Extensibility and Maintainability #202

Open
yunwei37 opened this issue Jan 29, 2024 · 21 comments
Open
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@yunwei37
Copy link
Member

Issue Summary:

The current bpftime architecture intermingles code for different eBPF program types and backends, such as uprobe and syscall tracing, within the syscall-server.so. For example:

  1. The attach_ctx class contains the uprobe and syscall code, and also maintain the state of eBPF virtual machines.
    bool check_exist_syscall_trace_program();
    // Check whether there is a syscall trace program
    bool check_exist_syscall_trace_program(const handler_manager *manager);
    // Check whether a certain pid was already equipped with syscall tracer
    // Using a set stored in the shared memory
    bool check_syscall_trace_setup(int pid);
    // Set whether a certain pid was already equipped with syscall tracer
    // Using a set stored in the shared memory
    void set_syscall_trace_setup(int pid, bool whether);
  2. The share memory also contains uprobe and syscall related attach infomation, which is calculated in the bpftime load time with the syscall-server.so. For instance:
    if ((int)attr->type == determine_uprobe_perf_type()) {
    // NO legacy bpf types
    bool retprobe =
    attr->config & (1 << determine_uprobe_retprobe_bit());
    size_t ref_ctr_off =
    attr->config >> PERF_UPROBE_REF_CTR_OFFSET_SHIFT;
    const char *name = (const char *)(uintptr_t)attr->config1;
    uint64_t offset = attr->config2;
    SPDLOG_DEBUG(
    "Creating uprobe name {} offset {} retprobe {} ref_ctr_off {} attr->config={:x}",
    name, offset, retprobe, ref_ctr_off, attr->config);
    int id = bpftime_uprobe_create(
    . It may be better to have some API that allow the backend (Attach target) to control what kind of perf event should be mocked in userspace, and how to attach centain eBPF progs to these events (eg. XDP in userspace DPDK, nginx modules, plugins).

This design limits the addition of new attach backends and eBPF program types, and also complicates the codebase. A refactor is proposed to address these issues and set the stage for future enhancements.

Proposed Changes:

  1. Decouple Syscall Server Responsibilities:

    • Restrict syscall-server.so and the daemon to only handle recording syscall traces and states, such as the creation of progs, maps, and links within the kernel.
    • Remove mixed responsibilities and allow for the loading of other eBPF program types.
  2. Split Attach Context:

    • Separate attach context class into two distinct targets, runtime and attach_events
    • The runtime should offer two types of APIs:
      • An API for initializing and manage all progs and maps.
      • An API for customizing attach-related information processing and operations.
    • The developer can inherit attach_events class to implement their own event sources.
  3. Temporary Feature Development Freeze:

    • Before the completion of the refactor, pause the addition of new features to avoid further complicating the current state.
  4. Future-Proofing and Extensibility:

    • The refactor will facilitate the replacement of components like Frida, embedding runtime statically in other applications, and expanding to new domains such as GPU tracing, XDP and [FEATURE] Add new attach types #158
    • Future modifications will be isolated to the attach context, simplifying updates and maintenance.

Rationale:

This refactor addresses fundamental design flaws that were not evident in the initial conception of bpftime. It aims to simplify the current codebase and prepare for more stable and scalable expansion. Although this entails a significant overhaul, it is manageable given the current code volume.

Next Steps:

  • Draft a detailed plan for the refactor, outlining the separation of responsibilities and the new API designs.
  • Address the current stability issues that have been exposed.
  • Establish a timeline for the refactor process, considering the complexity of the task.

Call for Input:

We welcome input from the community on this proposed refactor. Any insights or suggestions, especially regarding the decoupling of components and API design, would be highly appreciated.

@yunwei37 yunwei37 added the enhancement New feature or request label Jan 29, 2024
@Officeyutong
Copy link
Contributor

Replacing of frida is already able through the attach manager (by subclassing)

@Officeyutong
Copy link
Contributor

Officeyutong commented Jan 30, 2024

Split attach manager into a seperate target

#205

  • Attach manager: a small code part to provide API that could replace, listen a certain userspace function. Currently these piece is under the target runtime

Attach manager is relatively independent with runtime, it doesn't invoke any API of runtime. It only provide uprobe/uretprobe implementation to runtime (bpf_attach_ctx, by now). Besides, attach manager is the only part in bpftime that has strong dependency on Frida.

So if we split attach manager into a seperate target, we might gain:

  • It would be easier to replace FridaGum to other hooking framework, since only attach manager relies on FridaGum
  • Clearer code base: attach manager doesn't rely on anything in runtime, split it out will make the codebase more clear
  • Easier to add other "attach" implementation: attach manager only provides ability to hook userspace function. If we need other "attach" implementation, such as ebpf program as nginx module, it would be easier to update runtime to adapt this

@yunwei37
Copy link
Member Author

yunwei37 commented Jan 30, 2024

What about attach_ctx class?

Is it possible to implement all uprobe/syscall related code outside of runtime target and runtime dir? If we can do that, is it a better solution?

And also, there are some attach related code in syscall transformer and syscall-server.so

@Officeyutong
Copy link
Contributor

Officeyutong commented Jan 30, 2024

What about attach_ctx class?

Is it possible to implement all uprobe/syscall related code outside of runtime target and runtime dir? If we can do that, is it a better solution?

And also, there are some attach related code in syscall transformer and syscall-server.so

The implementation of syscall trace is not compatible with attach manager(or let's call it uprobe attach manager)

  • uprobes: register a callback function at a certain function address, and waiting for calls to the callback. Calls to the callback are executed by the hooked function itself
  • syscall trace: some foreign code(text transformer) would call a specified call back function, and this callback function would dispatch call to ebpf programs.

So I think we might split syscall trace attach implementation into another target, just call it syscall trace attach manager. Syscall trace callbacks should be registered to this target, and this target should provide a dispatch entry(The function that text transformer would call, when syscall was captured). And rename the current attach manager to uprobe attach manager, since it's only responsible for uprobes

@yunwei37
Copy link
Member Author

So Maybe we can create two targets?

  • /attach/uprobe
  • /attach/syscalls

The attach is a new dir under the project root.

And the user can specify which one to compile with the runtime.

@yunwei37
Copy link
Member Author

Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events?

Some background:

  • The eBPF runtime can be embedded in a shared memory, or compile and link with other applications as extensions. The runtime is responsible for load and manage the eBPF programs in the process.
  • There can be multiple eBPF attach methods at the same time, for example, uprobe and syscalls tracepoints Should be able to work together.
  • one eBPF program can be attached to multiple targets or events, one event can have multiple eBPF programs attached to it.

So maybe we can have a design like this:

  1. One attach_manager for managing all attach_ctx or attach_targets. The attached manager can have the unique ptr ownership of the runtime.
  2. attach_ctx has a base class and some sub classes. For example, uprobe_attach_ctx class and syscalls_attach_ctx. The attach_ctx should be able to access the shared memory through the runtime API, and also config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel.

These codes are all in the attach dir in project root.

Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.

@Officeyutong
Copy link
Contributor

So Maybe we can create two targets?

  • /attach/uprobe
  • /attach/syscalls

The attach is a new dir under the project root.

And the user can specify which one to compile with the runtime.

Sounds good. More attach implementation could be added in the future

@Officeyutong
Copy link
Contributor

Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events?

Some background:

  • The eBPF runtime can be embedded in a shared memory, or compile and link with other applications as extensions. The runtime is responsible for load and manage the eBPF programs in the process.
  • There can be multiple eBPF attach methods at the same time, for example, uprobe and syscalls tracepoints Should be able to work together.
  • one eBPF program can be attached to multiple targets or events, one event can have multiple eBPF programs attached to it.

So maybe we can have a design like this:

  1. One attach_manager for managing all attach_ctx or attach_targets. The attached manager can have the unique ptr ownership of the runtime.
  2. attach_ctx has a base class and some sub classes. For example, uprobe_attach_ctx class and syscalls_attach_ctx. The attach_ctx should be able to access the shared memory through the runtime API, and also config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel.

These codes are all in the attach dir in project root.

Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.

This also soulds good. But the uprobe attach manager I mentioned above is only some classes that provide API to register a callback at a certain function. It has nothing to do with any eBPF stuff. Maybe a name like uprobe_attach_impl is more suitable for this part of code?

@Officeyutong
Copy link
Contributor

Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events?
Some background:

  • The eBPF runtime can be embedded in a shared memory, or compile and link with other applications as extensions. The runtime is responsible for load and manage the eBPF programs in the process.
  • There can be multiple eBPF attach methods at the same time, for example, uprobe and syscalls tracepoints Should be able to work together.
  • one eBPF program can be attached to multiple targets or events, one event can have multiple eBPF programs attached to it.

So maybe we can have a design like this:

  1. One attach_manager for managing all attach_ctx or attach_targets. The attached manager can have the unique ptr ownership of the runtime.
  2. attach_ctx has a base class and some sub classes. For example, uprobe_attach_ctx class and syscalls_attach_ctx. The attach_ctx should be able to access the shared memory through the runtime API, and also config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel.

These codes are all in the attach dir in project root.
Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.

This also soulds good. But the uprobe attach manager I mentioned above is only some classes that provide API to register a callback at a certain function. It has nothing to do with any eBPF stuff. Maybe a name like uprobe_attach_impl is more suitable for this part of code?

And the attach manager you mentioned seems to be something that is responsible for "resolving perf event (or other equivalent), and allowing a certain event to call a certain ebpf program". Did I mis-understand what you said? If not, I think this thing is more suitable for the name attach manager, and should be split into individual targets.

But from another perspective, I still think uprobe_attach_impl should be split into an individual target. It has little dependency to other parts of bpftime. Splitting it into an individual can make the code base clearer, and would make it more convenient for other users that only want to use the uprobe implementation by us

@yunwei37
Copy link
Member Author

yunwei37 commented Jan 31, 2024

Yes, uprobe_attach_impl should be split into an individual target.

Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?

I think it could be something like

  1. attach_manager has the ownership of all the attach_impl and has the ownership of runtime. It's built into an 'object' target in cmake, and has the runtime as dependence.
  2. The attach_impl based class is in a header. The uprobe_attach_impl will inherit this and also be built into a standalone target.
  3. The agent.so will depend on these attach_impl targets.

@yunwei37
Copy link
Member Author

And also, can we add the new attach event at load time? So it's not statically compiled.

For example, we have three kinds agent.so:

  1. one is compiled with uprobe and syscalls tracepoints enabled.
  2. the second is compiled only with uprobe supported,
  3. the third one is used statically in the application, like the nginx module or xdp in dpdk.

We can let the agents or user config what functionality of syscalls it wants to mock in the syscall server.so. For example, allow some perf events syscall and bpf link types to be mock or response in the syscall server.so, while others not.

The config can be stored in the shared memory. So the new attached targets can register it.

@Officeyutong
Copy link
Contributor

Yes, uprobe_attach_impl should be split into an individual target.

Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?

I think it could be something like

  1. attach_manager has the ownership of all the attach_impl and has the ownership of runtime. It's built into an 'object' target in cmake, and has the runtime as dependence.
  2. The attach_impl based class is in a header. The uprobe_attach_impl will inherit this and also be built into a standalone target.
  3. The agent.so will depend on these attach_impl targets.

Yes, uprobe_attach_impl should be split into an individual target.

Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?

I think it could be something like

  1. attach_manager has the ownership of all the attach_impl and has the ownership of runtime. It's built into an 'object' target in cmake, and has the runtime as dependence.
  2. The attach_impl based class is in a header. The uprobe_attach_impl will inherit this and also be built into a standalone target.
  3. The agent.so will depend on these attach_impl targets.

What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)

@Officeyutong
Copy link
Contributor

And also, can we add the new attach event at load time? So it's not statically compiled.

For example, we have three kinds agent.so:

  1. one is compiled with uprobe and syscalls tracepoints enabled.
  2. the second is compiled only with uprobe supported,
  3. the third one is used statically in the application, like the nginx module or xdp in dpdk.

We can let the agents or user config what functionality of syscalls it wants to mock in the syscall server.so. For example, allow some perf events syscall and bpf link types to be mock or response in the syscall server.so, while others not.

The config can be stored in the shared memory. So the new attached targets can register it.

This sounds good

@yunwei37
Copy link
Member Author

What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)

have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.

@Officeyutong
Copy link
Contributor

What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)

have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.

Maps and programs are held in the shared memory, and may live longer than agent or syscall server. So maybe their ownership should not be limited by bpftime runtime?

@Officeyutong
Copy link
Contributor

Officeyutong commented Jan 31, 2024

What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)

have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.

Maps and programs are held in the shared memory, and may live longer than agent or syscall server. So maybe their ownership should not be limited by bpftime runtime?

"ownership" here means "stuff in the heap memory of a certain process that is required to operate shared memory". For example, the class bpftime_shm itself

@yunwei37 yunwei37 added the help wanted Extra attention is needed label Feb 1, 2024
@Officeyutong
Copy link
Contributor

Targets

  • uprobe_attach_impl: Implementation of uprobe attaches. Attaching to a specified address and waiting for callback. At /attach/uprobe/uprobe_attach_impl
  • text_segment_transformer: Same to the current target in that name. But moved to /attach/syscall/text_segment_transformer. Provide function to register a callback for syscall invocations.
  • syscall_attach_impl: Implementation of syscall attach. Providing a function to receive raw syscall events. And allow registering callbacks to a specified syscall id, or all syscalls. At /attach/syscall/syscall_attach_impl
  • base_attach_impl: Providing a set of headers which is an abstraction of attach implementations. The base pure virtual class should provide functions to iterate over attaches, or detach. Attaching API prototypes may not be provided in the base class, since attach arguments varies among different attach types. Should be at /attach/base_attach_impl
  • base_event_source: Provide a set of headers and base classes that define event source, which are things like load and compile an ebpf program, attach a uretprobe to function XYZ which will call ebpf program ABC, create an ebpf hash map, detach uretprobe with id XXX. With this we won't be limited to collect ebpf operations through libbpf. We can even use a nginx module as event source. Should be at /attach/event_source/base_event_source
  • syscall_server_event_source: Provide events collected from mocked libbpf functions. This should be modified from the current syscall server. Should be at /attach/event_source/syscall_server_event_source
  • attach_manager: Holding all available attach implementation at runtime. Receive events from attach_event_source, and drive attach impls to do the underlying things. attach manager should hold the ownerships of compiled ebpf programs and corresponding virtual machines. It should also hold attach implementations. Should be in /attach/attach_manager

@yunwei37
Copy link
Member Author

yunwei37 commented Feb 7, 2024

Is it better that you can first come up with a small example of how to use the new api to implement a new eBPF attach type (e.g. nginx module eBPF)?

@Officeyutong
Copy link
Contributor

Is it better that you can first come up with a small example of how to use the new api to implement a new eBPF attach type (e.g. nginx module eBPF)?

OK, I'll take it

@Officeyutong
Copy link
Contributor

Refer to https://github.com/eunomia-bpf/bpftime-new-api-poc for detailed POC

Other notes:

  • Avoid invasive nginx handler related APIs in bpftime codebase, such things could be implemented by exposing more low level APIs (such as manually managed extra data of attach target entry)

@Officeyutong
Copy link
Contributor

Instantiation of handler

  • prog_handler: The result of an instantiation of a prog handler is an instance of bpftime_prog, which includes a ready-to-execute ebpf virtual machine
  • perf event handler: The result of an instantiation of a perf event handler is an instance of attach_private_data. It's an abstract base class used to pass attach data to *_attach_impl, describing attach arguments (such as function offset of uprobe, syscall id of syscall trace, or a custom string for user-defined attach impls)
  • link handler: Usually the instantiation of a link handler would first lead to the instantiation of its related prog handler. After that, the process was determined by the user-registered instantiating handlers of different link type. For the built-in perf event link type, it would first instantiate the related perf event handler(to get the attach private data), then it would call *_attach_impls to register an native attach entry, and record the mapping from handler id to the native attach id. For the built-in uprobe_multi link type, no perf event is required. When instantiating a link of uprobe multi type, it would directly create attach_private_datas basing on arguments stored in the link (note that the creation of uprobe multi requires the registraion of uprobe/uretprobe attach impl). When instantiating a user-defined link type, it would directly call the user-registered instantiating function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants