Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ubpf needs to support maps #45

Closed
Alan-Jowett opened this issue Dec 3, 2020 · 8 comments
Closed

ubpf needs to support maps #45

Alan-Jowett opened this issue Dec 3, 2020 · 8 comments

Comments

@Alan-Jowett
Copy link
Collaborator

Adding support for maps to ubpf.

As near as I can tell from examining dumps of ELF eBPF programs it looks like:

  1. All maps are in a section labelled ".maps"
  2. Maps layout is sequentially.
  3. Symbol table entries contain segment index, offset and size for maps.

Proposed behavior is then:

  1. Look for a section name ".maps" that should be marked as "PROGBITS" and have the "write" and "allocate" flags set.
  2. When processing section name "XYZ", find the corresponding relocation section named ".relXYZ" and process it's relocations.
  3. For each relocation, check if it refers to an entry in the .maps section and apply the relocation to point to that location.

Maps appear to have relocation type R_BPF_64_64 / 1

@jpsamaroo
Copy link

I think this would be great to have! Having these maps also implies that we'll want to have helpers for each map, so I guess we'll need to provide a "kernel-compatible" ext_func set.

@Alan-Jowett
Copy link
Collaborator Author

I think it might make more sense to provide a function like:

/*
 * Register a map.
 *
 *  Set the address of the map in the lddw instruction based on the name resolved from the 
 *  symbol section of the ELF file.
 *
 * 'name' should be a string with a lifetime longer than the VM.
 *
 * Returns 0 on success, -1 on error.
 */
int ubpf_register_map(struct ubpf_vm *vm, const char *name, void *map);

This would do the following:

  1. During loading of the ELF, process the relocations.
  2. If the relocation points to the .maps section.
  3. Look up symbol for the relocation.
  4. Look up symbol -> map in internal table.
  5. Set map address in lddw instruction.

x64 instruction then is just a load of the immediate value. Any helpers that use maps then get passed the address of the map (this is similar to what Linux jitter does, AFAIK)

@jpsamaroo
Copy link

What about hash maps (and other non-contiguous data structures)? We'd still need special helpers to process them. Also, contiguous array access is usually done first via helper (to get the address of the element passed), so to support programs generated by tools focused on the Linux kernel's implementation, we'd need these helpers available.

@Alan-Jowett
Copy link
Collaborator Author

See BPF syscall, maps, verifier, samples, llvm

I am proposing we adopt something similar to the process that Alexi describes in the email thread, but with ubpf_load_elf replacing the instruction with a lddw with the address directly.

In the design listed, bpf_lookup_elem kernel helper function accepts the address of the map, not a FD. So we need the jitter to emit code with the address.

@Alan-Jowett
Copy link
Collaborator Author

After researching this, eBPF it appears as if the Linux kernel implements this via a BPF_LD_MAP_FD pseudo instruction, where the jitter replaces imm value with the address of the map.

@YutaroHayakawa
Copy link

YutaroHayakawa commented Jan 6, 2021

Just a comment

I think if we support maps in ubpf, it's worth to think about how we implement the map it self, especially about the concurrency. Of course, it depends on the execution environment, but I guess it would be more generic if it support multi threading environment.

In Linux-kernel implementation, there are various form of concurrency for maps. For example, array map basically doesn't perform any locking during lookup/update for performance. On the other hand, hash map takes lock when update/delete.

And the most notable stuff is the eBPF program in kernel is basically running under RCU (e.g. https://elixir.bootlin.com/linux/latest/source/drivers/net/veth.c#L580). This strong assumption makes possible to do the operation like below.

elem = bpf_map_lookup_elem(&map, &key);
...
// in-place update
elem->foo = 0xdeadbeef;

It is possible that another thread deletes the map element between lookup and in-place update, but thanks to the RCU, the memory for the element will never released before this program finish running. So nothing will be broken.

There is a project which implements RCU-like feature in user space (https://liburcu.org). However, this is LGPL so, not compatible with ubpf. Concurrency-kit is a good candidate I think, it's 2-clause-BSD (http://concurrencykit.org). "epoch" feature of it can be used like RCU. For implementation, notice both of them requires to register some context per thread, means maybe we need to enforce users to register it or intercept pthread_create or something...

Actually, I tried to do the same thing for generic-ebpf project (https://github.com/generic-ebpf/generic-ebpf). This was the pitfall I've met.

Good luck :)

@Alan-Jowett
Copy link
Collaborator Author

I think it make sense to decouple the maps part from the jitter part of ubpf.

Taking a look at the docs here eBPF maps it seems like it would make sense to implement this as follows:

  1. Consumer of ubpf provides a map_create callback passing (context, map_definition).
  2. Parse the .maps section in the ELF, invoking the map_create callback for each map, storing symbol->map_fd
  3. During the relocation, when we hit a map relocation, modify byte code setting the lddw map pseudo instruction and setting the fd in the instruction.

The consumer of the ubpf can then provide their own map implementation as they see fit via the map_create callback and the bpf_map_* helper functions.

@Alan-Jowett
Copy link
Collaborator Author

Going to build this on top of this PR as I need the code to lookup sections by name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants