Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for memory-backed file description #24

Open
wants to merge 1 commit into
base: master
from

Conversation

@dalehamel
Copy link

commented Apr 15, 2019

This abstracts the creation of the temporary ELF file to allow for two possible approaches:

  • The current (default) approach of allocating a temporary file to /tmp and dlopen'ing it
  • A new (optional) approach, to use a memory-backed file descriptor, if LIBSTAPSDT_MEMORY_BACKED_FD is specified.

The new approach his beneficial for two reasons:

  • There is no need to clean up the file, it is automatically removed when the process dies (the primary reason for this approach).
  • No calls are made to the underlying filesystem, everything is done in memory. While tmpfs is probably already memory-based, this could be advantageous in some circumstances.

libusdt does something similar, where it just valloc's directly into the processes address space, so there is no need for a temporary file to clean up.

Overall, I find that doing tho whole thing in memory and having it be automatically cleaned up with the process's lifecycle is a bit more elegant.

The drawbacks to this approach are:

  • It is not yet supported by bcc, but I've submitted iovisor/bcc#2314 to show how this could be utilized. Support has been added to BCC by iovisor/bcc#2314
  • If a large number of providers are allocated, the process will have a large number of file descriptors open. This is probably not a problem in practice.
  • Support for memfd_create was not added until linux 3.17 (not a problem in practice, as BCC needs 4.1+ to do anything meaningful with eBPF) and glibc 2.27. The system call may not be defined on some systems.

I think the drawbacks are mitigated by hiding it behind a preprocessor macro check, as the behavior remains unchanged. In the future, hopefully once iovisor/bcc#2314 is in a major bcc release, we could remove the macro and default to a memory-based shared object, falling back to temporary file if we detect that memfd_create is not supported.

Attaching a probe (global:helloworld) this way, the resulting process looks like:

/proc/${PID}/maps:

...
7f9e79f0c000-7f9e79f0d000 r-xp 00000000 00:05 11706507                   /memfd:libstapsdt:global (deleted)
7f9e79f0d000-7f9e7a10c000 ---p 00001000 00:05 11706507                   /memfd:libstapsdt:global (deleted)
7f9e7a10c000-7f9e7a10d000 rw-p 00000000 00:05 11706507                   /memfd:libstapsdt:global (deleted)
...

And if we check out the file descriptors for the process:

lrwx------ 1 dale.hamel dale.hamel 64 Apr 14 20:57 7 -> '/memfd:libstapsdt:global (deleted)'

So if we check for elf notes on that fd:

$readelf --notes /proc/${PID}/fd/7

Displaying notes found in: .note.stapsdt
  Owner                 Data size       Description
  stapsdt              0x00000039       NT_STAPSDT (SystemTap probe descriptors)
    Provider: global
    Name: helloword
    Location: 0x0000000000000260, Base: 0x0000000000000318, Semaphore: 0x0000000000000000
    Arguments: 8@%rdi -8@%rsi

And if I run tplist -p

$ tplist -p ${PID} | grep global
/proc/12646/fd/7 global:hello_nsec

Or use my latest branch of bpftrace with wildcard USDT support:

$ bpftrace -l 'usdt:*:global:*' -p ${PID}
usdt:/proc/12646/fd/7:global:helloworld

I can attach to the probe by that path, or by PID (using my bcc branch)

If I disable the provider, the file descriptor is closed and the elf file is removed from the memory map.


// Note that linux must be 3.17 or greater to support this
static inline int memfd_create(const char *name, unsigned int flags) {
return syscall(__NR_memfd_create, name, flags);

This comment has been minimized.

Copy link
@dalehamel

dalehamel Apr 15, 2019

Author

should probbaly check that __NR_memfd_create is defined or else bail out elegantly.

@dalehamel dalehamel force-pushed the dalehamel:memfd-dlopen branch from 16de001 to effbd84 Apr 15, 2019

@Drieger Drieger requested a review from mmarchini Apr 15, 2019

@dalehamel dalehamel marked this pull request as ready for review Apr 16, 2019

@dalehamel

This comment has been minimized.

Copy link
Author

commented Apr 16, 2019

I've marked this as ready for review, as the dependent patch has landed in BCC.

@mmarchini

This comment has been minimized.

Copy link
Collaborator

commented Apr 22, 2019

Awesome! I was not aware of memfd, and this seems like a much better approach compared to creating a file in the filesystem. libusdt doesn't have to write to a file first because Solaris/DTrace have an interface to register User Tracepoints.

IMO memfd should be the default, and we should fall back to tmpfs only if necessary. Also, it might be useful to have a way to choose which filesystem to use during runtime?

@dalehamel

This comment has been minimized.

Copy link
Author

commented Apr 22, 2019

@mmarchini great - i'll remove the macro then, and make it the default, falling back only if we can't find the necessary call.

Also, it might be useful to have a way to choose which filesystem to use during runtime?

to do this I'll need to add a new method signature for providerLoad probably, to force using /tmp files. If the original interface is called, it will go through the fallback path.

I'll modify this PR accordingly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.