Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding to support to attach uprobe to offset #803

Merged
merged 12 commits into from Nov 1, 2019

Conversation

olsajiri
Copy link
Contributor

@olsajiri olsajiri commented Jul 1, 2019

It's now possible to attach uprobe to specific offset
either directly to offset within the binary:

# bpftrace -e 'uprobe:./ex:1024 { printf("hit\n"); }'

or offset within the symbol:

# bpftrace -e 'uprobe:./ex:main+0xc { printf("hit\n"); }'

I did not implemented any checking of the offset validity.
If the offset is out of the binary scope, kernel will fail
to add the uprobe and bpftrace will fail with:

# bpftrace -e 'uprobe:./ex:0x100000 { printf("KOKOT\n"); }'
Attaching 1 probe...
perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p___ex_100000_1_bcc_9172/id): Invalid argument
Error attaching probe: uprobe:./ex:1048576

If the uprobe is not added on the instruction boundary, it succeeds,
because there's no such check on the registration path, but it will
most likely crash the probed binary.. users should be aware of this.

@tyroguru
Copy link
Contributor

tyroguru commented Jul 1, 2019

@olsajiri - this is a much needed addition so thanks massively from all the uprobe junkies out here! However, I don't think we can give a user such an onerous responsibility of ensuring that they have provided the correct offset - bpftrace must do the heavy lifting of ensuring that they have provided a valid instruction offset. The only way I know of doing this is to use a disassembler library to verify the instruction sequence and ensure the provided offset is at a valid instruction boundary. Maybe there are other ways though - doesn't perf / bcc allow you to specify instruction offsets with uprobe's? If so, how do they ensure that they only patch valid offsets?

@brendangregg
Copy link
Contributor

Maybe it needs to check using a disassembly library (or even a pipe to gdb), and if such a method isn't available, it rejects the uprobe with an explanation unless --unsafe is used.

@olsajiri
Copy link
Contributor Author

olsajiri commented Jul 2, 2019

@tyroguru - hum, I'll double check but I don't think perf probe has a check for this, AFAIK it allows you to add uprobe anywhere.. let me ask the 'perf probe' guys, perhaps they already had some discussion about this

@olsajiri
Copy link
Contributor Author

olsajiri commented Jul 2, 2019

more info.. perf checks for the symbols size, but does allow to
add uprobe in the middle of the instruction:

0000000000401126 <main>:
  401126:       55                      push   %rbp
  401127:       48 89 e5                mov    %rsp,%rbp
  40112a:       bf 10 20 40 00          mov    $0x402010,%edi
  40112f:       e8 fc fe ff ff          callq  401030 <puts@plt>
  401134:       b8 00 00 00 00          mov    $0x0,%eax
  401139:       5d                      pop    %rbp
  40113a:       c3                      retq   
  40113b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)


# perf probe -x ./ex main+200
Offset 200 is bigger than the size of main

# perf probe -x ./ex main+2
Added new event:
  probe_ex:main        (on main+2 in /home/jolsa/kernel/linux-perf/tools/perf/ex)

You can now use it in all perf tools, such as:

        perf record -e probe_ex:main -aR sleep 1

@tyroguru
Copy link
Contributor

tyroguru commented Jul 2, 2019

Thanks for the test case -that's really helpful. The perf interface into uprobes is.... interesting.

Yes, it's as I thought and it's pretty bad. Being able to overwrite any arbitrary byte in the text can not only lead to abnormal termination but, more importantly, data corruption. A trivial and tongue-in-cheek example:

# cat t.c
#include <stdio.h>

int main(int argc, char *argv[])
{
    printf("Stock price is %d USD\n", 3);
}

00000000004004f0 <main>:
  4004f0:       55                      push   %rbp
  4004f1:       48 89 e5                mov    %rsp,%rbp
  4004f4:       48 83 ec 20             sub    $0x20,%rsp
  4004f8:       89 7d fc                mov    %edi,-0x4(%rbp)
  4004fb:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
  4004ff:       48 bf c0 05 40 00 00    movabs $0x4005c0,%rdi
  400506:       00 00 00
  400509:       be 03 00 00 00          mov    $0x3,%esi
  40050e:       b0 00                   mov    $0x0,%al
  400510:       e8 cb fe ff ff          callq  4003e0 <printf@plt>

We put our probe on the second byte of the mov instruction which loads up the value 3...

# perf probe -x ./t main+26

Without tracing this gives us our regular market value for the stock:

# ./t
Stock price is 3 USD

We enable this probe in another window and I'm a rich man:

#  ./t
Stock price is 204 USD

Hopefully this shows in a lightly humoured way why this problem is actually an extremely serious one.

@olsajiri
Copy link
Contributor Author

olsajiri commented Jul 2, 2019

still I'm not totally convinced that forcing the instruction boundary is the right thing

how about assembly code that does some jump tricks and ends up executing
instructions not aligned to the nearest symbol.. unlikely, but doable.. bpftrace
would be useless then

@ajor
Copy link
Member

ajor commented Jul 2, 2019

We could maybe add an option to allow users to add uprobes in the middle of instructions if that's something people want to do, but bpftrace should be safe to run in production by default - don't want a one-character typo corrupting data or taking down a server.

@brendangregg
Copy link
Contributor

We've piped to shell commands before -- how we originally did user symbol resolution until we switched to the bcc interface. Could pipe to gdb to check the offset. If gdb isn't available, you can only use the uprobe with --unsafe.

@olsajiri
Copy link
Contributor Author

olsajiri commented Jul 3, 2019

ok, I'll try to add something like this

@ajor
Copy link
Member

ajor commented Jul 3, 2019 via email

@tyroguru
Copy link
Contributor

tyroguru commented Jul 3, 2019

Don't look into using gdb for this use case. Introducing a disassembler library would be by far the easiest and best solution to this as you haven't got to write one yourself - by the look of it there are loads of them out there and we just need to locate one that looks to be well maintained and active. I'd imagine that most of the work will be finding one and integrating it into the source base as using it should be extremely simple.

@javierhonduco
Copy link
Contributor

We could maybe use LLVM's MC, the post is a bit old, but seems like a good introduction and we are already using LLVM, so maybe there's less friction.

With respect to the safety measures, I personally think we should never allow for binary corruption, even if we gate it under the --unsafe flag, this could be a nightmare to debug, especially if the process does not crash immediately with SIGILL, for example

@jasonkeene
Copy link
Contributor

First, thanks so much for working on this! I'm linking the issue I filed for this just to connect them: #273

I would prefer to have bpftrace link against a disassembler library vs calling out to a tool. I also agree, we should prevent users from setting uprobes on non-instruction boundaries, at least by default.

@olsajiri
Copy link
Contributor Author

olsajiri commented Jul 15, 2019

I added the support to check on the uprobe offset and make sure
it's on instruction boundary, unless there's --unsafe option specified.

I used the binutils bfd/opcodes libraries for that, but the code allows
to add another disassembler implementation.

0000000000401136 <main>:
  401136:       55                      push   %rbp
  401137:       48 89 e5                mov    %rsp,%rbp
  40113a:       48 83 ec 10             sub    $0x10,%rsp
  40113e:       bf 02 00 00 00          mov    $0x2,%edi
  401143:       e8 f8 fe ff ff          callq  401040 <malloc@plt>
  401148:       48 89 45 f8             mov    %rax,-0x8(%rbp)

attaching with symbol:

# bpftrace  -e 'uprobe:./ex:main { printf("hit\n"); }'
Attaching 1 probe...
^C

# bpftrace  -e 'uprobe:./ex:main+1 { printf("hit\n"); }'
Attaching 1 probe...
^C

# bpftrace  -e 'uprobe:./ex:main+2 { printf("hit\n"); }'
Attaching 1 probe...
Could not add uprobe into middle of instruction: ./ex:main+2

# bpftrace --unsafe -e 'uprobe:./ex:main+2 { printf("hit\n"); }'
Attaching 1 probe...
Unsafe uprobe in the middle of the instruction: ./ex:main+2
^C

attaching with address:

# bpftrace -e 'uprobe:./ex:0x401136 { printf("hit\n"); }'
Attaching 1 probe...
^C

# bpftrace -e 'uprobe:./ex:0x401137 { printf("hit\n"); }'
Attaching 1 probe...
^C

# bpftrace -e 'uprobe:./ex:0x401138 { printf("hit\n"); }'
Attaching 1 probe...
Could not add uprobe into middle of instruction: ./ex:main+2

# bpftrace --unsafe -e 'uprobe:./ex:0x401138 { printf("hit\n"); }'
Attaching 1 probe...
Unsafe uprobe in the middle of the instruction: ./ex:main+2
^C

@olsajiri
Copy link
Contributor Author

rebased on latest master and pushed

@javierhonduco
Copy link
Contributor

Sorry for being stubborn on the safety concern, but I still think we should not allow misaligned instruction rewriting even if --unsafe is passed 😥

I don't really understand some case in which this could help at debugging a given problem, could you clarify this? I might be missing something! 🙍‍♂

In case this behaviour is needed for some reason, I'd disable it by default at build time and if some advanced user really wants to use this needs to enable it there and pass --unsafe

What do you think?

@olsajiri
Copy link
Contributor Author

I dont have any specific use case, but I can imagine some code (like glibc)
could have assembler codes which does not strictly follows symbols

I think it's safe enough to have it under --unsafe option ;-) but
I dont mind adding it behind compile option, I'll push new version

@olsajiri olsajiri force-pushed the uprobe_offset branch 2 times, most recently from 52ba9f7 to 3438544 Compare August 6, 2019 08:43
@olsajiri
Copy link
Contributor Author

olsajiri commented Aug 6, 2019

Added cmake CHECK_UPROBE_ALIGNMENT variable.

Now it's needed to specify -DCHECK_UPROBE_ALIGNMENT=TRUE
on the cmake command line.

@olsajiri
Copy link
Contributor Author

olsajiri commented Aug 6, 2019

Added cmake CHECK_UPROBE_ALIGNMENT variable.

Now it's needed to specify -DCHECK_UPROBE_ALIGNMENT=TRUE
on the cmake command line.

I actually made it more specific and changed it to ALLOW_UNSAFE_UPROBE,
which only enables the --unsafe option for unaligned uprobe. I'll push the
changes shortly.

@jasonkeene
Copy link
Contributor

With respect to use cases, I could imagine a use case where the disassembler does not properly interpret the byte sequences as the correct instructions. This happens with undocumented (hidden) instructions, under-documented instructions, and disassembler bugs. In these situations a user might want to be able to pass an --unsafe flag to set the uprobe at a specific byte offset.

@olsajiri
Copy link
Contributor Author

With respect to use cases, I could imagine a use case where the disassembler does not properly interpret the byte sequences as the correct instructions. This happens with undocumented (hidden) instructions, under-documented instructions, and disassembler bugs. In these situations a user might want to be able to pass an --unsafe flag to set the uprobe at a specific byte offset.

@jasonkeene are you be ok with ALLOW_UNSAFE_UPROBE compile variable,
or you want it there unconditionally?

@jasonkeene
Copy link
Contributor

I could go either way.

I imagine that not all users of bpftrace will be comfortable rebuilding. Also, I ship bpftrace as part of a container image and rebuilding/redeploying all of that would be additional friction.

I guess the question is what does a compile time guard provide? It adds friction to the process. Is that a good thing? A bad offset can corrupt instructions. It seems important to not do that.

Ultimately this is a UX question.

Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok LGTM now. Could you fix the merge conflict (sorry)? Then I'll merge

I'm not able to build bpftrace with bcc being installed
under directory that is not searched by linker by dafault.

Adding -L<dirname ${LIBBCC_LIBDIR}> option, which allows me
to build with:
  $ cmake \
    ...
    -DLIBBCC_INCLUDE_DIRS:PATH=/opt/bcc/include/bcc \
    -DLIBBCC_LIBRARIES:PATH=/opt/bcc/lib64/libbcc.so \
    ...
To carry the virtual address or symbol's function offset to define uprobe.

Changing accordingly the generated name for uprobes:

 - when address 1000 is defined:
   "uprobe:/bin/sh:1000"

 - when function offset 10 is defined for 'somefunc' symbol:
   "uprobe:/bin/sh:somefunc+10"
To properly resolve offset for uprobe, which can be specified
now with virtual address or symbol + offset.

Keeping the original offset() code in resolve_offset function,
which is still used for usdt probes.

Adding 'AttachedProbe::offset_' variable to carry the resolved
offset value.

The reason for resolve_offset_uprobe function is the
extra checking/resolving on for uprobes we need to have
plus the offset check coming in following patches.
We need both libbfd and libopcodes to use the binutils
disassembler later on. Adding the cmake detection of
these, mostly copied/stolen from already present cmake
Find* modules (FindLibBcc/FindLibElf)
With single 'is_aligned' interface that will be
used to validate proper uprobe offset.

The implementation allows to add different disassembler
implementations. I'll add BFD one in following changes.
As a IDisasm interface implementation by using the
binutils disassembler. It will be compiled in, if
the libbfd and libopcodes libraries are detected
during the build.
And fail the execution if unaligned access point is detected.
Add AttachPoint constructor to specify uprobe with symbol:offset.
It's possible to specify uprobe with:

  AttachPoint("uprobe", "/bin/sh", "foo", 10);

which creates uprobe attach point to /bin/sh at symbol "foo" and
its relative offset 10.

Allow existing AttachPoint constructor to specify uprobe
with offset. It's possible to specify uprobe with:

  AttachPoint("uprobe", "/bin/sh", 1024);

which creates uprobe attach point to /bin/sh at offset 1024.
Adding 2 tests into ast object and 2 under bpftrace.
Allowing the unaligned uprobe accespoint to
be added if --unsafe option is specified.

This only works if you build bpftrace with
-DALLOW_UNSAFE_UPROBE:BOOL=TRUE
Updating reference guide on detail how to specify
uprobe with virtual address and function+offset.
Adding 3 runtimes test for generic uprobe offset stuff.

To run more detailed tests we'd need more support from test
framework like to run add arch specific tests to run instruction
alignment tests.
@olsajiri
Copy link
Contributor Author

olsajiri commented Nov 1, 2019

pushed ;-)

@danobi danobi merged commit d9c2bab into bpftrace:master Nov 1, 2019
@mmarchini
Copy link
Contributor

Do we need to update our install section with any extra instructions?

@danobi
Copy link
Member

danobi commented Nov 4, 2019

Yeah I think we might need binutils. @olsajiri , mind putting up a PR?

mmisono added a commit to mmisono/bpftrace that referenced this pull request Nov 8, 2019
Example: `bpftrace -e 'kprobe:do_sys_open+9 { ... }'`.
This is based on the support of attaching uprobe to offset (bpftrace#803).  The
offset is checked whether it is an instruction boundary using vmlinux.
The environment variable `BPFTRACE_VMLINUX` can be used to specify
vmlinux file.  ALLOW_UNSAFE_UPROBE option is renamed to
ALLOW_UNSASFE_PROBE so that it supports both uprobe and kprobe.
@danobi
Copy link
Member

danobi commented Dec 23, 2019

I added the docs in c57204c

mmisono added a commit to mmisono/bpftrace that referenced this pull request Jan 12, 2020
Example: `bpftrace -e 'kprobe:do_sys_open+9 { ... }'`.

This is based on the support of attaching uprobe to offset (bpftrace#803).  The
offset is checked whether it is an instruction boundary using vmlinux
(with debug symbols).  The environment variable `BPFTRACE_VMLINUX` can
be used to specify vmlinux file.  ALLOW_UNSAFE_UPROBE option is renamed
to ALLOW_UNSASFE_PROBE so that it supports both uprobe and kprobe.

If bpftrace is compoiled with ALLOW_UNSASFE_PROBE option and vmlinux
file is not found, and --unsafe option is given, the userspace check is
skipped.  Note that linux kernel checks the address alignment, but it
does not check whether it is within the function.

Closes bpftrace#42
mmisono added a commit to mmisono/bpftrace that referenced this pull request Jan 18, 2020
Example: `bpftrace -e 'kprobe:do_sys_open+9 { ... }'`.

This is based on the support of attaching uprobe to offset (bpftrace#803).  The
offset is checked whether it is an instruction boundary using vmlinux
(with debug symbols).  The environment variable `BPFTRACE_VMLINUX` can
be used to specify vmlinux file.  ALLOW_UNSAFE_UPROBE option is renamed
to ALLOW_UNSASFE_PROBE so that it supports both uprobe and kprobe.

If bpftrace is compoiled with ALLOW_UNSASFE_PROBE option and vmlinux
file is not found, and --unsafe option is given, the userspace check is
skipped.  Note that linux kernel checks the address alignment, but it
does not check whether it is within the function.

Closes bpftrace#42
danobi pushed a commit that referenced this pull request Jan 22, 2020
Example: `bpftrace -e 'kprobe:do_sys_open+9 { ... }'`.

This is based on the support of attaching uprobe to offset (#803).  The
offset is checked whether it is an instruction boundary using vmlinux
(with debug symbols).  The environment variable `BPFTRACE_VMLINUX` can
be used to specify vmlinux file.  ALLOW_UNSAFE_UPROBE option is renamed
to ALLOW_UNSASFE_PROBE so that it supports both uprobe and kprobe.

If bpftrace is compoiled with ALLOW_UNSASFE_PROBE option and vmlinux
file is not found, and --unsafe option is given, the userspace check is
skipped.  Note that linux kernel checks the address alignment, but it
does not check whether it is within the function.

Closes #42
@vincentbernat
Copy link
Contributor

Just for information, in Debian, we are not allowed to link to libbfd-dev because the API/ABI is considered unstable. Therefore, for the official Debian package, I have to build without BFD support. From my understanding, this is a minor inconvenience. Do you agree?

@danobi
Copy link
Member

danobi commented Jan 20, 2021

Just for information, in Debian, we are not allowed to link to libbfd-dev because the API/ABI is considered unstable. Therefore, for the official Debian package, I have to build without BFD support. From my understanding, this is a minor inconvenience. Do you agree?

Yeah, should be fine. Users will just get an error saying "not compiled with offset support". And if they use --unsafe then bpftrace will skip error checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants