Report helper function errors #1276

mmisono · 2020-04-17T15:05:11Z

Some helper functions return error codes when failing, but currently,
bpftrace does not check and discards it. Several times I had to use bcc
to check error codes for debugging.

This patch adds an error checking mechanism, and helper function errors
are reported in runtime using '-k' option. For example:

% sudo BPFTRACE_MAP_KEYS_MAX=2 ./src/bpftrace -e \
  'BEGIN { @[0] = 0; @[1] = 1; @[2] = 2;} -k'
Attaching 1 probe...
stdin:1:35-36: WARNING: Failed to map_update_elem: Argument list too long (-7)
BEGIN { @[0] = 0; @[1] = 1; @[2] = 2;}
                                 ~

^C

@[0]: 0
@[1]: 1

This mechanism is implemented as an async action. By default, errors of
bpf_probe_read()-related functions and bpf_map_lookup_elem() are not
checked. This is because when these functions fail, bpftrace returns
zero as a result, and some scripts already check it, for example, by
using predicate / @[tid] /. -kk option forces to check all errors
including these functions.

Note that the error check increases code size. All codegen tests are
performed with the error check disabled.

mmisono · 2020-04-17T15:10:52Z

src/ast/irbuilderbpf.cpp

+  BasicBlock *helper_merge_block = BasicBlock::Create(module_.getContext(),
+                                                      "helper_merge",
+                                                      parent);
+  auto *ret = CreateIntCast(return_value, getInt32Ty(), true);


I found that a return type of most helper functions was Int64Ty but they actually returned int and we needed to use Int32Ty value to check if a return value was negative. I cast the value here to keep code changes of this PR as small as possible.

Can you leave that as a comment in the code? Will help next person who wonders why there's a cast.

mmisono · 2020-04-17T15:15:54Z

I think this solves #464.

I forgot to mention but this uses perf_event_output() for error reporting, so an error of perf_event_output() is not reported. An error message is printed using bpftrace.error() and is not included in json if an output format is json.

mmisono · 2020-04-17T15:35:33Z

This is the comparison of processed insn sizes (not the actual code size. Since the verifier explorers all possible code path, processed insn size becomes larger than the code size). Multiple values mean that the scripts have multiple functions.

So, basically error checking increases the number of processed insn by tens or hundreds, but some scripts (especially which have many map updates) increase code size by thousands (e.g., capable.bt). I also measured the compilation times but I didn't see noticeable differences with and without error checking. I didn't measure runtime speed.

I'm still wondering what is the best default behavior, and I'd appreciate any feedback.

tools	default	--check-all-helper	--disable-helper-check
tcpsynbl.bt	14,14,185,185	14,14,478,478	14,14,139,139
bashreadline.bt	161,14	161,14	161,14
tcpaccept.bt	394,391	394,1166	394,331
tcpconnect.bt	362,14	362,14	362,14
runqlen.bt	14,92	14,251	14,70
writeback.bt	2094,34,516,14	2094,34,546,14	828,14,453,14
setuids.bt	270,154,154,272,200,201,314	270,154,154,272,298,299,513	270,45,45,71,123,124,197
mdflush.bt	149,14	149,14	149,14
loads.bt	14,66	14,215	14,66
biosnoop.bt	159,14,253,198	159,14,352,198	159,14,182,42
cpuwalk.bt	14,62	14,77	14,41
opensnoop.bt	211,36,36,266,266,14	211,36,36,318,318,14	211,15,15,193,193,14
capable.bt	10463,14,365	10515,14,379	3382,14,335
naptime.bt	256,182	256,633	256,154
dcsnoop.bt	193,175,14,42	193,223,14,86	193,121,14,21
biostacks.bt	14,14,278,278,132	14,14,341,341,132	14,14,185,185,27
xfsdist.bt	14,250,250,250,250,14,94,94,94,94	14,305,305,305,305,14,94,94,94,94	14,150,150,150,150,14,30,30,30,30
pidpersec.bt	14,42,41,14	14,57,41,14	14,21,41,14
biolatency.bt	14,14,191,34	14,14,236,34	14,14,134,14
runqlat.bt	14,36,36,363,14	14,36,36,388,14	14,16,16,135,14
tcpretrans.bt	3262,14,875	3262,14,1769	1229,14,875
threadsnoop.bt	117,14	117,14	117,14
statsnoop.bt	181,36,36,36,36,161,161,161,161,14	181,36,36,36,36,195,195,195,195,14	181,15,15,15,15,108,108,108,108,14
tcpdrop.bt	3278,14,975	3279,14,1915	1246,14,894
vfsstat.bt	14,41,14,42,42,42,42,42,42,42,42	14,41,14,57,57,57,57,57,57,57,57	14,41,14,21,21,21,21,21,21,21,21
oomkill.bt	14,14	14,14	14,14
syncsnoop.bt	177,70,72,70,72,72,72	177,70,72,70,72,72,72	177,43,44,43,44,44,44
syscount.bt	14,162,14	14,203,14	14,53,14
swapin.bt	41,14	41,14	41,14
execsnoop.bt	101,218	101,707	101,218
killsnoop.bt	227,93,171	227,93,270	227,29,115
bitesize.bt	14,198,14	14,221,14	14,177,14
gethostlatency.bt	193,14,216,216,216,94,94	193,14,275,275,275,94,94	193,14,163,163,163,30,30
tcplife.bt	490,14,1015	490,14,2110	490,14,635

fbs · 2020-04-17T20:41:35Z

docs/reference_guide.md

@@ -141,6 +141,8 @@ OPTIONS:
    -p PID         enable USDT probes on PID
    -c 'CMD'       run CMD and enable USDT probes on resulting process
    -v             verbose messages
+    --check-all-helper     check return codes of all helper functions


Suggested change

--check-all-helper check return codes of all helper functions

--check-helper-errors emit a warning when a bpf helper returns an error

I think this is a bit more descriptive.

Also alignment :)

src/ast/irbuilderbpf.cpp

fbs · 2020-04-17T20:55:13Z

src/main.cpp

@@ -57,14 +57,16 @@ void usage()
  std::cerr << "    --unsafe       allow unsafe builtin functions" << std::endl;
  std::cerr << "    -v             verbose messages" << std::endl;
  std::cerr << "    --info         Print information about kernel BPF support" << std::endl;
+  std::cerr << "    --check-all-helper     check return codes of all helper functions" << std::endl;


same comment and alignment thing here

fbs · 2020-04-17T21:34:11Z

Nice, this is useful :)

Seems like the processed instruction count is actually reduced in the --disable-helper-check case, what is going on there?

mmisono · 2020-04-17T22:57:56Z

Thanks for the review, will fix.

"default" in the above table means the default behavior when this patch applied (so it checks errors other than bpf_probe_read_xxx / bpf_map_lookup_elem). --disable-helper-check does not check error at all, and this is the same behavior as the current.

mmisono · 2020-04-22T12:51:40Z

I examined the generated code. Basically, this error check creates one branch and adds 10 instructions per helper function call. A problem is a branch creates another state and the verifier needs to explore paths both that take a branch and not. This is the reason the processed instructions sharply increase in scripts that call many helper functions. The verifier prunes a search if two states are equivalent, but unfortunately not.

I'm wondering if we can implement this functionality using bpf-to-bpf call. I guess it reduces states. I'll give it a try later. (there is little information about this..)

mmisono · 2020-04-27T12:39:55Z

I fixed the reviewed points and changed the option since the initial version was a bit confusing:

-k : emit a warning when a bpf helper returns an error, except read functions (bpf_probe_read_xxx(), bpf_map_lookup_elem()).
- The reason excludes these functions is described in the commit message.
-kk: check all helper function errors.

Therefore, if these options are not given, program behavior is the same as the current. What do you think about this? @fbs

(As I wrote, I'm seeing if bpf-to-bpf call is usable, but I don't plan to include it in this PR even if it works.)

danobi · 2020-05-13T18:26:59Z

Nice, this looks really useful. Code looks good to me. Can you add some codegen tests with the error checks and also a runtime test or two? Just so we don't accidentally break this later.

Sorry about the delay, didn't see this PR. In the future feel free to ping me if PR get delayed too much.

src/bpftrace.h

danobi · 2020-05-13T18:20:00Z

src/ast/irbuilderbpf.cpp

+  BasicBlock *helper_merge_block = BasicBlock::Create(module_.getContext(),
+                                                      "helper_merge",
+                                                      parent);
+  auto *ret = CreateIntCast(return_value, getInt32Ty(), true);


Can you leave that as a comment in the code? Will help next person who wonders why there's a cast.

mmisono · 2020-05-17T13:04:59Z

Updated:

Rebased onto master
Use unordered map instead of map
Add comments about IntCast
Add codegen and runtime tests

Please take a look when you have time @danobi

Some helper functions return error codes when failing, but currently, bpftrace does not check and discards it. Several times I had to use bcc to check error codes for debugging. This patch adds an error checking mechanism, and helper function errors are reported in runtime using '-k' option. For example: ``` % sudo BPFTRACE_MAP_KEYS_MAX=2 ./src/bpftrace -e \ 'BEGIN { @[0] = 0; @[1] = 1; @[2] = 2;} -k' Attaching 1 probe... stdin:1:35-36: WARNING: Failed to map_update_elem: Argument list too long (-7) BEGIN { @[0] = 0; @[1] = 1; @[2] = 2;} ~ ^C @[0]: 0 @[1]: 1 ``` This mechanism is implemented as an async action. By default, errors of `bpf_probe_read()`-related functions and `bpf_map_lookup_elem()` are not checked. This is because when these functions fail, bpftrace returns zero as a result, and some scripts already check it, for example, by using predicate `/ @[tid] /`. `-kk` option forces to check all errors including these functions. Note that the error check increases code size. All codegen tests are performed with the error check disabled.

Helper error checking increases code size and some scripts requires more bigger log size.

tests/runtime/other

mmisono commented Apr 17, 2020

View reviewed changes

fbs reviewed Apr 17, 2020

View reviewed changes

mmisono force-pushed the report_helper_errors branch 2 times, most recently from a0c1e8d to 027c544 Compare April 27, 2020 12:32

danobi reviewed May 13, 2020

View reviewed changes

mmisono force-pushed the report_helper_errors branch from 027c544 to 5461b5d Compare May 17, 2020 12:54

mmisono added 2 commits May 18, 2020 22:48

Increase error log size

181c0cb

Helper error checking increases code size and some scripts requires more bigger log size.

danobi reviewed May 18, 2020

View reviewed changes

tests/runtime/other Outdated Show resolved Hide resolved

mmisono force-pushed the report_helper_errors branch 2 times, most recently from 811059e to 181c0cb Compare May 18, 2020 21:45

danobi approved these changes May 18, 2020

View reviewed changes

danobi merged commit 5ec9148 into bpftrace:master May 18, 2020

mmisono mentioned this pull request May 19, 2020

need to warn when exceeding map key limits #464

Closed

mmisono mentioned this pull request Oct 5, 2020

Add path builtin function #1492

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report helper function errors #1276

Report helper function errors #1276

mmisono commented Apr 17, 2020 •

edited

Loading

mmisono Apr 17, 2020

danobi May 13, 2020

mmisono commented Apr 17, 2020

mmisono commented Apr 17, 2020

fbs Apr 17, 2020 •

edited

Loading

fbs Apr 17, 2020

fbs commented Apr 17, 2020

mmisono commented Apr 17, 2020

mmisono commented Apr 22, 2020

mmisono commented Apr 27, 2020

danobi commented May 13, 2020

danobi May 13, 2020

mmisono commented May 17, 2020

	--check-all-helper check return codes of all helper functions
	--check-helper-errors emit a warning when a bpf helper returns an error

Report helper function errors #1276

Report helper function errors #1276

Conversation

mmisono commented Apr 17, 2020 • edited Loading

mmisono Apr 17, 2020

Choose a reason for hiding this comment

danobi May 13, 2020

Choose a reason for hiding this comment

mmisono commented Apr 17, 2020

mmisono commented Apr 17, 2020

fbs Apr 17, 2020 • edited Loading

Choose a reason for hiding this comment

fbs Apr 17, 2020

Choose a reason for hiding this comment

fbs commented Apr 17, 2020

mmisono commented Apr 17, 2020

mmisono commented Apr 22, 2020

mmisono commented Apr 27, 2020

danobi commented May 13, 2020

danobi May 13, 2020

Choose a reason for hiding this comment

mmisono commented May 17, 2020

mmisono commented Apr 17, 2020 •

edited

Loading

fbs Apr 17, 2020 •

edited

Loading