Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF: Use DebugLoc to find Filename for BTF line info #90302

Merged
merged 1 commit into from
Apr 29, 2024

Conversation

yonghong-song
Copy link
Contributor

@yonghong-song yonghong-song commented Apr 27, 2024

Andrii found an issue where the BTF line info may have empty source which seems wrong. The program is a Meta internal bpf program. I can reproduce with latest upstream compiler as well. Let the bpf program built without this patch and then with the following veristat check where veristat is a bpf verifier tool to do kernel verification for bpf programs:

$ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
$ rg '^;' log | sort | uniq -c | sort -nr | head -n10
4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
3907 ; if (now < start_allow_time) { @ ban.h:17
3674 ; @ src_mitigations.h:0
3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
1688 ; if (throttle_cfg) { @ rate_limit.h:85

You can see

3674 ; @ src_mitigations.h:0

where we do not have proper line information and line number.

In LLVM Machine IR, some instructions may carry DebugLoc information
to specify where the corresponding source is for this instruction.
The information includes file_name, line_num and col_num.
Each instruction may also attribute to a function in debuginfo.
So there are two ways to find file_name for a particular insn:
(1) find the corresponding function in debuginfo
(MI->getMF()->getFunction().getSubprogram()) and then
find the file_name from DISubprogram.
(2) find the corresponding file_name from DebugLoc.

The option (1) is used in current implementation. This mostly works.
But if one instruction is somehow generated from multiple functions,
the compiler has to pick just one. This may cause a mismatch between
file_name and line_num/col_num.

Besides potential incorrect mismatch of file_name vs. line_num/col_num,
There is another issue where some DebugLoc has line number 0. For example,
I dumped the dwarf line table for the above bpf program:

  Address            Line   Column File   ISA Discriminator OpIndex Flags
  ------------------ ------ ------ ------ --- ------------- ------- -------------
  0x0000000000000000     96      0     17   0             0       0  is_stmt
  0x0000000000000010    100     12     17   0             0       0  is_stmt prologue_end
  0x0000000000000020      0     12     17   0             0       0
  0x0000000000000058     37      7     17   0             0       0  is_stmt
  0x0000000000000060      0      0     17   0             0       0
  0x0000000000000088     37      7     17   0             0       0
  0x0000000000000090     42     75     17   0             0       0  is_stmt
  0x00000000000000a8     42     52     17   0             0       0
  0x00000000000000c0    120      9     17   0             0       0  is_stmt
  0x00000000000000c8      0      9     17   0             0       0
  0x00000000000000d0    106     21     17   0             0       0  is_stmt
  0x00000000000000d8    106      3     17   0             0       0
  0x00000000000000e0    110     25     17   0             0       0  is_stmt
  0x00000000000000f8    110     36     17   0             0       0
  0x0000000000000100      0     36     17   0             0       0
  ...

These DebugLoc with line number 0 needs to be skipped since we cannot
map them to the correct source code. Note that selftest offset-reloc-basic.ll
has this issue as well which is adjusted by this patch.

With the above two fixes, empty lines for source annotation are removed.

$ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
$ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10
4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
3907 ; if (now < start_allow_time) { @ ban.h:17
3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
1688 ; if (throttle_cfg) { @ rate_limit.h:85
1670 ; if (rl_cfg) { @ rate_limit.h:77

You can see that we do not have empty line any more.

3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85

@yonghong-song
Copy link
Contributor Author

cc @anakryiko

@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
OS.emitLabel(LineSym);

// Construct the lineinfo.
auto SP = DL->getScope()->getSubprogram();
constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol());
constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most important part of the fix :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. that is the key change!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this could be done as follows, as get() is used to implement -> operator:

-  constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
+  constructLineInfo(LineSym, DL->getFile(), DL->getLine(), DL->getColumn());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Just make the change to do 'DL->getFile()' directly.

Copy link
Contributor

@eddyz87 eddyz87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'll double check the impact on line information in selftests and write back this evening.

@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
OS.emitLabel(LineSym);

// Construct the lineinfo.
auto SP = DL->getScope()->getSubprogram();
constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol());
constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this could be done as follows, as get() is used to implement -> operator:

-  constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
+  constructLineInfo(LineSym, DL->getFile(), DL->getLine(), DL->getColumn());

Andrii found an issue where the BTF line info may have empty
source which seems wrong. The program is a Meta internal
bpf program. I can reproduce with latest upstream compiler
as well. Let the bpf program built without this patch and then with
the following veristat check where veristat is a bpf verifier tool
to do kernel verification for bpf programs:

  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3674 ;  @ src_mitigations.h:0
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85

You can see

   3674 ;  @ src_mitigations.h:0

where we do not have proper line information and line number.

In LLVM Machine IR, some instructions may carry DebugLoc information
to specify where the corresponding source is for this instruction.
The information includes file_name, line_num and col_num.
Each instruction may also attribute to a function in debuginfo.
So there are two ways to find file_name for a particular insn:
  (1) find the corresponding function in debuginfo
      (MI->getMF()->getFunction().getSubprogram()) and then
      find the file_name from DISubprogram.
  (2) find the corresponding file_name from DebugLoc.

The option (1) is used in current implementation. This mostly works.
But if one instruction is somehow generated from multiple functions,
the compiler has to pick just one. This may cause a mismatch between
file_name and line_num/col_num.

Besides potential incorrect mismatch of file_name vs. line_num/col_num,
There is another issue where some DebugLoc has line number 0. For example,
I dumped the dwarf line table for the above bpf program:

  Address            Line   Column File   ISA Discriminator OpIndex Flags
  ------------------ ------ ------ ------ --- ------------- ------- -------------
  0x0000000000000000     96      0     17   0             0       0  is_stmt
  0x0000000000000010    100     12     17   0             0       0  is_stmt prologue_end
  0x0000000000000020      0     12     17   0             0       0
  0x0000000000000058     37      7     17   0             0       0  is_stmt
  0x0000000000000060      0      0     17   0             0       0
  0x0000000000000088     37      7     17   0             0       0
  0x0000000000000090     42     75     17   0             0       0  is_stmt
  0x00000000000000a8     42     52     17   0             0       0
  0x00000000000000c0    120      9     17   0             0       0  is_stmt
  0x00000000000000c8      0      9     17   0             0       0
  0x00000000000000d0    106     21     17   0             0       0  is_stmt
  0x00000000000000d8    106      3     17   0             0       0
  0x00000000000000e0    110     25     17   0             0       0  is_stmt
  0x00000000000000f8    110     36     17   0             0       0
  0x0000000000000100      0     36     17   0             0       0
  ...

These DebugLoc with line number 0 needs to be skipped since we cannot
map them to the correct source code. Note that selftest offset-reloc-basic.ll
has this issue as well which is adjusted by this patch.

With the above two fixes, empty lines for source annotation are removed.

  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85
   1670 ; if (rl_cfg) { @ rate_limit.h:77

You can see that we do not have empty line any more.

   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
@yonghong-song
Copy link
Contributor Author

Looks good, I'll double check the impact on line information in selftests and write back this evening.

Thanks, @eddyz87 I added additional checking for line_num == 0. Now I checked with bpf selftest and our internal bpf program, there is zero line info with line_num == 0. But please double check.

@eddyz87
Copy link
Contributor

eddyz87 commented Apr 27, 2024

Looks good, I'll double check the impact on line information in selftests and write back this evening.

I used the following (ugly) script to dump all line information for *.bpf.o files generated for kernel selftests. It appears that there is no difference between line information generated for this branch and for 'main'.

@yonghong-song
Copy link
Contributor Author

yonghong-song commented Apr 27, 2024

I am not able to use your print_btf.py script as in my environment, I do not have 'lief' module and I do not know how to install it. Note that I cannot use 'pip insteal lief' due to internal restriction. Then I directly use compiler to generate ".s" file to make it easy to compare.
The command line:

clang  -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /home/yhs/work/yhs/llvm-project/llvm/build/install/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/include   -DENABLE_ATOMICS_TESTS  -O2 --target=bpf -S progs/percpu_alloc_array.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/percpu_alloc_array.bpf.o.s

With the current 'main' branch, we have

        .long   .Ltmp253
        .long   140
        .long   0
        .long   0                               # Line 0 Col 0

With this patch, there is no 'Line 0 Col 0' entry.

@eddyz87
Copy link
Contributor

eddyz87 commented Apr 29, 2024

@yonghong-song, I've updated the script to use llvm-objdump binary to extract BTF/BTF.ext section contents. I've also switched to using your branch dev2 instead of dev. After these changes I do see some changes in new vs old generated line info, it could be summarized as follows:

  • line info with line 0 is not generated anymore;
  • this causes removal of a few (~400) line infos in cases like below:
Insn       File                 Line Col
...
 1968       bpf_cubic.c          359  17  		delta = (cwnd * scale) >> 3;
 1984       bpf_cubic.c          360  27  		if (ca->ack_cnt > delta && delta) {
-1992       bpf_cubic.c          0    0   (anon)
-2008       bpf_cubic.c          360  27  		if (ca->ack_cnt > delta && delta) {

The record for instruction 2008 is removed, because it's coordinate is identical to record for instruction 1984.

=> I think that it is safe to merge this patch.

@yonghong-song
Copy link
Contributor Author

Great. The print_btf.py script works in my devserver now!

@yonghong-song yonghong-song merged commit 4c70157 into llvm:main Apr 29, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants