-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BPF: Use DebugLoc to find Filename for BTF line info #90302
Conversation
cc @anakryiko |
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) { | |||
OS.emitLabel(LineSym); | |||
|
|||
// Construct the lineinfo. | |||
auto SP = DL->getScope()->getSubprogram(); | |||
constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol()); | |||
constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most important part of the fix :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. that is the key change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this could be done as follows, as get()
is used to implement ->
operator:
- constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
+ constructLineInfo(LineSym, DL->getFile(), DL->getLine(), DL->getColumn());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Just make the change to do 'DL->getFile()' directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I'll double check the impact on line information in selftests and write back this evening.
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) { | |||
OS.emitLabel(LineSym); | |||
|
|||
// Construct the lineinfo. | |||
auto SP = DL->getScope()->getSubprogram(); | |||
constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol()); | |||
constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this could be done as follows, as get()
is used to implement ->
operator:
- constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
+ constructLineInfo(LineSym, DL->getFile(), DL->getLine(), DL->getColumn());
Andrii found an issue where the BTF line info may have empty source which seems wrong. The program is a Meta internal bpf program. I can reproduce with latest upstream compiler as well. Let the bpf program built without this patch and then with the following veristat check where veristat is a bpf verifier tool to do kernel verification for bpf programs: $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log $ rg '^;' log | sort | uniq -c | sort -nr | head -n10 4206 ; } else if (action->dry_run) { @ src_mitigations.h:57 3907 ; if (now < start_allow_time) { @ ban.h:17 3674 ; @ src_mitigations.h:0 3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85 1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26 1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28 1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25 1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498 1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76 1688 ; if (throttle_cfg) { @ rate_limit.h:85 You can see 3674 ; @ src_mitigations.h:0 where we do not have proper line information and line number. In LLVM Machine IR, some instructions may carry DebugLoc information to specify where the corresponding source is for this instruction. The information includes file_name, line_num and col_num. Each instruction may also attribute to a function in debuginfo. So there are two ways to find file_name for a particular insn: (1) find the corresponding function in debuginfo (MI->getMF()->getFunction().getSubprogram()) and then find the file_name from DISubprogram. (2) find the corresponding file_name from DebugLoc. The option (1) is used in current implementation. This mostly works. But if one instruction is somehow generated from multiple functions, the compiler has to pick just one. This may cause a mismatch between file_name and line_num/col_num. Besides potential incorrect mismatch of file_name vs. line_num/col_num, There is another issue where some DebugLoc has line number 0. For example, I dumped the dwarf line table for the above bpf program: Address Line Column File ISA Discriminator OpIndex Flags ------------------ ------ ------ ------ --- ------------- ------- ------------- 0x0000000000000000 96 0 17 0 0 0 is_stmt 0x0000000000000010 100 12 17 0 0 0 is_stmt prologue_end 0x0000000000000020 0 12 17 0 0 0 0x0000000000000058 37 7 17 0 0 0 is_stmt 0x0000000000000060 0 0 17 0 0 0 0x0000000000000088 37 7 17 0 0 0 0x0000000000000090 42 75 17 0 0 0 is_stmt 0x00000000000000a8 42 52 17 0 0 0 0x00000000000000c0 120 9 17 0 0 0 is_stmt 0x00000000000000c8 0 9 17 0 0 0 0x00000000000000d0 106 21 17 0 0 0 is_stmt 0x00000000000000d8 106 3 17 0 0 0 0x00000000000000e0 110 25 17 0 0 0 is_stmt 0x00000000000000f8 110 36 17 0 0 0 0x0000000000000100 0 36 17 0 0 0 ... These DebugLoc with line number 0 needs to be skipped since we cannot map them to the correct source code. Note that selftest offset-reloc-basic.ll has this issue as well which is adjusted by this patch. With the above two fixes, empty lines for source annotation are removed. $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log $ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10 4206 ; } else if (action->dry_run) { @ src_mitigations.h:57 3907 ; if (now < start_allow_time) { @ ban.h:17 3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85 1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26 1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28 1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25 1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498 1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76 1688 ; if (throttle_cfg) { @ rate_limit.h:85 1670 ; if (rl_cfg) { @ rate_limit.h:77 You can see that we do not have empty line any more. 3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85 Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Thanks, @eddyz87 I added additional checking for line_num == 0. Now I checked with bpf selftest and our internal bpf program, there is zero line info with line_num == 0. But please double check. |
I used the following (ugly) script to dump all line information for *.bpf.o files generated for kernel selftests. It appears that there is no difference between line information generated for this branch and for 'main'. |
I am not able to use your print_btf.py script as in my environment, I do not have 'lief' module and I do not know how to install it. Note that I cannot use 'pip insteal lief' due to internal restriction. Then I directly use compiler to generate ".s" file to make it easy to compare.
With the current 'main' branch, we have
With this patch, there is no 'Line 0 Col 0' entry. |
@yonghong-song, I've updated the script to use
Insn File Line Col
...
1968 bpf_cubic.c 359 17 delta = (cwnd * scale) >> 3;
1984 bpf_cubic.c 360 27 if (ca->ack_cnt > delta && delta) {
-1992 bpf_cubic.c 0 0 (anon)
-2008 bpf_cubic.c 360 27 if (ca->ack_cnt > delta && delta) { The record for instruction 2008 is removed, because it's coordinate is identical to record for instruction 1984. => I think that it is safe to merge this patch. |
Great. The print_btf.py script works in my devserver now! |
Andrii found an issue where the BTF line info may have empty source which seems wrong. The program is a Meta internal bpf program. I can reproduce with latest upstream compiler as well. Let the bpf program built without this patch and then with the following veristat check where veristat is a bpf verifier tool to do kernel verification for bpf programs:
$ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
$ rg '^;' log | sort | uniq -c | sort -nr | head -n10
4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
3907 ; if (now < start_allow_time) { @ ban.h:17
3674 ; @ src_mitigations.h:0
3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
1688 ; if (throttle_cfg) { @ rate_limit.h:85
You can see
3674 ; @ src_mitigations.h:0
where we do not have proper line information and line number.
In LLVM Machine IR, some instructions may carry DebugLoc information
to specify where the corresponding source is for this instruction.
The information includes file_name, line_num and col_num.
Each instruction may also attribute to a function in debuginfo.
So there are two ways to find file_name for a particular insn:
(1) find the corresponding function in debuginfo
(MI->getMF()->getFunction().getSubprogram()) and then
find the file_name from DISubprogram.
(2) find the corresponding file_name from DebugLoc.
The option (1) is used in current implementation. This mostly works.
But if one instruction is somehow generated from multiple functions,
the compiler has to pick just one. This may cause a mismatch between
file_name and line_num/col_num.
Besides potential incorrect mismatch of file_name vs. line_num/col_num,
There is another issue where some DebugLoc has line number 0. For example,
I dumped the dwarf line table for the above bpf program:
These DebugLoc with line number 0 needs to be skipped since we cannot
map them to the correct source code. Note that selftest offset-reloc-basic.ll
has this issue as well which is adjusted by this patch.
With the above two fixes, empty lines for source annotation are removed.
$ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
$ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10
4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
3907 ; if (now < start_allow_time) { @ ban.h:17
3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
1688 ; if (throttle_cfg) { @ rate_limit.h:85
1670 ; if (rl_cfg) { @ rate_limit.h:77
You can see that we do not have empty line any more.
3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85