-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SymbolIndex and kallsyms helpers #388
Conversation
44f9b17
to
acd7534
Compare
Alright, sorry for the failure (it was an old commit changing how ELF symbols are stored in |
6c8cb73
to
429053f
Compare
These latest pushes rebase on the current main branch, with the updated symbol finder API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only looked at the first two commits so far, so feel free to ignore these comments until I'm done or address them now, up to you. I'll continue reviewing tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I finally made it through the whole series. Good stuff! I can't claim that I fully understood the kallsyms format this time around, but I at least tried to look for coding errors.
The guts of the implementation look good, but the drgn.helpers.linux.kallsyms
interface is a bit overwhelming. There are lots of similar-looking helpers that I wouldn't know how to make use as an end user. I wonder if we can pare it down a bit. What are the most important use cases that you're targeting?
libdrgn/kallsyms.c
Outdated
} | ||
} else { | ||
if (kind_ret) | ||
*kind_ret = *token_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return without ever setting this, leaving the caller with uninitialized garbage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's worth guarding against. I think the easiest place to do it is actually in drgn_load_builtin_kallsyms()
, where we can simply check whether the string builder length is zero and raise an error.
There's a lot of stuff in the kallsyms tables that, if fuzzed, could cause these parsers to behave weirdly or maybe crash. I'm not certain how paranoid we want to be here. At this point, if the correct set of symbols are provided, with real values, and we've managed to read the token and name table successfully (without faulting or mallocing a buffer that's way too large), then it seems likely that the data will match the kallsyms format. So in practical terms, these sorts of things shouldn't happen. What are your opinions here? I can add paranoia to this, but it does feel like it will take rather unclear code and make it even worse...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of stuff in the kallsyms tables that, if fuzzed, could cause these parsers to behave weirdly or maybe crash.
The general philosophy in drgn is that bogus input can produce bogus results, but it should never cause drgn to crash or invoke undefined behavior. That does tend to require lots of extra bounds checking and overflow checking. I'll keep an eye out for that as I review the kallsyms code.
Thanks for the review! I'll take a look through it in detail and try to get an update ASAP, rebased on main as well. It looks like you made a lot of good catches, so thanks for that.
Yeah, I have even confused myself... I've occasionally imported from there and not known which function I want to use. So it does need to be simplified. The primary use cases I can think of are:
That would mean at a minimum, we'd really just need The |
e268fbd
to
8a9c865
Compare
Ok after a few botched pushes I think I've addressed everything here. I've also rebased on |
In commit c8ff872 ("Support systems without qsort_r"), usage of qsort_r was eliminated because it is a glibc extension. There was discussion of creating a utility function that implements qsort_r(), but the approach described uses thread local variables, so it is not actually reentrant, and it was dropped to avoid confusion. However, upcoming commits will also prefer a comparator function which takes an argument, and they also won't require a reentrant implementation. Add this helper in with a name that shouldn't spark confusion: qsort_arg(). Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
This will be reused in an upcoming commit. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
8a9c865
to
f1d53eb
Compare
Noticed that the branch had become stale due to some merge conflicts. This first push here has just resolved the conflicts. Got a lint error (git rebase does not run pre-commit :/). I still need to incorporate your new binary search macro, so I'll resolve both shortly. |
I'm loving the binary search helpers, they simplified things a lot here: diff --git a/libdrgn/symbol.c b/libdrgn/symbol.c
index 46e21a6b..92bcfdb9 100644
--- a/libdrgn/symbol.c
+++ b/libdrgn/symbol.c
@@ -6,6 +6,7 @@
#include <stdlib.h>
#include <string.h>
+#include "binary_search.h"
#include "drgn_internal.h"
#include "string_builder.h"
#include "symbol.h"
@@ -300,38 +301,21 @@ drgn_symbol_index_deinit(struct drgn_symbol_index *index)
static void address_search_range(struct drgn_symbol_index *index, uint64_t address,
uint32_t *start_ret, uint32_t *end_ret)
{
- // bsearch() only handles exact matches, and we need to find what is
- // essentially the correct insertion point, even if there's not an exact
- // match. So bsearch() can't be used here. It's helpful to think of this
- // code in terms of the Python library functions bisect_right().
- uint32_t lo = 0, hi = index->num_syms;
- uint32_t mid;
-
// First, identify the maximum symbol index which could possibly contain
// this address. Think of this as:
// end_ret = bisect_right([s.address for s in symbols], address)
- while (lo < hi) {
- mid = lo + (hi - lo) / 2;
- if (address < index->symbols[mid].address)
- hi = mid;
- else
- lo = mid + 1;
- }
- *end_ret = lo;
+ #define less_than_start(a, b) (*(a) < (b)->address)
+ *end_ret = binary_search_gt(index->symbols, index->num_syms, &address,
+ less_than_start);
+ #undef less_than_start
// Second, identify first symbol index which could possibly contain this
// address. We need to use "max_addrs" for this task:
// bisect_right(max_addrs, address)
- hi = lo;
- lo = 0;
- while (lo < hi) {
- mid = lo + (hi - lo) / 2;
- if (address < index->max_addrs[mid])
- hi = mid;
- else
- lo = mid + 1;
- }
- *start_ret = lo;
+ #define less_than_end(a, b) (*(a) < *(b))
+ *start_ret = binary_search_gt(index->max_addrs, index->num_syms, &address,
+ less_than_end);
+ #undef less_than_end
}
struct drgn_error * |
f1d53eb
to
963370d
Compare
Clarification for #388 (comment). Signed-off-by: Omar Sandoval <osandov@osandov.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signing off for the day after looking at commits 1-3. Will continue tomorrow, thanks!
libdrgn/kallsyms.c
Outdated
} | ||
} else { | ||
if (kind_ret) | ||
*kind_ret = *token_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of stuff in the kallsyms tables that, if fuzzed, could cause these parsers to behave weirdly or maybe crash.
The general philosophy in drgn is that bogus input can produce bogus results, but it should never cause drgn to crash or invoke undefined behavior. That does tend to require lots of extra bounds checking and overflow checking. I'll keep an eye out for that as I review the kallsyms code.
963370d
to
0f72fa5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that's the rest of it, looking good overall. I gave the kallsyms code a much closer read than I had before, so there are some new comments there that I missed in the last round. Have a good weekend!
The Symbol Finder API gives us the ability to register a dynamic callback for symbol lookup. However, many common use cases are satisfied by a simple static list of symbols. Correct and efficient lookup in this simple case is rather tricky. Implement a new type, SymbolIndex, which can take a list of symbols and index them for efficient lookup by name or address. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
0f72fa5
to
88780c7
Compare
I double checked to ensure the s390x failure was transient, so we've got a clean test run :) I believe I have addressed all your comments, and I avoided rebasing on the |
Yeah Packit has been flaky today. I'll give these another look today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just need a few final easy fixes.
88780c7
to
7994296
Compare
(LGTM other than the |
When properly configured, Linux contains its own symbol table within kernel memory at runtime. It is exposed as the /proc/kallsyms file, which is the easiest way to consume it, for live kernels. However, with recent changes to the Linux kernel in 6.0, necessary symbols are exposed within VMCOREINFO that allow us to interpret the data structures inside kernel memory, without needing debuginfo. This allows us to write helpers that can load kallsyms on vmcores, or on live systems. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Add Python helpers which load module kallsyms and return a symbol index for them. Unlike the /proc/kallsyms and built-in kallsyms, these are quite easy to handle using regular Python & drgn code, so implement them as Python helpers. There are (at least) two use cases for these helpers: 1. After loading CTF and built-in vmlinux kallsyms, support for module kallsyms is still necessary. 2. Sometimes, people only extract vmlinux DWARF debuginfo. Adding module symbols can allow stack traces and other symbolization to work even without module debuginfo. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The SymbolIndex structure owns the memory for symbols and names, and once added to the Program, it cannot be removed. Making copies of any of these symbols is purely a waste: we should be able to treat them as static. So add a lifetime and allow the SymbolIndex to specify static, avoiding unnecessary copies and frees. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
7994296
to
ee3f77f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all of your work on this, it looks ready to go 🎉
Yay, that's great! Thanks for all the reviewing work on this! |
This is the next step of the DWARFless debugging / CTF feature, #176. Unlike the symbol finder, there aren't too many questions of API, instead we're just implementing the API. This PR adds:
SymbolIndex
type, which holds an immutable list of symbols and indexes them for efficient name and address based lookup. Some interesting items here:SymbolIndex
overSymbolTable
since I didn't want to overload the common usage of that term applying to ELF symbol tables. That said,SymbolIndex
is a bit strange so I'm open to suggestions.load_proc_kallsyms()
andload_builtin_kallsyms()
) enumerate all kallsyms symbols and return aSymbolIndex
object. The first uses/proc/kallsyms
and the second uses information fromVMCOREINFO
to find the kallsyms tables./proc/kallsyms
from a vmcore./proc/kallsyms
includes module symbols, while the built-in vmlinux kallsyms of course does not. I've included an option to parse these symbols, however by default, we do not. This is because you can actually get the ELF symbol objects, which have more information, directly from thestruct module
.load_module_kallsyms()
) for loading module symbols from the kernel image and creating a symbol index from that. This does require debuginfo for vmlinux.I also of course have tests for the above.
Below are two issues you'll probably notice and want to discuss:
guess_long_names()
function works around torvalds/linux@73bbb94466fd ("kallsyms: support "big" kernel symbols") which was merged in 6.1. I didn't catch it immediately so I'm not sure whether there's any use in sending a commit to include a version number for kallsyms... Especially since I believe it's unlikely to be backported, but I could be wrong. It has certainly not been backported to stable kernels.drgn_symbol_destroy()
. This enables theSymbolIndex
to provides symbols with zero copying, even all the way out to the Python layer! However, I'm not certain it's worth the complexity it adds. That's why I kept the change as a second commit; it's easy to drop if you don't like it. I haven't measured any performance impact and admittedly, I don't know if it would be noticeable.