-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for parsing and reading C++ type templates #60
Conversation
libdrgn/dwarf_info_cache.c
Outdated
|
||
r = dwarf_child(die, &child); | ||
while (r == 0) { | ||
if (dwarf_tag(&child) == DW_TAG_template_type_parameter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since implementing support for incomplete types was a little sensitive as I moved stuff around, I made it into its own commit for now. If it looks good, I'll squash it into the others where they belong.
Right now, I ended up doing two passes over the children, to avoid creating the members array for incomplete types. However I could also do a single pass, and delete any (erroneous) member entries for incomplete types. Let me know what you like more!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep it as one pass by modifying the loop over the DWARF children to check if (!declaration && dwarf_tag(&child) == DW_TAG_member)
. You'll also have to move the incomplete type initialization to happen in the same place as the complete types and skip the dwarf_bytesize()
for incomplete types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took me awhile to get to this. In terms of splitting up the commits, I think it'd make the most sense to have two commits: one which adds this support for structs, unions, and classes (including the Python scaffolding + tests) and a separate commit which adds the same to functions. Would it be painful to respin that way?
libdrgn/type.c
Outdated
@@ -197,6 +204,7 @@ void drgn_int_type_init(struct drgn_type *type, const char *name, uint64_t size, | |||
} | |||
type->_private.size = size; | |||
type->_private.is_signed = is_signed; | |||
type->_private.num_template_parameters = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't bother initializing this for types without template parameters. It should be caught by asserting in the getter.
_drgn.pyi
Outdated
|
||
template_parameters: Sequence[TypeTemplateParameter] | ||
""" | ||
List of template parameters making up this type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "List of template parameters of this type"? This should also mention what types it's valid for (structure, union, class, and function?).
libdrgn/dwarf_info_cache.c
Outdated
|
||
r = dwarf_child(die, &child); | ||
while (r == 0) { | ||
if (dwarf_tag(&child) == DW_TAG_template_type_parameter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep it as one pass by modifying the loop over the DWARF children to check if (!declaration && dwarf_tag(&child) == DW_TAG_member)
. You'll also have to move the incomplete type initialization to happen in the same place as the complete types and skip the dwarf_bytesize()
for incomplete types
tests/test_dwarf.py
Outdated
self.assertEqual(die_type.template_parameters[0].name, "T") | ||
self.assertEqual(die_type.template_parameters[1].name, "U") | ||
self.assertEqual(die_type.template_parameters[0].type, int_type("int", 4, True)) | ||
self.assertEqual(die_type.template_parameters[1].type, int_type("int", 4, True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add the tests after the Python binding support so you can compare against a constructed type instead of having to check each template parameter.
libdrgn/python/type.c
Outdated
@@ -1808,67 +1863,73 @@ static DrgnType *compound_type(PyObject *tag_obj, PyObject *size_obj, | |||
DrgnType *struct_type(PyObject *self, PyObject *args, PyObject *kwds) | |||
{ | |||
static char *keywords[] = { | |||
"tag", "size", "members", "qualifiers", "language", NULL, | |||
"tag", "size", "members", "qualifiers", "template_parameters", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have template_parameters
between members
and qualifiers
? Or does that break a bunch of tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it only breaks one, so I'll fix it and change the order
0dcff84
to
2e1b88e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I focused most on my efforts on the areas that conflicted, so I'm not sure if changes are needed in the places that didnt (most of the python boilerplate). It seems to still work though, so I'll leave that as is for now. I'm expecting the rebase onto the members patch to have less conflicts however.
libdrgn/python/type.c
Outdated
bool can_cache_members = true; | ||
bool can_cache_templates = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these 'can_cache' variables for? I don't think I understand what they're doing for members, so I'm not sure if something is needed if it is true on templates as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we access some attribute of a type, we translate the C attribute to Python and cache it for subsequent uses. When a type is created in Python, we translate the Python arguments to C to build the type. As an optimization, we can sometimes cache the Python argument as an attribute directly when the type is created (that's what all the _PyDict_SetItemId()
calls at the end of this function do). We can't cache the list of members/parameters/template parameters if they use a callback, because they each need their own thunk. That's what these are for, and you do it for template parameters (as well as a call to _PyDict_SetItemId()
).
libdrgn/type.c
Outdated
@@ -499,12 +511,50 @@ drgn_compound_type_builder_add_member(struct drgn_compound_type_builder *builder | |||
return NULL; | |||
} | |||
|
|||
struct drgn_error * | |||
drgn_type_builder_add_template_parameter(struct drgn_program *prog, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just flagging this - I didn't want to duplicate almost the same thing for both compound and function type builders - so I unified the "add template parameter path" by splitting the arguments and omitting the builder. Not sure if there's a nicer way to de-duplicate the code without something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be better to have drgn_compound_type_builder_add_template_parameter()
and drgn_function_type_builder_add_template_parameter()
be the public interfaces, but they can both call through to this (static) helper internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the public interfaces are separate, is there a way I could not duplicate the callers, ie unpack_template
and parse_template
? I could pass both builders and see which one is non-null but that seems a bit hacky.
libdrgn/debug_info.c
Outdated
name = dwarf_formstring(attr); | ||
if (!name) { | ||
return drgn_error_create(DRGN_ERROR_OTHER, | ||
"DW_TAG_member has invalid DW_AT_name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DW_TAG_template_type_parameter
, not DW_TAG_member
, right?
libdrgn/debug_info.c
Outdated
name = NULL; | ||
} | ||
|
||
err = drgn_lazy_type_from_dwarf(dbinfo, die, false, "DW_TAG_member", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
libdrgn/debug_info.c
Outdated
|
||
Dwarf_Die child; | ||
int r = dwarf_child(die, &child); | ||
while (r == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this in the same pass as parsing the members. The main difference is that if the type is a declaration, you shouldn't get the size or endianness, and you should ignore any DW_TAG_member
children.
libdrgn/type.c
Outdated
enum drgn_type_kind kind = builder->kind; | ||
assert(kind == DRGN_TYPE_STRUCT || | ||
kind == DRGN_TYPE_UNION || | ||
kind == DRGN_TYPE_CLASS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drgn_compound_type_builder_init()
already checked this, so no need to repeat it.
libdrgn/type.c
Outdated
@@ -828,6 +859,9 @@ drgn_function_type_create(struct drgn_function_type_builder *builder, | |||
type->_private.language = | |||
lang ? lang : drgn_program_language(builder->prog); | |||
*ret = type; | |||
// TODO fix this, functions can have templates | |||
type->_private.num_template_parameters = 0; | |||
type->_private.language = drgn_language_or_default(lang); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this snuck in from a conflict. We're already assigning the language.
libdrgn/type.h
Outdated
@@ -348,8 +364,9 @@ drgn_compound_type_builder_add_member(struct drgn_compound_type_builder *builder | |||
* member must remain valid for the lifetime of @c builder->prog. | |||
* @param[in] tag Name of the type. Not copied; must remain valid for the | |||
* lifetime of @c builder->prog. May be @c NULL if the type is anonymous. | |||
* @param[in] size Size of the type in bytes. | |||
* @param[in] size Size of the type in bytes. Ignored if type is incomplete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, I think we want to explicitly return an error if is_complete
and size != 0
. We should also return an error if is_complete
and members were added to the builder.
_drgn.pyi
Outdated
@@ -529,6 +529,7 @@ class Program: | |||
tag: Optional[str], | |||
size: IntegerLike, | |||
members: Sequence[TypeMember], | |||
template_parameters: Optional[Sequence[TypeTemplateParameter]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean for template_parameters
to be None
? For members
, there's a distinction: an empty sequence is no members, and None
is an incomplete type. I don't see a distinction for template_parameters
. Perhaps this should be template_parameters: Sequence[TypeTemplateParameter] = ()
(also, template_parameters
should be mentioned in the docstring).
_drgn.pyi
Outdated
:param type: Type of the template parameter. This may be a :class:`Type` or ``None``. | ||
:param name: Name of the template parameter. This may be ``None`` if the parameter is | ||
unnamed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should go in the docstring for __init__()
now (and you can copy the text for TypeMember.__init__()
since that correctly mentions that type
can be a callable).
libdrgn/python/type.c
Outdated
if (append_member(parts, self, &first, template_parameters) == -1) | ||
goto out_repr_leave; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make this conditional on the type having template parameters? That'll make repr()
less noisy for the common case of no parameters.
libdrgn/python/type.c
Outdated
bool can_cache_members = true; | ||
bool can_cache_templates = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we access some attribute of a type, we translate the C attribute to Python and cache it for subsequent uses. When a type is created in Python, we translate the Python arguments to C to build the type. As an optimization, we can sometimes cache the Python argument as an attribute directly when the type is created (that's what all the _PyDict_SetItemId()
calls at the end of this function do). We can't cache the list of members/parameters/template parameters if they use a callback, because they each need their own thunk. That's what these are for, and you do it for template parameters (as well as a call to _PyDict_SetItemId()
).
56f3f8e
to
985a060
Compare
libdrgn/python/type.c
Outdated
cached_members : Py_None) == -1)) | ||
cached_members : Py_None) == -1) || | ||
(can_cache_templates && | ||
_PyDict_SetItemId(type_obj->attr_cache, &DrgnType_attr_template_parameters.id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I'm doing this right - can you take a closer look at this (and the function version as well). I have no idea how to test it.
985a060
to
81cbb8a
Compare
I rebased this diff on top of #72 - still no value templates support but it shouldn't be too difficult. I'm just trying to resolve conflicts right now and make sure things are working as is. |
a05be65
to
fe2e6c6
Compare
fe2e6c6
to
b3c4f65
Compare
This is more of a note to myself so I don't forget than anything else - on void templates there is no DW_AT_type (at least on clang), but there is a name, which means we have to do something to represent these instead of just failing - or we can just drop them completely in a worst case. |
In preparation for struct drgn_type referencing struct drgn_object, move the former after the latter. Signed-off-by: Omar Sandoval <osandov@osandov.com>
Getting the bit field size of a member will soon require evaluating the lazy type, so return it from drgn_member_type() instead of accessing it directly. Signed-off-by: Omar Sandoval <osandov@osandov.com>
In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>
This will be needed for types containing reference objects. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>
In preparation for calling the object parsing code from the type parsing code, move it up in the file (and update the coding style in drgn_object_from_dwarf_enumerator() while we're at it). Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>
I rebased your patches on the lazy object rework I did, added support for non-type parameters, and cleaned it up. If all the tests pass, I think this should be ready to go in! |
Merged! Let me know if you get the chance to test it out. |
There are a couple of things missing, but I want to try to keep this as small as possible for now.
Completed:
Missing:
Right now this support is always enabled (not only for C++ units), as we won't see this dwarf tag (I hope) in C code, and plumbing the current language down might be a little difficult. If there's a straightforward (and desirable) way to do this, let me know!
I'm the most unsure about the python bindings (and the TypeTemplateParameter python object). Most of it was derived from the Members array, so hopefully it's correct, but if you could take a close look at that it would be nice. Hopefully I didn't miss anything too obvious!
The commits are not too well seperated at the moment (too much is in the first commit). I can break it out or move things around if it would make more sense, however!)