Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support C++ references #81

Open
Tracked by #328
osandov opened this issue Jan 9, 2021 · 2 comments
Open
Tracked by #328

Support C++ references #81

osandov opened this issue Jan 9, 2021 · 2 comments
Labels
C++ Support for C++ debugging enhancement New feature or request help wanted Extra attention is needed

Comments

@osandov
Copy link
Owner

osandov commented Jan 9, 2021

E.g., type& foo. This should be a matter of adding the boilerplate for a new "reference" type kind and parsing it from DWARF. It will be almost identical to a pointer type. The only difference that comes to mind is that a reference type may not refer to a void type (in C++, at least, but maybe we can be more permissive).

@osandov osandov added enhancement New feature or request good first issue Good for newcomers C++ Support for C++ debugging labels Jan 9, 2021
@osandov
Copy link
Owner Author

osandov commented Jan 9, 2021

On second thought, there's more to it. Adding the type definition is easy, but the tricky part is Objects with reference type.

Although references are physically basically a pointer, logically they are used like the object they refer to. This means that an Object with reference type should be stored like a pointer, but when it is used in any operation, it should be transparently dereferenced. This includes Object.value_(), so we would probably need to add another method for getting the address value of the reference.

We'll also need good documentation about the difference between an Object with reference type, a reference Object, and a reference Object with reference type (ouch).

@osandov osandov removed the good first issue Good for newcomers label Jan 9, 2021
@osandov
Copy link
Owner Author

osandov commented Jan 25, 2022

I finally got around to writing up more details on how to go about this.

Problem Statement

drgn was originally developed for C, so it's still missing lots of C++ features. One of those missing features is references.

The mechanism for implementing references is not specified by the C++ standard. However, they are typically implemented like a pointer: the compiler stores an address which it transparently dereferences when the reference is used. See here and here for some discussions. DWARF seems to assume this implementation; search for "reference type" in the DWARF standard.

The goal of this issue is to support reference types and objects with a reference type in drgn.

Implementation

Types

drgn represents types with the drgn.Type class in the Python bindings and the struct drgn_type structure in libdrgn. There are various kinds of types, each of which have different attributes. For example, pointer types have a size and a referenced type (e.g., on a 64-bit platform, int * is 8 bytes and the referenced type is int).

Representing reference types in drgn should be fairly straightforward and mostly boilerplate along the lines of how we represent pointer types.

The first step is allowing struct drgn_type/drgn.Type to represent a reference type by:

  1. Adding a reference type kind to enum drgn_type_kind.
  2. Defining the getters for reference types (likely the same as for pointer types, so drgn_type_size(), drgn_type_little_endian(), and drgn_type_type().
  3. Adding a constructor for reference types (see drgn_pointer_type_create(), either generalize this function or add a drgn_reference_type_create()) and the matching drgn.Program.reference_type() Python binding (see Program_pointer_type()).
  4. Adding a unit test: see tests.test_type.TestType.test_pointer.

Then, we need to support parsing reference types from DWARF. Reference types are represented the same as pointer types in DWARF, except that they use the DW_TAG_reference_type tag. We need to update drgn_type_from_dwarf_internal() and either generalize drgn_pointer_type_from_dwarf() to also support reference types or add a drgn_reference_type_from_dwarf().

Objects

drgn represents variables and values with the drgn.Object class. Objects have a type. Objects can be used in expressions, which behave as if they had that type in C/C++.

Objects with a reference type are the trickier part of this task. In some respects, these objects should behave like pointers, and in others, they should behave like the referenced type.

Background

A struct drgn_object can be either a "value" or a "reference" (which is different from a C++ reference, but similar in spirit). See the documentation on references vs. values. Essentially:

  • struct drgn_object::kind indicates whether an object is a value or a reference (or absent, which isn't relevant here).
  • For a "value" object, we store the actual value of the variable in struct drgn_object::value. E.g., for an int object, we store an integer value; for a pointer, we store the pointer's integer value; for a struct object, we store a buffer containing the raw bytes of the structure.
  • For a "reference" object, we store the address of the variable in struct drgn_object::address.
  • struct drgn_object::encoding indicates how the value is encoded (e.g., as an unsigned integer, floating-point value, raw buffer, etc.).
  • When we get the value of an object, e.g. to use it in an expression: if it is a value, we simply get the stored value; if it is a reference, we read the value from the program's memory.

Storing Objects with Reference Type

I think we want to store objects with reference type the same way as we store objects with pointer type: encoded as DRGN_OBJECT_ENCODING_UNSIGNED, where the value is the address of the referred-to object. I.e.:

  • A "value" object with reference type stores the referred-to address in struct drgn_object::value.
  • A "reference" object with reference type stores the address of the reference object in struct drgn_object::address. (Yes, this is confusing.)

Operations on Objects with Reference Type

There are a couple of fundamental operations that we need to support on objects with reference type:

  1. Initializing an object with reference type. Like pointers, this will likely be drgn_object_set_unsigned() for initializing a value object with reference type from the referred-to address and drgn_object_set_reference() for initializing a reference object with reference type.
  2. Getting the referred-to value. The existing APIs here are drgn_object_read_value(), drgn_object_read_bytes(), drgn_object_read_signed(), drgn_object_read_unsigned(), drgn_object_read_integer(), and drgn_object_read_float(). I think we want to make those transparently dereference the object and return the referred-to value. Another option would be to make the existing functions treat the reference as a pointer and add new functions that dereference the object.

The remaining operations on references are mostly built on top of these fundamental operations and should behave like in C++:

  • Arithmetic operators, bitwise operators, casts, comparisons, member accesses (foo.bar), sizeof, etc. should transparently dereference the reference.
  • drgn_object_address_of_() (&foo) should return a pointer whose value is the referred-to address.

These operators are mostly defined in libdrgn/language_c.c and libdrgn/object.c.

There are a couple of extra operations that we may want to support:

  • "Dereferencing" a reference, i.e., going from an object with reference type (foo&) to the referred-to object (foo).
  • Getting the referred-to address of a reference. This isn't crucial since the same information is available via drgn_object_address_of(), but it may be a useful shortcut/optimization.

The hard part for both of these is defining and naming the API.

Pretty-Printing

drgn_format_type_name() and drgn_format_type() need to be implemented for reference types, and drgn_format_object() needs to be implemented for objects with reference type. We should also consider references inside of structs/classes, typedefs of references, etc.

@osandov osandov added the help wanted Extra attention is needed label Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ Support for C++ debugging enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant