Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix debug info for substituted methods #2885

Closed
wants to merge 4 commits into from

Conversation

zakkak
Copy link
Collaborator

@zakkak zakkak commented Sep 30, 2020

@zakkak
Copy link
Collaborator Author

zakkak commented Sep 30, 2020

@adinn can you please review?

@adinn
Copy link
Collaborator

adinn commented Oct 1, 2020

I am not clear what problem this change is fixing? Can you provide an example?

@adinn
Copy link
Collaborator

adinn commented Oct 1, 2020

Ah, ok, I spotted the issue in the mandrel repo.

@adinn
Copy link
Collaborator

adinn commented Oct 1, 2020

This looks to be correct, Foivos. It clearly gets the right declaring class and, hence, source file for the method in question. However, I'm concerned that this same change might need working through elsewhere in a consistent fashion once we introduce type info and embed method info in the class type. Indeed, I'm not sure yet whether there is a consistent solution here.

If we have an original class Foo with a substitution for method m that replaces it with method n of class Bar then this is going to correctly locate the source code for n in Bar.java. However, when we locate the method n with its 'owning' class do we install it under class Foo or under class Bar? Indeed, if Bar is merely a source of substituted methods for class Foo do we even have a record of class Bar in the debug info? (it makes sense only to have one such type -- this needs investigating further by looking up the original and substitute types (Foo and Bar) with the types present in the suite of types reported as being compiled into the image).

n.b. we can specify a method's file differently to that of its owner class but at present a FioleEntry is owned by a ClassEntry and methods added to the ClassEntry inherit their file form it. If we have to associate method n with class Bar in order to inherit file Bar.java then we still need to be able to embed it in the class info for class Foo. Contrariwise, if we associate it with class Foo then it won't get the right file. So, there i smor eot do here.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 1, 2020

If we have an original class Foo with a substitution for method m that replaces it with method n of class Bar then this is going to correctly locate the source code for n in Bar.java. However, when we locate the method n with its 'owning' class do we install it under class Foo or under class Bar? Indeed, if Bar is merely a source of substituted methods for class Foo do we even have a record of class Bar in the debug info? (it makes sense only to have one such type -- this needs investigating further by looking up the original and substitute types (Foo and Bar) with the types present in the suite of types reported as being compiled into the image).

I would like to see both types (Foo and Bar) in the debug info (and that's what this PR achieves). I believe we should treat substitutions as "different" methods. So if you invoke Foo.m and the code actually runs Bar.n then gdb should tell you that you are actually invoking Bar.n (again what this PR achieves).

If we want Foo.m to appear somewhere in the info (to show how we got to Bar.n although we didn't explicitly invoke it) I suggest adding it as an inlined caller of Bar.n. So instead of Main->Bar.n one would get Main->Foo.m->Bar.n (that is after #2880 makes it in).

n.b. we can specify a method's file differently to that of its owner class but at present a FileEntry is owned by a ClassEntry and methods added to the ClassEntry inherit their file form it. If we have to associate method n with class Bar in order to inherit file Bar.java then we still need to be able to embed it in the class info for class Foo. Contrariwise, if we associate it with class Foo then it won't get the right file. So, there i some more to do here.

FWIW in #2880 I handle this by adding DW_AT_decl_file in DW_TAG_subprogram DIEs (see adda6c4#diff-70c027a109cff44ead87ce8510dc76e1R185-R190). I am still not convinced why we need it here though :)

@adinn
Copy link
Collaborator

adinn commented Oct 1, 2020

I would like to see both types (Foo and Bar) in the debug info (and that's what this PR achieves). I believe we should treat substitutions as "different" methods. So if you invoke Foo.m and the code actually runs Bar.n then gdb should tell you that you are actually invoking Bar.n (again what this PR achieves).

Sorry but that's not what this PR achieves. All we have at present is debug info that identifies all compiled methods. That info only tangentially refers to classes by embedding the class name in the full method name. There is no actual description of the class (Foo or Bar) irrespective of whether your patch is included.

The problem that lies ahead is how to populate the debug info with details of the classes that are actually compiled into the image. The substitution mechanism means we can have a wholesale replacement Bar for an original class Foo or we can have a hybrid of Foo and Bar containing members of both (whether fields or methods). However, we will not have both Foo and Bar.

In order to have a coherent data model for runtime instances the info model needs to provide a description of the replacement/hybrid class identifying the fields it owns and inherits. It also needs to associate compiled methods with the class class. In doing so it may well be able to associate these fields and methods with the alternative source file definitions form which they they originate but it cannot pretend that the instances in question belong to two different classes.

So, your patch fixes a small piece of a larger problem -- i.e. how to find the right source file for a method member. However, this can only be used when we resolve the bigger problem.

n.b. pretending that a substituted method is inlined might be one way to do that but that option needs thinking about. It's certainly not going to work for substituted fields.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 1, 2020

Uh now I see :) (having only the current implementation in mind I map types to files, which is wrong apparently)
Thanks for the explanation @adinn , you are clearly are ahead of me in respect to types and DWARF!

In order to argue further on this I would like to see how you implement the types, so I will have to wait for that PR.
In the meantime do you think we should put this PR on hold as well? or should we proceed with this and figure out how to solve the issue with the types when your PR is ready?

@adinn
Copy link
Collaborator

adinn commented Oct 2, 2020

I think we could perhaps go ahead with this and then rework it in my up and coming version that models types. As the debug info model currently stands I think this is a valid correction apart from one detail that I still need to check with you. Does this still preserve the original class name for the substituted method throughout the info model? I think it does but just need to be sure.

To clarify the question, assume method Foo.m() gets substituted with Bar.n(). Your change will ensure that we use file Bar.java instead of Foo.java when looking up the source. However, we still need to associate the compiled code for this method with class Foo and label it as Foo.m(). More fully, the info model needs to populate all remaining info elements -- other than source file+line details -- using the original method m owned by Foo rather than the target substitution n owned by Bar.

One reason we need to treat the method as being attached to class Foo is that this ensures that we group the compiled method with other methods from the 'same' owning class. That is important for ordering the info subroutine records by increasing address and ensuring per-class aranges info is ordered monotonically. The same preference for Foo over Bar will be needed later to ensure that the original class is included in the model of the class base. That is important for ensuring that e.g. class name references in method signatures, field value types, super or extends links resolve correctly.

Basically, when it comes to naming I believe we have to use Foo not Bar for everything other than source file lookups because, source files and line numbers excepted, everything connected to Foo references Foo not Bar.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 2, 2020

Judging by the logs the class is consistently (for both primary ranges and sub ranges) set to the original, while the source file + line is pointing to the substituted method:

      [0x000b1644] primary range [0x00515d80, 0x00515e24] java.lang.System::identityHashCode(java.lang.Object):294
      [0x000b1644] Set File Name to entry 18 in the File Name Table (JavaLangSubstitutions.java)
      [0x000b1646] Set basic block
      [0x000b1647] Extended opcode 2: Set Address to 0x00515d80
      [0x000b1652] Advance Line by 293 to 294
      [0x000b1655] Copy 1
      [0x000b1656] sub range [0x00515d8e, 0x00515d97] java.lang.System::identityHashCode:294
      [0x000b1656] Special Opcode 201: advance Address by 14 to 0x00515d8e and Line by 0 to 294
      [0x000b1657] sub range [0x00515d97, 0x00515d97] java.lang.System::identityHashCode:295

Additionally, when it comes to DIEs, there is no compilation unit for JavaLangSubstitutions.java

@adinn
Copy link
Collaborator

adinn commented Oct 2, 2020

Ok, so I think this is ok to push. Let me just review the changes one by one to be sure.

Copy link
Collaborator

@adinn adinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks fine to me.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 2, 2020

Thank you Andrew!

@olpaw could you please be the next to review this? :)

@adinn
Copy link
Collaborator

adinn commented Oct 2, 2020

@olpaw This change corrects the file lookup for susbtituted methods to use the file defining the target class. Can you help integrate this/ n.b. the Travis failure for jdk8 is a Linz download error rather than an error in the patch.

Copy link
Member

@olpaw olpaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is fine. But the test needs adjustments.

substratevm/mx.substratevm/testhello.py Outdated Show resolved Hide resolved
@@ -334,6 +324,20 @@ private void computeFullFilePath() {
}
}

private ResolvedJavaType getDeclaringClass() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Makes sense!

@olpaw
Copy link
Member

olpaw commented Oct 2, 2020

@olpaw This change corrects the file lookup for susbtituted methods to use the file defining the target class. Can you help integrate this/ n.b. the Travis failure for jdk8 is a Linz download error rather than an error in the patch.

Yes. Looks like this was bad timing and lafo.ssw.uni-linz.ac.at was down (or got rebooted) while the gate was running.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 7, 2020

@adinn I ended up adding CUs for the substitution classes and DIEs for the substitution methods. The class name in the extended method name renames the original:

      [0x00080b53] primary range [0x004c6a80, 0x004c6acb] hello.Hello.DefaultGreeter::greet:37
      [0x00080b53] Set File Name to entry 1 in the File Name Table (Target_hello_Hello_DefaultGreeter.java)
      [0x00080b55] Set basic block
      [0x00080b56] Extended opcode 2: Set Address to 0x004c6a80
      [0x00080b61] Advance Line by 36 to 37
      [0x00080b63] Copy 1
      [0x00080b64] sub range [0x004c6a8e, 0x004c6a99] hello.Hello.DefaultGreeter::greet:37
      [0x00080b64] Special Opcode 201: advance Address by 14 to 0x004c6a8e and Line by 0 to 37
      [0x00080b65] sub range [0x004c6a99, 0x004c6aa4] hello.Hello.DefaultGreeter::greet:37
      [0x00080b65] Special Opcode 159: advance Address by 11 to 0x004c6a99 and Line by 0 to 37
      [0x00080b66] sub range [0x004c6ab5, 0x004c6ab5] hello.Hello.DefaultGreeter::greet:39
      [0x00080b66] Advance PC by constant 17 to 0x004c6aaa
      [0x00080b67] Special Opcode 161: advance Address by 11 to 0x004c6ab5 and Line by 2 to 39
      [0x00080b68] sub range [0x004c6ab5, 0x004c6aba] hello.Hello.DefaultGreeter::greet:39
      [0x00080b68] sub range [0x004c6ac0, 0x004c6ac5] hello.Hello.DefaultGreeter::greet:39
      [0x00080b68] Special Opcode 159: advance Address by 11 to 0x004c6ac0 and Line by 0 to 39
      [0x00080b69] sub range [0x004c6ac5, 0x004c6ac5] hello.Hello.DefaultGreeter::greet:39
      [0x00080b69] Special Opcode 75: advance Address by 5 to 0x004c6ac5 and Line by 0 to 39

and the only DIE we get is that of the substitution method, the original method doesn't appear in debug_info:

  Compilation Unit @ offset 0x9104:
   Length:        0x3c (32-bit)
   Version:       2
   Abbrev Offset: 0x0
   Pointer Size:  8
<0><910f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <9110>   DW_AT_language    : 11     (Java)
    <9111>   DW_AT_name        : (indirect string, offset: 0x7175): Target_hello_Hello_DefaultGreeter.java
    <9115>   DW_AT_comp_dir    : (indirect string, offset: 0x25a70): sources
    <9119>   DW_AT_low_pc      : 0x4c2a80
    <9121>   DW_AT_high_pc     : 0x4c2acb
    <9129>   DW_AT_stmt_list   : 0x80b06
 <1><912d>: Abbrev Number: 3 (DW_TAG_subprogram)
    <912e>   DW_AT_name        : (indirect string, offset: 0x850bc): hello.Hello.DefaultGreeter::greet
    <9132>   DW_AT_low_pc      : 0x4c2a80
    <913a>   DW_AT_high_pc     : 0x4c2acb
    <9142>   DW_AT_external    : 1

WDYT? Is this still OK or should we wait for types?

Edit: Mistaken hello.Hello.NamedGreeter::greet with hello.Hello.DefaultGreeter::greet

@zakkak zakkak force-pushed the substitutees-debug branch 2 times, most recently from a97b6c6 to 61b403b Compare October 7, 2020 00:36
@zakkak
Copy link
Collaborator Author

zakkak commented Oct 7, 2020

FYI I have added comments in the commits that might help in the review.

@adinn
Copy link
Collaborator

adinn commented Oct 7, 2020

WDYT? Is this still OK or should we wait for types?

I don't think this is the right way to go. In the expanded info model which includes type info method info (subprograms) has to be embedded within an owning class type. That's following the layout used by C++ code which we really need to do as closely as possible in order to ensure that gdb understands it. The class layout is itself embedded in an enclosing CU which, unlike with C++, is 1-1 with the class.

So, we will have a CU and associated class info record for class Hello in which we can place the method info you are generating but there will not be a CU or class info record for class Target_hello_Hello_DefaultGreeter. Firstly, that class does not actually appear in the compiled program (i.e. it has no instances). Secondly, even if we created a dummy CU for that class and pretended it existed the substituted method would end up being associated with the dummy class rather than being viewed as a method of the class onto which it has been grafted by the substitution process.

A similar issue arises with substituted fields and makes the problem more visible. If we located them in the 'dummy' class then instances of the original would have a 'blind' region where no field was defined while at the same time the field would be defined in a class that had no instances and had gaps where the unsubstituted fields were meant to sit.

n.b. Also even with the current model there is a problem with relocating substituted methods in a separate CU. At present each info section CU specifies a low and hi address range and all CUs are expected (by both the DWARF spec and gdb) to be listed in ascending address range order with no overlaps. The aranges section is also supposed to list those ranges in ascending order and map them to a CU. Your current change is pulling one or more methods out of a CU and listing them in an adjacent CU. You have been lucky in this case that the order constraint has not been broken. However, in general that will not be the case.

What this means is that you need to insert the method info record for the substituted method into the original class's CU even though that CU is associated with the file of the original class. The change you originally proposed will ensure that file and line numbers pulled out of the debug_line section reference the correct file and, hence, that gdb visits the correct lines when stepping thorugh the method code.

Now, you /could/ also update the method info to record the association of the susbtituted method with the alternative file. You can achieve that by adding a DW_AT_decl_file element to the method info (that is one of the things that I do in the updated info model). The value would be the index of the method's file in the debug_line section file table for the CU. You can obtain that from the class entry by looking up the local file index of the method's file entry. However, there is no great benefit to doing that just now as it won't materially improve the debugging experience and that fix will be arriving soon enough in my update.

So, I recommend you go back to the previous version of this fix which adjusted computation of the filename and cachedir for the method.

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 7, 2020

TLDR: I think that maintaining the current "approach" but skipping the generation of DIEs (and thus CUs) for substitution methods would do the trick and adding DW_AT_decl_file would be nice to have.

n.b. Also even with the current model there is a problem with relocating substituted methods in a separate CU. At present each info section CU specifies a low and hi address range and all CUs are expected (by both the DWARF spec and gdb) to be listed in ascending address range order with no overlaps. The aranges section is also supposed to list those ranges in ascending order and map them to a CU. Your current change is pulling one or more methods out of a CU and listing them in an adjacent CU. You have been lucky in this case that the order constraint has not been broken. However, in general that will not be the case.

I don't think this is luck. AFAIU GraalVM at compilation treats the substitution methods as regular methods of the substitution class so the ranges of the substitution method are expected to be outside of the ranges of the original CU.

To demonstrate this I have added two extra methods in Hello.DefaultGreeter that I don't substitute:

diff --git a/substratevm/src/com.oracle.svm.test/src/hello/Hello.java b/substratevm/src/com.oracle.svm.test/src/hello/Hello.java
index 0482024ce1a..b5bae88401b 100644
--- a/substratevm/src/com.oracle.svm.test/src/hello/Hello.java
+++ b/substratevm/src/com.oracle.svm.test/src/hello/Hello.java
@@ -28,11 +28,16 @@ package hello;
 
 // Checkstyle: stop
 
+import com.oracle.svm.core.annotate.NeverInline;
+
 public class Hello {
     public abstract static class Greeter {
         static Greeter greeter(String[] args) {
             if (args.length == 0) {
-                return new DefaultGreeter();
+                final DefaultGreeter defaultGreeter = new DefaultGreeter();
+                defaultGreeter.greetOne();
+                defaultGreeter.greetTwo();
+                return defaultGreeter;
             } else if (args.length == 1) {
                 return new NamedGreeter(args[0]);
             } else {
@@ -44,10 +49,21 @@ public class Hello {
     }
 
     public static class DefaultGreeter extends Greeter {
+        @NeverInline("testing purposes")
+        public void greetOne() {
+            System.out.println("Hello, One!");
+        }
+
         @Override
         public void greet() {
             System.out.println("Hello, world!");
         }
+
+        @NeverInline("testing purposes")
+        public void greetTwo() {
+            System.out.println("Hello, Two!");
+        }
+
     }
 
     public static class NamedGreeter extends Greeter {

and the corresponding debug_info segment looks like this:

  Compilation Unit @ offset 0x9104:
   Length:        0x3c (32-bit)
   Version:       2
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><910f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <9110>   DW_AT_language    : 11     (Java)
    <9111>   DW_AT_name        : (indirect string, offset: 0x7175): Target_hello_Hello_DefaultGreeter.java
    <9115>   DW_AT_comp_dir    : (indirect string, offset: 0x25aba): sources
    <9119>   DW_AT_low_pc      : 0x4c2a80
    <9121>   DW_AT_high_pc     : 0x4c2acb
    <9129>   DW_AT_stmt_list   : 0x80db0
 <1><912d>: Abbrev Number: 3 (DW_TAG_subprogram)
    <912e>   DW_AT_name        : (indirect string, offset: 0x85106): hello.Hello.DefaultGreeter::greet
    <9132>   DW_AT_low_pc      : 0x4c2a80
    <913a>   DW_AT_high_pc     : 0x4c2acb
    <9142>   DW_AT_external    : 1
 <1><9143>: Abbrev Number: 0
  Compilation Unit @ offset 0x9144:
   Length:        0x52 (32-bit)
   Version:       2
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><914f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <9150>   DW_AT_language    : 11     (Java)
    <9151>   DW_AT_name        : (indirect string, offset: 0x65d3c): Hello.java
    <9155>   DW_AT_comp_dir    : (indirect string, offset: 0x25aba): sources
    <9159>   DW_AT_low_pc      : 0x4c2b00
    <9161>   DW_AT_high_pc     : 0x4c2bcb
    <9169>   DW_AT_stmt_list   : 0x80e19
 <1><916d>: Abbrev Number: 3 (DW_TAG_subprogram)
    <916e>   DW_AT_name        : (indirect string, offset: 0x2597a): hello.Hello.DefaultGreeter::greetOne
    <9172>   DW_AT_low_pc      : 0x4c2b00
    <917a>   DW_AT_high_pc     : 0x4c2b4b
    <9182>   DW_AT_external    : 1
 <1><9183>: Abbrev Number: 3 (DW_TAG_subprogram)
    <9184>   DW_AT_name        : (indirect string, offset: 0xfb64): hello.Hello.DefaultGreeter::greetTwo
    <9188>   DW_AT_low_pc      : 0x4c2b80
    <9190>   DW_AT_high_pc     : 0x4c2bcb
    <9198>   DW_AT_external    : 1
 <1><9199>: Abbrev Number: 0

That said, I would expect adding the substitution method in the original CU to break the ranges ordering and not the other way around.

It's true that at the moment there is no benefit of creating that extra CU, but it helps solve the issue described on the end of this comment without breaking existing assumptions of the design.

What this means is that you need to insert the method info record for the substituted method into the original class's CU even though that CU is associated with the file of the original class. The change you originally proposed will ensure that file and line numbers pulled out of the debug_line section reference the correct file and, hence, that gdb visits the correct lines when stepping through the method code.

Now, you /could/ also update the method info to record the association of the susbtituted method with the alternative file. You can achieve that by adding a DW_AT_decl_file element to the method info (that is one of the things that I do in the updated info model). The value would be the index of the method's file in the debug_line section file table for the CU. You can obtain that from the class entry by looking up the local file index of the method's file entry. However, there is no great benefit to doing that just now as it won't materially improve the debugging experience and that fix will be arriving soon enough in my update.

Note that there is also DW_AT_abstract_origin that can be used to create partial subprogram DIEs and link them with the full DIE in another CU.
This could be an overkill though since DW_AT_decl_file can give us all we need.

So, I recommend you go back to the previous version of this fix which adjusted computation of the filename and cachedir for the method.

There were at least two problems with the previous approach.

  1. primary ranges were not always getting the correct source file path because https://github.com/oracle/graal/pull/2885/files#diff-069332e1ae845021510db48ab8f8c337L101-L108 was returning the "original" type, and that was later used to retrieve the file

  2. Even after fixing 1 and getting the correct file path for primaryRanges there was the following issue.

To set DW_AT_name of a CU we rely on the corresponding ClassEntry's fileEntry field.
ClassEntries are generated "lazely" while traversing primary ranges.
This is meant to ensure that each primaryRange can be associated with an existing ClassEntry.
Note that the fileEntry is passed to the ClassEntry at instantiation and originates from the corresponding primaryRange using the primary range's filename .

The above process assumes that primaryRanges corresponding to methods of the same Class are defined in the same file. However this is not true for substituted methods. As a result in the previous approach (after fixing 1) I ended up having primaryRanges mapping to the same ClassEntry but needing different fileEntries which was triggering this assertion. I then tried working around this but nothing I came up with seemed elegant. The toughest part was figuring which fileEntry of which primary range should be used to set DW_AT_name in the corresponding CU.

HTH

@zakkak
Copy link
Collaborator Author

zakkak commented Oct 7, 2020

@olpaw we are putting this on hold till @adinn's work on adding types is ready.

Thanks for your help so far :)

@zakkak zakkak marked this pull request as draft October 7, 2020 12:14
@olpaw
Copy link
Member

olpaw commented Oct 7, 2020

@olpaw we are putting this on hold till @adinn's work on adding types is ready.

Thanks for your help so far :)

No problem. Ping me if this becomes relevant again.

@zakkak
Copy link
Collaborator Author

zakkak commented May 17, 2021

@adinn I have updated this PR could you please have a look?

In the current state the substitute methods are being added to the CU of the original Class, and there is no CU for the substitute class (unless it defines non substitute methods as well).

@adinn
Copy link
Collaborator

adinn commented May 18, 2021

@adinn I have updated this PR could you please have a look?

In the current state the substitute methods are being added to the CU of the original Class, and there is no CU for the substitute class (unless it defines non substitute methods as well).

Yes, I know. That is being done deliberately and is not a bug. Which means that the changes you just pushed to 'correct' this are a mistake and will need to be removed.

We actually discussed this some while back while you were preparing this patch. The thing you need to recall is that substitutions are used to synthesize /one/ class in the target image from /two/ original source classes. The debuginfo must describe the types that exist in the final image. It can only model the one target class even though it may locate the elements of that class in different source files.

Let us take use an obvious example to explain this: java.lang.Class which is substituted by org.graal....DynamicHub. The two input classes provide /two/ possible names we can use to identify /one/ debuginfo class record. That one class has to own all the methods/fields of the synthesized class, whether they were provided by the original or by a substitution.

Suppose we were instead to introduce two class records. We might encode field name of type String in the debug info for java.lang.Class and encode field super of type Class (or, according to preference, DynamicHub) in the debug info for DynamicHub. Now suppose gdb is looking at some arbitrary Java instance and we ask it to dereference and print the Class/DynamicHub whose address is stored in the header.

If gdb were told this pointer is of type java.lang.Class * it would print field name but it would not know about field superHub. If instead we chose to use DynamicHub * to type this pointer then we would see the superHub field but not the name field. The same issue would arise if we were distribute methods to one or other class: a Class would have a getName() method but not a getSuperHub() method; a DynamicHub would be the other way round.

This need to provide only one class debuginfo record also means we have to make a choice about naming and use it consistently. We can choose to use name java.lang.Class or name org...DynamicHub for the class debuginfo record (n.b. the current code consistently employs the original name, not the substituted name). That choice also implies that gdb will use the qualifying class name for embedded fields and methods. So, gdb will identify method getSuperHub() added by class DynamicHub as java.lang.Class::getSuperHub(). We don't get a choice about that.

Note that all this is only to do with how the class, field or method debuginfo records are linked to each other and, indirectly, how they end up being named. We still get to choose independently how we associate the class and its elements with source files. So, can choose to associate our class debuginfo record with file Class.java or file DynamicHub.java (if we only had line info for the class declaration in the original bytecode we could even point gdb at one or other actual class definition). Not surprisingly, given the naming choice the file is Class.java.

Similarly, we can associate the methods and fields with either file (only once again bytecode does not actually give line numbers for declarations so we cannot place them precisely). We can also certainly ensure that the line number used to step through the code are for the source file which contains the code that actually got compiled. In this case we associate original fields/methods with Class.java and substituted fields/methods with DynamicHub.java because that is where the actual declaration and init/code for them resides.

The current implementation very carefully arranges to do exactly this file association described above -- that's what the wantOriginal flag in those lookup methods is there for.

So, please remove the last four commits which added the substituted classes and relocated method entries to them. The code should be correct without these changes. If you think there is still something wrong in the model and/or the association of model elements with source files please explain it to me first before making any other changes.

Once you have reverted back to where we were last night I will proceed to review the rest of this change as planned.

@zakkak
Copy link
Collaborator Author

zakkak commented May 18, 2021

Hi @adinn I am afraid there is some misunderstanding so let me try to clarify things.

@adinn I have updated this PR could you please have a look?
In the current state the substitute methods are being added to the CU of the original Class, and there is no CU for the substitute class (unless it defines non substitute methods as well).

Yes, I know. That is being done deliberately and is not a bug. Which means that the changes you just pushed to 'correct' this are a mistake and will need to be removed.

It appears that my choice of words were wrong. By "the current state" I meant "the current state of this PR" not "the current state of master branch". AFAIU the changes in this PR do not 'correct' this, they just make sure to associate the method DIE in the CU of the original Class with the file path of the source file of the substitute Class.

...

Similarly, we can associate the methods and fields with either file (only once again bytecode does not actually give line numbers for declarations so we cannot place them precisely). We can also certainly ensure that the line number used to step through the code are for the source file which contains the code that actually got compiled. In this case we associate original fields/methods with Class.java and substituted fields/methods with DynamicHub.java because that is where the actual declaration and init/code for them resides.

The current implementation very carefully arranges to do exactly this file association described above -- that's what the wantOriginal flag in those lookup methods is there for.

To my experience, it does not, or at least not always. In 9ba04e4 I add a case were the current implementation (the master branch) fails to provide the correct file path for a substitution.
I believe that this PR fixes this issue without breaking anything else.

So, please remove the last four commits which added the substituted classes and relocated method entries to them.

The last four commits do not add substituted classes or relocate (to other CUs) method entries, they just fix the FileEntry associations for substituted methods.

The code should be correct without these changes. If you think there is still something wrong in the model and/or the association of model elements with source files please explain it to me first before making any other changes.

Once you have reverted back to where we were last night I will proceed to review the rest of this change as planned.

The changes I made today (and the force-pushes) should only be cosmetic to make the CI happy, so I am not sure to which state you refer to by "last night". For clarity the changes that I applied today are the following:

$ git diff 2c440e8df91eede16fbee27afee34a94b36f2cd7
diff --git a/substratevm/mx.substratevm/testhello.py b/substratevm/mx.substratevm/testhello.py
index da63d4986f1..71703247fed 100644
--- a/substratevm/mx.substratevm/testhello.py
+++ b/substratevm/mx.substratevm/testhello.py
@@ -263,7 +263,7 @@ def test():
                           r"%s = 1000"%(wildcard_pattern))
         checker.check(exec_string, skip_fails=False)
 
-    # set a break point at PrintStream::println(String)
+    # set a break point at hello.Hello$DefaultGreeter::greet
     # expect "Breakpoint 2 at 0x[0-9a-f]+: file hello/Target_Hello_DefaultGreeter.java, line [0-9]+."
     exec_string = execute("break hello.Hello$DefaultGreeter::greet")
     rexp = r"Breakpoint 2 at %s: file hello/Target_hello_Hello_DefaultGreeter\.java, line %s\."%(address_pattern, digits_pattern)
diff --git a/substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/MethodEntry.java b/substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/MethodEntry.java
index 981bbd8f7e5..c7d93473c79 100644
--- a/substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/MethodEntry.java
+++ b/substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/MethodEntry.java
@@ -30,16 +30,18 @@ public class MethodEntry extends MemberEntry implements Comparable<MethodEntry>
     final TypeEntry[] paramTypes;
     final String[] paramNames;
     final boolean isDeoptTarget;
-    final boolean isSubstitutionMethod;
+    final boolean isSubstitution;
 
-    public MethodEntry(FileEntry fileEntry, String methodName, ClassEntry ownerType, TypeEntry valueType, TypeEntry[] paramTypes, String[] paramNames, int modifiers, boolean isDeoptTarget, boolean isSubstitutionMethod) {
+    public MethodEntry(FileEntry fileEntry, String methodName, ClassEntry ownerType, TypeEntry valueType,
+                    TypeEntry[] paramTypes, String[] paramNames, int modifiers, boolean isDeoptTarget,
+                    boolean isSubstitutionMethod) {
         super(fileEntry, methodName, ownerType, valueType, modifiers);
         assert ((paramTypes == null && paramNames == null) ||
                         (paramTypes != null && paramNames != null && paramTypes.length == paramNames.length));
         this.paramTypes = paramTypes;
         this.paramNames = paramNames;
         this.isDeoptTarget = isDeoptTarget;
-        this.isSubstitutionMethod = isSubstitutionMethod;
+        this.isSubstitution = isSubstitutionMethod;
     }
 
     public String methodName() {
@@ -102,8 +104,8 @@ public class MethodEntry extends MemberEntry implements Comparable<MethodEntry>
                 return paraComparison;
             }
         }
-        if (this.isSubstitutionMethod != isSubstitutionMethod) {
-            return this.isSubstitutionMethod ? 1 : -1;
+        if (this.isSubstitution != isSubstitutionMethod) {
+            return this.isSubstitution ? 1 : -1;
         }
         return 0;
     }
@@ -133,8 +135,8 @@ public class MethodEntry extends MemberEntry implements Comparable<MethodEntry>
                 return paramComparison;
             }
         }
-        if (this.isSubstitutionMethod != other.isSubstitutionMethod) {
-            return this.isSubstitutionMethod ? 1 : -1;
+        if (this.isSubstitution != other.isSubstitution) {
+            return this.isSubstitution ? 1 : -1;
         }
         return 0;
     }

Please let me know if there is still an issue.

@adinn
Copy link
Collaborator

adinn commented May 18, 2021

AFAIU the changes in this PR do not 'correct' this, they just make sure to associate the method DIE in the CU of the original Class with the file path of the source file of the substitute Class.

Ok, apologies for misreading your proposed change.

The current implementation very carefully arranges to do exactly this file association described above -- that's what the wantOriginal flag in those lookup methods is there for.

To my experience, it does not, or at least not always. In 9ba04e4 I add a case were the current implementation (the master branch) fails to provide the correct file path for a substitution.
I believe that this PR fixes this issue without breaking anything else.

Ok, let me have a look at what is going wrong with this example.

@adinn
Copy link
Collaborator

adinn commented May 19, 2021

Hi Foivos,

I debugged the current code using your substitution based Hello example and found 3 problems that were causing it to employ the wrong file when presented with a substituted class/method.

  • a change to how SubstitutionMethod/Field report their declaring class meant they return the original class when trying to identify the source file (wantOriginal = false) where previously they returned the annotated class
  • a similar lookup error in the debug code info generator meant it used the original class of a SubstitutionMethod to compute the file name for a range when it ought use the annotated class
  • an error in the line info generator where it uses the class's file as the start file of a primary range rather than the file of the range/its underlying method

I created a separate issue for this error #3420 I also provided a PR #3421 to fix the problem that includes two commits:

  1. your changes the Hello example and python scripts to test substitutions under mx debuginfotest
  2. the debuginfo changes that ensure this test passes

@zakkak
Copy link
Collaborator Author

zakkak commented May 19, 2021

Superseded by #3421

@zakkak zakkak closed this May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants