Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSYM] Include end_sequence debug_line rows in Dwarf transform #90535

Open
wants to merge 1 commit into
base: users/avillega/main.gsym-include-end_sequence-debug_line-rows-in-dwarf-transform
Choose a base branch
from

Conversation

avillega
Copy link
Contributor

Work around for #46494.
This change adds debug_line end_sequence rows when converting
the function line tables. By including the end_sequence
it is possible to handle some edge cases like icf optimizations.

Created using spr 1.3.5
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 30, 2024

@llvm/pr-subscribers-debuginfo

Author: Andres Villegas (avillega)

Changes

Work around for #46494.
This change adds debug_line end_sequence rows when converting
the function line tables. By including the end_sequence
it is possible to handle some edge cases like icf optimizations.


Patch is 23.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90535.diff

2 Files Affected:

  • (modified) llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp (+21-12)
  • (added) llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml (+564)
diff --git a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
index ff6b560d11726b..d12dce510e663e 100644
--- a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
+++ b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
@@ -321,7 +321,10 @@ static void convertFunctionLineTable(OutputAggregator &Out, CUInfo &CUI,
       StartAddress, object::SectionedAddress::UndefSection};
 
 
-  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) {
+  // end_sequence markers can be located at RangeSize position,
+  // lookupAddressRange search up to RangeSize not inclusive, to include
+  // end_sequence markers it is necessary to lookup until RangeSize+1
+  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, RowVector)) {
     // If we have a DW_TAG_subprogram but no line entries, fall back to using
     // the DW_AT_decl_file an d DW_AT_decl_line if we have both attributes.
     std::string FilePath = Die.getDeclFile(
@@ -354,6 +357,18 @@ static void convertFunctionLineTable(OutputAggregator &Out, CUInfo &CUI,
   for (uint32_t RowIndex : RowVector) {
     // Take file number and line/column from the row.
     const DWARFDebugLine::Row &Row = CUI.LineTable->Rows[RowIndex];
+
+    // TODO(avillega): With this conditional, functions folded by `icf`
+    // optimizations will only include 1 of all the folded functions. There is
+    // not a clear path forward to have the information of all folded functions
+    // in gsym.
+    if (Row.EndSequence) {
+      // End sequence markers are included for the last address
+      // in a function or the last contiguos address in a sequence.
+      break;
+    }
+
+
     std::optional<uint32_t> OptFileIdx =
         CUI.DWARFToGSYMFileIndex(Gsym, Row.File);
     if (!OptFileIdx) {
@@ -411,7 +426,7 @@ static void convertFunctionLineTable(OutputAggregator &Out, CUInfo &CUI,
       else
         Out.Report("Non-monotonically increasing addresses",
                    [&](raw_ostream &OS) {
-                     OS << "error: line table has addresses that do not "
+                     OS << "warning: line table has addresses that do not "
                         << "monotonically increase:\n";
                      for (uint32_t RowIndex2 : RowVector)
                        CUI.LineTable->Rows[RowIndex2].dump(OS);
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator &Out, CUInfo &CUI,
     auto LastLE = FI.OptLineTable->last();
     if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
         continue;
+
     // Only push a row if it isn't an end sequence. End sequence markers are
     // included for the last address in a function or the last contiguous
     // address in a sequence.
-    if (Row.EndSequence) {
-      // End sequence means that the next line entry could have a lower address
-      // that the previous entries. So we clear the previous row so we don't
-      // trigger the line table error about address that do not monotonically
-      // increase.
-      PrevRow = DWARFDebugLine::Row();
-    } else {
-      FI.OptLineTable->push(LE);
-      PrevRow = Row;
-    }
+    FI.OptLineTable->push(LE);
+    PrevRow = Row;
+
   }
   // If not line table rows were added, clear the line table so we don't encode
   // on in the GSYM file.
diff --git a/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml
new file mode 100644
index 00000000000000..0e1e507179057b
--- /dev/null
+++ b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml
@@ -0,0 +1,564 @@
+## Test loading an ELF file with DWARF with icf (identical code folding) 
+## optimizations.
+## First we make the ELF file from yaml,
+## then we convert the ELF file to GSYM, then we do lookups on the newly
+## created GSYM, and finally we dump the entire GSYM.
+##
+## The elf file corresponds to this c program:
+## int f() {
+##   return 1;
+## }
+## 
+## int g() {
+##   return 1;
+## }
+## 
+## int main() {
+##   f();
+##   g();
+##   return 0;
+## }
+
+# RUN: yaml2obj %s -o %t
+# RUN: llvm-gsymutil --convert %t --out-file=%t.gsym 2>&1 | FileCheck %s --check-prefix=CONVERT --dump-input=always
+
+# CONVERT-NOT: warning: line table has addresses that do not monotonically increase
+# CONVERT: Input file: {{.*\.yaml\.tmp}}
+# CONVERT: Output file (x86_64): {{.*\.yaml\.tmp\.gsym}}
+# CONVERT: Loaded 2 functions from DWARF.
+# CONVERT: Loaded 8 functions from symbol table.
+# CONVERT: Pruned 3 functions, ended with 7 total
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_DYN
+  Machine:         EM_X86_64
+  Entry:           0x1680
+ProgramHeaders:
+  - Type:            PT_PHDR
+    Flags:           [ PF_R ]
+    VAddr:           0x40
+    Align:           0x8
+    Offset:          0x40
+  - Type:            PT_INTERP
+    Flags:           [ PF_R ]
+    FirstSec:        .interp
+    LastSec:         .interp
+    VAddr:           0x2A8
+    Offset:          0x2A8
+  - Type:            PT_LOAD
+    Flags:           [ PF_R ]
+    FirstSec:        .interp
+    LastSec:         .eh_frame
+    Align:           0x1000
+    Offset:          0x0
+  - Type:            PT_LOAD
+    Flags:           [ PF_X, PF_R ]
+    FirstSec:        .text
+    LastSec:         .plt
+    VAddr:           0x1680
+    Align:           0x1000
+    Offset:          0x680
+  - Type:            PT_LOAD
+    Flags:           [ PF_W, PF_R ]
+    FirstSec:        .init_array
+    LastSec:         .got
+    VAddr:           0x2810
+    Align:           0x1000
+    Offset:          0x810
+  - Type:            PT_LOAD
+    Flags:           [ PF_W, PF_R ]
+    FirstSec:        .data
+    LastSec:         .bss
+    VAddr:           0x39E8
+    Align:           0x1000
+    Offset:          0x9E8
+  - Type:            PT_DYNAMIC
+    Flags:           [ PF_W, PF_R ]
+    FirstSec:        .dynamic
+    LastSec:         .dynamic
+    VAddr:           0x2820
+    Align:           0x8
+    Offset:          0x820
+  - Type:            PT_GNU_RELRO
+    Flags:           [ PF_R ]
+    FirstSec:        .init_array
+    LastSec:         .got
+    VAddr:           0x2810
+    Offset:          0x810
+  - Type:            PT_GNU_EH_FRAME
+    Flags:           [ PF_R ]
+    FirstSec:        .eh_frame_hdr
+    LastSec:         .eh_frame_hdr
+    VAddr:           0x57C
+    Align:           0x4
+    Offset:          0x57C
+  - Type:            PT_GNU_STACK
+    Flags:           [ PF_W, PF_R ]
+    Align:           0x0
+    Offset:          0x0
+  - Type:            PT_NOTE
+    Flags:           [ PF_R ]
+    FirstSec:        .note.ABI-tag
+    LastSec:         .note.gnu.build-id
+    VAddr:           0x2C4
+    Align:           0x4
+    Offset:          0x2C4
+Sections:
+  - Name:            .interp
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x2A8
+    AddressAlign:    0x1
+    Content:         2F6C696236342F6C642D6C696E75782D7838362D36342E736F2E3200
+  - Name:            .note.ABI-tag
+    Type:            SHT_NOTE
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x2C4
+    AddressAlign:    0x4
+    Notes:
+      - Name:            GNU
+        Desc:            '00000000030000000200000000000000'
+        Type:            NT_VERSION
+  - Name:            .note.gnu.build-id
+    Type:            SHT_NOTE
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x2E4
+    AddressAlign:    0x4
+    Notes:
+      - Name:            GNU
+        Desc:            849F3F19CCA98C8C
+        Type:            NT_PRPSINFO
+  - Name:            .dynsym
+    Type:            SHT_DYNSYM
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x300
+    Link:            .dynstr
+    AddressAlign:    0x8
+  - Name:            .gnu.version
+    Type:            SHT_GNU_versym
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x390
+    Link:            .dynsym
+    AddressAlign:    0x2
+    Entries:         [ 0, 2, 1, 1, 3, 1 ]
+  - Name:            .gnu.version_r
+    Type:            SHT_GNU_verneed
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x39C
+    Link:            .dynstr
+    AddressAlign:    0x4
+    Dependencies:
+      - Version:         1
+        File:            libc.so.6
+        Entries:
+          - Name:            GLIBC_2.2.5
+            Hash:            157882997
+            Flags:           0
+            Other:           3
+          - Name:            GLIBC_2.34
+            Hash:            110530996
+            Flags:           0
+            Other:           2
+  - Name:            .gnu.hash
+    Type:            SHT_GNU_HASH
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x3D0
+    Link:            .dynsym
+    AddressAlign:    0x8
+    Header:
+      SymNdx:          0x6
+      Shift2:          0x1A
+    BloomFilter:     [ 0x0 ]
+    HashBuckets:     [ 0x0 ]
+    HashValues:      [  ]
+  - Name:            .dynstr
+    Type:            SHT_STRTAB
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x3EC
+    AddressAlign:    0x1
+  - Name:            .rela.dyn
+    Type:            SHT_RELA
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x470
+    Link:            .dynsym
+    AddressAlign:    0x8
+    Relocations:
+      - Offset:          0x2810
+        Type:            R_X86_64_RELATIVE
+        Addend:          5808
+      - Offset:          0x2818
+        Type:            R_X86_64_RELATIVE
+        Addend:          5888
+      - Offset:          0x39F0
+        Type:            R_X86_64_RELATIVE
+        Addend:          14832
+      - Offset:          0x29C0
+        Symbol:          __libc_start_main
+        Type:            R_X86_64_GLOB_DAT
+      - Offset:          0x29C8
+        Symbol:          __gmon_start__
+        Type:            R_X86_64_GLOB_DAT
+      - Offset:          0x29D0
+        Symbol:          __register_frame_info
+        Type:            R_X86_64_GLOB_DAT
+      - Offset:          0x29D8
+        Symbol:          __cxa_finalize
+        Type:            R_X86_64_GLOB_DAT
+      - Offset:          0x29E0
+        Symbol:          __deregister_frame_info
+        Type:            R_X86_64_GLOB_DAT
+  - Name:            .rela.plt
+    Type:            SHT_RELA
+    Flags:           [ SHF_ALLOC, SHF_INFO_LINK ]
+    Address:         0x530
+    Link:            .dynsym
+    AddressAlign:    0x8
+    Info:            .got.plt
+    Relocations:
+      - Offset:          0x3A10
+        Symbol:          __register_frame_info
+        Type:            R_X86_64_JUMP_SLOT
+      - Offset:          0x3A18
+        Symbol:          __cxa_finalize
+        Type:            R_X86_64_JUMP_SLOT
+      - Offset:          0x3A20
+        Symbol:          __deregister_frame_info
+        Type:            R_X86_64_JUMP_SLOT
+  - Name:            .rodata
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_MERGE ]
+    Address:         0x578
+    AddressAlign:    0x4
+    EntSize:         0x4
+    Content:         '01000200'
+  - Name:            .eh_frame_hdr
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x57C
+    AddressAlign:    0x4
+    Content:         011B033B3000000005000000041100004C00000034110000780000008411000098000000F4110000B800000004120000D8000000
+  - Name:            .eh_frame
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x5B0
+    AddressAlign:    0x8
+    Content:         1400000000000000017A5200017810011B0C070890010710100000001C000000B010000022000000000000001400000000000000017A5200017810011B0C0708900100001C0000001C000000B41000004700000000410E108602430D0602420C070800001C0000003C000000E41000006100000000410E108602430D06025C0C070800001C0000005C000000341100000B00000000410E108602430D06460C07080000001C0000007C000000241100002100000000410E108602430D065C0C070800000000000000
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x1680
+    AddressAlign:    0x10
+    Content:         31ED4989D15E4889E24883E4F050544531C031C9488D3DE5000000FF151F130000F4CCCCCCCCCCCCCCCCCCCCCCCCCCCC554889E5F6056D230000010F8405000000E92F000000C6055B23000001488B05FC1200004885C00F8418000000E900000000488D3DC7EEFFFF488D3540230000E8EB0000005DC3660F1F840000000000554889E5F60565230000010F8405000000E949000000C6055323000001488B05B41200004885C00F8411000000E900000000488B3DB7220000E8B2000000488B059B1200004885C00F8411000000E900000000488D3D56EEFFFFE8A10000005DC3CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC554889E5B8010000005DC3CCCCCCCCCC554889E54883EC10C745FC00000000E8DCFFFFFFE8D7FFFFFF31C04883C4105DC3
+  - Name:            .init
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x17A4
+    AddressAlign:    0x4
+    Content:         4883EC08488B05191200004885C07402FFD04883C408C3
+  - Name:            .fini
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x17BC
+    AddressAlign:    0x4
+    Content:         4883EC084883C408C3
+  - Name:            .plt
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x17D0
+    AddressAlign:    0x10
+    Content:         FF352A220000FF252C2200000F1F4000FF252A2200006800000000E9E0FFFFFFFF25222200006801000000E9D0FFFFFFFF251A2200006802000000E9C0FFFFFF
+  - Name:            .init_array
+    Type:            SHT_INIT_ARRAY
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x2810
+    AddressAlign:    0x8
+    Content:         '0000000000000000'
+  - Name:            .fini_array
+    Type:            SHT_FINI_ARRAY
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x2818
+    AddressAlign:    0x8
+    Content:         '0000000000000000'
+  - Name:            .dynamic
+    Type:            SHT_DYNAMIC
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x2820
+    Link:            .dynstr
+    AddressAlign:    0x8
+    Entries:
+      - Tag:             DT_NEEDED
+        Value:           0x5F
+      - Tag:             DT_FLAGS_1
+        Value:           0x8000000
+      - Tag:             DT_DEBUG
+        Value:           0x0
+      - Tag:             DT_RELA
+        Value:           0x470
+      - Tag:             DT_RELASZ
+        Value:           0xC0
+      - Tag:             DT_RELAENT
+        Value:           0x18
+      - Tag:             DT_RELACOUNT
+        Value:           0x3
+      - Tag:             DT_JMPREL
+        Value:           0x530
+      - Tag:             DT_PLTRELSZ
+        Value:           0x48
+      - Tag:             DT_PLTGOT
+        Value:           0x39F8
+      - Tag:             DT_PLTREL
+        Value:           0x7
+      - Tag:             DT_SYMTAB
+        Value:           0x300
+      - Tag:             DT_SYMENT
+        Value:           0x18
+      - Tag:             DT_STRTAB
+        Value:           0x3EC
+      - Tag:             DT_STRSZ
+        Value:           0x80
+      - Tag:             DT_GNU_HASH
+        Value:           0x3D0
+      - Tag:             DT_INIT_ARRAY
+        Value:           0x2810
+      - Tag:             DT_INIT_ARRAYSZ
+        Value:           0x8
+      - Tag:             DT_FINI_ARRAY
+        Value:           0x2818
+      - Tag:             DT_FINI_ARRAYSZ
+        Value:           0x8
+      - Tag:             DT_INIT
+        Value:           0x17A4
+      - Tag:             DT_FINI
+        Value:           0x17BC
+      - Tag:             DT_VERSYM
+        Value:           0x390
+      - Tag:             DT_VERNEED
+        Value:           0x39C
+      - Tag:             DT_VERNEEDNUM
+        Value:           0x1
+      - Tag:             DT_NULL
+        Value:           0x0
+  - Name:            .got
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x29C0
+    AddressAlign:    0x8
+    Content:         '00000000000000000000000000000000000000000000000000000000000000000000000000000000'
+  - Name:            .data
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x39E8
+    AddressAlign:    0x8
+    Content:         '00000000000000000000000000000000'
+  - Name:            .got.plt
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x39F8
+    AddressAlign:    0x8
+    Content:         202800000000000000000000000000000000000000000000E617000000000000F6170000000000000618000000000000
+  - Name:            .bss
+    Type:            SHT_NOBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x3A28
+    AddressAlign:    0x8
+    Size:            0x49
+  - Name:            .comment
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_MERGE, SHF_STRINGS ]
+    AddressAlign:    0x1
+    EntSize:         0x1
+    Content:         4C696E6B65723A2046756368736961204C4C442031382E302E3000004675636873696120636C616E672076657273696F6E2031382E302E30303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303030303000
+  - Name:            .debug_abbrev
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         011101252513050325721710171B2511015523731774170000022E00111B1206401803253A0B3B0B49133F19000003240003253E0B0B0B000000
+  - Name:            .debug_info
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         59000000050001080000000001001D0001080000000000000002000000000000000000080000000C00000002000B00000001560300015800000002010B0000000156050005580000000202210000000156060009580000000304050400
+  - Name:            .debug_rnglists
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         1600000005000800010000000400000003000B03010B03022100
+  - Name:            .debug_str_offsets
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         20000000050000002F00000020000000000000002D00000029000000A200000024000000
+  - Name:            .debug_line
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         8F0000000500080037000000010101FB0E0D00010101010000000100000101011F010000000003011F020F051E0120000000008DEC93A3F4F70159E4940B929722C023040000090270170000000000000105030A4B060B580202000101040000090270170000000000001605030A4B060B580202000101040000090280170000000000001A05030AE55959060B2E0206000101
+  - Name:            .debug_line_str
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_MERGE, SHF_STRINGS ]
+    AddressAlign:    0x1
+    EntSize:         0x1
+    Content:         2F7573722F6C6F63616C2F6C6F63616C6C2F686F6D652F303030303030303000742E6300
+Symbols:
+  - Name:            __abi_tag
+    Type:            STT_OBJECT
+    Section:         .note.ABI-tag
+    Value:           0x2C4
+    Size:            0x20
+  - Name:            crtbegin.c
+    Type:            STT_FILE
+    Index:           SHN_ABS
+  - Name:            __do_init
+    Type:            STT_FUNC
+    Section:         .text
+    Value:           0x16B0
+    Size:            0x47
+  - Name:            __do_init.__initialized
+    Type:            STT_OBJECT
+    Section:         .bss
+    Value:           0x3A28
+    Size:            0x1
+  - Name:            __EH_FRAME_LIST__
+    Type:            STT_OBJECT
+    Section:         .eh_frame
+    Value:           0x5B0
+  - Name:            __do_init.__object
+    Type:            STT_OBJECT
+    Section:         .bss
+    Value:           0x3A30
+    Size:            0x40
+  - Name:            __do_fini
+    Type:            STT_FUNC
+    Section:         .text
+    Value:           0x1700
+    Size:            0x61
+  - Name:            __do_fini.__finalized
+    Type:            STT_OBJECT
+    Section:         .bss
+    Value:           0x3A70
+    Size:            0x1
+  - Name:            __init
+    Type:            STT_OBJECT
+    Section:         .init_array
+    Value:           0x2810
+    Size:            0x8
+  - Name:            __fini
+    Type:            STT_OBJECT
+    Section:         .fini_array
+    Value:           0x2818
+    Size:            0x8
+  - Name:            __d...
[truncated]

@avillega
Copy link
Contributor Author

I think I can accomplish the same behaviour exposed in #89703 which requires a change to the DWARF apis without actually changing them.

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this makes sense after checking the code for DWARFDebugLine::LineTable::lookupAddressRangeImpl(...). If a line table has multiple sequences that contain an address, it will find the first sequence that contains the address and then return the rows for the function.

What is the effect of this change on the test case? Does it change the final line table in the GSYM file?

Comment on lines -324 to +327
if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) {
// end_sequence markers can be located at RangeSize position,
// lookupAddressRange search up to RangeSize not inclusive, to include
// end_sequence markers it is necessary to lookup until RangeSize+1
if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, RowVector)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a line table has two functions that share the same address range within the same line table, this call currently will return only matches from the first sequence that contains an address. So we won't get all rows from all sequences that match. I checked the DWARFDebugLine::LineTable::lookupAddressRangeImpl(...) function to verify.

Comment on lines +360 to +371

// TODO(avillega): With this conditional, functions folded by `icf`
// optimizations will only include 1 of all the folded functions. There is
// not a clear path forward to have the information of all folded functions
// in gsym.
if (Row.EndSequence) {
// End sequence markers are included for the last address
// in a function or the last contiguos address in a sequence.
break;
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a function is split into two discontiguous ranges? This will lose the line table information for any subsequent discontiguous ranges since we break out of the loop here.

// Only push a row if it isn't an end sequence. End sequence markers are
// included for the last address in a function or the last contiguous
// address in a sequence.
if (Row.EndSequence) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to not break out on Row.EndSequence as it allows functions to have discontiguous ranges. Not sure if that happens. I was assuming that if we asked for the rows for a given address range we wouldn't get all entries if two merged functions with different line table entries were found, but that assumption might not be correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allows functions to have discontiguous ranges. Not sure if that happens.

I think it can, Bolt and/or Propeller put each basic block in its own section IIRC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite following the collection of thoughts here, they seem disjoint to me, so trying to discuss:

We used to not break out on Row.EndSequence

I don't understand the relationship between end_sequence and functions with discontiguous ranges - could you describe this connection in more detail?

as it allows functions to have discontiguous ranges.

Agreed with @pogo59, I believe both Bolt and Propeller can create discontiguous address ranges for a function (but you'll see DW_AT_ranges on the subprogram)

I was assuming that if we asked for the rows for a given address range we wouldn't get all entries if two merged functions with different line table entries were found, but that assumption might not be correct?

Yeah, looking at the implementation of lookupAddressRangeImpl it finds a single sequence that starts closest to the start address, then adds all rows within that sequence that cover the range requested. So, no, it won't retrieve addresses/ranges from multiple sequences.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite following the collection of thoughts here, they seem disjoint to me, so trying to discuss:

We used to not break out on Row.EndSequence

I don't understand the relationship between end_sequence and functions with discontiguous ranges - could you describe this connection in more detail?

If a DW_TAG_subprogram has N discontiguous ranges, we will create N FunctionInfo objects, one for each range. We will request the line table entries for each range in the DW_TAG_subprogram's DW_AT_ranges attribute. So if there are end sequences in there, we try to keep going. If there is an end_sequence in the line table this is probably the indication of a bug in the line tables.

as it allows functions to have discontiguous ranges.

Agreed with @pogo59, I believe both Bolt and Propeller can create discontiguous address ranges for a function (but you'll see DW_AT_ranges on the subprogram)

Yes, and as I mention above, we create individual FunctionInfo objects for each range and only request the line table entries for each individual range.

I was assuming that if we asked for the rows for a given address range we wouldn't get all entries if two merged functions with different line table entries were found, but that assumption might not be correct?

Yeah, looking at the implementation of lookupAddressRangeImpl it finds a single sequence that starts closest to the start address, then adds all rows within that sequence that cover the range requested. So, no, it won't retrieve addresses/ranges from multiple sequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants