Skip to content

[SHT_LLVM_BB_ADDR_MAP] Encode and decode callsite offsets in a newly-introduced SHT_LLVM_BB_ADDR_MAP version. #144426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rlavaee
Copy link
Contributor

@rlavaee rlavaee commented Jun 16, 2025

Recently, we have been looking at some optimizations targeting individual calls. In particular, we plan to extend the address mapping technique to map to individual callsites. For example, in this piece of code for a basic blocks:

<BB>:
1200:    lea 0x1(%rcx), %rdx
1204:    callq foo
1209:    cmpq 0x10, %rdx
120d:    ja  L1

We want to emit 0x9 as the call site offset for callq foo (the offset from the block entry to right after the call), so that we know if a sampled address is before the call or after.

This PR implements the decode/encode/emit capability. The Codegen change will be implemented in a later PR.

@llvmbot
Copy link
Member

llvmbot commented Jun 16, 2025

@llvm/pr-subscribers-objectyaml

@llvm/pr-subscribers-llvm-binary-utilities

Author: Rahman Lavaee (rlavaee)

Changes

Recently, we have been looking at some optimizations targeting individual calls. In particular, we plan to extend the address mapping technique to map to individual callsites. For example, in this piece of code for a basic blocks:

&lt;BB&gt;:
1200:    lea 0x1(%rcx), %rdx
1204:    callq foo
1209:    cmpq 0x10, %rdx
120d:    ja  L1

We want to emit 0x9 as the call site offset for callq foo (the offset from the block entry to right after the call), so that we know if a sampled address is before the call or after.

This PR implements the decode/encode/emit capability. The Codegen change will be implemented in a later PR.


Patch is 30.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144426.diff

12 Files Affected:

  • (modified) llvm/include/llvm/Object/ELFTypes.h (+15-7)
  • (modified) llvm/include/llvm/ObjectYAML/ELFYAML.h (+1)
  • (modified) llvm/lib/Object/ELF.cpp (+29-3)
  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+17-1)
  • (modified) llvm/lib/ObjectYAML/ELFYAML.cpp (+1)
  • (modified) llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test (+6-4)
  • (modified) llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml (+16-14)
  • (modified) llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml (+36-13)
  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+2)
  • (modified) llvm/tools/obj2yaml/elf2yaml.cpp (+12-2)
  • (modified) llvm/unittests/Object/ELFObjectFileTest.cpp (+59-35)
  • (modified) llvm/unittests/Object/ELFTypesTest.cpp (+17-14)
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 87e4dbe448091..d7a468f1116d7 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -831,6 +831,7 @@ struct BBAddrMap {
     bool BrProb : 1;
     bool MultiBBRange : 1;
     bool OmitBBEntries : 1;
+    bool CallsiteOffsets : 1;
 
     bool hasPGOAnalysis() const { return FuncEntryCount || BBFreq || BrProb; }
 
@@ -842,7 +843,8 @@ struct BBAddrMap {
              (static_cast<uint8_t>(BBFreq) << 1) |
              (static_cast<uint8_t>(BrProb) << 2) |
              (static_cast<uint8_t>(MultiBBRange) << 3) |
-             (static_cast<uint8_t>(OmitBBEntries) << 4);
+             (static_cast<uint8_t>(OmitBBEntries) << 4) |
+             (static_cast<uint8_t>(CallsiteOffsets) << 5);
     }
 
     // Decodes from minimum bit width representation and validates no
@@ -851,7 +853,7 @@ struct BBAddrMap {
       Features Feat{
           static_cast<bool>(Val & (1 << 0)), static_cast<bool>(Val & (1 << 1)),
           static_cast<bool>(Val & (1 << 2)), static_cast<bool>(Val & (1 << 3)),
-          static_cast<bool>(Val & (1 << 4))};
+          static_cast<bool>(Val & (1 << 4)), static_cast<bool>(Val & (1 << 5))};
       if (Feat.encode() != Val)
         return createStringError(
             std::error_code(), "invalid encoding for BBAddrMap::Features: 0x%x",
@@ -861,9 +863,10 @@ struct BBAddrMap {
 
     bool operator==(const Features &Other) const {
       return std::tie(FuncEntryCount, BBFreq, BrProb, MultiBBRange,
-                      OmitBBEntries) ==
+                      OmitBBEntries, CallsiteOffsets) ==
              std::tie(Other.FuncEntryCount, Other.BBFreq, Other.BrProb,
-                      Other.MultiBBRange, Other.OmitBBEntries);
+                      Other.MultiBBRange, Other.OmitBBEntries,
+                      Other.CallsiteOffsets);
     }
   };
 
@@ -914,13 +917,18 @@ struct BBAddrMap {
     uint32_t Size = 0;   // Size of the basic block.
     Metadata MD = {false, false, false, false,
                    false}; // Metdata for this basic block.
+    // Offsets of callsites (end of call instructions), relative to the basic
+    // block start.
+    SmallVector<uint32_t, 1> CallsiteOffsets;
 
-    BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD)
-        : ID(ID), Offset(Offset), Size(Size), MD(MD){};
+    BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD,
+            SmallVector<uint32_t, 1> CallsiteOffsets)
+        : ID(ID), Offset(Offset), Size(Size), MD(MD),
+          CallsiteOffsets(std::move(CallsiteOffsets)) {}
 
     bool operator==(const BBEntry &Other) const {
       return ID == Other.ID && Offset == Other.Offset && Size == Other.Size &&
-             MD == Other.MD;
+             MD == Other.MD && CallsiteOffsets == Other.CallsiteOffsets;
     }
 
     bool hasReturn() const { return MD.HasReturn; }
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index dfdfa055d65fa..539d56bed102b 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -162,6 +162,7 @@ struct BBAddrMapEntry {
     llvm::yaml::Hex64 AddressOffset;
     llvm::yaml::Hex64 Size;
     llvm::yaml::Hex64 Metadata;
+    std::optional<std::vector<llvm::yaml::Hex64>> CallsiteOffsets;
   };
   uint8_t Version;
   llvm::yaml::Hex8 Feature;
diff --git a/llvm/lib/Object/ELF.cpp b/llvm/lib/Object/ELF.cpp
index e6864ca508a54..359dfb14b8c37 100644
--- a/llvm/lib/Object/ELF.cpp
+++ b/llvm/lib/Object/ELF.cpp
@@ -837,7 +837,7 @@ decodeBBAddrMapImpl(const ELFFile<ELFT> &EF,
       Version = Data.getU8(Cur);
       if (!Cur)
         break;
-      if (Version > 2)
+      if (Version > 3)
         return createError("unsupported SHT_LLVM_BB_ADDR_MAP version: " +
                            Twine(static_cast<int>(Version)));
       Feature = Data.getU8(Cur); // Feature byte
@@ -893,7 +893,32 @@ decodeBBAddrMapImpl(const ELFFile<ELFT> &EF,
                             ? readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr)
                             : BlockIndex;
           uint32_t Offset = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
-          uint32_t Size = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+          // Read the callsite offsets.
+          uint32_t LastCallsiteOffset = 0;
+          SmallVector<uint32_t, 1> CallsiteOffsets;
+          if (FeatEnable.CallsiteOffsets) {
+            if (Version < 3) {
+              return createError(
+                  "version should be >= 3 for SHT_LLVM_BB_ADDR_MAP when "
+                  "callsite offsets feature is enabled: version = " +
+                  Twine(static_cast<int>(Version)) +
+                  " feature = " + Twine(static_cast<int>(Feature)));
+            }
+            uint32_t NumCallsites =
+                readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+            CallsiteOffsets.reserve(NumCallsites);
+            for (uint32_t CallsiteIndex = 0;
+                 !ULEBSizeErr && Cur && (CallsiteIndex < NumCallsites);
+                 ++CallsiteIndex) {
+              LastCallsiteOffset +=
+                  readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+              CallsiteOffsets.push_back(LastCallsiteOffset);
+            }
+            if (!Cur || ULEBSizeErr)
+              break;
+          }
+          uint32_t Size = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr) +
+                          LastCallsiteOffset;
           uint32_t MD = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
           if (Version >= 1) {
             // Offset is calculated relative to the end of the previous BB.
@@ -906,7 +931,8 @@ decodeBBAddrMapImpl(const ELFFile<ELFT> &EF,
             MetadataDecodeErr = MetadataOrErr.takeError();
             break;
           }
-          BBEntries.push_back({ID, Offset, Size, *MetadataOrErr});
+          BBEntries.push_back(
+              {ID, Offset, Size, *MetadataOrErr, CallsiteOffsets});
         }
         TotalNumBlocks += BBEntries.size();
       }
diff --git a/llvm/lib/ObjectYAML/ELFEmitter.cpp b/llvm/lib/ObjectYAML/ELFEmitter.cpp
index 9ae76a71ede5e..76d679c9ceffb 100644
--- a/llvm/lib/ObjectYAML/ELFEmitter.cpp
+++ b/llvm/lib/ObjectYAML/ELFEmitter.cpp
@@ -1452,7 +1452,7 @@ void ELFState<ELFT>::writeSectionContent(
   for (const auto &[Idx, E] : llvm::enumerate(*Section.Entries)) {
     // Write version and feature values.
     if (Section.Type == llvm::ELF::SHT_LLVM_BB_ADDR_MAP) {
-      if (E.Version > 2)
+      if (E.Version > 3)
         WithColor::warning() << "unsupported SHT_LLVM_BB_ADDR_MAP version: "
                              << static_cast<int>(E.Version)
                              << "; encoding using the most recent version";
@@ -1483,6 +1483,13 @@ void ELFState<ELFT>::writeSectionContent(
     if (!E.BBRanges)
       continue;
     uint64_t TotalNumBlocks = 0;
+    bool EmitCallsiteOffsets = FeatureOrErr->CallsiteOffsets;
+    if (!EmitCallsiteOffsets) {
+      for (const ELFYAML::BBAddrMapEntry::BBRangeEntry &BBR : *E.BBRanges)
+        for (const ELFYAML::BBAddrMapEntry::BBEntry &BBE : *BBR.BBEntries)
+          EmitCallsiteOffsets |=
+              BBE.CallsiteOffsets.has_value() && !BBE.CallsiteOffsets->empty();
+    }
     for (const ELFYAML::BBAddrMapEntry::BBRangeEntry &BBR : *E.BBRanges) {
       // Write the base address of the range.
       CBA.write<uintX_t>(BBR.BaseAddress, ELFT::Endianness);
@@ -1500,6 +1507,15 @@ void ELFState<ELFT>::writeSectionContent(
         if (Section.Type == llvm::ELF::SHT_LLVM_BB_ADDR_MAP && E.Version > 1)
           SHeader.sh_size += CBA.writeULEB128(BBE.ID);
         SHeader.sh_size += CBA.writeULEB128(BBE.AddressOffset);
+        if (EmitCallsiteOffsets) {
+          uint32_t NumCallsites =
+              BBE.CallsiteOffsets.has_value() ? BBE.CallsiteOffsets->size() : 0;
+          SHeader.sh_size += CBA.writeULEB128(NumCallsites);
+          if (BBE.CallsiteOffsets.has_value()) {
+            for (uint32_t Offset : *BBE.CallsiteOffsets)
+              SHeader.sh_size += CBA.writeULEB128(Offset);
+          }
+        }
         SHeader.sh_size += CBA.writeULEB128(BBE.Size);
         SHeader.sh_size += CBA.writeULEB128(BBE.Metadata);
       }
diff --git a/llvm/lib/ObjectYAML/ELFYAML.cpp b/llvm/lib/ObjectYAML/ELFYAML.cpp
index 520e956fdab9f..c38f86e4f4f1b 100644
--- a/llvm/lib/ObjectYAML/ELFYAML.cpp
+++ b/llvm/lib/ObjectYAML/ELFYAML.cpp
@@ -1882,6 +1882,7 @@ void MappingTraits<ELFYAML::BBAddrMapEntry::BBEntry>::mapping(
   IO.mapRequired("AddressOffset", E.AddressOffset);
   IO.mapRequired("Size", E.Size);
   IO.mapRequired("Metadata", E.Metadata);
+  IO.mapOptional("CallsiteOffsets", E.CallsiteOffsets);
 }
 
 void MappingTraits<ELFYAML::PGOAnalysisMapEntry>::mapping(
diff --git a/llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test b/llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
index c5d071c11d1de..5d7bc8baa9b25 100644
--- a/llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
+++ b/llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
@@ -49,7 +49,8 @@
 # CHECK-NEXT:           {
 # CHECK-NEXT:             ID: 2
 # CHECK-NEXT:             Offset: 0x3
-# CHECK-NEXT:             Size: 0x4
+# CHECK-NEXT:             Callsite Offsets: [1, 3]
+# CHECK-NEXT:             Size: 0x7
 # CHECK-NEXT:             HasReturn: Yes
 # CHECK-NEXT:             HasTailCall: No
 # CHECK-NEXT:             IsEHPad: Yes
@@ -75,7 +76,7 @@
 # CHECK-NEXT:             HasTailCall: No
 # CHECK-NEXT:             IsEHPad: No
 # CHECK-NEXT:             CanFallThrough: Yes
-# CHECK-NEXT:            HasIndirectBranch: No
+# CHECK-NEXT:             HasIndirectBranch: No
 # CHECK-NEXT:           }
 # CHECK-NEXT:         ]
 # CHECK-NEXT:       }
@@ -143,8 +144,8 @@ Sections:
     ShSize: [[SIZE=<none>]]
     Link:   .text
     Entries:
-      - Version: 2
-        Feature: 0x8
+      - Version: 3
+        Feature: 0x28
         BBRanges:
           - BaseAddress: [[ADDR=0x11111]]
             BBEntries:
@@ -158,6 +159,7 @@ Sections:
                 AddressOffset: 0x3
                 Size:          0x4
                 Metadata:      0x15
+                CallsiteOffsets: [ 0x1 , 0x2 ]
       - Version: 2
         BBRanges:
           - BaseAddress: 0x22222
diff --git a/llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml b/llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
index 8dbf97ef2bc12..861cb94692947 100644
--- a/llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
+++ b/llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
@@ -14,7 +14,7 @@
 # VALID-NEXT:   - Name: .llvm_bb_addr_map
 # VALID-NEXT:     Type: SHT_LLVM_BB_ADDR_MAP
 # VALID-NEXT:     Entries:
-# VALID-NEXT:       - Version: 2
+# VALID-NEXT:       - Version: 3
 # VALID-NEXT:         BBRanges:
 ## The 'BaseAddress' field is omitted when it's zero.
 # VALID-NEXT:           - BBEntries:
@@ -30,15 +30,16 @@
 # VALID-NEXT:               AddressOffset: 0xFFFFFFFFFFFFFFF7
 # VALID-NEXT:               Size:          0xFFFFFFFFFFFFFFF8
 # VALID-NEXT:               Metadata:      0xFFFFFFFFFFFFFFF9
-# VALID-NEXT:       - Version: 2
-# VALID-NEXT:         Feature: 0x8
+# VALID-NEXT:       - Version: 3
+# VALID-NEXT:         Feature: 0x28
 # VALID-NEXT:         BBRanges:
 # VALID-NEXT:           - BaseAddress: 0xFFFFFFFFFFFFFF20
 # VALID-NEXT:             BBEntries:
-# VALID-NEXT:               - ID:            6
-# VALID-NEXT:                 AddressOffset: 0xA
-# VALID-NEXT:                 Size:          0xB
-# VALID-NEXT:                 Metadata:      0xC
+# VALID-NEXT:               - ID:              6
+# VALID-NEXT:                 AddressOffset:   0xA
+# VALID-NEXT:                 Size:            0xB
+# VALID-NEXT:                 Metadata:        0xC
+# VALID-NEXT:                 CallsiteOffsets: [ 0x1, 0x2 ]
 
 --- !ELF
 FileHeader:
@@ -50,7 +51,7 @@ Sections:
     Type:   SHT_LLVM_BB_ADDR_MAP
     ShSize: [[SIZE=<none>]]
     Entries:
-      - Version: 2
+      - Version: 3
         Feature: 0x0
         BBRanges:
           - BaseAddress: 0x0
@@ -67,17 +68,18 @@ Sections:
                 AddressOffset: 0xFFFFFFFFFFFFFFF7
                 Size:          0xFFFFFFFFFFFFFFF8
                 Metadata:      0xFFFFFFFFFFFFFFF9
-      - Version:   2
-        Feature:   0x8
+      - Version:   3
+        Feature:   0x28
         NumBBRanges: [[NUMBBRANGES=<none>]]
         BBRanges:
           - BaseAddress:   0xFFFFFFFFFFFFFF20
             NumBlocks: [[NUMBLOCKS=<none>]]
             BBEntries:
-             - ID:            6
-               AddressOffset: 0xA
-               Size:          0xB
-               Metadata:      0xC
+             - ID:              6
+               AddressOffset:   0xA
+               Size:            0xB
+               Metadata:        0xC
+               CallsiteOffsets: [ 0x1, 0x2 ]
 
 ## Check obj2yaml can dump empty .llvm_bb_addr_map sections.
 
diff --git a/llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml b/llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
index 709938babffbf..bc12884311e96 100644
--- a/llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
+++ b/llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
@@ -36,7 +36,8 @@
 # Case 4: Specify Entries.
 # CHECK:        Name: .llvm_bb_addr_map (1)
 # CHECK:        SectionData (
-# CHECK-NEXT:     0000: 02002000 00000000 0000010B 010203
+# CHECK-NEXT:     0000: 03002000 00000000 0000010B 01020102
+# CHECK-NEXT:     0010: 0203
 # CHECK-NEXT:   )
 
 # Case 5: Specify Entries and omit the Address field.
@@ -44,28 +45,32 @@
 # CHECK:        Address:
 # CHECK-SAME:   {{^ 0x0$}}
 # CHECK:        SectionData (
-# CHECK-NEXT:     0000: 02000000 00000000 0000010C 010203
+# CHECK-NEXT:     0000: 03000000 00000000 0000010C 010203
 # CHECK-NEXT:   )
 
 # Case 6: Override the NumBlocks field.
 # CHECK:        Name: .llvm_bb_addr_map (1)
 # CHECK:        SectionData (
-# CHECK-NEXT:     0000: 02002000 00000000 0000020D 010203
+# CHECK-NEXT:     0000: 03002000 00000000 0000020D 010203
 # CHECK-NEXT:   )
 
 # Case 7: Specify empty BBRanges.
 # CHECK:        Name: .llvm_bb_addr_map (1)
 # CHECK:        SectionData (
-# CHECK-NEXT:     0000: 020000
+# CHECK-NEXT:     0000: 030000
 # CHECK-NEXT:   )
 
 # Case 8: Specify empty BBRanges with multi-bb-range.
 # CHECK:        Name: .llvm_bb_addr_map (1)
 # CHECK:        SectionData (
-# CHECK-NEXT:     0000: 020800
+# CHECK-NEXT:     0000: 030800
 # CHECK-NEXT:   )
 
-
+# Case 9: Specify empty CallsiteOffsets.
+# CHECK:        Name: .llvm_bb_addr_map (1)
+# CHECK:        SectionData (
+# CHECK-NEXT:     0000: 03202000 00000000 0000010E 01000203
+# CHECK-NEXT:   )
 
 
 --- !ELF
@@ -100,7 +105,7 @@ Sections:
   - Name: '.llvm_bb_addr_map (4)'
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
-      - Version: 2
+      - Version: 3
         BBRanges:
           - BaseAddress: 0x0000000000000020
             BBEntries:
@@ -108,13 +113,14 @@ Sections:
                 AddressOffset: 0x00000001
                 Size:          0x00000002
                 Metadata:      0x00000003
+                CallsiteOffsets: [0x1, 0x2]
 
 ## 5) When specifying the description with Entries, the 'Address' field will be
 ##    zero when omitted.
   - Name: '.llvm_bb_addr_map (5)'
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
-      - Version: 2
+      - Version: 3
         BBRanges:
           - BBEntries:
             - ID:            12
@@ -127,7 +133,7 @@ Sections:
   - Name: '.llvm_bb_addr_map (6)'
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
-      - Version:   2
+      - Version: 3
         BBRanges:
           - BaseAddress:   0x0000000000000020
             NumBlocks: 2
@@ -142,7 +148,7 @@ Sections:
   - Name: '.llvm_bb_addr_map (7)'
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
-      - Version: 2
+      - Version: 3
         BBRanges: []
 
 ## 8) We can produce a SHT_LLVM_BB_ADDR_MAP section from a multi-bb-range
@@ -150,10 +156,27 @@ Sections:
   - Name: '.llvm_bb_addr_map (8)'
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
-      - Version: 2
+      - Version: 3
         Feature: 0x8
         BBRanges: []
 
+## 9) We can produce a SHT_LLVM_BB_ADDR_MAP section from a description
+##    with empty callsite offsets.
+  - Name: '.llvm_bb_addr_map (9)'
+    Type: SHT_LLVM_BB_ADDR_MAP
+    Entries:
+      - Version: 3
+        Feature: 0x20
+        BBRanges:
+          - BaseAddress: 0x0000000000000020
+            BBEntries:
+             - ID:              14
+               AddressOffset:   0x00000001
+               Size:            0x00000002
+               Metadata:        0x00000003
+               CallsiteOffsets: []
+
+
 ## Check we can't use Entries at the same time as either Content or Size.
 # RUN: not yaml2obj --docnum=2 -DCONTENT="00" %s 2>&1 | FileCheck %s --check-prefix=INVALID
 # RUN: not yaml2obj --docnum=2 -DSIZE="0" %s 2>&1 | FileCheck %s --check-prefix=INVALID
@@ -175,7 +198,7 @@ Sections:
 
 ## Check that yaml2obj generates a warning when we use unsupported versions.
 # RUN: yaml2obj --docnum=3  %s 2>&1 | FileCheck %s --check-prefix=INVALID-VERSION
-# INVALID-VERSION: warning: unsupported SHT_LLVM_BB_ADDR_MAP version: 3; encoding using the most recent version
+# INVALID-VERSION: warning: unsupported SHT_LLVM_BB_ADDR_MAP version: 4; encoding using the most recent version
 
 --- !ELF
 FileHeader:
@@ -187,4 +210,4 @@ Sections:
     Type: SHT_LLVM_BB_ADDR_MAP
     Entries:
 ##  Specify unsupported version
-      - Version: 3
+      - Version: 4
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index abaf6077ba9e7..7250d0a129cf5 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -7878,6 +7878,8 @@ void LLVMELFDumper<ELFT>::printBBAddrMaps(bool PrettyPGOAnalysis) {
             DictScope BBED(W);
             W.printNumber("ID", BBE.ID);
             W.printHex("Offset", BBE.Offset);
+            if (!BBE.CallsiteOffsets.empty())
+              W.printList("Callsite Offsets", BBE.CallsiteOffsets);
             W.printHex("Size", BBE.Size);
             W.printBoolean("HasReturn", BBE.hasReturn());
             W.printBoolean("HasTailCall", BBE.hasTailCall());
diff --git a/llvm/tools/obj2yaml/elf2yaml.cpp b/llvm/tools/obj2yaml/elf2yaml.cpp
index c56ed15501b40..53455b8c7580a 100644
--- a/llvm/tools/obj2yaml/elf2yaml.cpp
+++ b/llvm/tools/obj2yaml/elf2yaml.cpp
@@ -899,7 +899,7 @@ ELFDumper<ELFT>::dumpBBAddrMapSection(const Elf_Shdr *Shdr) {
   while (Cur && Cur.tell() < Content.size()) {
     if (Shdr->sh_type == ELF::SHT_LLVM_BB_ADDR_MAP) {
       Version = Data.getU8(Cur);
-      if (Cur && Version > 2)
+      if (Cur && Version > 3)
         return createStringError(
             errc::invalid_argument,
             "invalid SHT_LLVM_BB_ADDR_MAP section version: " +
@@ -934,9 +934,19 @@ ELFDumper<ELFT>::dumpBBAddrMapSection(const Elf_Shdr *Shdr) {
            ++BlockIndex) {
         uint32_t ID = Version >= 2 ? Data.getULEB128(Cur) : BlockIndex;
         uint64_t Offset = Data.getULEB128(Cur);
+        std::optional<std::vector<llvm::yaml::Hex64>> CallsiteOffsets;
+        if (FeatureOrErr->CallsiteOffsets) {
+          uint32_t NumCallsites = Data.getULEB128(Cur);
+          CallsiteOffsets = std::vector<llvm::yaml::Hex64>(NumCallsites, 0);
+          for (uint32_t CallsiteIndex = 0; Cur && CallsiteIndex < NumCallsites;
+               ++CallsiteIndex) {
+            (*CallsiteOffsets)[CallsiteIndex] = Data.getULEB128(Cur);
+          }
+        }
         uint64_t Size = Data.getULEB128(Cur);
         uint64_t Metadata = Data.getULEB128(Cur);
-        BBEntries.push_back({ID, Offset, Size, Metadata});
+        BBEntries.push_back(
+            {ID, Offset, Size, Metadata, std::move(CallsiteOffsets)});
       }
       TotalNumBlocks += BBEntries.size();
       BBRanges.push_back({BaseAddress, /*NumBlocks=*/{}, BBEntries});
diff --git a/llvm/unittests/Object/ELFObjectFileTest.cpp b/llvm/unittests/Object/ELFObjectFileTest.cpp
index 1073df95c379a..423f92ea07b39 100644
--- a/llvm/unittests/Object/ELFObjectFileTest.cpp
+++ b/llvm/unittests/Object/ELFObjectFileTest.cpp
@@ -531,7 +531,7 @@ TEST(ELFObjectFileTest, InvalidDecodeBBAddrMap) {
   // Check that we can detect unsupported versions.
   SmallString<128> UnsupportedVersionYamlString(CommonYamlString);
   UnsupportedVersionYamlString += R"(
-      - Version: 3
+      - Version: 4
         BBRanges:
           - BaseAddress: 0x11111...
[truncated]

@rlavaee rlavaee requested a review from xur-llvm June 16, 2025 20:08
@rlavaee rlavaee changed the title Introduce the new SHT_LLVM_BB_ADDR_MAP version 3, to allow encoding callsite offsets. [SHT_LLVM_BB_ADDR_MAP] Encode and decode callsite offsets in a newly-introduced SHT_LLVM_BB_ADDR_MAP version. Jun 16, 2025
Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe your motivation for doing this a bit more?

We also needed to do something similar for basic block trace modelling given call instructions are sort of an implicit CFG edge. We just did this by parsing the binary/object files though. It was pretty inexpensive and didn't require any overhead/additional complexity within the BBAddrMap. It sounds like there are additional constraints here that I'm missing though.

@rlavaee
Copy link
Contributor Author

rlavaee commented Jun 16, 2025

Can you describe your motivation for doing this a bit more?

I think your understanding is correct. To explain a bit more, we want to map from an address X to the following :
{Function name, BB ID, Callsite region index, offset within the callsite region}

Callsite region index #i specifies the address range between the end of the call #i-1 to the end of the call #i.

We just did this by parsing the binary/object files though.

By "parsing" do you mean disassembly? The approach implemented here does not need any disassembly. And our constraint is no disassembly.

@boomanaiden154
Copy link
Contributor

By "parsing" do you mean disassembly? The approach implemented here does not need any disassembly. And our constraint is no disassembly.

Yeah. Why is not disassembling a hard requirement? If you're parsing the BBAddrMap I would assume you have access to the rest of the binary and it's not super expensive to disassemble a large binary, at least with our setup when we tried it.

@rlavaee
Copy link
Contributor Author

rlavaee commented Jun 16, 2025

Yeah. Why is not disassembling a hard requirement?

We would have to support every architecture separately, whereas here it's architecture-independent. We allow disassembly in the llvm-propeller tooling as long as it's a specific architecture-dependent case. What's your specific objection to this approach? We have extended the section multiple times so far (BB-relative offsets, BB ranges).

@boomanaiden154
Copy link
Contributor

We would have to support every architecture separately, whereas here it's architecture-independent.

Ah, true. I would think MCInstrDesc::isCall() would handle most of it though assuming you can get the architecture from a user-provided target triple or the binary itself?

What's your specific objection to this approach? We have extended the section multiple times so far (BB-relative offsets, BB ranges).

No specific objection, I just want to make sure the additional complexity/version bump has sufficient motivation and isn't something that can just be trivially reconstructed from the binary.

The CI failures also look relevant: https://github.com/llvm/llvm-project/actions/runs/15690836465?pr=144426

Metadata: 0x00000003
CallsiteOffsets: []


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: double blank line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@rlavaee
Copy link
Contributor Author

rlavaee commented Jun 17, 2025

We would have to support every architecture separately, whereas here it's architecture-independent.

Ah, true. I would think MCInstrDesc::isCall() would handle most of it though assuming you can get the architecture from a user-provided target triple or the binary itself?

Correct. We will be using the Codegen intrinsic here too.

What's your specific objection to this approach? We have extended the section multiple times so far (BB-relative offsets, BB ranges).

No specific objection, I just want to make sure the additional complexity/version bump has sufficient motivation and isn't something that can just be trivially reconstructed from the binary.

The CI failures also look relevant: https://github.com/llvm/llvm-project/actions/runs/15690836465?pr=144426

Thanks. Fixed the issues.

@rlavaee rlavaee force-pushed the callsite_anchors branch 2 times, most recently from 6632858 to 0b25028 Compare June 17, 2025 23:51
@rlavaee rlavaee force-pushed the callsite_anchors branch from 0b25028 to 1f848b5 Compare June 18, 2025 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants