-
Couldn't load subscription status.
- Fork 15k
Enable LLDB to load large dSYM files. #164471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-lldb Author: Greg Clayton (clayborg) Changesllvm-dsymutil can produce mach-o files where some sections in __DWARF exceed the 4GB barrier and subsequent sections in the dSYM will be inaccessible because the mach-o section_64 structure only has a 32 bit file offset. This patch enables LLDB to load a large dSYM file by figuring out when this happens and properly adjusting the file offset of the LLDB sections. I was unable to add a test as obj2yaml and yaml2obj are broken for mach-o files and they can't convert a yaml file back into a valid mach-o object file. Any suggestions for adding a test would be appreciated. Full diff: https://github.com/llvm/llvm-project/pull/164471.diff 1 Files Affected:
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index 9cdb8467bfc60..6878f7331e0f5 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -1674,6 +1674,10 @@ void ObjectFileMachO::ProcessSegmentCommand(
uint32_t segment_sect_idx;
const lldb::user_id_t first_segment_sectID = context.NextSectionIdx + 1;
+ // dSYM files can create sections whose data exceeds the 4GB barrier, but
+ // mach-o sections only have 32 bit offsets. So keep track of when we
+ // overflow and fix the sections offsets as we iterate.
+ uint64_t section_offset_adjust = 0;
const uint32_t num_u32s = load_cmd.cmd == LC_SEGMENT ? 7 : 8;
for (segment_sect_idx = 0; segment_sect_idx < load_cmd.nsects;
++segment_sect_idx) {
@@ -1697,6 +1701,14 @@ void ObjectFileMachO::ProcessSegmentCommand(
// isn't stored in the abstracted Sections.
m_mach_sections.push_back(sect64);
+ // Make sure we can load dSYM files whose __DWARF sections exceed the 4GB
+ // barrier. llvm::MachO::section_64 have only 32 bit file offsets for the
+ // section contents.
+ const uint64_t section_file_offset = sect64.offset + section_offset_adjust;
+ // If this section overflows a 4GB barrier, then we need to adjust any
+ // subsequent the section offsets.
+ if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX)
+ section_offset_adjust += 0x100000000ull;
if (add_section) {
ConstString section_name(
sect64.sectname, strnlen(sect64.sectname, sizeof(sect64.sectname)));
@@ -1736,13 +1748,13 @@ void ObjectFileMachO::ProcessSegmentCommand(
}
// Grow the section size as needed.
- if (sect64.offset) {
+ if (section_file_offset) {
const lldb::addr_t segment_min_file_offset =
segment->GetFileOffset();
const lldb::addr_t segment_max_file_offset =
segment_min_file_offset + segment->GetFileSize();
- const lldb::addr_t section_min_file_offset = sect64.offset;
+ const lldb::addr_t section_min_file_offset = section_file_offset;
const lldb::addr_t section_max_file_offset =
section_min_file_offset + sect64.size;
const lldb::addr_t new_file_offset =
@@ -1770,9 +1782,9 @@ void ObjectFileMachO::ProcessSegmentCommand(
sect64.addr, // File VM address == addresses as they are
// found in the object file
sect64.size, // VM size in bytes of this section
- sect64.offset, // Offset to the data for this section in
+ section_file_offset, // Offset to the data for this section in
// the file
- sect64.offset ? sect64.size : 0, // Size in bytes of
+ section_file_offset ? sect64.size : 0, // Size in bytes of
// this section as
// found in the file
sect64.align,
@@ -1792,14 +1804,14 @@ void ObjectFileMachO::ProcessSegmentCommand(
SectionSP section_sp(new Section(
segment_sp, module_sp, this, ++context.NextSectionIdx, section_name,
sect_type, sect64.addr - segment_sp->GetFileAddress(), sect64.size,
- sect64.offset, sect64.offset == 0 ? 0 : sect64.size, sect64.align,
- sect64.flags));
+ section_file_offset, section_file_offset == 0 ? 0 : sect64.size,
+ sect64.align, sect64.flags));
// Set the section to be encrypted to match the segment
bool section_is_encrypted = false;
if (!segment_is_encrypted && load_cmd.filesize != 0)
section_is_encrypted = context.EncryptedRanges.FindEntryThatContains(
- sect64.offset) != nullptr;
+ section_file_offset) != nullptr;
section_sp->SetIsEncrypted(segment_is_encrypted || section_is_encrypted);
section_sp->SetPermissions(segment_permissions);
|
|
llvm-dsymutil will assert when asserts are enabled when creating large dSYM files that exhibit this bevavior. If asserts are disabled, then llvm-dsymutil can produce such binaries. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Is there an issue open for this? Would it be difficult to fix? |
| if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX) | ||
| section_offset_adjust += 0x100000000ull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's an edge case, but I believe this will be incorrect if sect64.size > 2^32. I would suggest masking out the lower 32 bits and adding that to the adjust value, but there might be a cleaner way.
| if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX) | |
| section_offset_adjust += 0x100000000ull; | |
| if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX) | |
| section_offset_adjust += ~((1 << 32) - 1) & ((uint64_t)sect64.offset + sect64.size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's an edge case, but I believe this will be incorrect if
sect64.size > 2^32. I would suggest masking out the lower 32 bits and adding that to the adjust value, but there might be a cleaner way.
I have an offending dSYM file that loads properly with this fix, so I believe math is correct. Here is the output of my mach-o.py tool that dumps the sections of a mach-o file that has this problem:
% macho.py --sections Foo.dSYM
FILE OFF INDEX ADDRESS SIZE OFFSET ALIGN RELOFF NRELOC FLAGS RESERVED1 RESERVED2 RESERVED3 NAME
=========== ===== ------------------ ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------------------
0x000000f8: [ 1] 0x0000000100004000 0x00000000092531bc 0x00000000 0x00000005 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 __TEXT.__text
0x00000148: [ 2] 0x00000001092571bc 0x000000000003e22c 0x00000000 0x00000002 0x00000000 0x00000000 0x80000408 0x00008dba 0x0000000c 0x00000000 __TEXT.__stubs
0x00000198: [ 3] 0x00000001092953e8 0x0000000000163d40 0x00000000 0x00000002 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 __TEXT.__objc_stubs
0x000001e8: [ 4] 0x00000001093f9128 0x0000000000001010 0x00000000 0x00000002 0x00000000 0x00000000 0x00000016 0x00000000 0x00000000 0x00000000 __TEXT.__init_offsets
0x00000238: [ 5] 0x00000001093fa140 0x000000000000acb8 0x00000000 0x00000004 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__literals
0x00000288: [ 6] 0x0000000109404df8 0x00000000002e6390 0x00000000 0x00000002 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __TEXT.__objc_methlist
0x000002d8: [ 7] 0x00000001096eb188 0x000000000014a098 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__constg_swiftt
0x00000328: [ 8] 0x0000000109835220 0x0000000000159542 0x00000000 0x00000004 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_typeref
0x00000378: [ 9] 0x000000010998e770 0x00000000001726f0 0x00000000 0x00000004 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_reflstr
0x000003c8: [ 10] 0x0000000109b00e60 0x000000000015a4e8 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_fieldmd
0x00000418: [ 11] 0x0000000109c5b348 0x0000000000082ed8 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_capture
0x00000468: [ 12] 0x0000000109cde220 0x00000000000165f4 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_types
0x000004b8: [ 13] 0x0000000109cf4814 0x000000000006a178 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__ustring
0x00000508: [ 14] 0x0000000109d5e98c 0x000000000000ae60 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_builtin
0x00000558: [ 15] 0x0000000109d697ec 0x0000000000001388 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_mpenum
0x000005a8: [ 16] 0x0000000109d6ab74 0x0000000000002c9c 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_protos
0x000005f8: [ 17] 0x0000000109d6d810 0x00000000000187a0 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_proto
0x00000648: [ 18] 0x0000000109d85fb0 0x00000000000076e0 0x00000000 0x00000002 0x00000000 0x00000000 0x1000000b 0x00000000 0x00000000 0x00000000 __TEXT.__swift_as_entry
0x00000698: [ 19] 0x0000000109d8d690 0x0000000000006d94 0x00000000 0x00000002 0x00000000 0x00000000 0x1000000b 0x00000000 0x00000000 0x00000000 __TEXT.__swift_as_ret
0x000006e8: [ 20] 0x0000000109d94424 0x0000000000012688 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__swift5_assocty
0x00000738: [ 21] 0x0000000109da6aac 0x000000000015c8b8 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__unwind_info
0x00000788: [ 22] 0x0000000109f03368 0x000000000015d55c 0x240be000 0x00000003 0x00000000 0x00000000 0x6000000b 0x00000000 0x00000000 0x00000000 __TEXT.__eh_frame
0x00000820: [ 23] 0x000000010a064000 0x0000000000046dc8 0x00000000 0x00000003 0x00000000 0x00000000 0x00000006 0x00000000 0x00000000 0x00000000 __DATA_CONST.__got
0x00000870: [ 24] 0x000000010a0aadd0 0x0000000000aa6788 0x00000000 0x00000004 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__const
0x000008c0: [ 25] 0x000000010ab51558 0x000000000026de60 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__cfstring
0x00000910: [ 26] 0x000000010adbf3b8 0x0000000000043c60 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_classlist
0x00000960: [ 27] 0x000000010ae03018 0x0000000000013708 0x00000000 0x00000003 0x00000000 0x00000000 0x1000000b 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_protolist
0x000009b0: [ 28] 0x000000010ae16720 0x0000000000021240 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_catlist
0x00000a00: [ 29] 0x000000010ae37960 0x00000000000002f0 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_nlclslist
0x00000a50: [ 30] 0x000000010ae37c50 0x0000000000000010 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_nlcatlist
0x00000aa0: [ 31] 0x000000010ae37c60 0x0000000000000008 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA_CONST.__objc_imageinfo
0x00000b38: [ 32] 0x000000010ae38000 0x00000000000359c8 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_classrefs
0x00000b88: [ 33] 0x000000010ae6d9c8 0x00000000006d3218 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_data
0x00000bd8: [ 34] 0x000000010b540be0 0x000000000001b4e0 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_superrefs
0x00000c28: [ 35] 0x000000010b55c0c0 0x00000000000faa10 0x00000000 0x00000003 0x00000000 0x00000000 0x10000005 0x00000000 0x00000000 0x00000000 __DATA.__objc_selrefs
0x00000c78: [ 36] 0x000000010b656ad0 0x000000000006d008 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_ivar
0x00000cc8: [ 37] 0x000000010b6c3ad8 0x0000000000e536e0 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_const
0x00000d18: [ 38] 0x000000010c5171c0 0x00000000002f0599 0x00000000 0x00000005 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__data
0x00000d68: [ 39] 0x000000010c807760 0x000000000000b0a8 0x00000000 0x00000003 0x00000000 0x00000000 0x1000000b 0x00000000 0x00000000 0x00000000 __DATA.__objc_protorefs
0x00000db8: [ 40] 0x000000010c812808 0x0000000000000810 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_clsrolist
0x00000e08: [ 41] 0x000000010c813018 0x00000000000001a0 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_stublist
0x00000e58: [ 42] 0x000000010c8131b8 0x0000000000000978 0x00000000 0x00000003 0x00000000 0x00000000 0x00000013 0x00000000 0x00000000 0x00000000 __DATA.__thread_vars
0x00000ea8: [ 43] 0x000000010c813b30 0x0000000000000008 0x00000000 0x00000003 0x00000000 0x00000000 0x10000000 0x00000000 0x00000000 0x00000000 __DATA.__objc_catlist2
0x00000ef8: [ 44] 0x000000010c813b38 0x0000000000000190 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__s_async_hook
0x00000f48: [ 45] 0x000000010c813cc8 0x00000000000000b0 0x00000000 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__swift56_hooks
0x00000f98: [ 46] 0x000000010c813d78 0x0000000000000008 0x00000000 0x00000003 0x00000000 0x00000000 0x00000014 0x00008db9 0x00000000 0x00000000 __DATA.__thread_ptrs
0x00000fe8: [ 47] 0x000000010c813d80 0x0000000000000058 0x00000000 0x00000004 0x00000000 0x00000000 0x00000011 0x00000000 0x00000000 0x00000000 __DATA.__thread_data
0x00001038: [ 48] 0x000000010c813de0 0x0000000000003b88 0x00000000 0x00000004 0x00000000 0x00000000 0x00000012 0x00000000 0x00000000 0x00000000 __DATA.__thread_bss
0x00001088: [ 49] 0x000000010c817980 0x0000000000064290 0x00000000 0x00000007 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 __DATA.__common
0x000010d8: [ 50] 0x000000010c87bc40 0x00000000002f4d80 0x00000000 0x00000006 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 __DATA.__bss
0x00001170: [ 51] 0x000000010cb74000 0x00000000003623d4 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __RODATA.__gcc_except_tab
0x000011c0: [ 52] 0x000000010ced6400 0x00000000006f9cd3 0x00000000 0x00000008 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __RODATA.__const
0x00001210: [ 53] 0x000000010d5d00e0 0x0000000001098b2a 0x00000000 0x00000004 0x00000000 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 __RODATA.__cstring
0x00001260: [ 54] 0x000000010e668c0a 0x0000000000749b6d 0x00000000 0x00000000 0x00000000 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 __RODATA.__objc_methname
0x00001340: [ 55] 0x0000000132e70000 0x00000000afe7d690 0x2421c000 0x00000005 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__swift_ast
0x00001390: [ 56] 0x00000001e2ced690 0x000000002515ddf1 0xd4099690 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_line
0x000013e0: [ 57] 0x0000000207e4b481 0x000000007de6527a 0xf91f7481 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_info
warning: this section's end file range 0x17705c6fb (0xf91f7481 + 0x7de6527a) exceeds the 4GB boundary, subsequent sections might have their offsets truncated since 64 bit mach-o sections only have 32 bit offsets.
Subsequent section might have an offset greater than or equal to 0x7705c6fb
0x00001430: [ 58] 0x0000000285cb06fb 0x0000000003665210 0x7705c6fb 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_aranges
error: this section's file range overlaps another
0x00001480: [ 59] 0x000000028931590b 0x0000000006d67dc0 0x7a6c190b 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_ranges
error: this section's file range overlaps another
0x000014d0: [ 60] 0x000000029007d6cb 0x000000000ba354aa 0x814296cb 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_loc
error: this section's file range overlaps another
0x00001520: [ 61] 0x000000029bab2b75 0x0000000000009e78 0x8ce5eb75 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_frame
error: this section's file range overlaps another
0x00001570: [ 62] 0x000000029babc9ed 0x000000000000d370 0x8ce689ed 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_abbrev
error: this section's file range overlaps another
0x000015c0: [ 63] 0x000000029bac9d5d 0x0000000068f46242 0x8ce75d5d 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__debug_str
error: this section's file range overlaps another
0x00001610: [ 64] 0x0000000304a0ff9f 0x0000000000261e20 0xf5dbbf9f 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__apple_namespac
error: this section's file range overlaps another
0x00001660: [ 65] 0x0000000304c71dbf 0x0000000014921828 0xf601ddbf 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__apple_names
error: this section's file range overlaps another
warning: this section's end file range 0x10a93f5e7 (0xf601ddbf + 0x14921828) exceeds the 4GB boundary, subsequent sections might have their offsets truncated since 64 bit mach-o sections only have 32 bit offsets.
Subsequent section might have an offset greater than or equal to 0x0a93f5e7
0x000016b0: [ 66] 0x00000003195935e7 0x00000000083a2b70 0x0a93f5e7 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__apple_types
0x00001700: [ 67] 0x0000000321936157 0x000000000027cfb8 0x12ce2157 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DWARF.__apple_objc
Prior to the fix LLDB shows this:
SectID Type File Address Perm File Off. File Size Flags Section Name
------------------ ---------------------- --------------------------------------- ---- ---------- ---------- ---------- ----------------------------
0x0000000000000600 container [0x000000010edb4000-0x0000000132e70000) r-- 0x00002000 0x240bb056 0x00000000 Foo.__LINKEDIT
0x0000000000000700 container [0x0000000132e70000-0x0000000321bb4000) rw- 0x2421c000 0x1eed4310f 0x00000000 Foo.__DWARF
0x0000000000000037 swift-modules [0x0000000132e70000-0x00000001e2ced690) rw- 0x2421c000 0xafe7d690 0x00000000 Foo.__DWARF.__swift_ast
0x0000000000000038 dwarf-line [0x00000001e2ced690-0x0000000207e4b481) rw- 0xd4099690 0x2515ddf1 0x00000000 Foo.__DWARF.__debug_line
0x0000000000000039 dwarf-info [0x0000000207e4b481-0x0000000285cb06fb) rw- 0xf91f7481 0x7de6527a 0x00000000 Foo.__DWARF.__debug_info
0x000000000000003a dwarf-aranges [0x0000000285cb06fb-0x000000028931590b) rw- 0x7705c6fb 0x03665210 0x00000000 Foo.__DWARF.__debug_aranges
0x000000000000003b dwarf-ranges [0x000000028931590b-0x000000029007d6cb) rw- 0x7a6c190b 0x06d67dc0 0x00000000 Foo.__DWARF.__debug_ranges
0x000000000000003c dwarf-loc [0x000000029007d6cb-0x000000029bab2b75) rw- 0x814296cb 0x0ba354aa 0x00000000 Foo.__DWARF.__debug_loc
0x000000000000003d dwarf-frame [0x000000029bab2b75-0x000000029babc9ed) rw- 0x8ce5eb75 0x00009e78 0x00000000 Foo.__DWARF.__debug_frame
0x000000000000003e dwarf-abbrev [0x000000029babc9ed-0x000000029bac9d5d) rw- 0x8ce689ed 0x0000d370 0x00000000 Foo.__DWARF.__debug_abbrev
0x000000000000003f dwarf-str [0x000000029bac9d5d-0x0000000304a0ff9f) rw- 0x8ce75d5d 0x68f46242 0x00000000 Foo.__DWARF.__debug_str
0x0000000000000040 apple-namespaces [0x0000000304a0ff9f-0x0000000304c71dbf) rw- 0xf5dbbf9f 0x00261e20 0x00000000 Foo.__DWARF.__apple_namespac
0x0000000000000041 apple-names [0x0000000304c71dbf-0x00000003195935e7) rw- 0xf601ddbf 0x14921828 0x00000000 Foo.__DWARF.__apple_names
0x0000000000000042 apple-types [0x00000003195935e7-0x0000000321936157) rw- 0x0a93f5e7 0x083a2b70 0x00000000 Foo.__DWARF.__apple_types
0x0000000000000043 apple-objc [0x0000000321936157-0x0000000321bb310f) rw- 0x12ce2157 0x0027cfb8 0x00000000 Foo.__DWARF.__apple_objc
After the fix LLDB shows this:
(lldb) image dump sections Foo
SectID Type File Address Perm File Off. File Size Flags Section Name
------------------ ---------------------- --------------------------------------- ---- ---------- ---------- ---------- ----------------------------
0x0000000000000600 container [0x000000010edb4000-0x0000000132e70000) r-- 0x00002000 0x240bb056 0x00000000 Foo.__LINKEDIT
0x0000000000000700 container [0x0000000132e70000-0x0000000321bb4000) rw- 0x2421c000 0x1eed4310f 0x00000000 Foo.__DWARF
0x0000000000000037 swift-modules [0x0000000132e70000-0x00000001e2ced690) rw- 0x2421c000 0xafe7d690 0x00000000 Foo.__DWARF.__swift_ast
0x0000000000000038 dwarf-line [0x00000001e2ced690-0x0000000207e4b481) rw- 0xd4099690 0x2515ddf1 0x00000000 Foo.__DWARF.__debug_line
0x0000000000000039 dwarf-info [0x0000000207e4b481-0x0000000285cb06fb) rw- 0xf91f7481 0x7de6527a 0x00000000 Foo.__DWARF.__debug_info
0x000000000000003a dwarf-aranges [0x0000000285cb06fb-0x000000028931590b) rw- 0x17705c6fb 0x03665210 0x00000000 Foo.__DWARF.__debug_aranges
0x000000000000003b dwarf-ranges [0x000000028931590b-0x000000029007d6cb) rw- 0x17a6c190b 0x06d67dc0 0x00000000 Foo.__DWARF.__debug_ranges
0x000000000000003c dwarf-loc [0x000000029007d6cb-0x000000029bab2b75) rw- 0x1814296cb 0x0ba354aa 0x00000000 Foo.__DWARF.__debug_loc
0x000000000000003d dwarf-frame [0x000000029bab2b75-0x000000029babc9ed) rw- 0x18ce5eb75 0x00009e78 0x00000000 Foo.__DWARF.__debug_frame
0x000000000000003e dwarf-abbrev [0x000000029babc9ed-0x000000029bac9d5d) rw- 0x18ce689ed 0x0000d370 0x00000000 Foo.__DWARF.__debug_abbrev
0x000000000000003f dwarf-str [0x000000029bac9d5d-0x0000000304a0ff9f) rw- 0x18ce75d5d 0x68f46242 0x00000000 Foo.__DWARF.__debug_str
0x0000000000000040 apple-namespaces [0x0000000304a0ff9f-0x0000000304c71dbf) rw- 0x1f5dbbf9f 0x00261e20 0x00000000 Foo.__DWARF.__apple_namespac
0x0000000000000041 apple-names [0x0000000304c71dbf-0x00000003195935e7) rw- 0x1f601ddbf 0x14921828 0x00000000 Foo.__DWARF.__apple_names
0x0000000000000042 apple-types [0x00000003195935e7-0x0000000321936157) rw- 0x20a93f5e7 0x083a2b70 0x00000000 Foo.__DWARF.__apple_types
0x0000000000000043 apple-objc [0x0000000321936157-0x0000000321bb310f) rw- 0x212ce2157 0x0027cfb8 0x00000000 Foo.__DWARF.__apple_objc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note the file offset's bottom 32 bits are just truncated by 1 or 2 GB in the "prior o the fix" case. In the "After the fix" case the offsets are properly increasing. LLDB wasn't able to load the debug info prior to the fix and it is able to load it after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: (1 << 32) -> (1 ULL << 32)
clang is very likely to treat 1 as int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think this won't work if the section size is > 8GB, which is not the case in your examples (I know the example is contrived, but not impossible). If the offset + size spills over 4GB, then we need to add 0x100000000ull like we do here. We can see file offset of __debug_aranges is 0x17705c6fb rather than 0x7705c6fb . But imagine if the size of the prior section __debug_info was 4GB larger. Then we would need to add 0x200000000ull to the file offset of the next section __debug_aranges to get 0x27705c6fb.
In the code, we don't just want to add 0x100000000ull, we want to add 0x100000000ull * N where N is the number of 4GB boundaries that the section crosses. That is basically the math I was attempting to do with the bit mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine there are two sections:
__sec_a, size 8GB, offset 0
__sec_b, size 100, offset 0 (because of overflow when it should be 0x2_0000_0000)
Current 0x100000000ull (4G) increment will think that __sec_b starts at 4G instead of 8G.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine there are two sections:
__sec_a, size 8GB, offset 0 __sec_b, size 100, offset 0 (because of overflow when it should be 0x2_0000_0000)Current
0x100000000ull(4G) increment will think that__sec_bstarts at 4G instead of 8G.
I believe this is right. Any section after a >8GB section will not be handled correctly. But a file with many sections all less than 4GB, but totaling 8+GB in sum, will be handled correctly.
It can't be fixed, the mach-o file format doesn't support 64 bit section offsets. The section offset is only 32 bit. |
Sorry, I mean the section offset can't be fixed. I am not familiar with the obj2yaml/yaml2obj stuff. I am sure it can be fixed. |
| uint32_t segment_sect_idx; | ||
| const lldb::user_id_t first_segment_sectID = context.NextSectionIdx + 1; | ||
|
|
||
| // dSYM files can create sections whose data exceeds the 4GB barrier, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem of a Mach-O file being larger than 4GB, not a problem with a segment that is larger than 4GB. In fact if it is the final section of a file, there are no problems with a >4GB section.
To recap, a Mach-O file has zero or more Segments, and a Segment has zero or more Sections. A typical segment might be __TEXT or __DATA. A typical section might be __text or __eh_frame. The naming convention is opposite from ELF so I wanted to call it out explicitly.
The segment_command_64 structure from loader.h has a 64-bit fileoff and filesize for the Segment. The section_64 structure from loader.h has a 64-bit virtual address and size, but a 32-bit file offset (oops, probably) and file offset to the relocation entries. lldb doesn't care about the latter. This file offset in section_64 is from the start of the file, not the start of a segment, so if the start of this section in the file is greater than 4GB of the way through the file, you'll overflow the offset.
If a section starts within the first 4GB of the file, and extends past that point, that's fine. It's only if there is a section after this one that we will hit the problem.
"the 4GB barrier" reads a little odd to me, just my two cents, but I would probably phrase it more simply as "file larger than 4GB". 4GB barrier makes it sound more exciting somehow lol, like we're breaking the sound barrier over here.
|
FWIW a month ago I had a PR #159849 where lldb loads some mach-o structures into llvm::MachO representations, and then adds a virtual memory slide to them which could exceed 4GB. I made local copies of the structures (e.g. |
You could have a tool that creates a mach-o binary, I've written a few similar utilities in the testsuite, mostly mach-o corefile creators. But the problem is that you'll need to create a >4GB file in the test build directory. I might be mistaken, but it sounds risky to assume this is OK on all of our various CI environments. Even if yaml2obj could do this, you'd be looking at a 2*4GB binary checked in to the testsuite - an even less good idea. |
| // If this section overflows a 4GB barrier, then we need to adjust any | ||
| // subsequent the section offsets. | ||
| if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX) | ||
| section_offset_adjust += 0x100000000ull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const uint64_t section_file_offset = sect64.offset + section_offset_adjust;
if (is_dsym && ((uint64_t)sect64.offset + sect64.size) >= UINT32_MAX)
section_offset_adjust += 0x100000000ull;
I wouldn't check is_dsym. If this is correct, it is correct for any mach-o file that exceeds 4GB; the fact that we only see this in practice in dSYMs doesn't mean we should restrict it here.
|
All inline suggestions have been fixed. I also tried to make a minimal yaml file that could create a large enough file, but it complains when trying to emit the file as it runs into the 32 bit section offset limiation and refuses to make a bad mach-o file. So no easy way to test this with yaml. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tried to make a minimal yaml file that could create a large enough file, but it complains when trying to emit the file as it runs into the 32 bit section offset limiation and refuses to make a bad mach-o file. So no easy way to test this with yaml.
Since it is difficult to generate an invalid Mach-O, I wonder if in this case a compressed file that's mostly zeros can be added to the repository. Uncompress during the test, hope the contributor has the 4 GBs to spare, then delete as one of the last things on the test (because nobody wants a silly 4 GB in their system).
llvm-dsymutil can produce mach-o files where some sections in __DWARF exceed the 4GB barrier and subsequent sections in the dSYM will be inaccessible because the mach-o section_64 structure only has a 32 bit file offset. This patch enables LLDB to load a large dSYM file by figuring out when this happens and properly adjusting the file offset of the LLDB sections. I was unable to add a test as obj2yaml and yaml2obj are broken for mach-o files and they can't convert a yaml file back into a valid mach-o object file. Any suggestions for adding a test would be appreciated.
- Fix a case where a section can be larger that 4GB - Fix comments to be a bit more clear - Don't only do this for dSYM files
The binary is as minimal as possible and it contains 1 segment named "__DWARF" with 3 sections: FILE OFF INDEX ADDRESS SIZE OFFSET ALIGN RELOFF NRELOC FLAGS RESERVED1 RESERVED2 RESERVED3 NAME =========== ===== ------------------ ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------------- 0x00000068: [ 1] 0x00000000fffffff0 0x0000000000000020 0xfffffff0 0x00000002 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 __DWARF.__debug_abbrev 0x000000b8: [ 2] 0x0000000100000010 0x0000000200000000 0x00000010 0x00000002 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 __DWARF.__debug_info 0x00000108: [ 3] 0x0000000300000010 0x0000000000000020 0x00000010 0x00000002 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 __DWARF.__debug_line The file offsets should be parsed correctly by LLDB as: __debug_abbrev file_offset=0x00000000fffffff0 __debug_info file_offset=0x0000000100000010 __debug_line file_offset=0x0000000300000010
llvm-dsymutil can produce mach-o files where some sections in __DWARF exceed the 4GB barrier and subsequent sections in the dSYM will be inaccessible because the mach-o section_64 structure only has a 32 bit file offset. This patch enables LLDB to load a large dSYM file by figuring out when this happens and properly adjusting the file offset of the LLDB sections.
I was unable to add a test as obj2yaml and yaml2obj are broken for mach-o files and they can't convert a yaml file back into a valid mach-o object file. Any suggestions for adding a test would be appreciated.