Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdb warns about our llvm6.0 fullaot dwarf data (linux) #8806

Closed
jaykrell opened this issue May 22, 2018 · 19 comments · Fixed by #19770, dotnet/runtime#36320, #20031 or dotnet/runtime#38454
Closed

Comments

@jaykrell
Copy link
Contributor

jaykrell commented May 22, 2018

linux
running llvm 6.0 fullaot llvm system.xml tests subset

jay@ubuntu:/s/mono/mcs/class/System.XML$ pwd

/s/mono/mcs/class/System.XML

jay@ubuntu:/s/mono/mcs/class/System.XML$ PATH="/s/mono/runtime/_tmpinst/bin:/i/monollvm60opt/bin:/i/monollvm/bin:/i/monollvm-deb/bin:/home/jay/Komodo-Edit-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" MONO_REGISTRY_PATH="/home/jay/.mono/registry" MONO_TESTS_IN_PROGRESS="yes" MONO_PATH="./../../class/lib/testing_aot_full/:./../../class/lib/testing_aot_full//tests:.:$MONO_PATH"  gdb --args  /s/mono/mono/mini/mono    ./../../class/lib/testing_aot_full/nunit-lite-console.exe  ./../../class/lib/testing_aot_full/tests/testing_aot_full_System.Xml_test.dll      -test=MonoTests.System.XmlSerialization.XmlSerializerTests.TestSerializeZeroFlagEnum_InvalidValue 

gdb issues diagnostic/warning:

DW_FORM_strp pointing outside of .debug_str section [in module /s/mono/mcs/class/lib/testing_aot_full/tests/testing_aot_full_System.Xml_test.dll.so]

Presumably, expected, no warnings about our dwarf stuff. But perhaps ok.

@jaykrell jaykrell changed the title gdb warns about our llvm6.0 fullaot dwarf data gdb warns about our llvm6.0 fullaot dwarf data (linux) May 22, 2018
@jaykrell
Copy link
Contributor Author

Hm, indeed, gdb cannot walk stack from System_Xml_Serialization_XmlCustomFormatter_FromEnum_long_string___long___string using LLVM 6.0 fullaot but can from regular mini fullaot.

i.e. break System_Xml_Serialization_XmlCustomFormatter_FromEnum_long_string___long___string
run
bt

@jaykrell
Copy link
Contributor Author

Clarification, it can walk one level, to System_Xml_Serialization_EnumMap_GetXmlName_string_object, and that's it.

@mathieubourgeois
Copy link
Contributor

I've been investigating for a while what I think might be the same issue, and I think I may know what the issue is.

I've been having issues with Xamarin.Android, where I would like to generate DWARF symbols for our AOT+LLVM application, send the DWARF info to Crashlytics and strip the resulting binary. I can easily do the part with generating the symbols and stripping the binary afterwards. However, the resulting DWARF that I get seems to be broken :

  • dwarfdump mentions dwarfdump ERROR: Cannot get a formstr (or a formstrp)....: DW_DLE_STRP_OFFSET_BAD(204) (204)
  • objdump --dwarf gives a bit more info :
  Compilation Unit @ offset 0x2f5:
   Length:        0x52e8 (32-bit)
   Version:       2
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><300>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <301>   DW_AT_producer    :objdump: Warning: DW_FORM_strp offset too big: 6f6e6f4d
 (indirect string, offset: 0x6f6e6f4d): <offset is too big>
    <305>   DW_AT_language    : 16672   (Unknown: 4120)
    <307>   DW_AT_name        :objdump: Warning: DW_FORM_strp offset too big: 4320544f
 (indirect string, offset: 0x4320544f): <offset is too big>
    <30b>   DW_AT_stmt_list   : 0x69706d6f
    <30f>   DW_AT_comp_dir    :objdump: Warning: DW_FORM_strp offset too big: 2072656c
 (indirect string, offset: 0x2072656c): <offset is too big>
    <313>   DW_AT_GNU_pubnames: 54
    <314>   DW_AT_low_pc      : 0x30322820302e382e
    <31c>   DW_AT_high_pc     : 0x30632f30312d3931
 <1><324>: Abbrev Number: 99
objdump: Warning: DIE at offset 0x324 refers to abbreviation number 99 which does not exist

From there, I noticed that the problematic offsets are actually ASCII characters. More specifically, they're the bytes of the Mono AOT Compiler 6.8.0 (2019-10/c0c5c78e2bd) that was emitted as part of the compile unit DWARF die. But it looks like it isn't decoded properly.

What I've noticed afterwards is, when we compile with LLVM, we actually link two assembly files, one from Mono and one from LLVM. So we actually have two .debug_abbrev sections. Indeed, that is the case :

Contents of the .debug_abbrev section:

  Number TAG (0x0)
   1      DW_TAG_compile_unit    [has children]
    DW_AT_producer     DW_FORM_strp
    DW_AT_language     DW_FORM_data2
    DW_AT_name         DW_FORM_strp
    DW_AT_stmt_list    DW_FORM_data4
    DW_AT_comp_dir     DW_FORM_strp
    DW_AT_GNU_pubnames DW_FORM_flag
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT value: 0     DW_FORM value: 0
   2      DW_TAG_subprogram    [no children]
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   3      DW_TAG_subprogram    [no children]
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data1
    DW_AT_inline       DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   4      DW_TAG_subprogram    [has children]
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   5      DW_TAG_inlined_subroutine    [no children]
    DW_AT_abstract_origin DW_FORM_ref4
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_call_file    DW_FORM_data1
    DW_AT_call_line    DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   6      DW_TAG_subprogram    [no children]
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT_abstract_origin DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   7      DW_TAG_subprogram    [no children]
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data2
    DW_AT value: 0     DW_FORM value: 0
   8      DW_TAG_subprogram    [no children]
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data2
    DW_AT_inline       DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   9      DW_TAG_subprogram    [has children]
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT_MIPS_linkage_name DW_FORM_strp
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data2
    DW_AT value: 0     DW_FORM value: 0
   10      DW_TAG_inlined_subroutine    [no children]
    DW_AT_abstract_origin DW_FORM_ref4
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_call_file    DW_FORM_data1
    DW_AT_call_line    DW_FORM_data2
    DW_AT value: 0     DW_FORM value: 0
  Number TAG (0xb2)
   1      DW_TAG_compile_unit    [has children]
    DW_AT_producer     DW_FORM_string
    DW_AT_name         DW_FORM_string
    DW_AT_comp_dir     DW_FORM_string
    DW_AT_language     DW_FORM_data1
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_stmt_list    DW_FORM_data4
    DW_AT value: 0     DW_FORM value: 0
   2      DW_TAG_subprogram    [has children]
    DW_AT_name         DW_FORM_string
    DW_AT_MIPS_linkage_name DW_FORM_string
    DW_AT_decl_file    DW_FORM_udata
    DW_AT_decl_line    DW_FORM_udata
    DW_AT_description  DW_FORM_string
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT_frame_base   DW_FORM_block1
    DW_AT value: 0     DW_FORM value: 0
   3      DW_TAG_formal_parameter    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT_location     DW_FORM_block1
    DW_AT value: 0     DW_FORM value: 0
   15      DW_TAG_formal_parameter    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT_location     DW_FORM_data4
    DW_AT value: 0     DW_FORM value: 0
   4      DW_TAG_base_type    [no children]
    DW_AT_byte_size    DW_FORM_data1
    DW_AT_encoding     DW_FORM_data1
    DW_AT_name         DW_FORM_string
    DW_AT value: 0     DW_FORM value: 0
   5      DW_TAG_class_type    [has children]
    DW_AT_name         DW_FORM_string
    DW_AT_byte_size    DW_FORM_udata
    DW_AT value: 0     DW_FORM value: 0
   17      DW_TAG_class_type    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_byte_size    DW_FORM_udata
    DW_AT value: 0     DW_FORM value: 0
   6      DW_TAG_member    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT_data_member_location DW_FORM_block1
    DW_AT value: 0     DW_FORM value: 0
   7      DW_TAG_typedef    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   8      DW_TAG_enumeration_type    [has children]
    DW_AT_name         DW_FORM_string
    DW_AT_byte_size    DW_FORM_udata
    DW_AT_type         DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   9      DW_TAG_enumerator    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_const_value  DW_FORM_sdata
    DW_AT value: 0     DW_FORM value: 0
   10      DW_TAG_namespace    [has children]
    DW_AT_name         DW_FORM_string
    DW_AT value: 0     DW_FORM value: 0
   11      DW_TAG_variable    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT_location     DW_FORM_block1
    DW_AT value: 0     DW_FORM value: 0
   12      DW_TAG_variable    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_type         DW_FORM_ref4
    DW_AT_location     DW_FORM_data4
    DW_AT value: 0     DW_FORM value: 0
   13      DW_TAG_pointer_type    [no children]
    DW_AT_type         DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   14      DW_TAG_reference_type    [no children]
    DW_AT_type         DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   16      DW_TAG_inheritance    [no children]
    DW_AT_type         DW_FORM_ref4
    DW_AT_data_member_location DW_FORM_block1
    DW_AT value: 0     DW_FORM value: 0
   18      DW_TAG_subprogram    [no children]
    DW_AT_name         DW_FORM_string
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_addr
    DW_AT value: 0     DW_FORM value: 0

However, what we can notice here is that the Mono assembly abbrev gets linked in the section after the LLVM one. But, in the previous objdump result, it's pretty clear that it's not using that set of abbreviations :

Abbrev Offset: 0x0

even though the offset should really be 0xb2 to be interpreted properly. I've looked at the resulting assembly generated by LLVM to handle the .debug_abbrev offset to compare them :`

LLVM

	.section	.debug_info,"",@progbits
.Lcu_begin0:
	.word	753
	.hword	2
	.word	.debug_abbrev
	.byte	8
	.byte	1
	.word	.Linfo_string0
....

Mono

.section ".debug_info"
.subsection 0
.Ldebug_info_start:

.LDIFF_SYM0=.Ldebug_info_end - .Ldebug_info_begin
	.long .LDIFF_SYM0
.Ldebug_info_begin:

	.hword 2
	.long 0
	.byte 8,1
	.string "Mono AOT Compiler 6.8.0 (2019-10/c0c5c78e2bd)"

The difference is that LLVM outputs a word to the .debug_abbrev section, but Mono outputs a long to absolute zero. Therefore, it looks like because we're pointing to 0, depending on how the linker merge the sections together, we'll always think we're at offset 0 of the debug_abbrev section even though we may have been relocated. I did do a test on my end by modifying this section to match the LLVM behavior :

.section ".debug_abbrev"
.subsection 0
.Ldebug_abbrev_start:

	.byte 1,17,1,37,8,3,8,27,8,19,11,17,1,18,1,16,6,0,0,2,46,1,3,8,135,64,8,58,15,59,15,90
	.byte 8,17,1,18,1,64,10,0,0,3,5,0,3,8,73,19,2,10,0,0,15,5,0,3,8,73,19,2,6,0,0,4
	.byte 36,0,11,11,62,11,3,8,0,0,5,2,1,3,8,11,15,0,0,17,2,0,3,8,11,15,0,0,6,13,0,3
	.byte 8,73,19,56,10,0,0,7,22,0,3,8,73,19,0,0,8,4,1,3,8,11,15,73,19,0,0,9,40,0,3,8
	.byte 28,13,0,0,10,57,1,3,8,0,0,11,52,0,3,8,73,19,2,10,0,0,12,52,0,3,8,73,19,2,6,0
	.byte 0,13,15,0,73,19,0,0,14,16,0,73,19,0,0,16,28,0,73,19,56,10,0,0,18,46,0,3,8,17,1,18
	.byte 1,0,0,0
.section ".debug_info"
.subsection 0
.Ldebug_info_start:

.LDIFF_SYM0=.Ldebug_info_end - .Ldebug_info_begin
	.long .LDIFF_SYM0
.Ldebug_info_begin:

	.hword 2
	.word	.Ldebug_abbrev_start
	.byte 8,1
	.string "Mono AOT Compiler 6.8.0 (2019-10/c0c5c78e2bd)"

i.e. I added manually a label at the start of .debug_abbrev and emit a word to that label instead of .long 0. I re-run as and ld and now the generated DWARF actually works. dwarfdump doesn't crash anymore and provides all the relevant information.

I've been looking to do a fix for this in mono_dwarf_writer_emit_base_info. However, I didn't notice any facility to emit a word for a specific label. I'm willing to do some tests on my free time to try and resolve it, but I'm not sure what the best way to emit the required code here would be. I'll try some soonish, but if anyone has information on how to best accomplish that, I'd gladly take the help.

@akoeplinger
Copy link
Member

akoeplinger commented May 12, 2020

@lambdageek @vargaz do you have some suggestions for @mathieubourgeois ?

@mathieubourgeois
Copy link
Contributor

mathieubourgeois commented May 12, 2020

FYI, I just tried to look for the same signs on the initial issue (Linux, fullaot) by running

mono --debug --llvm --aot=full,save-temps,nodebug,dwarfdebug,no-write-symbols,asmwriter,llvm-path=<path-to-llvm>,temp-path=./temp ./System.IO.Compression.FileSystem.dll

and I found the same symptoms in an objdump of the .so (with the same consequence that dwarfdump fails) :

<1><2c6>: Abbrev Number: 0
Compilation Unit @ offset 0x2c7:
Length:        0x52ad (32-bit)
Version:       2
Abbrev Offset: 0x0
Pointer Size:  8
<0><2d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
<2d3>   DW_AT_producer    :objdump: Warning: DW_FORM_strp offset too big: 6f6e6f4d
(indirect string, offset: 0x6f6e6f4d):
<2d7>   DW_AT_language    : 16672   (Unknown: 4120)
<2d9>   DW_AT_name        :objdump: Warning: DW_FORM_strp offset too big: 4320544f
(indirect string, offset: 0x4320544f):
<2dd>   DW_AT_stmt_list   : 0x69706d6f
<2e1>   DW_AT_comp_dir    :objdump: Warning: DW_FORM_strp offset too big: 2072656c
(indirect string, offset: 0x2072656c):
<2e5>   DW_AT_GNU_pubnames: 54
<2e6>   DW_AT_low_pc      : 0x6d2820302e33312e
<2ee>   DW_AT_ranges      : 0x65747361
<1><2f2>: Abbrev Number: 114
objdump: Warning: DIE at offset 0x2f2 refers to abbreviation number 114 which does not exist

Again, the offset is 0x0, but should be 0x2b. So it seems like it's indeed the same issue. Disabling full-aot doesn't trigger the issue, but I'm having it on Android doesn't require full-aot. It seems that the thing they have in common might be that the LLVM code is written to assembly first instead of directly to an object file (no clue why that would have an effect on it though). No matter what, I think making sure we use a relocatable label for the offset in the debug_abbrev section seems like the safer option no matter what.

@mathieubourgeois
Copy link
Contributor

From what I read of the DWARF specification, we should still emit a .long (it's actually appropriate, it's always meant to be 4-byte long). However, we can't output a label currently as a .long.

I did do a fix by adding such an ability and it works on linux+fullaot as well.

monojenkins pushed a commit to monojenkins/runtime that referenced this issue May 13, 2020
When outputting DWARF code to start a compilation unit in .debug_info, the standard expect a 4-byte offset from the .debug_abbrev code. Mono has always output an offset of 0.

However, this doesn't work in every cases. When we have linux+fullaot, we link two object files (one from Mono, one from LLVM). Both have their .debug_abbrev section. If we use 0 as an offset, it seems possible that the linker will keep thinking that our offset is 0, no matter the circumstances. Since the offset is always 0, it can be using the wrong abbreviation table (i.e. the one from the LLVM assembly instead of the one from the Mono assembly). The consequence of this is that the linked file is not valid DWARF (dwarfdump and objdump will complain about invalid offsets). At best, some tools will be able to work with a part of what we have, but any program requiring entirely valid DWARF will fail.

To fix this, we generate a label for the start of our debug_abbrev section and we instead generate it by generating a long with that label. This matches existing behavior seen in the LLVM generated code, and makes dwarfdump and objdump react properly to the linked product.

## Notes

I tested this fix in two ways :
- Linux + fullaot
- Xamarin.Android (which was my initial issue to begin with)
  - I didn't test the emission of the label (I didn't want to build Xamarin.Android in full to try the change)
  - However, I did replicate the change myself on the generated .s file and applied the commands normally done by Xamarin.Android to confirm that the fix worked properly (that's how I found the solution in the first place)

I'm not 100% convinced this is the right fix, since the DWARF spec mentions an offset in the debug_abbrev section and I'm not sure if they mean relative to the start of the debug_abbrev section or an address in the file (I would guess that if it's 0, they treat it as relative and absolute otherwise but I'm really not sure).

For the emission of the label, there wasn't a functionnality available to do as such, so I added one. I based myself on emit_symbol_diff.

## Fixes

Fixes mono/mono#8806
akoeplinger pushed a commit that referenced this issue May 13, 2020
…f 0 (#19770)

When outputting DWARF code to start a compilation unit in .debug_info, the standard expect a 4-byte offset from the .debug_abbrev code. Mono has always output an offset of 0.

However, this doesn't work in every cases. When we have linux+fullaot, we link two object files (one from Mono, one from LLVM). Both have their .debug_abbrev section. If we use 0 as an offset, it seems possible that the linker will keep thinking that our offset is 0, no matter the circumstances. Since the offset is always 0, it can be using the wrong abbreviation table (i.e. the one from the LLVM assembly instead of the one from the Mono assembly). The consequence of this is that the linked file is not valid DWARF (dwarfdump and objdump will complain about invalid offsets). At best, some tools will be able to work with a part of what we have, but any program requiring entirely valid DWARF will fail.

To fix this, we generate a label for the start of our debug_abbrev section and we instead generate it by generating a long with that label. This matches existing behavior seen in the LLVM generated code, and makes dwarfdump and objdump react properly to the linked product.

Fixes #8806
akoeplinger pushed a commit to dotnet/runtime that referenced this issue May 13, 2020
…f 0 (#36320)

When outputting DWARF code to start a compilation unit in .debug_info, the standard expect a 4-byte offset from the .debug_abbrev code. Mono has always output an offset of 0.

However, this doesn't work in every cases. When we have linux+fullaot, we link two object files (one from Mono, one from LLVM). Both have their .debug_abbrev section. If we use 0 as an offset, it seems possible that the linker will keep thinking that our offset is 0, no matter the circumstances. Since the offset is always 0, it can be using the wrong abbreviation table (i.e. the one from the LLVM assembly instead of the one from the Mono assembly). The consequence of this is that the linked file is not valid DWARF (dwarfdump and objdump will complain about invalid offsets). At best, some tools will be able to work with a part of what we have, but any program requiring entirely valid DWARF will fail.

To fix this, we generate a label for the start of our debug_abbrev section and we instead generate it by generating a long with that label. This matches existing behavior seen in the LLVM generated code, and makes dwarfdump and objdump react properly to the linked product.

Fixes mono/mono#8806

Co-authored-by: mathieubourgeois <mathieubourgeois@users.noreply.github.com>
monojenkins pushed a commit to monojenkins/mono that referenced this issue May 16, 2020
When outputting DWARF code to start a compilation unit in .debug_info, the standard expect a 4-byte offset from the .debug_abbrev code. Mono has always output an offset of 0.

However, this doesn't work in every cases. When we have linux+fullaot, we link two object files (one from Mono, one from LLVM). Both have their .debug_abbrev section. If we use 0 as an offset, it seems possible that the linker will keep thinking that our offset is 0, no matter the circumstances. Since the offset is always 0, it can be using the wrong abbreviation table (i.e. the one from the LLVM assembly instead of the one from the Mono assembly). The consequence of this is that the linked file is not valid DWARF (dwarfdump and objdump will complain about invalid offsets). At best, some tools will be able to work with a part of what we have, but any program requiring entirely valid DWARF will fail.

To fix this, we generate a label for the start of our debug_abbrev section and we instead generate it by generating a long with that label. This matches existing behavior seen in the LLVM generated code, and makes dwarfdump and objdump react properly to the linked product.

Fixes mono#8806
akoeplinger pushed a commit that referenced this issue May 26, 2020
…f 0 (#19794)

When outputting DWARF code to start a compilation unit in .debug_info, the standard expect a 4-byte offset from the .debug_abbrev code. Mono has always output an offset of 0.

However, this doesn't work in every cases. When we have linux+fullaot, we link two object files (one from Mono, one from LLVM). Both have their .debug_abbrev section. If we use 0 as an offset, it seems possible that the linker will keep thinking that our offset is 0, no matter the circumstances. Since the offset is always 0, it can be using the wrong abbreviation table (i.e. the one from the LLVM assembly instead of the one from the Mono assembly). The consequence of this is that the linked file is not valid DWARF (dwarfdump and objdump will complain about invalid offsets). At best, some tools will be able to work with a part of what we have, but any program requiring entirely valid DWARF will fail.

To fix this, we generate a label for the start of our debug_abbrev section and we instead generate it by generating a long with that label. This matches existing behavior seen in the LLVM generated code, and makes dwarfdump and objdump react properly to the linked product.

Fixes #8806

Co-authored-by: Mathieu Bourgeois <mathieu.bourgeois@gameloft.com>
@dalexsoto
Copy link
Member

@vargaz Xamarin iOS mtouch tests regressed with what appears to be coming from the fix of this issue with the following warnings:

Xamarin.MTouch.BuildWithCulture("sl_SI") : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory105/mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory105/mtouch-test-cache/arm64/Xamarin.iOS.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory105/mtouch-test-cache/arm64/System.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory105/mtouch-test-cache/arm64/mscorlib.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.BuildWithCulture (System.String culture) [0x00086] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.BuildWithCulture("ur_IN") : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory107/mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory107/mtouch-test-cache/arm64/Xamarin.iOS.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory107/mtouch-test-cache/arm64/System.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory107/mtouch-test-cache/arm64/mscorlib.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.BuildWithCulture (System.String culture) [0x00086] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.LinkedAwayTypesInContainerAppLinker : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory194/mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory192/mtouch-test-cache/arm64/testServiceExtension.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.LinkedAwayTypesInContainerAppLinker () [0x000ce] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.MT0095_NotSharedCode : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory272/mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory272/mtouch-test-cache/arm64/Xamarin.iOS.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory272/mtouch-test-cache/arm64/System.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory272/mtouch-test-cache/arm64/mscorlib.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory270/mtouch-test-cache/arm64/testServiceExtension.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory270/mtouch-test-cache/arm64/Xamarin.iOS.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory270/mtouch-test-cache/arm64/System.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory270/mtouch-test-cache/arm64/mscorlib.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.MT0095_NotSharedCode () [0x00129] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.MT0095_SharedCode : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory276/mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory274/mtouch-test-cache/arm64/testServiceExtension.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.MT0095_SharedCode () [0x00101] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.MT0113_interpreter("-all","-all",null) : warnings
Expected: 0
But was: 2
Warnings:
	warning MT5203: Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory371/mtouch-test-cache/arm64/testApp.exe.o
	warning MT5203: Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory369/mtouch-test-cache/arm64/testServiceExtension.dll.o
  at Xamarin.Tests.Tool.AssertWarningCount (System.Int32 count, System.String message) [0x0009a] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.MT0113_interpreter (System.String app_interpreter, System.String appex_interpreter, System.String msg) [0x000c1] in <446203e45dcf45328d3265c417775e93>:0 
Xamarin.MTouch.WatchOSExtensionsWithExtensions : No warnings expected, but got:
Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory575/mtouch-test-cache/armv7k/testApp.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in /Users/builder/jenkins/workspace/xamarin-macios-pr-builder/tests/mtouch/bin/Debug/tmp-test-dir/Xamarin.Tests.BundlerTool.CreateTemporaryDirectory574/mtouch-test-cache/armv7k/intentsExtension.dll.o	
  at Xamarin.Tests.Tool.AssertNoWarnings () [0x0007c] in <446203e45dcf45328d3265c417775e93>:0 
  at Xamarin.MTouch.WatchOSExtensionsWithExtensions () [0x000b1] in <446203e45dcf45328d3265c417775e93>:0 

@akoeplinger
Copy link
Member

/cc @mathieubourgeois we'll probably need to revert that patch from 2020-02, unless you have a quick idea of what might be up.

@mathieubourgeois
Copy link
Contributor

Directly here no I don't have an exact idea. If it's possible to extract one of the problematic .o/.s file, I could take a look directly at the structure and try to figure out what's wrong. Otherwise, I'd need to set myself up just to repro the issue, I don't have any direct ideas. The only theory I have so far is that this may be a difference in offset handling in the linker between clang and gcc (the initial problem was linux/android, which links via gcc's ld, while ios works with clang's ld)

@dalexsoto
Copy link
Member

dalexsoto commented Jun 23, 2020

Hello @mathieubourgeois inside the arm64 folder you will find the .o/.s files

No warnings expected, but got:
	Native linking warning: warning: can't parse dwarf compilation unit info in mtouch-test-cache/arm64/testApp.exe.o
	Native linking warning: warning: can't parse dwarf compilation unit info in mtouch-test-cache/arm64/Xamarin.iOS.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in mtouch-test-cache/arm64/System.dll.o
	Native linking warning: warning: can't parse dwarf compilation unit info in mtouch-test-cache/arm64/mscorlib.dll.o	

mtouch-test-cache.zip

@mathieubourgeois
Copy link
Contributor

mathieubourgeois commented Jun 23, 2020

From running objdump on testApp.exe.o

0000000000000165 __debug_abbrev:
  ...

0000000000000227 __debug_info:
227: 45 01 00 00 --> Length of debug_info section
22b: 02 00 65 01 --> First two bytes are DWARF Version (2), last two bytes are part of the debug_abbrev offset
22f: 00 00 08 01 --> First two bytes are part of debug_abbrev offset

So the abbrev_offset is 0x00000165, which is the exact offset in the file of the .debug_abbrev section (instead of being an offset in the section itself), so it really looks like we have a clash of behavior between different compilers. Out of the blue, I'm not sure why different compilers would interpret this in different ways, so I guess in the meantime it's probably better to revert until we understand what's going on (making it configurable without understanding why it happens in the first place seems very problematic to me)

@akoeplinger
Copy link
Member

Ok, thanks for looking. I reverted the commit in 2020-02. It's still enabled in master (and dotnet/runtime).

@mathieubourgeois
Copy link
Contributor

Also, I just relooked at an equivalent file when I built it in Linux at the time, and the offset into debug_abbrev in the .o file is 0x00000000. So that's definitely an issue (as to the why, that's another story)

@mathieubourgeois
Copy link
Contributor

Interesting thing I just noticed : passing the linked files in the opposite order (Mono's .o file before llvm's .o file instead of llvm followed by Mono) to ld doesn't seem to have the initial issue at all (at least, on Linux)

@mathieubourgeois
Copy link
Contributor

I can't completely confirm this information (the zip file only contains the mono built file, not the llvm built assembly and object), but by looking at the llvm source code, I can infer that, based on doesDwarfUseRelocationsAcrossSections() and MCGenDwarfInfo::Emit, LLVM would generate the offset with a label if :

  • We're not targetting Darwin (doesDwarfUseRelocationsAcrossSections returns false in Darwin by default it seems)
  • Or we're DWARF v3 and up and we have more than one code section

The second one is never true, so AbbrevSectionSymbol is never true in Darwin and LLVM will just emit a 0 directly. So, in Darwin's case, it looks like the fix is actually introducing the discrepancy. One possible fix therefore could be to match LLVM's behavior and just disable it on any Darwin target.

@akoeplinger
Copy link
Member

One possible fix therefore could be to match LLVM's behavior and just disable it on any Darwin target.

That sounds like a good approach to me, what do you think @vargaz and @dalexsoto ?

@mathieubourgeois
Copy link
Contributor

Did a tentative PR based on my suggested idea.

monojenkins pushed a commit to monojenkins/runtime that referenced this issue Jun 26, 2020
mono/mono#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes mono/mono#8806 (again)
akoeplinger pushed a commit that referenced this issue Jun 30, 2020
…20031)

#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes #8806 (again)
akoeplinger pushed a commit to dotnet/runtime that referenced this issue Jun 30, 2020
…38454)

mono/mono#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes mono/mono#8806 (again)

Co-authored-by: mathieubourgeois <mathieubourgeois@users.noreply.github.com>
akoeplinger pushed a commit to akoeplinger/mono that referenced this issue Jun 30, 2020
…ono#20031)

mono#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes mono#8806 (again)

(cherry picked from commit 51dfcfe)
kevinwkt pushed a commit to kevinwkt/runtimelab that referenced this issue Jul 15, 2020
…#38454)

mono/mono#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes mono/mono#8806 (again)

Co-authored-by: mathieubourgeois <mathieubourgeois@users.noreply.github.com>
akoeplinger added a commit that referenced this issue Jul 24, 2020
…s a label instead of 0" (#20046)

* Revert "Revert "Emit DWARF debug_abbrev offset for compile units as a label instead of 0 (#19794)" (#20013)"

This reverts commit 83105ba.

* Bring back previous debug_abbrev offset behavior on Apple platforms (#20031)

#19794 broke on iOS. Analyzing the result showed that, while the assembly was generated as expected, the linked result was different. Analyzing the LLVM code (which triggered the original fix) seems to imply that, on Apple platforms, they will not generate a label for it, only doing the 0-offset as we were doing before. Therefore, match the LLVM behavior by bringing back our previous logic when targetting Mach, otherwise use the new way of doing things.

Fixes #8806 (again)

(cherry picked from commit 51dfcfe)

Co-authored-by: mathieubourgeois <mathieu.bourgeois@gameloft.com>
jonpryor added a commit to jonpryor/xamarin-android that referenced this issue Aug 14, 2020
Context: mono/mono#19860
Context: mono/mono#19964
Context: mono/mono#20138
Context: mono/mono#8806
Context: xamarin/xamarin-macios#9289

Changes: mono/mono@83105ba...66e2b84

  * mono/mono@66e2b84002f: [aot] Fix an assert which is hit for generic instances with a lot of arguments. (#20239)
  * mono/mono@d3daacdaa80: Bump msbuild to latest commit
  * mono/mono@e59c1cd70f4: Fix Cairo issue on macOS Big Sur (#20154)
  * mono/mono@648655b86d5: [aot] Avoid a crash in generic sharing for invalid generic instances. (#20158)
  * mono/mono@ec71e8a7ae3: [2020-02] Reapply "Emit DWARF debug_abbrev offset for compile units as a label instead of 0" (#20046)
  * mono/mono@20bb4f9a6d3: [mono][mini] Do a non-virtual call for bound delegates (#20039)
  * mono/mono@9ca6fa646a8: [merp] Remove dead code (#20043)
  * mono/mono@2ff424be293: [crashing] Improve crash chaining (#20018)
jonpryor added a commit to jonpryor/xamarin-android that referenced this issue Aug 14, 2020
Context: mono/mono#8806
Context: mono/mono#19860
Context: mono/mono#19964
Context: mono/mono#20138
Context: xamarin/xamarin-macios#9289

Changes: mono/mono@83105ba...66e2b84

  * mono/mono@66e2b84002f: [aot] Fix an assert which is hit for generic instances with a lot of arguments. (#20239)
  * mono/mono@d3daacdaa80: Bump msbuild to latest commit
  * mono/mono@e59c1cd70f4: Fix Cairo issue on macOS Big Sur (#20154)
  * mono/mono@648655b86d5: [aot] Avoid a crash in generic sharing for invalid generic instances. (#20158)
  * mono/mono@ec71e8a7ae3: [2020-02] Reapply "Emit DWARF debug_abbrev offset for compile units as a label instead of 0" (#20046)
  * mono/mono@20bb4f9a6d3: [mono][mini] Do a non-virtual call for bound delegates (#20039)
  * mono/mono@9ca6fa646a8: [merp] Remove dead code (#20043)
  * mono/mono@2ff424be293: [crashing] Improve crash chaining (#20018)
@brendanzagaeski
Copy link
Contributor

Release status update for Xamarin SDKs

New Preview versions of the Xamarin.Android and Xamarin.iOS SDKs have now been published that include the fix for this item. The fix is not yet included in Release versions. I will update this item again when Release versions are available that include the fix.

Fix included in Xamarin.Android SDK version 11.1.0.15.
Fix included in Xamarin.iOS SDK version 14.4.0.9.

Fix included on Windows in Visual Studio 2019 version 16.8 Preview 4. To try the Preview version that includes the fix, check for the latest updates in Visual Studio Preview.

Fix included on macOS in Visual Studio 2019 for Mac version 8.8 Preview 4. To try the Preview version that includes the fix, check for the latest updates on the Preview updater channel.

@brendanzagaeski
Copy link
Contributor

Release status update for Xamarin SDKs

New Release versions of the Xamarin.Android and Xamarin.iOS SDKs have now been published that include the fix for this item.

Fix included in Xamarin.Android SDK version 11.1.0.17.
Fix included in Xamarin.iOS SDK version 14.4.1.3.

Fix included on Windows in Visual Studio 2019 version 16.8. To get the new version that includes the fix, check for the latest updates or install the most recent release from https://visualstudio.microsoft.com/downloads/.

Fix included on macOS in Visual Studio 2019 for Mac version 8.8. To get the new version that includes the fix, check for the latest updates on the Stable updater channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment