New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM/Linux] coreclr fails due to lack of DWARF feature in libunwind #6698

Open
hqueue opened this Issue Aug 11, 2016 · 9 comments

Comments

Projects
None yet
6 participants
@hqueue
Copy link
Member

hqueue commented Aug 11, 2016

Coreclr uses libunwind library to unwind stack frame during exception handling and etc.
(https://github.com/dotnet/core/blob/master/release-notes/1.0/Release-Notes-RC2.md)

However libunwind sometimes can not unwind stack correctly and it was reported in #6598.
After investigation, this failure was casused by lack of support for ARM vfvp3/NEON in libunwind.
At least, two issues(#6525, #6272) are related and I think about 10 TC failures are due to same cause in ARM/Linux (both arm and arm-softfp)
Usually this failure are observed only when coreclr is build in Release where compiler(clang) optimization is enabled with "-g -O1" or "-g -O3", because ARM vfpv3/NEON are spilled to stack when compiler optimization is enabled. (Also -mfpu=vfpv3 is turned on by default in coreclr).

For example, when we build coreclr for ARM/Linux using clang-3.6 with -O1 optimization, following function make use of ARM vfvp3/NEON registers.

https://github.com/dotnet/coreclr/blob/master/src/classlibnative/bcltype/number.cpp#L2457

FCIMPL3_VII(Object*, COMNumber::FormatDouble, double value, StringObject* formatUNSAFE, NumberFormatInfo* numfmtUNSAFE)

After compilation, above C++ function will be compiled into assembly file as below and you can observe push intruction for d8 register which is ARM vfpv3/NEON registers, i.e. vpush {d8} and DWARF CFI is generated for d8 regsiter, .i.e .cfi_offset d8, -48.

.Ltmp2030:
        .cfi_offset r4, -36
        .setfp  r7, sp, #12
        add     r7, sp, #12
.Ltmp2031:
        .cfi_def_cfa r7, 24
        .pad    #4
        sub     sp, #4
        .vsave  {d8}
        vpush   {d8}
.Ltmp2032:
        .cfi_offset d8, -48
        .pad    #296

After linking, above code will reside in libcoreclr.so and .cif_offset information is stored in .debug_frame oflibcoreclr.so.dbg. For Debug and Checked build, libcoreclr.so contains, .debug_frame.
If we look into.debug_frame, we foundDW_CFA_offset_extended: r264 at cfa-48 which was.cfi_offset d8, -48 statement in above assembly file.

001644d8 0000002c 00163dc4 FDE cie=00163dc4 pc=0024da6c..0024dccc
DW_CFA_advance_loc: 4 to 0024da70
DW_CFA_def_cfa_offset: 36
DW_CFA_offset: r14 at cfa-4
DW_CFA_offset: r11 at cfa-8
DW_CFA_offset: r10 at cfa-12
DW_CFA_offset: r9 at cfa-16
DW_CFA_offset: r8 at cfa-20
DW_CFA_offset: r7 at cfa-24
DW_CFA_offset: r6 at cfa-28
DW_CFA_offset: r5 at cfa-32
DW_CFA_offset: r4 at cfa-36
DW_CFA_advance_loc: 2 to 0024da72
DW_CFA_def_cfa: r7 ofs 24
DW_CFA_advance_loc: 6 to 0024da78
DW_CFA_offset_extended: r264 at cfa-48
DW_CFA_nop
DW_CFA_nop

libunwind try to unwind stack using infromation in .debug_frame when UNW_ARM_METHOD_DWARF is turned on (coreclr turns on this method along with other methods by default). However libunwind can not interpret DW_CFA_offset_extended: r264 at cfa-48 correctly and it fails to unwind stack frame when information of vfpv3 register is in .debug_frame.

Same patterns are observed in both arm and arm-soft and in "-g -O1" and "-g -O3".

In short,
For ARM/Linux (both arm and arm-softfp)

  1. When compiling coreclr itself,
    (1) Binary of coreclr itself contain spilling ARM vfpv3/NEON registers (e.g. d8, d9 and tc.) to stack when built in Release mode using clang (with -O1, -O3 option)
    (2) At the same time, stack frame infromation is also stored in DWARF format in.debug_frame section with "-g" option enabled by default in Release build. And DWARF CFI contains infromation of ARM vfpv3/NEON registers when these regsiters reside in stack frame.
  2. At runtime,
    (1) coreclr try to unwind stack frame using libunwind
    (2) However libunwind can not interpret ARM vfpv3/NEON register correctly and stack unwinding fails when UNW_ARM_METHOD_DWARF is enabled which is default of coreclr now.

Possible solutions,

  • Method 1: Disable UNW_ARM_METHOD_DWARF for coreclr
  • Method 2: Add a support for libunwind of ARM vfpv3/NEON registers for UNW_ARM_METHOD_DWARF

I've also tried method 1 and 2, and both seems to work.

For method 1, when disabling UNW_ARM_METHOD_DWARF by setting UNW_ARM_UNWIND_METHOD=6, #6525 reproted that there is no obvious regression observbed.
For method 2, it can also passed about 10 more TCs but it is not fully tested yet and it required change for libunwind. I will make a upstream patch for this and contact libunwind community.

Threfore I think method 1 will be a feasible and practical way to fix this issue. I will add PR soon.

hqueue added a commit to hqueue/coreclr that referenced this issue Aug 11, 2016

ARM: Disable stack unwinding using DWARF
+    // libunwind library is used to unwind stack frame, butlibunwind for ARM
+    // does not support ARM vfpv3/NEON registers in DWARF format correctly
+    // Therefore let's disable stack unwiding using DWARF information
+    // See dotnet#6698
+    //
+    // libunwind use following methods to unwind stack frame.
+    // UNW_ARM_METHOD_ALL          0xFF
+    // UNW_ARM_METHOD_DWARF        0x01
+    // UNW_ARM_METHOD_FRAME        0x02
+    // UNW_ARM_METHOD_EXIDX        0x04
+    putenv(const_cast<char *>("UNW_ARM_UNWIND_METHOD=6"));

Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com>
@hqueue

This comment has been minimized.

Copy link
Member Author

hqueue commented Aug 11, 2016

@ghost

This comment has been minimized.

Copy link

ghost commented Aug 11, 2016

Is this issue reported in upstream's mailing list (http://lists.nongnu.org/archive/html/libunwind-devel/)?

@hqueue

This comment has been minimized.

Copy link
Member Author

hqueue commented Aug 12, 2016

@jasonwilliams200OK I will report this issue with the patch soon.

@ghost

This comment has been minimized.

Copy link

ghost commented Aug 12, 2016

Great! Would be nice to have a back and forth to this issue for better context. :)

@gkhanna79 gkhanna79 modified the milestone: Future Jan 12, 2017

@BruceForstall

This comment has been minimized.

Copy link
Contributor

BruceForstall commented Jan 9, 2019

Is the libunwind issue mentioned above still an issue? I see that the workaround implemented in coreruncommon.cpp in #6700 is still there.

@hqueue @parjong @janvorli

@parjong

This comment has been minimized.

Copy link
Contributor

parjong commented Jan 10, 2019

IMHO, this issue can be closed (the issue itself is still alive, but the workaround works well).

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Jan 10, 2019

I think it would be worth checking it now that we have libunwind sources as part of coreclr tree. It may be already fixed in the version we have or we could debug and fix it.

@BruceForstall

This comment has been minimized.

Copy link
Contributor

BruceForstall commented Jan 10, 2019

@janvorli Do you want to take the issue, then?

@janvorli janvorli self-assigned this Jan 10, 2019

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Jan 10, 2019

Yes, I've just assigned it to myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment