Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libunwind: Don't save/restore DDC for RISC-V CheriABI. #515

Merged
merged 1 commit into from Mar 19, 2021

Conversation

bsdjhb
Copy link
Contributor

@bsdjhb bsdjhb commented Feb 16, 2021

Since the register doesn't have a DWARF number, the compiler will
never emit any rules in .eh_frame to unwind its value, so it will
always be restored to the value at the time the exception is thrown.

Since the register doesn't have a DWARF number, the compiler will
never emit any rules in .eh_frame to unwind its value, so it will
always be restored to the value at the time the exception is thrown.
@bsdjhb bsdjhb requested a review from jrtc27 February 16, 2021 17:39
@jrtc27
Copy link
Member

jrtc27 commented Feb 16, 2021

This is an ABI break but I suspect we have nothing other than gdb that would care, which is hybrid anyway? Maybe parts of Qt have knowledge of libunwind, but @pentelbart is about the only user of that on CHERI-RISC-V (LLVM itself also does but only ever treats the context as an opaque pointer/handle, it never dereferences it, and purecap LLVM remains in my experimental local branch), so I think this breakage is fine.

@bsdjhb
Copy link
Contributor Author

bsdjhb commented Feb 16, 2021

GDB doesn't know/care about unwind cursor layout. It knows about signal frame layouts and sometimes knows some bits about longjmp buffers (but even then it only cares about the location of the PC for software single stepping, not the full layout).

@jrtc27
Copy link
Member

jrtc27 commented Feb 16, 2021

GDB doesn't know/care about unwind cursor layout. It knows about signal frame layouts and sometimes knows some bits about longjmp buffers (but even then it only cares about the location of the PC for software single stepping, not the full layout).

I thought it did though and that's why aarch64 GDB was segfaulting when run with a hybrid libunwind when it had erroneously started saving/restoring capabilities?

@bsdjhb
Copy link
Contributor Author

bsdjhb commented Feb 16, 2021

Hmmm, it uses C++ exceptions and I think the unwind changes in morello-dev just broke C++ exception handling for hybrid at the time. GDB was just the first program we ran under Morello that used exceptions. devd uses C++, but not exceptions IIRC. I'd have to think more about why that broke though as in theory I think the _cxa_*() routines are what allocate exception objects (which are what contain a unw_cursor_t IIRC) so it should only be libcxxrt and libunwind that need to be in sync. Hmm, I guess the libthr does use libunwind directly as well for the unwinding it does in pthread_exit() to run destructors.

@pentelbart
Copy link

pentelbart commented Feb 17, 2021

Maybe parts of Qt have knowledge of libunwind, but @pentelbart is about the only user of that on CHERI-RISC-V

AFAIK it's not used in the parts of Qt that I'm using (I think it's just in the WebEngine/Chromium stuff).

@bsdjhb
Copy link
Contributor Author

bsdjhb commented Mar 11, 2021

Ping.

@jrtc27 jrtc27 requested a review from arichardson March 11, 2021 22:19
@@ -158,8 +158,8 @@
# if __riscv_xlen == 64
# define _LIBUNWIND_TARGET_RISCV 1
# ifdef __CHERI_PURE_CAPABILITY__
# define _LIBUNWIND_CONTEXT_SIZE 98
# define _LIBUNWIND_CURSOR_SIZE 121
# define _LIBUNWIND_CONTEXT_SIZE 96
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does context shrink by two but cursor only by one? But there should be static_assert()s so it must be fine.

Copy link
Member

@jrtc27 jrtc27 Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I wonder if I mis-counted before, I'd expect it to always be even for CHERI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably because of padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so here is the "old" cursor layout:

(gdb) ptype /o libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_riscv>
/* offset    |  size */  type = class libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_riscv> 
    [with A = class libunwind::LocalAddressSpace, 
         R = class libunwind::Registers_riscv] 
        : public libunwind::AbstractUnwindCursor {
                         private:
/*   16      |    16 */    class A &_addressSpace;
/*   32      |   784 */    R _registers;
/*  816      |   144 */    unw_proc_info_t _info;
/*  960      |     1 */    bool _unwindInfoMissing;
/*  961      |     1 */    bool _isSignalFrame;
/* XXX 14-byte padding */

                           /* total size (bytes):  976 */
                         }

That should result in 976 / 8 == 122 for the cursor size. I'm not sure why this is not tripping up the assertion in the current code. The new size is 960 bytes which results in the 120 in this commit.

Humm, the existing RISC-V world builds with this set to 122 as well as 121. Given how picky the check_fit<> template has been in the past, I don't know why.

Oh:

struct __attribute__((aligned(sizeof(void*)))) unw_cursor_t {
  uint64_t data[_LIBUNWIND_CURSOR_SIZE];
};

That alignment means that too-small values can work since it gets padded out. This is the existing unw_cursor_t:

(gdb) ptype /o unw_cursor_t 
/* offset    |  size */  type = struct unw_cursor_t {
/*    0      |   968 */    uint64_t data[121];
/* XXX  8-byte padding */

                           /* total size (bytes):  976 */
                         }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably also my fault as I probably just used the smallest value which satisfied the assertion failure when doing RISC-V the first time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I was the one responsible for this code. I probably did something stupid like work out how many pointers there were in the struct and just add that on again, forgetting about padding for alignment.

Copy link
Member

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retracting approval for now; will work out tomorrow what's up with the cursor/context sizes

@@ -158,8 +158,8 @@
# if __riscv_xlen == 64
# define _LIBUNWIND_TARGET_RISCV 1
# ifdef __CHERI_PURE_CAPABILITY__
# define _LIBUNWIND_CONTEXT_SIZE 98
# define _LIBUNWIND_CURSOR_SIZE 121
# define _LIBUNWIND_CONTEXT_SIZE 96
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I was the one responsible for this code. I probably did something stupid like work out how many pointers there were in the struct and just add that on again, forgetting about padding for alignment.

@jrtc27 jrtc27 merged commit b4bb2ea into dev Mar 19, 2021
@jrtc27 jrtc27 deleted the riscv_libunwind_ddc branch March 19, 2021 01:32
arichardson added a commit to arichardson/llvm-project that referenced this pull request Mar 17, 2022
Commit b4bb2ea removed the space to
store DDC in the unw_context_t, but the assembly was not deleted, so we
still store it to the one-past-the-end offset 768. This results in a
bounds violation when called from _Unwind_RaiseException(). We did not
notice this initially since it was only added to cheribsd in January
with CTSRD-CHERI/cheribsd@898cdc7.
I've also updated _ZN9libunwind15Registers_riscv6jumptoEv, which should
have the same problem.


Fixes: CTSRD-CHERI#515
arichardson added a commit to arichardson/llvm-project that referenced this pull request Mar 17, 2022
Commit b4bb2ea removed the space to
store DDC in the unw_context_t, but the assembly was not deleted, so we
still store it to the one-past-the-end offset 768. This results in a
bounds violation when called from _Unwind_RaiseException(). We did not
notice this initially since it was only added to cheribsd in January
with CTSRD-CHERI/cheribsd@898cdc7.
I've also updated _ZN9libunwind15Registers_riscv6jumptoEv, which should
have the same problem.


Fixes: CTSRD-CHERI#515
arichardson added a commit that referenced this pull request Mar 17, 2022
Commit b4bb2ea removed the space to
store DDC in the unw_context_t, but the assembly was not deleted, so we
still store it to the one-past-the-end offset 768. This results in a
bounds violation when called from _Unwind_RaiseException(). We did not
notice this initially since it was only added to cheribsd in January
with CTSRD-CHERI/cheribsd@898cdc7.
I've also updated _ZN9libunwind15Registers_riscv6jumptoEv, which should
have the same problem.


Fixes: #515
jrtc27 pushed a commit to CTSRD-CHERI/libunwind that referenced this pull request Mar 17, 2022
Commit b4bb2ea39b362b1899e4a361e7bbc6964885cb05 removed the space to
store DDC in the unw_context_t, but the assembly was not deleted, so we
still store it to the one-past-the-end offset 768. This results in a
bounds violation when called from _Unwind_RaiseException(). We did not
notice this initially since it was only added to cheribsd in January
with CTSRD-CHERI/cheribsd@898cdc7.
I've also updated _ZN9libunwind15Registers_riscv6jumptoEv, which should
have the same problem.


Fixes: CTSRD-CHERI/llvm-project#515
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants