Skip to content

Commit b58caef

Browse files
author
Sergei Trofimovich
committed
ia64: fix small struct return
This change fixes libffi.call/struct10.c failure on ia64: FAIL: libffi.call/struct10.c -W -Wall -Wno-psabi -O0 execution test .Lst_small_struct handles returns for structs less than 32 bytes (following ia64 return value ABI [1]). Subroutine does roughly the following: ``` mov [sp+0] = r8 mov [sp+8] = r9 mov [sp+16] = r10 mov [sp+24] = r11 memcpy(destination, source=sp, 12); ``` The problem: ia64 ABI guarantees that top 16 bytes of stack are scratch space for callee function. Thus it can clobber it. [1] says (7.1 Procedure Frames): """ * Scratch area. This 16-byte region is provided as scratch storage for procedures that are called by the current procedure. Leaf procedures do not need to allocate this region. A procedure may use the 16 bytes at the top of its own frame as scratch memory, but the contents of this area are not preserved by a procedure call. """ In our case 16 top bytes are clobbered by a PLT resolver when memcpy() is called for the first time. As a result memcpy implementation reads already clobbered data frop top of stack. The fix is simple: allocate 16 bytes of scrats space prior to memcpy() call. [1]: https://www.intel.com/content/dam/www/public/us/en/documents/guides/itanium-software-runtime-architecture-guide.pdf Bug: https://bugs.gentoo.org/634190 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
1 parent 45da2fc commit b58caef

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

src/ia64/unix.S

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,6 @@ ffi_call_unix:
175175
;;
176176

177177
.Lst_small_struct:
178-
add sp = -16, sp
179178
cmp.lt p6, p0 = 8, in3
180179
cmp.lt p7, p0 = 16, in3
181180
cmp.lt p8, p0 = 24, in3
@@ -191,6 +190,12 @@ ffi_call_unix:
191190
(p8) st8 [r18] = r11
192191
mov out1 = sp
193192
mov out2 = in3
193+
;;
194+
// ia64 software calling convention requires
195+
// top 16 bytes of stack to be scratch space
196+
// PLT resolver uses that scratch space at
197+
// 'memcpy' symbol reolution time
198+
add sp = -16, sp
194199
br.call.sptk.many b0 = memcpy#
195200
;;
196201
mov ar.pfs = loc0

0 commit comments

Comments
 (0)