Browse files

Ensure 16-byte alignment for amd64 coroutine stack

The AMD64 ABI requires that stacks be 16-byte aligned.  stackend is
going to be 16-byte aligned, 128 is a multiple of 16, and
sizeof(uintptr_t) is going to 8 on amd64, so this results in a stack
that is not 16-byte aligned on amd64 (at least for platforms where
this code branch is used, such as OpenBSD).  This results in bus errors
when floating point numbers are converted to strings, since snprintf
calls movaps with a non 16-byte aligned memory location.

With this patch, the iovm correctness tests pass on OpenBSD/amd64.
  • Loading branch information...
1 parent 7bf702c commit 84c9e4f3093d26c36642bb53303e7213e48402e3 @jeremyevans jeremyevans committed Oct 12, 2012
Showing with 1 addition and 1 deletion.
  1. +1 −1 libs/coroutine/source/Coro.c
2 libs/coroutine/source/Coro.c
@@ -599,7 +599,7 @@ void Coro_setup(Coro *self, void *arg)
if (64 > (- sav[i] + (uintptr_t)&i))
assert(i < sz);
- sav[i] = stackend - sizeof(uintptr_t) - 128;
+ sav[i] = stackend - sizeof(uintptr_t)*2 - 128;

6 comments on commit 84c9e4f


The proper way to do alignment is like this:

static inline size_t aligned_size(size_t size, size_t alignment)
    assert(alignment <= 0x8000);
    size_t r = size + --alignment + 2;
    return (r + 2 + alignment) & ~alignment;

Calling this function assumes that "alignment" is power of 2, and "size" would be the original stackend - sizeof(uintptr_t) - 128.


That's certainly looks like a more portable solution, compared to the quick hack I used. I'll be happy as long as the problem get's fixed. Not being familiar with the amd64 ABI or Io's internals, it took me many hours to diagnose the underlying problem causing the bus error, and figure out where the fix should be applied.


I'll accept whichever working version you guys recommend. Thanks for helping with this.


If you plan on having Io run on many platforms with different alignment requirements, then Jeremy Tregunna's code is probably best. If you only care about fixing amd64, then this pull request gets the job done with the least number of changes (2 characters).


I'll test out my code in place, and see if there are any regressions. Should all look ok, then I will send another pull.


Alright tested, and seems to work just fine. It's in #232.

Please sign in to comment.