-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIC32: Fix stack align #474
Conversation
@@ -38,7 +38,8 @@ typedef uint32_t os_sr_t; | |||
/* Stack type, aligned to a 32-bit word. */ | |||
#define OS_STACK_PATTERN (0xdeadbeef) | |||
|
|||
typedef uint32_t os_stack_t; | |||
/* uint64_t in an attempt to get the stack 8 aligned */ | |||
typedef uint64_t os_stack_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about :
typedef uint32_t __attribute__((aligned(8))) os_stack_t;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running on sim x86 we have similar restrictions, i.e. stack has to be aligned to 16 bytes.
The way this was dealt with was by making sure at task's stack init function that the SP was aligned to that boundary.
If you can follow the assembly: kernel/os/src/arch/sim/os_arch_stack_frame.s
Although the code is commented :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also adjust stack_top in os_arch_task_stack_init(), if it was not aligned on 8-byte boundary.
The use case that would worry me is where user allocates the stack with malloc(), as opposed to declaring a global variable. malloc() might, or might not, be returning 8-byte aligned data.
typedef'ing os_stack_t as uint64, or adding that aligned() attribute seems useful also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first attempt at a fix for this did something similar (modifying the stack pointer in os_arch_task_stack_init()
) but I didn't like that the user may end up with a slightly smaller stack than they specified. Let me know If you would rather I did this to fit in with the x86 sim port.
typedef uint32_t __attribute__((aligned(8))) os_stack_t;
I couldn't get this to work, looks like it aligns the elements and not the array?
repos/apache-mynewt-core/kernel/os/src/os.c:48:1: error: alignment of array elements is greater than element size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My experience with aligned attribute is a bit limited, I usually have used it with specific symbols, not with a type definition. I'm ok if you end up using uint64_t as os_stack_t type, BTW.
However, I would recommend making os_arch_task_stack_init() modifications regardless. This because there's no guarantee regarding alignment of malloc()'d stack.
With 8 byte alignment of os_stack_t, the number of cases where os_arch_task_stack_init() will get called with unaligned stack should not be many. Usually user would get exactly the stack space they wanted.
I'm not too worried about losing 4 bytes here due to alignment, BTW. In practice I think people leave bigger margins for stack; it's a bit tedious to figure the deepest possible call stack, and any code modification/compiler flag change has a chance of changing this for the worse.
However, that's just my opinion :). I'm ok with even the current change, and if it turns out not to be enough, we'll modify this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense. I have just dug out my old attempt at that code and remembered another problem I had with it: I am storing my lazy FPU context as an offset to the top of the stack (which must be aligned). The actual top of the stack cannot be modified in the os_arch_task_stack_init()
function as it is overwritten shortly after. Do you know how much would be broken by changing this (in os_task.c):
t->t_stackptr = os_arch_task_stack_init(t, &stack_bottom[stack_size],
stack_size);
t->t_stacktop = &stack_bottom[stack_size];
t->t_stacksize = stack_size;
to this:
t->t_stacktop = &stack_bottom[stack_size];
t->t_stacksize = stack_size;
t->t_stackptr = os_arch_task_stack_init(t, &stack_bottom[stack_size],
stack_size);
? (You could even get rid of 2 args to os_arch_task_stack_init()
).
The alternative is computing the alignment every time I want to access the FPU context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter option would be unbearable :)
I can't think of a thing which would break with the change you're proposing. And if they do, we'll fix them. Maybe supplement it with a comment saying that these lines should not be reordered from this on? So that folks later coming here later would know to be aware what os_arch_task_stack_init() will do.
As for changing the arguments to os_arch_task_stack_init(), I'd be cool with that. Simpler is better.
It would be good to send an email to mailing list regarding that change, though. This because it is a bit of an API change, though I imagine most of the folks wouldn't even notice. Maybe it would make sense to do that via a separate pull request? So that we can get this stack alignment issue fixed sooner rather than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I will get on with making another PR + mailing list explanation for that. It might take me a bit of time though, I have a lot on at the moment!
Sometimes the stack was not 8-aligned causing FPU load/store problems (and is incompatible with the ABI). Changing the type to uint64_t seems to fix it but I don't think this is portable so if anyone has any ideas for a better fix please let me know!