Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fault on accessing minimal user stack with syscall #22

Closed
rick-masters opened this issue Feb 1, 2023 · 5 comments
Closed

Fault on accessing minimal user stack with syscall #22

rick-masters opened this issue Feb 1, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@rick-masters
Copy link
Contributor

If a user process tries to access a stack page for the first time using a system call, it may cause a page fault.

This problem was seen with the make program while building linux but the setup to reproduce the problem would be impractical, so I created a small test program. It took quite a bit of investigation and testing to isolate the specific problem.

The following code can be used to trigger the problem. Admittedly, this test is not something anyone would run into casually, but I do believe it this is "valid" code that should work and represents what I think make was doing when it faulted.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <errno.h>

int main() {
        char filename[3670] = "stacktest";
        struct stat statinfo;

        /*
        puts("hello");
        __asm__ (
                        "mov $0x6A,%eax\t\n"
                        "mov $0xbffff00a,%ebx\t\n"
                        "mov $0xbfffefc8,%ecx\t\n"
                        "int $0x80"
                );
        printf("my size is %u\n", statinfo.st_size);
        */
        printf("filename is 0x%08x\n", (unsigned int) filename);
        printf("statinfo is 0x%08x\n", (unsigned int) &statinfo);
}

You may need to do perform some manual adjustments to make this work.
First, compile and run the program as is:

cc stacktest.c -o stacktest
./stacktest
filename is 0xbffff00a
statinfo is 0xbfffefc8

Note the address of filename and statinfo. It is important that filename and statinfo are in different memory pages. You may need to adjust the size of filename in order to make this happen.

Now change the assembly code (if necessary) to match those addresses for setting %ebx and %ecx, respectively, and uncomment that code.

You should be able to run the program now and get the correct size of the file:

cc stacktest.c -o stacktest
./stacktest
hello
my size is 586769
filename is 0xbffff00a
statinfo is 0xbfffefc8

The assembly code is executing a stat system call and is the equivalent of stat(filename, &statinfo);.

Now, edit the code and remove the line puts("hello");. Compile and run again. For me, this produces a segfault.

The issue is that the stack page for statinfo is not mapped in.
Normally, any access of an unmapped stack page from user land would cause the page to be mapped in by a page fault. The puts("hello"); call triggers this mechanism when it pushes a call frame onto the stack. However, when that is removed, the program is using a system call to write to that memory for the first time. In this case, the fault occurs in the kernel and it appears the normal mechanism for growing the stack does not trigger and a page fault terminates the program instead.

It seems there are a couple of ways to resolve this. At first, I thought the problem was simply that a lack of sufficient stack space was being mapped for the user program. I developed a solution with that in mind. I wrote a change in which the user program
is mapped a significant chunk of free stack from the start of execution. (See attached PR). This indeed resolves the problem.

After trying to create a test case I came to understand the current mechanism for growing the user stack dynamically on a fault, which examines the user process %esp register. It might be possible to create a similar mechanism for growing the stack when a
fault occurs during a system call. On the other hand, I don't see a downside to just setting up more stack memory in the vma to begin with.

@mikaku
Copy link
Owner

mikaku commented Feb 2, 2023

I've tried your example program with several values but I'm unable to reproduce the segfault.
Can you, please, paste all the kernel output (registers, etc.) from the segmentation fault?

@rick-masters
Copy link
Contributor Author

Test program attached
stacktest.gz

@mikaku mikaku self-assigned this Feb 2, 2023
@mikaku mikaku added the bug Something isn't working label Feb 2, 2023
@mikaku
Copy link
Owner

mikaku commented Feb 2, 2023

I think I know what is happening here.

The function verify_address() doesn't do a stack expansion when the address being checked belongs to the user stack, which would save the inevitable page fault. Having the option CONFIG_LAZY_USER_ADDR_CHECK enabled or disabled doesn't make any change on this.

Solution:

  • Add a check in verify_address() to do a stack expansion (if needed) when CONFIG_LAZY_USER_ADDR_CHECK is disabled.

  • Add a check in do_page_fault() to see if the faulting address belongs to the user stack and then, let page_not_present() expand the stack accordingly. This will only happen when CONFIG_LAZY_USER_ADDR_CHECK is enabled because the check in verify_address() will not exist.

The following patch belongs to the second point above. For now, you don't need the patch in verify_address() because your kernel has CONFIG_LAZY_USER_ADDR_CHECK enabled.

diff --git a/mm/fault.c b/mm/fault.c
index 79f624b..a336587 100644
--- a/mm/fault.c
+++ b/mm/fault.c
@@ -266,6 +266,16 @@ void do_page_fault(unsigned int trap, struct sigcontext *sc)
 
                /* in kernel mode */
                } else {
+                       struct sigcontext *usc;
+
+                       /* does it look like a user stack address? */
+                       usc = (unsigned int *)sc->esp + 34;     /* user sigcontext before the page fault */
+                       if(cr2 >= (usc->oldesp - 32) && cr2 < PAGE_OFFSET) {
+                               if((!page_not_present(vma, cr2, usc))) {
+                                       return;
+                               }
+                       }
+
                        dump_registers(trap, sc);
                        show_vma_regions(current);
                        do_exit(SIGTERM);

Please, let me know if it works for you.
I'll submit a complete patch soon.

@rick-masters
Copy link
Contributor Author

Yes that works. Thank you.
I'd recommend this change to avoid a pointer conversion warning:

                        usc = (struct sigcontext *) ((unsigned int *) sc->esp + 34);    /* user sigcontext before the page fault */

@mikaku
Copy link
Owner

mikaku commented Feb 3, 2023

I'd recommend this change to avoid a pointer conversion warning:

Thanks, I'll take it into account when sending the definitive patch.

@mikaku mikaku closed this as completed in 0e0df33 Feb 3, 2023
mikaku added a commit that referenced this issue Feb 6, 2023
…ddress when CONFIG_LAZY_USER_ADDR_CHECK is disabled, since 'do_page_fault()' will do the rest of the job #22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants