Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlapping VMA regions not merged correctly #16

Closed
rick-masters opened this issue Jan 4, 2023 · 13 comments · Fixed by #17
Closed

Overlapping VMA regions not merged correctly #16

rick-masters opened this issue Jan 4, 2023 · 13 comments · Fixed by #17
Assignees
Labels
bug Something isn't working

Comments

@rick-masters
Copy link
Contributor

If a program calls mmap on memory already mapped the segments are not split/merged correctly. This can result in parts of an existing segment being lost. In the case where the segment was the heap, attempts to access that part of the heap results in a segfault. (See test case 3 below).

This is partly a regression introduced by c786079

The following is a list of various kinds of segment overlaps that can occur. ("old" refers to an existing segment).

1:
old: xxxx
new:     xxxx

2:
old: xxxxxxxx
new:     xxxx

3:
old: xxxxxxxx
new: xxxx

4:
old: xxxx
new: xxxxxxxx

5:
old: xxxxxxxx
new:   xxxx

6:
old: xxxxxxxx
new:   xxxxxxxx

7:
new: xxxxxxxxx
old:   xxxxxxxxx

The new segment should be mapped fully and any overlapping part of the old segment should be discarded. For case 1, if the segments are compatible, they should be merged into one.

Background information that describes the expected behavior is provided here: https://stackoverflow.com/questions/14943990/overlapping-pages-with-mmap-map-fixed

Prior to c786079 cases 2, 3, and 5 worked properly while 1, 4, 6, and 7 caused reboots.

After c786079 case 1 did not merge, 2 and 4 worked, while 3, 5, 6 and 7 left overlapping segments. Also 4 leaks memory and 1 subjects vma memory to a double-free.

With the PR provided, 1, 2, 3, 4, 5, and 6 work properly but 7 still leaves an overlapping segment.

The following are test programs for cases 1 through 7.

Case 1:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = (((unsigned int) heap_start) + 8192) & PAGE_MASK;
 char inbuff[100];

 /* expand heap to more than one page */
 char *expansion = malloc(8192);

 /* mmap memory at the start of the heap, but only part of it. */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 aligned += 16384;
 /* mmap memory right after previous mmap segment */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 return 0;
}

Case 2:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = (((unsigned int) heap_start) + 4096) & PAGE_MASK;

 /* expand heap to more than one page */
 char *expansion = malloc(16384);

 /* mmap memory to the end of the heap. */
 if ((mem = mmap ((void *)aligned, 20480, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 return 0;
}

Case 3:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = ((unsigned int) heap_start) & PAGE_MASK;

 /* expand heap to more than one page */
 char *expansion = malloc(16384);

 /* mmap memory at the start of the heap, but only part of it. */
 if ((mem = mmap ((void *)aligned, 4096, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 /* test accessing the end of the heap */
 expansion[16000] = 0;
 return 0;
}

Case 4:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = ((unsigned int) heap_start) & PAGE_MASK;

 /* mmap memory past the end of the heap. */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 return 0;
}

Case 5:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = (((unsigned int) heap_start) + 4096) & PAGE_MASK;

 /* expand heap to more than one page */
 char *expansion = malloc(16384);

 /* mmap memory in the middle of the heap. */
 if ((mem = mmap ((void *)aligned, 4096, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 /* test accessing the end of the heap */
 expansion[16000] = 0;
 return 0;
}

Case 6:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = (((unsigned int) heap_start) + 4096) & PAGE_MASK;

 /* expand heap to more than one page */
 char *expansion = malloc(8192);

 /* mmap memory past the start of the heap and extending beyond the end of the heap. */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 return 0;
}

Case 7:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>

#define PAGE_SIZE            4096
#define PAGE_MASK            ~(PAGE_SIZE - 1)        /* 0xFFFFF000 */

int main (int argc, char *argv[])
{
 char *mem;
 char *heap_start = malloc(1);
 unsigned int aligned = (((unsigned int) heap_start) + 8192) & PAGE_MASK;
 char inbuff[100];

 /* expand heap to more than one page */
 char *expansion = malloc(8192);

 /* mmap memory at the start of the heap, but only part of it. */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 aligned -= 4096;
 /* mmap memory starting before previous mmap and overlapping it  */
 if ((mem = mmap ((void *)aligned, 16384, 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0)) == (caddr_t) -1) {
    puts ("mmap error!");
    return 1;
 }

 return 0;
}

It can be helpful to print the vma regions before and after every mmap syscall by applying this patch to the kernel:

diff --git a/kernel/syscalls.c b/kernel/syscalls.c
index 5dece15..ebc65eb 100644
--- a/kernel/syscalls.c
+++ b/kernel/syscalls.c
@@ -8,6 +8,7 @@
 #include <fiwix/types.h>
 #include <fiwix/syscalls.h>
 #include <fiwix/mm.h>
+#include <fiwix/mman.h>
 #include <fiwix/stat.h>
 #include <fiwix/errno.h>
 #include <fiwix/string.h>
@@ -406,9 +407,17 @@ int do_syscall(unsigned int num, int arg1, int arg2, int arg3, int arg4, int arg
                return -ENOSYS;
        }
        current->sp = (unsigned int)&sc;
+       int retval;
+       if (num == 90) {
+               show_vma_regions(current);
+       }
 #ifdef CONFIG_SYSCALL_6TH_ARG
-       return sys_func(arg1, arg2, arg3, arg4, arg5, arg6, &sc);
+       retval = sys_func(arg1, arg2, arg3, arg4, arg5, arg6, &sc);
 #else
-       return sys_func(arg1, arg2, arg3, arg4, arg5, &sc);
+       retval = sys_func(arg1, arg2, arg3, arg4, arg5, &sc);
 #endif /* CONFIG_SYSCALL_6TH_ARG */
+       if (num == 90) {
+               show_vma_regions(current);
+       }
+       return retval;
 }
diff --git a/kernel/syscalls/old_mmap.c b/kernel/syscalls/old_mmap.c
index de28f0f..1e804f6 100644
--- a/kernel/syscalls/old_mmap.c
+++ b/kernel/syscalls/old_mmap.c
@@ -12,6 +12,7 @@
 #include <fiwix/errno.h>
 #include <fiwix/string.h>

+#define __DEBUG__
 #ifdef __DEBUG__
 #include <fiwix/stdio.h>
 #include <fiwix/process.h>
@mikaku
Copy link
Owner

mikaku commented Jan 4, 2023

Wow Rick, good catch!

I periodically do a massive build (all packages in the CDROM) when testing new features, basically to avoid regressions. After the changes introduced in c786079 I didn't see anything worse. I keep seeing the same 3 or 4 sporadic segfaults as always. They are pending to fix, of course, but seeing the same results and in the same place made me think I was in the good line.

Your 7 test programs will be really of help. I'll double check the case 7 to see if I can fix the issue.
Meanwhile, I'll apply your patch here and will do another massive build.

I'll let you know the results.

@mikaku
Copy link
Owner

mikaku commented Jan 5, 2023

After the massive build, I still have seen the same number of segfaults in the same places. Perhaps this confirms that sys_mmap() is not working properly yet.

Regarding your program tests:

After c786079 case 1 did not merge, 2 and 4 worked, while 3, 5, 6 and 7 left overlapping segments. Also 4 leaks memory and 1 subjects vma memory to a double-free.

I confirm all cases.

With the PR provided, 1, 2, 3, 4, 5, and 6 work properly but 7 still leaves an overlapping segment.

No, in all cases the heap is affected by the new segments and in the case 4, it even disappears.

For instance, this is the output for the case 1:
(instead of modifying the kernel, I've modified the program test to show the vma regions before and after calling mmap)

(before first mmap)
mmap regions:
08048000-08060000 r-xp 00000000 03:02 18015      [text]
08060000-08061000 rw-p 00018000 03:02 18015      [data]
08061000-08065000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
(after first mmap)
mmap regions:
08048000-08060000 r-xp 00000000 03:02 18015      [text]
08060000-08061000 rw-p 00018000 03:02 18015      [data]
08061000-08063000 rw-p 00000000 00:00 0          [heap]
08063000-08067000 ---p 00000000 00:00 0          [mmap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
(after second mmap)
mmap regions:
08048000-08060000 r-xp 00000000 03:02 18015      [text]
08060000-08061000 rw-p 00018000 03:02 18015      [data]
08061000-08063000 rw-p 00000000 00:00 0          [heap]
08063000-0806b000 ---p 00000000 00:00 0          [mmap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]

As you can see, the heap initially had 08061000-08065000 and at the end it has 08061000-08063000.

I've executed the same program test 1 on a Linux, and this is the result:

mmap regions:
08048000-08060000 r-xp 00000000 03:02 18015
08060000-08061000 rw-p 00018000 03:02 18015
08061000-08064000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
Segmentation fault

The segfault here would confirm that heap has been clobbered by a different segment (mmap):

From the URL you provided above:

... mmap(2) manual, it clearly states:
If the memory region specified by addr and len overlaps pages of any existing mapping(s), then the overlapped part of the existing mapping(s) will be discarded.

As I said, the rest of cases have similar results affecting heap directly.

Also, after applying your patch I have seen a regression in the case of a new mmap() with an address above the first entry. I mean, imagine the following vm regions:

08048000-08072000 r-xp 00000000 03:02 17967      [text]
08072000-08073000 rw-p 0002a000 03:02 17967      [data]
08073000-08074000 rw-p 00000000 00:00 0          [heap]
bfffe000-c0000000 rwxp 00000000 00:00 0          [stack]

Before apply your patch, I was able to include a new mmap before the first one, i.e., an address like 0x06000000 (with a length of 4096 bytes):

# ./mmap 

Address (hex): 06000000
Length (hex): 1000
Mode (0=R,1=W,2=RW): 0
mapping addr = 06000000, len = 4096
Success
mmap regions:
06000000-06001000 r--p 00000000 03:02 17         [mmap]
08048000-08072000 r-xp 00000000 03:02 17967      [text]
08072000-08073000 rw-p 0002a000 03:02 17967      [data]
08073000-08074000 rw-p 00000000 00:00 0          [heap]
bfffe000-c0000000 rwxp 00000000 00:00 0          [stack]

Address (hex): 

After applying your patch, this is not possible and it caused a reboot:

WARNING: release_page(): trying to free an already freed page (150)!
WARNING: release_page(): trying to free an already freed page (158)!
WARNING: release_page(): trying to free an already freed page (1422)!
...
... and reboot.

The following is the test program used to include a new mmap region:

#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <errno.h>

int main(void)
{
        int *addr, start, len;
        int pid, flags, mode;
        int fd;
        char cmd[100];

        for(;;) {
                printf("\n");
                printf("Address (hex): ");
                scanf("%x", &start);
                getchar();      /* eats a '\n' */
                printf("Length (hex): ");
                scanf("%x", &len);
                getchar();      /* eats a '\n' */
                printf("Mode (0=R,1=W,2=RW): ");
                scanf("%d", &mode);
                getchar();      /* eats a '\n' */

                addr = (unsigned int *)start;
                if(start) {
                        flags = MAP_FIXED | MAP_PRIVATE;
                } else {
                        addr = NULL;
                        flags = MAP_PRIVATE | MAP_ANONYMOUS;
                }

                if(mode == 0) {
                        mode = PROT_READ;
                } else if(mode == 1) {
                        mode = PROT_WRITE;
                } else {
                        mode = PROT_READ | PROT_WRITE;
                }

                if(start >= 0) {
                        printf("mapping addr = %08x, len = %d\n", (int)addr, len);
                        if(flags == (MAP_FIXED | MAP_PRIVATE)) {
                                fd = open("/dev/zero", 0);
                                mmap(addr, len, mode, flags, fd, 0);
                                close(fd);
                        } else {
                                mmap(addr, len, mode, flags, -1, 0);
                        }
                        perror("");
                } else {
                        /* if address is negative then remove the vm region */
                        start *= -1;
                        addr = (unsigned int *)start;
                        printf("unmapping addr = %08x, len = %d\n", (int)addr, len);
                        munmap(addr, len);
                        perror("");
                }

                printf("mmap regions:\n");
                memset(cmd, 0, sizeof(cmd));
                sprintf(cmd, "cat /proc/%d/maps", getpid());
                system(cmd);
        }

        return 0;
}

I think this could be related to the case 7.

@rick-masters
Copy link
Contributor Author

For the test cases, the heap should be changed by the mmap. When an mmap segment overlaps the heap, the overlapping part of the heap should be discarded. So, I think the test cases are working as expected (except for 7).

From here: https://man7.org/linux/man-pages/man2/mmap.2.html

If the memory region specified by addr and length overlaps pages of any existing mapping(s), then the overlapped part of the existing mapping(s) will be discarded.

The reason I filed this issue was because a program in live-bootstrap crashed because it was doing something similar to test case 3. It remapped part of its own heap using mmap. But the rest of the heap was deleted unexpectedly and it segfaulted when trying to access that. Honestly, I didn't look closely at why it was remapping the first page of its heap but it appears to be something the musl-1.2.3 library does internally related to dynamic linking. So, remapping the heap is something that should work and that is the reason I wrote the tests to overlap the heap.

I also tried some of these tests on Linux and experienced segfaults, but that's probably because other standard libraries like glibc don't react well to the heap being remapped. But the kernel may be correctly doing what it was told to do according to the documentation of mmap. It would have been simpler for me to provide tests that avoided the heap, and those might not have segfaulted on Linux, but I was trying to address the specific situation I was experiencing with a musl-based program running on Fiwix.

With regards to 0x60000000 I agree that if that worked before it should still work. I'll take a look at that today and report back later.

@rick-masters
Copy link
Contributor Author

I added a commit to the PR that handles the 0x06000000 situation. The previous code did not handle case where vma->prev wrapped around to the end of list or to itself. Sorry about that.

@mikaku
Copy link
Owner

mikaku commented Jan 5, 2023

For the test cases, the heap should be changed by the mmap. When an mmap segment overlaps the heap, the overlapping part of the heap should be discarded. So, I think the test cases are working as expected (except for 7).

Yes, the problem here is that even after applying your patch Fiwix wasn't managing correctly the discarded segments. It didn't discard the memory pages associated to those overlapped segments. That's why we don't get a segfault when we should.

I also tried some of these tests on Linux and experienced segfaults, but that's probably because other standard libraries like glibc don't react well to the heap being remapped. But the kernel may be correctly doing what it was told to do according to the documentation of mmap. It would have been simpler for me to provide tests that avoided the heap, and those might not have segfaulted on Linux, but I was trying to address the specific situation I was experiencing with a musl-based program running on Fiwix.

I appreciate a lot the effort you put on fixing this. Fiwix indeed was lacking of a properly sys_mmap behavior.

I'll include a call to free_vma_pages() in merge_vma_regions() in order to free the pages associated to a discarded segment.

@rick-masters
Copy link
Contributor Author

It didn't discard the memory pages associated to those overlapped segments.

Oh, yeah that's something I never thought of. The previous code adjusted the vma ranges without addressing the underlying pages so it didn't even occur to me. I don't have tests which verify pages are unmapped. I only have test 5 which verified a page was still mapped.

You may need to add or remove a subset of pages from a vma or transfer some of them to that new vma region created in merge_vma_regions, which is more complicated than I thought. If you make that change I'll be sure to look it over and make sure it works for the situations I'm testing.

@rick-masters
Copy link
Contributor Author

After thinking about this a little bit more I am wondering why the test 5 program segfaulted before my fix was applied. The vma for the heap was shortened but the underlying pages for the end of the heap should have still been mapped...

@mikaku mikaku closed this as completed in #17 Jan 6, 2023
@mikaku mikaku added the bug Something isn't working label Jan 6, 2023
@mikaku
Copy link
Owner

mikaku commented Jan 6, 2023

I don't have tests which verify pages are unmapped. I only have test 5 which verified a page was still mapped.

You mean this line?

 expansion[16000] = 0;

Tests 3 and 5 have this line.

or transfer some of them to that new vma region created in merge_vma_regions, which is more complicated than I thought.

No, the system call sys_mremap is not supported yet. So, AFAIK, memory pages must be only freed or removed.

After thinking about this a little bit more I am wondering why the test 5 program segfaulted before my fix was applied. The vma for the heap was shortened but the underlying pages for the end of the heap should have still been mapped...

AFAIK, it segfaulted because you did an mmap() in the middle of heap and, hence, part of the heap was discarded (memory pages freed).

Oh!, you're right. It doesn't segfaults under Linux. The address of the variable expansion is in the first part of the heap which has not been overlapped by the mmap().

@mikaku
Copy link
Owner

mikaku commented Jan 6, 2023

Case 5 (continued):

This is the result under Fiwix:

# ./5
expansion = 8066018
before mmap
mmap regions:
08048000-08065000 r-xp 00000000 03:02 18021      [text]
08065000-08066000 rw-p 0001d000 03:02 18021      [data]
08066000-0806c000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
after mmap
mmap regions:
08048000-08065000 r-xp 00000000 03:02 18021      [text]
08065000-08066000 rw-p 0001d000 03:02 18021      [data]
08066000-08067000 rw-p 00000000 00:00 0          [heap]
08067000-08068000 ---p 00000000 00:00 0          [mmap]
08068000-0806c000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
Segmentation fault

and this is the result under Linux:

# ./5
expansion = 8065b00
before mmap
mmap regions:
08048000-08065000 r-xp 00000000 03:02 18021
08065000-08066000 rw-p 0001d000 03:02 18021
08066000-0806b000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
after mmap
mmap regions:
08048000-08065000 r-xp 00000000 03:02 18021
08065000-08066000 rw-p 0001d000 03:02 18021
08066000-08067000 ---p 00000000 00:00 0
08067000-0806b000 rwxp 00001000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0

I see two different things here:

  1. the address of expansion in Fiwix resides in the heap segment, but it is in the data segment under Linux.
  2. Linux introduced and offset of 4096 bytes (0x1000) in the resulting heap.

@mikaku
Copy link
Owner

mikaku commented Jan 6, 2023

This is the change in mm/mmap.c to free memory pages in overlapped segments:

diff --git a/mm/mmap.c b/mm/mmap.c
index 30a9d3c..ba2e45b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -208,6 +245,8 @@ void merge_vma_regions(struct vma *a, struct vma *b)
                new->s_type = a->s_type;
                new->inode = a->inode;
                new->o_mode = a->o_mode;
+               free_vma_pages(a, b->start, a->end - b->start);
+               invalidate_tlb();
                a->end = b->start;
                if(a->start == a->end) {
                        del_vma_region(a);
@@ -220,7 +259,7 @@ void merge_vma_regions(struct vma *a, struct vma *b)
        }
 }
 
-static void free_vma_pages(struct vma *vma, unsigned int start, __size_t length)
+void free_vma_pages(struct vma *vma, unsigned int start, __size_t length)
 {
        unsigned int n, addr;
        unsigned int *pgdir, *pgtbl;

@mikaku
Copy link
Owner

mikaku commented Jun 1, 2023

I recently noticed, while doing a complete package build, that some packages showed the following message when its ./configure script was running:

checking for working mmap... no

Why that no? As far as I can tell, sys_mmap() seems pretty stable after all.
Well, I think this is some way related to the problem in the example 5.

I was able to extract the program that tested mmap() inside configure, then removed irrellevant lines, and added some printf() to help to track process mapping at each step:

#include <stdio.h>
/* Thanks to Mike Haertel and Jim Avera for this test.
   Here is a matrix of mmap possibilities:
        mmap private not fixed
        mmap private fixed at somewhere currently unmapped
        mmap private fixed at somewhere already mapped
        mmap shared not fixed
        mmap shared fixed at somewhere currently unmapped
        mmap shared fixed at somewhere already mapped
   For private mappings, we should verify that changes cannot be read()
   back from the file, nor mmap's back from the file at a different
   address.  (There have been systems where private was not correctly
   implemented like the infamous i386 svr4.0, and systems where the
   VM page cache was not coherent with the file system buffer cache
   like early versions of FreeBSD and possibly contemporary NetBSD.)
   For shared mappings, we should conversely verify that changes get
   propogated back to all the places they're supposed to be.

   Grep wants private fixed already mapped.
   The main things grep needs to know about mmap are:
   * does it exist and is it safe to write into the mmap'd area
   * how to use it (BSD variants)  */

#include <fcntl.h>
#include <sys/mman.h>
#include <sys/errno.h>
#include <stdlib.h>

static void show_maps(void)
{
        char buf[80];
        FILE *f;

        sprintf(buf, "/proc/self/maps");
        f = fopen(buf, "r");
        while(fgets(buf, 79, f)) {
                printf("%s", buf);
        }
        fclose(f);
}

int
main (void)
{
  char *data, *data2, *data3;
  int i, pagesize;
  int fd;

  pagesize = getpagesize ();

  /* First, make a file with some known garbage in it. */
  show_maps();
  data = (char *) malloc (pagesize);
  printf("data = (0x%x) (pagesize = %d)\n", (unsigned int)data, pagesize);
  show_maps();
  if (!data) {
    printf("ERR: first malloc failed!\n");
    return (1);
  }
  for (i = 0; i < pagesize; ++i)
    *(data + i) = rand ();
  umask (0);
  fd = creat ("conftest.mmap", 0600);
  if (fd < 0) {
    printf("ERR: creat(conftest.mmap)\n");
    return (1);
  }
  if (write (fd, data, pagesize) != pagesize) {
    printf("ERR: write(..., pagesize)\n");
    return (1);
  }
  close (fd);

  /* Next, try to mmap the file at a fixed address which already has
     something else allocated at it.  If we can, also make sure that
     we see the same garbage.  */
  fd = open ("conftest.mmap", O_RDWR);
  if (fd < 0) {
    printf("ERR: open(conftest.mmap)\n");
    return (1);
  }
  data2 = (char *) malloc (2 * pagesize);
  printf("data2 = (0x%x) (pagesize = %d)\n", (unsigned int)data2, pagesize);
  show_maps();
  if (!data2) {
    printf("ERR: not data2\n");
    return (1);
  }
  data2 += (pagesize - ((int) data2 & (pagesize - 1))) & (pagesize - 1);
  printf("data2 = (0x%x)\n", (unsigned int)data2);
  show_maps();
  if (data2 != mmap (data2, pagesize, PROT_READ | PROT_WRITE,
                     MAP_PRIVATE | MAP_FIXED, fd, 0L)) {
    printf("ERR: mmap()\n");
    return (1);
  }
  for (i = 0; i < pagesize; ++i)
    if (*(data + i) != *(data2 + i)) {
      printf("data != data2\n");
      return (1);
    }

  /* Finally, make sure that changes to the mapped area do not
     percolate back to the file as seen by read().  (This is a bug on
     some variants of i386 svr4.0.)  */
  for (i = 0; i < pagesize; ++i)
    *(data2 + i) = *(data2 + i) + 1;
  data3 = (char *) malloc (pagesize);
  perror("data3");
  printf("data3 = (0x%x) (pagesize = %d)\n", (unsigned int)data3, pagesize);
  show_maps();
  if (!data3) {
    printf("ERR: not data3\n");
    return (1);
  }
  if (read (fd, data3, pagesize) != pagesize) {
    printf("read != pagesize\n");
    return (1);
  }
  for (i = 0; i < pagesize; ++i)
    if (*(data + i) != *(data3 + i)) {
      printf("data != data3\n");
      return (1);
    }
  close (fd);
  return (0);
}

This is the output of this program under Fiwix:

# ./err_mmap 
08048000-08062000 r-xp 00000000 03:02 18662      [text]
08062000-08063000 rw-p 0001a000 03:02 18662      [data]
08063000-08064000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
data = (0x80639c8) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662      [text]
08062000-08063000 rw-p 0001a000 03:02 18662      [data]
08063000-08066000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
data2 = (0x80649d0) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662      [text]
08062000-08063000 rw-p 0001a000 03:02 18662      [data]
08063000-08069000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
data2 = (0x8065000)
08048000-08062000 r-xp 00000000 03:02 18662      [text]
08062000-08063000 rw-p 0001a000 03:02 18662      [data]
08063000-08069000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
data3: Success
data3 = (0x0) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662      [text]
08062000-08063000 rw-p 0001a000 03:02 18662      [data]
08063000-08065000 rw-p 00000000 00:00 0          [heap]
08065000-08066000 rw-p 00000000 03:02 17963      [mmap]
08066000-0806b000 rw-p 00000000 00:00 0          [heap]
bffff000-c0000000 rwxp 00000000 00:00 0          [stack]
ERR: not data3

Why malloc() returns NULL to data3?

And this is the output under Linux (i386) :

# ./err_mmap 
08048000-08062000 r-xp 00000000 03:02 18662
08062000-08063000 rw-p 0001a000 03:02 18662
08063000-08064000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
data = (0x80634b0) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662
08062000-08063000 rw-p 0001a000 03:02 18662
08063000-08066000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
data2 = (0x80644b8) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662
08062000-08063000 rw-p 0001a000 03:02 18662
08063000-08069000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
data2 = (0x8065000)
08048000-08062000 r-xp 00000000 03:02 18662
08062000-08063000 rw-p 0001a000 03:02 18662
08063000-08069000 rwxp 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
data3: Success
data3 = (0x80664c0) (pagesize = 4096)
08048000-08062000 r-xp 00000000 03:02 18662
08062000-08063000 rw-p 0001a000 03:02 18662
08063000-08065000 rwxp 00000000 00:00 0
08065000-08066000 rw-p 00000000 03:02 17963
08066000-08069000 rwxp 00003000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0

As you can see, there is again an offset on an anonymous mapping:

08066000-08069000 rwxp 00003000 00:00 0

I need to investigate further.

@mikaku
Copy link
Owner

mikaku commented Jun 3, 2023

OK, I've found the bug:

Fiwix/mm/mmap.c

Line 248 in 2fd0923

free_vma_pages(a, b->start, a->end - b->start);

This line should be:

       free_vma_pages(a, b->start, b->end - b->start);

otherwise it ends up freeing too many pages.

After applying this patch the test 5 no longer segfaults and, after another complete package build, all mmap checkings now say that mmap works.

@mikaku
Copy link
Owner

mikaku commented Jun 6, 2023

I think we can safely close it now.
We can reopen it if we discover another bug related to this issue.

@mikaku mikaku closed this as completed Jun 6, 2023
mikaku added a commit that referenced this issue Aug 3, 2023
sys_lchown, and creating the new #182 sys_chown which will follow
(dereference) symbolic links.

This might break the compatibility with Linux 2.0 ABI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants