-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault in function storePhysicalMemory #8
Comments
Selfie allocates memory for the emulator in 4KB chunks using malloc. There may be something wrong with that in particular if the returned address is a negative integer. You will need to track those malloc calls and see if memory access matches the blocks returned by malloc. |
Also, make sure the binary is 32-bit. |
yliam, have you been able to resolve the problem? I would like to close the issue. Thanks! |
Hello again Sir, I was unsuccessful in resolving the segmentation fault issue I reported to you in May. I don't think I have enough knowledge of Selfie's internal architecture at this point (and programming skill) to find and kill the bug. I brought the issue to your attention in hopes that, seeing how you engineered an wrote Selfie in an inherently portable manner, you might know of a simple fix that could easily be implemented to eliminate the bug. I realize you have developed Selfie on Linux for Linux, but I think if it could be tweaked to run successfully on Windows (natively) without Cygwin, additional (including beginning programming and/or computer architecture student) users might find the project very appealing and enjoyable, as it is inherently simple in concept and completely self hosting! If you feel there is no time right now to investigate the bug further, it's no problem for me if you close the issue on GitHub. Perhaps one of us (or another enterprising party) will figure it out in time. I congratulate you for the progress you are making on the book manuscript. This will prove to be an invaluable reference for all Selfie enthusiasts, including the previously mentioned beginning programming and/or computer architecture students too! Thank you so much for all the inspirational work you have put into Selfie thus far! Sincerely, |
I could reproduce the segfault on a VM with MSYS2/GCC 6.1.0 and revision 2f93229. (gdb) backtrace I think isVirtualAddressMapped() assumes that uninitialized heap memory is filled with zeros. However, Windows' LocalAlloc() initializes memory content with 0xBAADF00D, which is why it returns 1. |
Impressive! Nice work! @Blazefrost, zeroing all registers as well is perfectly fine. I remember that I thought about doing that a long time ago but then forgot. Could you please do a pull request on the master branch? @yliam, thanks a lot for encouraging us to look into this. As soon as I have the pull request, I will integrate it into the master branch for you to try on your system. |
@yliam, let me also thank you for your kind words on selfie and the manuscript. It is nice to hear that people are interested in this project! Finishing the manuscript will take more time but we are actively working on it, also on the code, experimenting with RISC-V, 64-bit, and multicore support. |
Done! It's pull request #9. |
Thanks! I just pulled this in and then polished the code a bit. @yliam, can you try the code on your machine? |
Well, that DID fix the segmentation fault in function storePhysicalMemory. However, now that the code execution gets past the point of the original (now fixed) segmentation fault, I'm getting another segmentation fault in function loadPhysicalMemory. The new segmentation fault occurs on the 19th call into loadPhysicalMemory. The top of the call stack shows loadPhysicalMemory(paddr=0x300) when the crash occurs. I'm guessing the cause of this bug is very similar to the original one. |
Interesting. @yliam: could you please post the whole console output? @Blazefrost: could you please try to reproduce that error? Try my latest version as well as your fix. Thanks! |
I'm not sure if this is what you're looking for, but here's the session log (with unix line endings). |
Thanks, @yliam I'd need the command and its output that leads to the new seg fault. |
The command issued in windows is: The output leading to the new seg fault is: Then the seg fault occurs. |
Looks like selfie.m is either corrupted or the file is ok but incorrectly loaded. Can you post the file here? |
I'm sorry that I accidentally hit the close issue button. I immediately reopened it. |
That's ok, no worries. Could you please run: ./selfie -c selfie.c -s selfie.s -o selfie.m and send me selfie.s and selfie.m? Please erase both selfie.s and selfie.m before doing so. |
Indeed, there is a issue with both reading and writing binaries due to Windows' strange way of handling files. On POSIX systems, the data you write goes into the file, whereas on Windows there are two file modes. In text mode, the default option, Windows does weird things like handling CRLF or reporting EOF when encountering 0x1A (Ctrl-Z). In binary mode, Windows treats files similar to POSIX (See http://cygwin.com/faq.html#faq.api.cr-lf, first three paragraphs). Binary mode can be enabled by using the binary mode flag (0x8000). The flags are set correctly in the patch below: diff --git a/selfie.c b/selfie.c
index ed324d6..eeabb28 100644
--- a/selfie.c
+++ b/selfie.c
@@ -163,11 +163,11 @@ int* string_buffer; // buffer for string output
int* filename_buffer; // buffer for filenames
int* io_buffer; // buffer for binary I/O
-// 0 = O_RDONLY (0x0000)
-int O_RDONLY = 0;
+// 32768 = 0x8000 = _O_BINARY (0x8000) | _O_RDONLY (0x0000)
+int O_RDONLY = 32768;
-// 577 = 0x0241 = O_CREAT (0x0040) | O_WRONLY (0x0001) | O_TRUNC (0x0200)
-int O_CREAT_WRONLY_TRUNC = 577; // flags for opening write-only files
+// 33537 = 0x8301 = _O_BINARY (0x8000) | _O_CREAT (0x0100) | _O_WRONLY (0x0001) | _O_TRUNC (0x0200)
+int O_CREAT_WRONLY_TRUNC = 33537; // flags for opening write-only files
// 420 = 00644 = S_IRUSR (00400) | S_IWUSR (00200) | S_IRGRP (00040) | S_IROTH (00004)
int S_IRUSR_IWUSR_IRGRP_IROTH = 420; // flags for rw-r--r-- file permissions
|
Nice! Is the only difference between Windows and Unix then S_IRUSR_IWUSR_IRGRP_IROTH versus S_IREAD_IWRITE? Can we have the _O_BINARY set in O_CREAT_WRONLY_TRUNC and O_RDONLY even on Unix? I am asking because I would like to identify the minimal difference in selfie between Windows and Unix. Thanks! |
Unfortunately not. Not only S_IREAD_IWRITE/S_IRUSR_IWUSR_IRGRP_IROTH differ but also the open() flags, e.g. O_CREAT is 0x0100 on Windows, but 0x0040 on UNIX. |
@Blazefrost: Excellent! Thanks for the clarification! Once I set the binary mode flag (0x8000) for Windows, the segmentation fault went away and selfie now runs to completion! @ckirsch: I'm noticing a very slight difference between the selfie1.m and selfie2.m files generated during self-compilation using the Windows equivalent of the Linux command: ./selfie -c selfie.c -o selfie1.m -m 2 -c selfie.c -o selfie2.m Which, in Windows, is: .\selfie.exe -c .\selfie.c -o .\selfie1.m -m 2 -c .\selfie.c -o .\selfie2.m I'll do some more testing and report back on this, with copies of the .m files for comparison. |
Great! @yliam: please generate assembly files as well to see if there is a difference: .\selfie.exe -c .\selfie.c -o .\selfie1.m -s selfie1.s -m 2 -c .\selfie.c -o .\selfie2.m -s selfie2.s @Blazefrost: I will eventually need a pull request with your fix. Could you please do that in such a way that going from the Unix to the Windows version is a simple matter of changing a few comments? But for now, let's first see if there is a problem with the above. Many thanks! |
Hello again all, I took selfie.c from commit b2b6fa3 and implemented the changes outlined by Blazefrost. After running the following: I got the following results (including modified selfie.c): There are slight differences between selfie1.m and selfie2.m. |
Function implementOpen() also contains the following call:
Does this need to be addressed as well? |
Thanks, @yliam! No, this:
is fine and the way you applied @Blazefrost's patch is also fine. After comparing the binary and assembly files that you sent us I found the issues in the code leading to the slight differences in both binary and assembly files. They were again related to non-zeroed memory. Please update to the latest version and try again. |
By the way, is anyone willing to find flags that work on all systems? I ran selfie with the flags for Windows on my Mac and it worked...The only difference is that files created by selfie get rw------- rather than rw-r--r-- permissions which is at least not problematic. I would anyway prefer to have a solution that is based on proper reasoning. |
Thanks Professor Kirsch! After applying @Blazefrost's patch to your most recent bug fix in commit 2067b46, selfie is now working perfectly for me natively on Windows. Cygwin is no longer required! Here's the modified source file I used (rename it to selfie.c before use): First, I compiled it with: Then, I ran it in self-compilation mode with: The generated files selfie1.m and selfie2.m are identical! |
I've digged through MinGW's <sys/stat.h> and found out that all permission flags are present in it, too. Luckily, the values of the flags match with POSIX's specs, so S_IRUSR_IWUSR_IRGRP_IROTH doesn't need to be changed after all 😉. I've changed the patch above accordingly. |
Does anybody know what is affected if you set the Windows specific _O_BINARY oflag (0x8000) bit in calls to the open() function on POSIX systems? |
Thank you, @Blazefrost! I pulled this in but now made the Windows flags default. Let's see if travis fails or not. It does work on my Mac. I also found a mistake in the Unix flags...but things worked anyway. |
@yliam, it does seem that _O_BINARY is ignored on Unix systems... |
Ok, the Linux build with the WINDOWS flags fails. So, I went back to the LINUX/MAC configuration as default. However, I am not yet giving up on finding a one-configuration-fits-all setting. @Blazefrost, you seem to have access to both Linux and Windows machines, right? I am wondering what happens on Windows when we use a combination of all flags, that is, 34561 or 0x8701. But more systematically, could you try to write code that figures out what OS it is dealing with by opening the file that needs to be opened but using a subset of all flags in such a way that failure tells you the OS and then allows you to retry with the correct flags? Ideally, if the second attempt also fails then it should be a true failure, in particular not misuse of flags anymore. |
By the way, this has to work independently of the to-be-opened file already existing or not. |
If that doesn't work you may rely on the variable |
I continued experimenting a bit more. The latest version works again on Linux and Mac but probably not on Windows. Can anyone check? |
Per your request, I tried 628b1e8. It compiled fine with MinGW, but gave me the following error: selfie.exe: could not create binary output file .\selfie1.m At this point, it looks like either the POSIX configuration or the Windows configuration can work by default, but not both! |
Too bad. I am now experimenting with something new but I have difficulties finding the correct flags for Linux in fact. Turns out that our original flag value 577 worked for Linux and Mac but only by accident somehow. Does anyone know why 4353 does not work on Linux? |
Ok, I figured the Linux flags out: 577 is indeed correct. I did that by running a test program checked into a new branch called flags. Can someone please check this out and try to get In the meantime I checked in selfie in the master branch that may in fact work on all platforms now, please try as well. Thanks! |
38cf8f8 does indeed work on Windows, when the new function openWriteOnly() makes the third call to function open(), after rolling over to the configuration that uses the Windows flags! I also got make to run the selfie Makefile in Windows PowerShell after making only a few changes to the Makefile. In case you don't know, Windows PowerShell is Microsoft's preferred command shell in Windows. Here are the details on how I got make to run the selfie Makefile on Windows: 1 - "CFLAGS" variable: MinGW doesn't like the define "-D'main(a,b)=main(a,char**argv)'", so it was removed. I'm not sure what I missed here that prevented the define from working. 2 - "selfie" target: "$(CC)" causes an error, so it was hard coded to "gcc" instead. I probably also missed something simple here that would have let "$(CC)" work. 3a - "test" target: Built-in Windows commands can always be loaded in PowerShell without specifying their locations. However, for security reasons, PowerShell doesn't load non-built-in Windows commands from the current location by default, so attempting to load selfie by using either "selfie" or "selfie.exe" (without explicitly telling PowerShell to run the selfie executable in the current location) does not work. In this case though, selfie is trusted, so this default behavior is easily overridden by simply telling PowerShell to load the command from the current location using either ".\selfie" or ".\selfie.exe". Either command works, because Windows automatically checks to see if there is an executable file name with a ".exe" extension for the command that was entered. 3b - "test" target: The Linux utility program named "diff" does not exist in Windows. In PowerShell, the command "diff" is an alias to a built-in PowerShell cmdlet named "Compare-Object", which is not what is wanted here. However, Windows does have a built-in Windows command named "fc" (actual file name "fc.exe"), which, similarly to "diff" on Linux, compares two files or sets of files and displays the differences between them. Strangely though, in the case of the built-in Windows command "fc", if it is being used in a PowerShell shell, then the complete file name "fc.exe" must be used instead of the usually acceptable form of "fc". This is because, in PowerShell, the command "fc" is an alias to another built-in PowerShell cmdlet named "Format-Custom". The desired action must therefore be invoked by issuing the command as "fc.exe" (instead of in the usually acceptable form of "fc" without the ".exe") to avoid it being interpreted by PowerShell as the "Format-Custom" cmdlet's alias "fc" instead of "fc.exe". 4 - "clean" target: The Linux "rm -rf" command was replaced by its Windows equivalent, "del". Here's the Makefile for MinGW on Windows: Finally, here's the log for the successful run of make test for 38cf8f8 on Windows: |
Additionally, I compiled test.c and windows.c from the new flags branch using MinGW on Windows and ran each of them in PowerShell on Windows. Here are the results: PS C:\selfie-flags> .\test PS C:\selfie-flags> .\windows |
@yliam: thanks a lot! This confirms the flag settings we have in selfie.c now. Very nice. I am not sure yet if we can support your MinGW version of the Makefile but we will definitely keep the Windows support in selfie.c As soon as I get to it I will work on the readme.md file for selfie to reflect Windows support. |
I cleaned up the code and updated the Readme file. I also finished the in-memory linker and pulled it into the master branch. Yes, we now have a linker. I also removed the linker and flags branches. @yliam: I believe the original issue is solved. Could you please close the issue? Many thanks, everyone, for your help! |
Yes, the original issue has been successfully resolved! Thank you @ckirsch, @Blazefrost, and anyone else who contributed in figuring out what was causing the problem. Now we all have a fine cross-platform version of selfie to use as a result! Thanks again, |
Greetings Again Sir,
I realize your development platform and therefore main operating focus for Selfie is Linux. As you will recall from our previous correspondence in February, I have successfully run Selfie in Cygwin. However, I have also compiled selfie natively on Windows with MinGW. I only had to change the following six lines by looking up and changing open() oflag values to reflect those found in the MinGW fcntl.h file and file access permission (sflag) values to reflect those found in the MinGW sys/stat.h file. Original and changed versions of each line are shown below:
165
// 577 = 0x0241 = O_CREAT (0x0040) | O_WRONLY (0x0001) | O_TRUNC (0x0200)
// 769 = 0x0301 = _O_CREAT (0x0100) | _O_WRONLY (0x0001) | _O_TRUNC (0x0200)
166
int O_CREAT_WRONLY_TRUNC = 577; // flags for opening write-only files
int win_O_CREAT_WRONLY_TRUNC = 769; // flags for opening write-only files
168
// 420 = 00644 = S_IRUSR (00400) | S_IWUSR (00200) | S_IRGRP (00040) | S_IROTH (00004)
// 384 = 0x0180 = _S_IREAD (0x0100) | _S_IWRITE (0x0080)
169
int S_IRUSR_IWUSR_IRGRP_IROTH = 420; // flags for rw-r--r-- file permissions
int win_S_IREAD_IWRITE = 384; // flags for rw file permissions
3976
fd = open(binaryName, O_CREAT_WRONLY_TRUNC, S_IRUSR_IWUSR_IRGRP_IROTH);
fd = open(binaryName, win_O_CREAT_WRONLY_TRUNC, win_S_IREAD_IWRITE);
6015
assemblyFD = open(assemblyName, O_CREAT_WRONLY_TRUNC, S_IRUSR_IWUSR_IRGRP_IROTH);
assemblyFD = open(assemblyName, win_O_CREAT_WRONLY_TRUNC, win_S_IREAD_IWRITE);
My issue is that I get a signal SIGSEV, Segmentation Fault when line 4878 is executed inside the storePhysicalMemory function on native Windows compiled with MinGW. Do you have any idea as to what might be causing this? Maybe something to do with the paddr pointer?
Thanks,
yliam
The text was updated successfully, but these errors were encountered: