MoarVM build fails on WSL (Bash on Windows) #470

Closed
yorickdowne opened this Issue Dec 28, 2016 · 11 comments

Comments

Projects
None yet
5 participants
@yorickdowne

yorickdowne commented Dec 28, 2016

Building MoarVM currently fails out of the box on WSL. Tested with Rakudo * 2016.11 and Windows 10 build 14986, Ubuntu 16.04 LTS. Output from perl Configure.pl --backend=moar --gen-moar attached as configure-rakudo-txt.

This is due to a missing feature in WSL: It does not implement execstack. See discussion at Microsoft/WSL#286. MS may, at some future point, implement execstack, and this issue will go away.

MoarVM can be built by manually clearing execstack after the first attempt via execstack -c /path/to/libmoar.so, then attempting the build again. See second attached file configure-rakudo-no-execstack-pass2.txt .

The two instances of libmoar.so are:
./MoarVM/libmoar.so
./install/lib/libmoar.so

Edit 2017-01-04: This can likely be solved upstream at dyncall, see next comment.

There are four ways to handle this issue that come to mind:

  1. Do nothing, wait for MS to implement execstack. They will do so eventually. I've asked for roadmap and don't currently have an answer.
  2. Do nothing much, document the execstack -c workaround at http://rakudo.org/how-to-get-rakudo/#Installing-Rakudo-Star-Source-Prerequisites-Debian. Though, on second thought: This might be iffy. This just clears the execstack p_flag. Does libmoar.so actually require execstack?
  3. Detect WSL and link libmoar.so with -Wl,-z,noexecstack, see also Microsoft/WSL#283. This would avoid the build error. This workaround for MS' missing feature would then, likely, be removed again when MS releases execstack support in a production version of Windows 10 / WSL. Same concern as 2).

3a: To detect WSL, this code snippet that was considered for a "perl 5 on WSL" issue (since fixed) could be adapted in some form. Test for "Microsoft" or "WSL" in /proc/version:

if ($^O eq 'linux') {
    if (open my $pv, '<', '/proc/version') {
        my $lv = <$pv>;
        if (index($lv,'Microsoft') != -1 || index($lv,'WSL') != -1) {
            // Special-case the build here
        }
    }
}
  1. Investigate why libmoar.so has execstack set. The files that cause it are in dyncall. As per https://linux.die.net/man/8/execstack , GCC "trampoline" code, an example being nested functions, requires execstack. Compiling without should be instructive. See also https://www.win.tue.nl/~aeb/linux/hh/protection.html.
    However, linking without execstack is successful, see 4a. Testing fails on spectests, although it shows 0 Failed tests (see below). I am not sure whether that's expected.
    See also https://wiki.gentoo.org/wiki/Hardened/GNU_stack_quickstart for finding the .o files that require execstack. It turns out it's in dyncall, see 4b.

4a: I've tested disabling execstack during compile/link. MoarVM builds, rakudo builds, make test (mostly) succeeds. Which leaves me a little confused as to why GCC is setting the execstack flag. Is there a seldom-used code path that requires it? Or is this a fluke, and MoarVM does not need execstack after all?
This is how I tested:

Edit MoarVM/build/Makefile.in

Append -Wl,-z,noexecstack to LDFLAGS

These succeed:

perl Configure.pl --backend=moar --gen-moar
make
make rakudo-test

This fails, but succeeds "mostly":
make rakudo-spectest

Test Summary Report
-------------------
t/spec/S03-metaops/hyper.rakudo.moar                        (Wstat: 0 Tests: 402 Failed: 0)
  TODO passed:   115
t/spec/S05-modifier/counted-match.rakudo.moar               (Wstat: 0 Tests: 29 Failed: 0)
  TODO passed:   19-20, 25-27
t/spec/S05-substitution/subst.rakudo.moar                   (Wstat: 0 Tests: 179 Failed: 0)
  TODO passed:   63
t/spec/S15-nfg/regex.rakudo.moar                            (Wstat: 0 Tests: 12 Failed: 0)
  TODO passed:   10
t/spec/S26-documentation/09-configuration.rakudo.moar       (Wstat: 0 Tests: 17 Failed: 0)
  TODO passed:   8-10
t/spec/S32-io/pipe.t                                        (Wstat: 65280 Tests: 10 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 14 tests but ran 10.
t/spec/S32-str/substr.rakudo.moar                           (Wstat: 0 Tests: 56 Failed: 0)
  TODO passed:   56
t/spec/integration/advent2012-day15.rakudo.moar             (Wstat: 0 Tests: 11 Failed: 0)
  TODO passed:   9
Files=1092, Tests=51351, 1090 wallclock secs ( 5.69 usr 11.26 sys + 637.14 cusr 241.83 csys = 895.92 CPU)
Result: FAIL
Makefile:514: recipe for target 'm-spectest5' failed
make[1]: *** [m-spectest5] Error 1

This succeeds:
make modules-test

4b: Using scanelf on MoarVM after a failed build (Makefile.in not modified), dyncall is the reason stackexec is set. I've reached out to D. Adler to see whether he can spare cycles to shed light on when (or whether) dyncall executes code on the stack.

tbehrens@WSL-Insider-Test:~/rakudo/rakudo-star-2016.11/MoarVM$ scanelf -qeR .
!WX --- --- ./3rdparty/dyncall/dyncall/dyncall_call.o
!WX --- --- ./3rdparty/dyncall/dyncallback/dyncall_callback_arch.o
RWX --- --- ./libmoar.so

configure-rakudo.txt
configure-rakudo-no-execstack-pass2.txt

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Jan 4, 2017

dyncall-noexec-asm-patch.txt
dyncall-configure-patch.txt

Progress and a suggested solution.

1), 2) and 3): All reasonable, and they all miss the mark a little bit.

4): I have been in touch with the dyncall devs. They do not require an executable stack. That two .o files are built with the execstack flag is an artifact of their make environment.
Suggested solution: Wait for dyncall to resolve this, then incorporate into MoarVM from upstream.

One possbile fix is to specify ASFLAGS in "configure" in the dyncall root directory.

dyncall devs will instead investigate adding .section .note.GNU-stack,"",@progbits to relevant .S files in dyncall. This might come as part of 1.0 and "will take a while."

For reference, I've attached patch files showing either approach. Final solution by dyncall devs will most likely not be that exact patch.

This is the statement by one of the dyncall devs:

I haven’t done anything with the execstack and we do not need an executable C stack (if that’s whats behind execstack).
The dyncallback library (for closures) allocates memory chunks that are written and then made executable. But these chunks come from mmap or VirtualAlloc on Win32.
The only thing that dyncall(back) does with the stack is to copy buffered argument data from heap to stack (for dyncalls) and the reverse (for dyncallbacks).

yorickdowne commented Jan 4, 2017

dyncall-noexec-asm-patch.txt
dyncall-configure-patch.txt

Progress and a suggested solution.

1), 2) and 3): All reasonable, and they all miss the mark a little bit.

4): I have been in touch with the dyncall devs. They do not require an executable stack. That two .o files are built with the execstack flag is an artifact of their make environment.
Suggested solution: Wait for dyncall to resolve this, then incorporate into MoarVM from upstream.

One possbile fix is to specify ASFLAGS in "configure" in the dyncall root directory.

dyncall devs will instead investigate adding .section .note.GNU-stack,"",@progbits to relevant .S files in dyncall. This might come as part of 1.0 and "will take a while."

For reference, I've attached patch files showing either approach. Final solution by dyncall devs will most likely not be that exact patch.

This is the statement by one of the dyncall devs:

I haven’t done anything with the execstack and we do not need an executable C stack (if that’s whats behind execstack).
The dyncallback library (for closures) allocates memory chunks that are written and then made executable. But these chunks come from mmap or VirtualAlloc on Win32.
The only thing that dyncall(back) does with the stack is to copy buffered argument data from heap to stack (for dyncalls) and the reverse (for dyncallbacks).

@ryanerwin

This comment has been minimized.

Show comment
Hide comment
@ryanerwin

ryanerwin Apr 21, 2017

Any luck with a MoarVM update for this, perhaps a change in Configure.pl to check for /proc/version for Microsoft, or even update the default LDFLAGS to clear execstack since it's an extra dependency that doesn't actually seem to be required?

I've tried building new releases of MoarVM each month (2017.01 02 03 04) but this still needs to be manually patched each time, making rakudobrew unusable...

Any luck with a MoarVM update for this, perhaps a change in Configure.pl to check for /proc/version for Microsoft, or even update the default LDFLAGS to clear execstack since it's an extra dependency that doesn't actually seem to be required?

I've tried building new releases of MoarVM each month (2017.01 02 03 04) but this still needs to be manually patched each time, making rakudobrew unusable...

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Apr 21, 2017

@BenGoldberg1

This comment has been minimized.

Show comment
Hide comment
@BenGoldberg1

BenGoldberg1 Apr 22, 2017

The execstack flag is misnamed, it really means make all readable memory executable.

If your program doesn't have execstack enabled, and you attempt to be able to malloc a buffer, stick executable bytes into it, then immediately call the function in the buffer, you will fail.

The prefered solutions are either use mprotect to make the page containing the buffer executable, or use mmap instead of malloc, and of course use the PROT_EXEC flag when you do so.

See: http://stackoverflow.com/questions/23276488/why-is-execstack-required-to-execute-code-on-the-heap

It would not surprise me in the slightest if dyncall mallocs memory for trampoline functions.

The execstack flag is misnamed, it really means make all readable memory executable.

If your program doesn't have execstack enabled, and you attempt to be able to malloc a buffer, stick executable bytes into it, then immediately call the function in the buffer, you will fail.

The prefered solutions are either use mprotect to make the page containing the buffer executable, or use mmap instead of malloc, and of course use the PROT_EXEC flag when you do so.

See: http://stackoverflow.com/questions/23276488/why-is-execstack-required-to-execute-code-on-the-heap

It would not surprise me in the slightest if dyncall mallocs memory for trampoline functions.

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Apr 22, 2017

yorickdowne commented Apr 22, 2017

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Jun 2, 2017

A fix for dyncall's use of execstack has been committed upstream, in the current dyncall dev version. Kudos for the devs to keep at it, that rabbit hole went deep across different architectures.

dyncall 1.0 is still "a good while out". I'm optimistic though and hope we'll see it in 2017. Once dyncall 1.0 is available upstream, I'll update here again, and then MoarVM devs can take a look at using that instead of 0.9. At which point this bug can be laid to rest.

A fix for dyncall's use of execstack has been committed upstream, in the current dyncall dev version. Kudos for the devs to keep at it, that rabbit hole went deep across different architectures.

dyncall 1.0 is still "a good while out". I'm optimistic though and hope we'll see it in 2017. Once dyncall 1.0 is available upstream, I'll update here again, and then MoarVM devs can take a look at using that instead of 0.9. At which point this bug can be laid to rest.

@FelipeMartin

This comment has been minimized.

Show comment
Hide comment
@FelipeMartin

FelipeMartin Jun 6, 2017

Hi guys.

I came here because of a segmentation fault error trying to run a binary code compiled with the following line:
$ gcc -Wall -Wa,--execstack -o test test.c
While in a Fedora redhat VM it works fine when I run $ ./test, in Bash on Windows I had this error.
Someone on Microsoft/WSL#286 told that this pack could solve:
$ apt-get install execstack

Has Anyone tried it ?

FelipeMartin commented Jun 6, 2017

Hi guys.

I came here because of a segmentation fault error trying to run a binary code compiled with the following line:
$ gcc -Wall -Wa,--execstack -o test test.c
While in a Fedora redhat VM it works fine when I run $ ./test, in Bash on Windows I had this error.
Someone on Microsoft/WSL#286 told that this pack could solve:
$ apt-get install execstack

Has Anyone tried it ?

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Jun 7, 2017

@FelipeMartin Does your code need execstack? And if so, why?
PROT_GROWSDOWN has not been implemented in WSL, and given the security issues with executable stack, there's no consensus right now that it needs to be implemented.
As per @BenGoldberg1 's link above, there are better ways to execute code on the heap than setting execstack (mmap and mprotect are mentioned). And you really shouldn't rely on code that needs to execute on the stack.

@FelipeMartin Does your code need execstack? And if so, why?
PROT_GROWSDOWN has not been implemented in WSL, and given the security issues with executable stack, there's no consensus right now that it needs to be implemented.
As per @BenGoldberg1 's link above, there are better ways to execute code on the heap than setting execstack (mmap and mprotect are mentioned). And you really shouldn't rely on code that needs to execute on the stack.

@FelipeMartin

This comment has been minimized.

Show comment
Hide comment
@FelipeMartin

FelipeMartin Jun 7, 2017

@yorickdowne Actually this is an example just for studying purpose (to understand how machine code works).
The following is my program and its debug:


typedef int (*funcp) (int x);

int main(void) {
	int i;
	unsigned char codigo[] = {0x55,0x89,0xe5,0x8b,0x45,0x08,0x83,0xc0,0x01,0x89,0xec,0x5d,0xc3};
	funcp f = (funcp)codigo;
	i = (*f)(10);
	printf("%d\n", i);
	return 0;
}
5       int main(void) {
(gdb) s
7               unsigned char codigo[] = {0x55,0x89,0xe5,0x8b,0x45,0x08,0x83,0xc0,0x01,0x89,0xec,0x5d,0xc3};
(gdb)
8               funcp f = (funcp)codigo;
(gdb)
9               i = (*f)(10);
(gdb)

Program received signal SIGSEGV, Segmentation fault.
0x00007ffffffde4d0 in ?? ()

Could I do something similar (compiling in a different way with mmap or mprotect) to have the same idea ?

FelipeMartin commented Jun 7, 2017

@yorickdowne Actually this is an example just for studying purpose (to understand how machine code works).
The following is my program and its debug:


typedef int (*funcp) (int x);

int main(void) {
	int i;
	unsigned char codigo[] = {0x55,0x89,0xe5,0x8b,0x45,0x08,0x83,0xc0,0x01,0x89,0xec,0x5d,0xc3};
	funcp f = (funcp)codigo;
	i = (*f)(10);
	printf("%d\n", i);
	return 0;
}
5       int main(void) {
(gdb) s
7               unsigned char codigo[] = {0x55,0x89,0xe5,0x8b,0x45,0x08,0x83,0xc0,0x01,0x89,0xec,0x5d,0xc3};
(gdb)
8               funcp f = (funcp)codigo;
(gdb)
9               i = (*f)(10);
(gdb)

Program received signal SIGSEGV, Segmentation fault.
0x00007ffffffde4d0 in ?? ()

Could I do something similar (compiling in a different way with mmap or mprotect) to have the same idea ?

@yorickdowne

This comment has been minimized.

Show comment
Hide comment
@yorickdowne

yorickdowne Jun 9, 2018

dyncall 1.0 is out and includes the change to avoid execstack, among a lot of great improvements. http://www.dyncall.org/changelog#dc10

Using dyncall 1.0 will resolve this issue and, it looks like, might resolve a few others, see #794

dyncall 1.0 is out and includes the change to avoid execstack, among a lot of great improvements. http://www.dyncall.org/changelog#dc10

Using dyncall 1.0 will resolve this issue and, it looks like, might resolve a few others, see #794

@deven

This comment has been minimized.

Show comment
Hide comment
@deven

deven Jun 14, 2018

Until MoarVM incorporates dyncall 1.0, a workaround that seems to work for me is to first download and install dyncall 1.0 into /usr/local, then build MoarVM with the --has-dyncall option so it will use it.

If you're building MoarVM via Rakudo's Configure.pl, you can use --moar-option="--has-dyncall" to pass this option through to MoarVM's Configure.pl script.

deven commented Jun 14, 2018

Until MoarVM incorporates dyncall 1.0, a workaround that seems to work for me is to first download and install dyncall 1.0 into /usr/local, then build MoarVM with the --has-dyncall option so it will use it.

If you're building MoarVM via Rakudo's Configure.pl, you can use --moar-option="--has-dyncall" to pass this option through to MoarVM's Configure.pl script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment