Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmopolitan Lua test runner broken #601

Open
mingodad opened this issue Sep 8, 2022 · 31 comments
Open

Cosmopolitan Lua test runner broken #601

mingodad opened this issue Sep 8, 2022 · 31 comments
Labels
contributions welcome We'll commit to review and maintenance if the people who need it write the changes.

Comments

@mingodad
Copy link

mingodad commented Sep 8, 2022

Playing with cosmopolitan I decided to try build/compile some projects with it and test then on of then is Lua, cosmopolitan already has Lua in https://github.com/jart/cosmopolitan/tree/master/third_party/lua and testing it with a battery of tests I've collected from several sources (mainly from Lua itself) I noticed it doesn't pass it but the original Lua-5.4.4 and Lua-5.3.6 does pass then.

The attached Lua tests has been tested with Lua(5.1.5, 5.2.4, 5.3.6, 5.4.4) and all of then pass all tests.
lua-tests.zip

Here is the output of Lua from cosmopolitan:

cosmopolitan/o/third_party/lua/lua.com lua-tests.lua 
	strings         pass:142   fail: 0      1ms
lua-tests.lua:294: field 'atan2' is not callable (a nil value)
stack traceback:
lua-tests.lua:294: in local 'testfunc'
./minctest.lua:74: in function 'lrun'
lua-tests.lua:194: in main chunk
[C]: in ?
	math            

There are several math functions that fail and after commenting then it fail other tests.

Using this Lua script https://lua-users.org/lists/lua-l/2009-02/msg00284.html as base I created another one to create an amalgamation of the various Lua versions (also attached).

And as test I created a Lua-5.3.6 amalgamation
am-lua-5.3.6.c.zip
and with the script shown below I built it with cosmopolitan-2.0.1 and when testing it with the tests attached it segfaults:

cosmopolitan/dad/am-lua-5.3.6.com lua-tests.lua 
	strings         pass:142   fail: 0      1ms
	math            pass:103   fail: 0     13ms
	calls           pass:79   fail: 0     20ms
	closures        pass:93   fail: 0      0ms
	metatables      pass:173   fail: 0      1ms
	literals        pass:296   fail: 0      1ms
	tables,next,for pass:59602   fail: 0     25ms
Segmentation fault (core dumped)

And building it without cosmopolitan pass all tests:

gcc -DWTHOUTH_COSMOPOLITAN -DMAKE_LUA_CMD -DLUA_PROGNAME='"lua"' -DLUA_COMPAT_5_2 -o am-lua-5.3.6 am-lua-5.3.6.c -lm -ldl
cosmopolitan/dad/am-lua-5.3.6 lua-tests.lua 
	strings         pass:142   fail: 0      0ms
	math            pass:103   fail: 0      6ms
	calls           pass:79   fail: 0     18ms
	closures        pass:93   fail: 0      0ms
	metatables      pass:173   fail: 0      1ms
	literals        pass:296   fail: 0      1ms
	tables,next,for pass:59629   fail: 0     37ms
	pattern matchingpass:143   fail: 0     21ms
	table sort      pass:150030   fail: 0    217ms
	vararg          pass:65   fail: 0      0ms
ALL TESTS PASSED (210753/210753)
# run gcc compiler in freestanding mode
gcc -g -Os -static -fno-pie -no-pie -nostdlib -nostdinc \
  -fno-omit-frame-pointer -pg -mnop-mcount -mno-tls-direct-seg-refs \
  -o am-lua-5.3.6.com.dbg am-lua-5.3.6.c \
  -DMAKE_LUA_CMD \
  -DLUA_PROGNAME='"lua"' \
  -DLUA_COMPAT_5_2 \
  -DLUA_WITH_COSMOPOLITAN \
  -Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \
  -Wl,-T,ape.lds -include cosmopolitan.h crt.o ape-no-modify-self.o cosmopolitan.a
objcopy -S -O binary am-lua-5.3.6.com.dbg am-lua-5.3.6.com

# NOTE: scp it to windows/mac/etc. *before* you run it!
# ~40kb static binary (can be ~16kb w/ MODE=tiny)
./ape.elf ./am-lua-5.3.6.com
@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

All of the above was run on Ubuntu 18.04 and gcc-9.4:

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-nrQql7/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~18.04) 

@mingodad mingodad changed the title Cosmopolitan Lua doesn't pass testes Cosmopolitan Lua doesn't pass tests Sep 8, 2022
@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

Here is the output of gdb:

gdb --args cosmopolitan/dad/am-lua-5.3.6.com.dbg lua-tests.lua 
GNU gdb (Ubuntu 10.2-0ubuntu1~18.04~2) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from cosmopolitan/dad/am-lua-5.3.6.com.dbg...

warning: Loadable section ".tdata" outside of ELF segments
(gdb) r
Starting program: cosmopolitan/dad/am-lua-5.3.6.com.dbg lua-tests.lua
	strings         pass:142   fail: 0      0ms
	math            pass:103   fail: 0      3ms
	calls           pass:79   fail: 0     10ms
	closures        pass:93   fail: 0      0ms
	metatables      pass:173   fail: 0      1ms
	literals        pass:296   fail: 0      0ms
	tables,next,for pass:59579   fail: 0     20ms

Program received signal SIGSEGV, Segmentation fault.
0x00000000004239b9 in str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20862
20862	  const char *src = luaL_checklstring(L, 1, &srcl);  /* subject */
(gdb) bt
#0  0x00000000004239b9 in str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20862
#1  0x0000000000413ca6 in luaD_precall (L=0x100080000018, L@entry=0x423991 <str_gsub>, func=0x1000800f0940, 
    func@entry=0x10008006aee0, nresults=nresults@entry=-1) at am-lua-5.3.6.c:7725
#2  0x000000000040e402 in luaV_execute (L=0x423991 <str_gsub>, L@entry=0x10008006aee0) at am-lua-5.3.6.c:15513
#3  0x000000000040eb7d in luaD_call (L=0x10008006aee0, L@entry=0x1000800f0940, func=<optimized out>, 
    nResults=nResults@entry=-2147045664) at am-lua-5.3.6.c:7790
#4  0x000000000040eba5 in luaD_callnoyield (L=L@entry=0x1000800f0940, func=<optimized out>, 
    nResults=nResults@entry=-2147045664) at am-lua-5.3.6.c:7800
#5  0x000000000040ec2c in lua_callk (L=0x1000800f0940, nargs=<optimized out>, nresults=-2147045664, 
    ctx=<optimized out>, k=<optimized out>) at am-lua-5.3.6.c:4244
#6  0x0000000000423b64 in add_value (tr=6, e=0x1000800f0940 "\221\071B", s=0x10008006aee0 "@\t\017\200", 
    b=0x700000002870, ms=0x700000002648) at am-lua-5.3.6.c:20837
#7  str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20884
#8  0x0000000000413ca6 in luaD_precall (L=0x100080000018, L@entry=0x423991 <str_gsub>, func=0x1000800f0940, 
    func@entry=0x10008006aee0, nresults=nresults@entry=-1) at am-lua-5.3.6.c:7725
#9  0x000000000040e402 in luaV_execute (L=0x423991 <str_gsub>, L@entry=0x10008006aee0) at am-lua-5.3.6.c:15513
#10 0x000000000040eb7d in luaD_call (L=0x10008006aee0, L@entry=0x1000800f0940, func=<optimized out>, 
    nResults=nResults@entry=-2147045664) at am-lua-5.3.6.c:7790
#11 0x000000000040eba5 in luaD_callnoyield (L=L@entry=0x1000800f0940, func=<optimized out>, 
    nResults=nResults@entry=-2147045664) at am-lua-5.3.6.c:7800
#12 0x000000000040ec2c in lua_callk (L=0x1000800f0940, nargs=<optimized out>, nresults=-2147045664, 
    ctx=<optimized out>, k=<optimized out>) at am-lua-5.3.6.c:4244
#13 0x0000000000423b64 in add_value (tr=6, e=0x1000800f0940 "\221\071B", s=0x10008006aee0 "@\t\017\200", 
    b=0x700000004ca0, ms=0x700000004a78) at am-lua-5.3.6.c:20837
#14 str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20884
#15 0x0000000000413ca6 in luaD_precall (L=0x100080000018, L@entry=0x423991 <str_gsub>, func=0x1000800f0940, 
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) q

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

I just updated the lua-tests to skip non existing math functions:
lua-tests.zip
And now again is the output of Lua from cosmopolitan:

cosmopolitan/o/third_party/lua/lua.com lua-tests.lua 
	strings         pass:142   fail: 0      1ms
	math            pass:99   fail: 0     12ms
	calls           pass:79   fail: 0     19ms
	closures        pass:93   fail: 0      0ms
lua-tests.lua:1263: attempt to compare two table values
stack traceback:
lua-tests.lua:1263: in local 'test'
lua-tests.lua:1271: in local 'testfunc'
./minctest.lua:74: in function 'lrun'
lua-tests.lua:1099: in main chunk
[C]: in ?
	metatables      

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

I'm testing with the latest commit 0547eab and I'm getting this:

bash mk-lua-5.3.6.sh
In file included from <command-line>:
./cosmopolitan.h:28299: warning: "IFNAMSIZ" redefined
28299 | #define IFNAMSIZ    IF_NAMESIZE
      | 
./cosmopolitan.h:28245: note: this is the location of the previous definition
28245 | #define IFNAMSIZ 16
      | 

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

I could isolate the test that causes the segfault:

-- deep nest of gsubs
function rev (s)
  return string.gsub(s, "(.)(.+)", function (c,s1) return rev(s1)..c end)
end

local x = string.rep('012345', 10)
print(rev(rev(x)) == x)
cosmopolitan/dad/am-lua-5.3.6.com.dbg deep-nested-gsubs.lua 
Segmentation fault (core dumped)

@pkulchenko
Copy link
Collaborator

cosmopolitan/dad/am-lua-5.3.6.com.dbg deep-nested-gsubs.lua
Segmentation fault (core dumped)

@mingodad, I don't get this failure, as it works for me for redbean-based lua.

lua-tests.lua:1263: attempt to compare two table values

Also, these failures are all related to __lt method not being present (as it's expected to be emulated with __le), but this support has been removed in Lua 5.4:

The use of the __lt metamethod to emulate __le has been removed. When needed, this metamethod must be explicitly defined.

See https://www.lua.org/manual/5.4/manual.html#8.1

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

Thank you for reply but all references to segfault are using the attached Lua-5.3.6 amalgamation.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

When I add this line from the cosmopolitan default Lua build:

-fno-math-errno -fno-trapping-math -fno-fp-int-builtin-inexact -fno-ident -fno-common -fno-gnu-unique -fstrict-aliasing -fstrict-overflow -fno-semantic-interposition -mno-tls-direct-seg-refs -Wall -Werror -fdebug-prefix-map=/home/mingo/dev/c/A_programming-languages/cosmopolitan= -frecord-gcc-switches -fno-schedule-insns2 -fno-optimize-sibling-calls -mno-omit-leaf-frame-pointer -DSYSDEBUG -O2 -fno-code-hoisting -g -gdescribe-dies -DCOSMO -DMODE="" -DIMAGE_BASE_VIRTUAL=0x800000 -nostdinc -iquote .  -Wa,-W -Wa,-I. -Wa,--noexecstack -Wa,--nocompress-debug-sections -msse3 -fno-math-errno -fno-trapping-math -fno-fp-int-builtin-inexact -fno-ident -fno-common -fno-gnu-unique -fstrict-aliasing -fstrict-overflow -fno-semantic-interposition -mno-tls-direct-seg-refs -std=gnu2x -fno-gcse -ffunction-sections -fdata-sections

And try to build the Lua-5.3.6 amalgamation I get this:

bash mk-lua-5.3.6.sh
In file included from <command-line>:
./cosmopolitan.h:28299: error: "IFNAMSIZ" redefined [-Werror]
28299 | #define IFNAMSIZ    IF_NAMESIZE
      | 
./cosmopolitan.h:28245: note: this is the location of the previous definition
28245 | #define IFNAMSIZ 16
      | 
am-lua-5.3.6.c: In function 'read_chars':
am-lua-5.3.6.c:19023:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
19023 | }
      | ^
am-lua-5.3.6.c: In function 'read_line':
am-lua-5.3.6.c:18997:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
18997 | }
      | ^
am-lua-5.3.6.c: In function 'read_all':
am-lua-5.3.6.c:19010:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
19010 | }
      | ^
am-lua-5.3.6.c: In function 'findloader':
am-lua-5.3.6.c:22988:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
22988 | }
      | ^
am-lua-5.3.6.c: In function 'luaL_loadfilex':
am-lua-5.3.6.c:16760:1: error: the frame size of 4128 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
16760 | }
      | ^
am-lua-5.3.6.c: In function 'str_dump':
am-lua-5.3.6.c:20297:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20297 | }
      | ^
am-lua-5.3.6.c: In function 'str_char':
am-lua-5.3.6.c:20277:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20277 | }
      | ^
am-lua-5.3.6.c: In function 'utfchar':
am-lua-5.3.6.c:22289:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
22289 | }
      | ^
am-lua-5.3.6.c: In function 'os_date':
am-lua-5.3.6.c:20009:1: error: the frame size of 8352 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20009 | }
      | ^
am-lua-5.3.6.c: In function 'str_upper':
am-lua-5.3.6.c:20217:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20217 | }
      | ^
am-lua-5.3.6.c: In function 'str_reverse':
am-lua-5.3.6.c:20191:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20191 | }
      | ^
am-lua-5.3.6.c: In function 'str_lower':
am-lua-5.3.6.c:20204:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20204 | }
      | ^
am-lua-5.3.6.c: In function 'str_rep':
am-lua-5.3.6.c:20243:1: error: the frame size of 8256 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20243 | }
      | ^
am-lua-5.3.6.c: In function 'str_pack':
am-lua-5.3.6.c:21512:1: error: the frame size of 8320 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
21512 | }
      | ^
am-lua-5.3.6.c: In function 'tconcat':
am-lua-5.3.6.c:21869:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
21869 | }
      | ^
am-lua-5.3.6.c: In function 'str_gsub':
am-lua-5.3.6.c:20896:1: error: the frame size of 8880 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
20896 | }
      | ^
am-lua-5.3.6.c: In function 'str_format':
am-lua-5.3.6.c:21191:1: error: the frame size of 8320 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
21191 | }
      | ^
am-lua-5.3.6.c: In function 'luaL_gsub':
am-lua-5.3.6.c:17029:1: error: the frame size of 8240 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
17029 | }
      | ^
am-lua-5.3.6.c: In function 'searchpath':
am-lua-5.3.6.c:22849:1: error: the frame size of 8224 bytes is larger than 4096 bytes [-Werror=frame-larger-than=]
22849 | }
      | ^
cc1: all warnings being treated as errors

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

Also, these failures are all related to __lt method not being present (as it's expected to be emulated with __le), but this support has been removed in Lua 5.4:

Also notice that the attached tests are executed and pass all of then for the standard Lua-(5.1.5, 5.2.4, 5.3.6, 5.4.4).

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

And when I mention cosmopolitan Lua I refer to cosmopolitan/o/third_party/lua/lua.com that I suspect is the same that redbeanis using.

@pkulchenko
Copy link
Collaborator

Also, these failures are all related to __lt method not being present (as it's expected to be emulated with __le), but this support has been removed in Lua 5.4:
Also notice that the attached tests are executed and pass all of then for the standard Lua-(5.1.5, 5.2.4, 5.3.6, 5.4.4).

That may be the case, but only because the "standard" Lua 5.4.x likely includes LUA_COMPAT_5_3, which enables LUA_COMPAT_LT_LE, but cosmopolitan's version of Lua 5.4 doesn't do that, so it fails those tests.

And when I mention cosmopolitan Lua I refer to cosmopolitan/o/third_party/lua/lua.com that I suspect is the same that redbean is using.

Correct; all I'm saying is that it's not expected to pass those <= and >= tests, as the __lt method needs to be provided explicitly. The fact that it works for other Lua 5.4 binaries is likely explained by the usage of LUA_COMPAT_5_3.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

Thank you again for reply !
And what about the segfault of the Lua-5.3.6 amalgamation attached ?

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

In fact I did a build with cosmopolitan of Lua-(5.1.5, 5.2.4, 5.3.6 and 5.4.4) and all of then segfault.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

Hello Paul !
I just looked at your profile and saw that you have done a lot of nice projects with/for Lua, I once did a translation of your ZeroBraneStudio project (https://github.com/mingodad/ZeroBraneStudioLJS) to LJS to test LJS and succeeded on it.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

I just added the OVERRIDE_CFLAGS += -DLUA_COMPAT_5_3 to cosmopolitan/third_party/lua/lua.mk and rebuilt it and now it only fail one test due to changes made to recognize Integer literals such as 033 will now be interpreted as octal this is the output:

cosmopolitan/o/third_party/lua/lua.com lua-tests.lua 
	strings         pass:142   fail: 0      0ms
	math            pass:103   fail: 0      3ms
	calls           pass:79   fail: 0     10ms
	closures        pass:93   fail: 0      0ms
	metatables      pass:173   fail: 0      1ms
	literals        lua-tests.lua:1487 error 
pass:295   fail: 1      0ms
	tables,next,for pass:59568   fail: 0     16ms
	pattern matchingpass:143   fail: 0     15ms
	table sort      pass:150030   fail: 0    104ms
	vararg          pass:65   fail: 0      0ms
SOME TESTS FAILED (210691/210692)

But I would like to know what is preventing my builds of Lua-(5.1.5, 5.2.4, 5.3.6 and 5.4.4) to pass the tests, it seems to be something with the several compiler flags that I'm not using but should, actually I'm using the values provided in the README https://github.com/jart/cosmopolitan#getting-started .

@pkulchenko
Copy link
Collaborator

I just looked at your profile and saw that you have done a lot of nice projects with/for Lua,

I did, indeed ;).

I once did a translation of your ZeroBraneStudio project (https://github.com/mingodad/ZeroBraneStudioLJS) to LJS to test LJS and succeeded on it.

Yes. I saw that; great job with that translation!

am-lua-5.3.6.com
And what about the segfault of the Lua-5.3.6 amalgamation attached ?
In fact I did a build with cosmopolitan of Lua-(5.1.5, 5.2.4, 5.3.6 and 5.4.4) and all of then segfault.

That I'm not sure about, but should be able to check on that in a bit. Can you run it with --ftrace (and may be with --strace), as I'm curious where exactly it segfaults. Building a debug build may help as well, as it enables ASAN, which provides additional diagnostic for memory management issues.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

See the backtrace of gdb bellow, it seems that the lua_State get lost when calling str_gsub:

gdb --args cosmopolitan/dad/am-lua-5.3.6.com.dbg lua-tests.lua 
GNU gdb (Ubuntu 10.2-0ubuntu1~18.04~2) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from cosmopolitan/dad/am-lua-5.3.6.com.dbg...
(gdb) r
Starting program: cosmopolitan/dad/am-lua-5.3.6.com.dbg lua-tests.lua
	strings         pass:142   fail: 0      0ms
	math            pass:103   fail: 0      5ms
	calls           pass:79   fail: 0     17ms
	closures        pass:93   fail: 0      0ms
	metatables      pass:173   fail: 0      1ms
	literals        pass:296   fail: 0      1ms
	tables,next,for pass:59669   fail: 0     36ms
	pattern matching
Program received signal SIGSEGV, Segmentation fault.
0x0000000000432cf8 in str_gsub (L=0x0) at am-lua-5.3.6.c:20860
20860	static int str_gsub (lua_State *L) {
(gdb) bt
#0  0x0000000000432cf8 in str_gsub (L=0x0) at am-lua-5.3.6.c:20860
#1  0x00000000004121d1 in luaD_precall (L=0x100080000018, func=0x1000800f0820, nresults=-1) at am-lua-5.3.6.c:7725
#2  0x0000000000427014 in luaV_execute (L=0x100080000018) at am-lua-5.3.6.c:15513
#3  0x000000000041255e in luaD_call (L=0x100080000018, func=0x1000800f07d0, nResults=1) at am-lua-5.3.6.c:7790
#4  0x00000000004125bc in luaD_callnoyield (L=0x100080000018, func=0x1000800f07d0, nResults=1) at am-lua-5.3.6.c:7800
#5  0x000000000040c4d8 in lua_callk (L=0x100080000018, nargs=2, nresults=1, ctx=0, k=0x0) at am-lua-5.3.6.c:4244
#6  0x0000000000432bff in add_value (ms=0x700000004af0, b=0x700000002ad0, 
    s=0x100080069fe8 "45012345012345012345012345012345012345012345012345", e=0x10008006a01a "", tr=6)
    at am-lua-5.3.6.c:20837
#7  0x0000000000432e8f in str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20884
#8  0x00000000004121d1 in luaD_precall (L=0x100080000018, func=0x1000800f0790, nresults=-1) at am-lua-5.3.6.c:7725
#9  0x0000000000427014 in luaV_execute (L=0x100080000018) at am-lua-5.3.6.c:15513
#10 0x000000000041255e in luaD_call (L=0x100080000018, func=0x1000800f0740, nResults=1) at am-lua-5.3.6.c:7790
#11 0x00000000004125bc in luaD_callnoyield (L=0x100080000018, func=0x1000800f0740, nResults=1) at am-lua-5.3.6.c:7800
#12 0x000000000040c4d8 in lua_callk (L=0x100080000018, nargs=2, nresults=1, ctx=0, k=0x0) at am-lua-5.3.6.c:4244
#13 0x0000000000432bff in add_value (ms=0x700000007530, b=0x700000005510, 
    s=0x100080069e98 "345012345012345012345012345012345012345012345012345", e=0x100080069ecb "", tr=6)
    at am-lua-5.3.6.c:20837
#14 0x0000000000432e8f in str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20884
#15 0x00000000004121d1 in luaD_precall (L=0x100080000018, func=0x1000800f0700, nresults=-1) at am-lua-5.3.6.c:7725
#16 0x0000000000427014 in luaV_execute (L=0x100080000018) at am-lua-5.3.6.c:15513
#17 0x000000000041255e in luaD_call (L=0x100080000018, func=0x1000800f06b0, nResults=1) at am-lua-5.3.6.c:7790
#18 0x00000000004125bc in luaD_callnoyield (L=0x100080000018, func=0x1000800f06b0, nResults=1) at am-lua-5.3.6.c:7800
#19 0x000000000040c4d8 in lua_callk (L=0x100080000018, nargs=2, nresults=1, ctx=0, k=0x0) at am-lua-5.3.6.c:4244
#20 0x0000000000432bff in add_value (ms=0x700000009f70, b=0x700000007f50, 
    s=0x100080069d48 "2345012345012345012345012345012345012345012345012345", e=0x100080069d7c "", tr=6)
--Type <RET> for more, q to quit, c to continue without paging--

@mingodad
Copy link
Author

mingodad commented Sep 8, 2022

And here is the output of --ftrace and --strace:

cosmopolitan/dad/am-lua-5.3.6.com.dbg --strace deep-nested-gsubs.lua
SYS      0             55'030 bell system five system call support 231 magnums loaded on gnu/systemd
SYS  18262             92'536 mmap(0x700000000000, 131'072, PROT_READ|PROT_WRITE, MAP_STACK|MAP_ANONYMOUS, -1, 0) → 0x700000000000 (131'072 bytes total)
SYS  18262            828'324 getenv("TMPDIR") → NULL
SYS  18262            837'263 getenv("TERM") → "xterm-256color"
SYS  18262            872'241 mmap(0, 65'536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) → 0x100080000000 (196'608 bytes total)
SYS  18262            911'021 gettimeofday([{1662665866, 908490}], 0) → 0
SYS  18262          1'033'870 getenv("LUA_PATH_5_3") → NULL
SYS  18262          1'042'698 getenv("LUA_PATH") → NULL
SYS  18262          1'052'700 getenv("LUA_CPATH_5_3") → NULL
SYS  18262          1'059'628 getenv("LUA_CPATH") → NULL
SYS  18262          1'282'954 getenv("LUA_INIT_5_3") → NULL
SYS  18262          1'290'148 getenv("LUA_INIT") → NULL
SYS  18262          1'315'038 openat(AT_FDCWD, "deep-nested-gsubs.lua", 0, 0) → 3
SYS  18262          1'335'410 readv(3, [{{u"-", 1}, {u"- deep nest of gsubs◙function rev (s)◙  "..., 4'084}}], 2) → 176
SYS  18262          1'377'556 readv(3, [{}], 2) → 0
SYS  18262          1'489'416 close(3) → 0
SYS  18262          1'505'579 sigaction(SIGINT, {.sa_handler=&4379f9, .sa_flags=0x10000000, .sa_mask={}}, [{.sa_handler=0, .sa_flags=0, .sa_mask={}}]) → 0
Segmentation fault (core dumped)

deep-nested-trace.zip

@ahgamut
Copy link
Collaborator

ahgamut commented Sep 8, 2022

deeply-nested-gsub.lua

I just saw deeply_nested_repr for Python tests yesterday, where a segfault occurred because of stack/recursion limit. @jart @pkulchenko perhaps this deeply-nested-gsub test is related?

@mingodad
Copy link
Author

mingodad commented Sep 9, 2022

From my tests-lua.lua the only test that fails now is deeply-nested-gsub.lua and I've tried adding several compiler flags/macros without any success:

  -DSTACK_FRAME_UNLIMITED \
  -DCOSMO -DMODE="" -DIMAGE_BASE_VIRTUAL=0x400000 -fno-gcse -ffunction-sections -fdata-sections 

@mingodad
Copy link
Author

mingodad commented Sep 9, 2022

Interesting when I change deeply-nested-gsub.lua like this:

-- deep nest of gsubs
function rev (s)
  return string.gsub(s, "(.)(.+)", function (c,s1) return rev(s1)..c end)
end

local x = string.rep('012345', 10) --!!!<<< here changing the second parameter to 5 and then to 3 
print(rev(rev(x)) == x)

When I change the second parameter of string.rep from 10 to 5 then my build of the amalgamation of Lua-5.4.4 works fine all others segfault, when I change from 10 to 3 then Lua-5.3.6 segfault all others work fine (Lua-5.1.5, 5.2.4, 5.4.4).

But again (just to remark) with the standard build all of then works fine.

@mingodad
Copy link
Author

mingodad commented Sep 9, 2022

Also I noticed that the README at https://github.com/jart/cosmopolitan#getting-started uses -pg to build the hello.c isn't this a copy and paste mistake ?
Why build with profiling info ?
I'm using those compiler flags on my tests but it doesn't feel right for normal builds, there is any justification to need it ?

@mingodad
Copy link
Author

mingodad commented Sep 9, 2022

Here is all source code to replicate the issue reported here:

unzip -l am-lua-all.zip 
Archive:  am-lua-all.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   455381  2022-09-09 13:02   am-lua-5.1.5.c
   561751  2022-09-09 12:59   am-lua-5.2.4.c
   691834  2022-09-09 12:59   am-lua-5.3.6.c
   866550  2022-09-09 12:59   am-lua-5.4.4.c
      265  2022-09-09 13:18   deep-nested-gsubs.lua
      735  2022-09-09 13:00   mk-lua-5.1.5.sh
      773  2022-09-09 12:59   mk-lua-5.2.4.sh
     1792  2022-09-09 13:00   mk-lua-5.3.6.sh
      853  2022-09-09 13:00   mk-lua-5.4.4.sh
     3166  2022-09-09 12:49   mk-lua-amalgamation.lua
      302  2022-09-09 13:20   test-deep-nested-gsubs.sh
---------                     -------
  2583402                     11 files

am-lua-all.zip

Each mk-lua-x.x.x.sh script will build Lua-x.x.x with cosmopolitan and without it then test-deep-nested-gsubs.sh will test then with deep-nested-gsubs.lua three times with parameters (10, 5, 3).

And here is the output of test-deep-nested-gsubs.sh:

./test-deep-nested-gsubs.sh
=> Now testing 5.1.5
string.rep	10
true
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
string.rep	5
true
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
string.rep	3
true
string.rep	3
true
string.rep	3
true

=> Now testing 5.2.4
string.rep	10
true
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
string.rep	5
true
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
string.rep	3
true
string.rep	3
true
string.rep	3
true

=> Now testing 5.3.6
string.rep	10
true
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
string.rep	5
true
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
string.rep	3
true
string.rep	3
Segmentation fault (core dumped)
string.rep	3
Segmentation fault (core dumped)

=> Now testing 5.4.4
string.rep	10
true
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
string.rep	5
true
string.rep	5
true
string.rep	5
true
string.rep	3
true
string.rep	3
true
string.rep	3
true

@pkulchenko
Copy link
Collaborator

=> Now testing 5.4.4
string.rep 10
true
string.rep 10
Segmentation fault (core dumped)

Does this mean that the code sometimes works with 10 and sometimes doesn't?

See the backtrace of gdb bellow, it seems that the lua_State get lost when calling str_gsub:
0x0000000000432cf8 in str_gsub (L=0x0) at am-lua-5.3.6.c:20860
20860 static int str_gsub (lua_State *L) {

It does seem like the lua_State value gets lost, but it's interesting that in your earlier stack trace, it was still present, but got lost two lines later:

Program received signal SIGSEGV, Segmentation fault.
0x00000000004239b9 in str_gsub (L=0x100080000018) at am-lua-5.3.6.c:20862
20862 const char src = luaL_checklstring(L, 1, &srcl); / subject */

I'm not sure what exactly it means yet.

@mingodad
Copy link
Author

Thank you again for reply !
And again I'm failing to communicate properly.
The first entry for each test was the libc build that never fail.

I've changed the test script to add a bit more info to the output intending to make it a bit less confusing:

#!/bin/sh

testDeepNrep() {
	tlua=deep-nested-gsubs.lua
	echo "  => libc build"
	./am-lua-$1 $tlua $2
	echo "  => cosmopolitan build"
	./am-lua-$1.com $tlua $2
	./am-lua-$1.com.dbg $tlua $2
}

testDeep() {
	echo "=> Now testing $1"
	testDeepNrep $1 10
	testDeepNrep $1 5
	testDeepNrep $1 3
	echo ""
}
testDeep 5.1.5
testDeep 5.2.4
testDeep 5.3.6
testDeep 5.4.4
./test-deep-nested-gsubs.sh
=> Now testing 5.1.5
  => libc build
string.rep	10
true
  => cosmopolitan build
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
  => libc build
string.rep	5
true
  => cosmopolitan build
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
  => libc build
string.rep	3
true
  => cosmopolitan build
string.rep	3
true
string.rep	3
true

=> Now testing 5.2.4
  => libc build
string.rep	10
true
  => cosmopolitan build
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
  => libc build
string.rep	5
true
  => cosmopolitan build
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
  => libc build
string.rep	3
true
  => cosmopolitan build
string.rep	3
true
string.rep	3
true

=> Now testing 5.3.6
  => libc build
string.rep	10
true
  => cosmopolitan build
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
  => libc build
string.rep	5
true
  => cosmopolitan build
string.rep	5
Segmentation fault (core dumped)
string.rep	5
Segmentation fault (core dumped)
  => libc build
string.rep	3
true
  => cosmopolitan build
string.rep	3
Segmentation fault (core dumped)
string.rep	3
Segmentation fault (core dumped)

=> Now testing 5.4.4
  => libc build
string.rep	10
true
  => cosmopolitan build
string.rep	10
Segmentation fault (core dumped)
string.rep	10
Segmentation fault (core dumped)
  => libc build
string.rep	5
true
  => cosmopolitan build
string.rep	5
true
string.rep	5
true
  => libc build
string.rep	3
true
  => cosmopolitan build
string.rep	3
true
string.rep	3
true
# run gcc compiler in freestanding mode
gcc -g -Os -static -fno-pie -no-pie -nostdlib -nostdinc \
  -fno-omit-frame-pointer -mnop-mcount -mno-tls-direct-seg-refs \
  -o am-lua-5.4.4.com.dbg am-lua-5.4.4.c \
  -DMAKE_LUA_CMD \
  -DLUA_PROGNAME='"lua"' \
  -DLUA_COMPAT_5_3 \
  -DLUA_WITH_COSMOPOLITAN \
  -Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \
  -Wl,-T,ape.lds -include cosmopolitan.h crt.o ape-no-modify-self.o cosmopolitan.a
objcopy -S -O binary am-lua-5.4.4.com.dbg am-lua-5.4.4.com

gcc -Os -DMAKE_LUA_CMD -DLUA_PROGNAME='"lua"' -DLUA_COMPAT_5_3 -DLUA_USE_LINUX -D_XOPEN_SOURCE=500 -o am-lua-5.4.4 am-lua-5.4.4.c -lm -ldl

# NOTE: scp it to windows/mac/etc. *before* you run it!
# ~40kb static binary (can be ~16kb w/ MODE=tiny)
./ape.elf ./am-lua-5.4.4.com
#  -DSTACK_FRAME_UNLIMITED \
#  -fno-gcse -ffunction-sections -fdata-sections \

@mingodad
Copy link
Author

Anyone can try to reproduce it with the full code on this zip file https://github.com/jart/cosmopolitan/files/9534950/am-lua-all.zip previously posted here #601 (comment) .

@mingodad
Copy link
Author

I also did a build with almost identical command line used to build cosmopolitan/third_party/lua and it also segfault:

gcc \
	-msse3 \
	-fno-math-errno \
	-fno-trapping-math \
	 -fno-fp-int-builtin-inexact \
	 -fno-ident \
	 -fno-common \
	 -fno-gnu-unique \
	 -fstrict-aliasing \
	 -fstrict-overflow \
	 -fno-semantic-interposition \
	 -mno-tls-direct-seg-refs \
	 -Wall \
	 -Werror \
	 -frecord-gcc-switches \
	 -fno-schedule-insns2 \
	 -fno-optimize-sibling-calls \
	 -mno-omit-leaf-frame-pointer \
	 -O2 \
	 -fno-code-hoisting \
	 -g \
	 -gdescribe-dies \
	 -DCOSMO \
	 -DMODE="" \
	 -DIMAGE_BASE_VIRTUAL=0x400000 \
	 -nostdinc \
	 -iquote . \
	 -DSYSDEBUG \
	 -Wa,-W \
	 -Wa,-I. \
	 -Wa,--noexecstack \
	 -Wa,--nocompress-debug-sections \
	 -msse3 \
	 -fno-math-errno \
	 -fno-trapping-math \
	 -fno-fp-int-builtin-inexact \
	 -fno-ident \
	 -fno-common \
	 -fno-gnu-unique \
	 -fstrict-aliasing \
	 -fstrict-overflow \
	 -fno-semantic-interposition \
	 -mno-tls-direct-seg-refs \
	 -std=gnu2x \
	 -DLUA_COMPAT_5_3 \
	 -ffunction-sections \
	 -fdata-sections \
	 -c \
	 -pg \
	 -D__PG__ \
	 -mno-red-zone \
	 -D__MNO_RED_ZONE__ \
	 -fno-omit-frame-pointer \
	 -D__FNO_OMIT_FRAME_POINTER__ \
	 -o am-lua-5.4.4.com.dbg am-lua-5.4.4.c \
	-DMAKE_LUA_CMD \
	-DLUA_PROGNAME='"lua"' \
	-DLUA_COMPAT_5_3 \
	-DLUA_WITH_COSMOPOLITAN \
	-DSTACK_FRAME_UNLIMITED \
	-Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \
	-Wl,-T,ape.lds -include cosmopolitan.h crt.o ape-no-modify-self.o cosmopolitan.a

objcopy -S -O binary am-lua-5.4.4.com.dbg am-lua-5.4.4.com

#	 -fdebug-prefix-map=cosmopolitan=
#	 -include ../libc/integral/normalize.inc

@mingodad
Copy link
Author

Finally found the missing feature to make it work, looking through cosmopolitan/third_party/lua/lua.main.c this line called my attention STATIC_STACK_SIZE(0x40000);, then adding it at the top of my Lua-x.x.x amalgamations solved the segfault problem, but each version of Lua require a different value to be able to execute deep-nested-gsubs.lua with string.rep(xx, 10):

Lua-5.4.4 works fine with => STATIC_STACK_SIZE(0x40000);
Lua-5.3.6 works fine with => STATIC_STACK_SIZE(0x100000);
Lua-5.2.4 works fine with => STATIC_STACK_SIZE(0x80000);
Lua-5.1.5 works fine with => STATIC_STACK_SIZE(0x80000);

Would be nice if this is mentioned on the README or somewhere else that describe any build with cosmopolitan to save some head hairs !

@pkulchenko
Copy link
Collaborator

@mingodad, thank you for getting to the bottom of this! I think this is what @ahgamut was alluding to in his earlier message (apologies for not making clear the implications of his suggestion):

I just saw deeply_nested_repr for Python tests yesterday, where a segfault occurred because of stack/recursion limit. @jart @pkulchenko perhaps this deeply-nested-gsub test is related?

I agree that it should be more prominently listed, but this is very much app-dependent. If you ask for more nested-gsub loops, you're going to run into the same issue again. This commit has more details on STATIC_STACK_SIZE and notes that 30k is sufficient for everything in Cosmopolitan repo except Python. STATIC_YOINK("stack_usage_logging") can be used to figure out how much stack is needed.

@mingodad
Copy link
Author

Thank you again for reply !
That commit is from a year ago, I also searched on this repository for deeply_nested_repr references and found nothing, the documentation is a bit dry (basically a cosmopolitan libc reference and doesn't include STATIC_STACK_SIZE(sz)).
I'll say again I hope it'll be included everywhere cosmopolitan build is referenced.

And yes I understand that it's application dependent but knowing that the default stack size is not the same of libc/OS would help grok segfaults like the one found and described in this issue with less waste of time/sanity.

@jart
Copy link
Owner

jart commented Sep 12, 2022

Could you try building open source Lua using our toolchain? https://gist.github.com/jart/3571ae8571a07435e3246a3f06ebaf58

Lua passes tests in the cosmo repo. We got it to pass very early in the project. We probably made some local modifications to them along the way though. The way we run the ones checked-in is:

(m=; make -j8 MODE=$m o/$m/third_party/lua/lua.com && cd third_party/lua/test/ && ~/cosmo/o/$m/third_party/lua/lua.com all.lua)

However Lua's tests aren't very good. It relies mostly on running Lua as a subprocess and parsing the results out of the REPL. As such, when I added a newer better Bestline based REPL to lua.com, it interfered with the delicate way the Lua tests expect things to work. So I haven't been able to run the Lua tests in months.

Contributions are welcome helping us fix things.

@jart jart changed the title Cosmopolitan Lua doesn't pass tests Cosmopolitan Lua test runner broken Sep 12, 2022
@jart jart added the contributions welcome We'll commit to review and maintenance if the people who need it write the changes. label Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions welcome We'll commit to review and maintenance if the people who need it write the changes.
Projects
None yet
Development

No branches or pull requests

4 participants