Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The reader test seems to stuck in a deadlock (zim-testing-suite) for some reason. #670

Open
Animeshz opened this issue Feb 15, 2022 · 12 comments · May be fixed by #723
Open

The reader test seems to stuck in a deadlock (zim-testing-suite) for some reason. #670

Animeshz opened this issue Feb 15, 2022 · 12 comments · May be fixed by #723
Assignees
Labels
Milestone

Comments

@Animeshz
Copy link

Seems like running meson test fails with the first test of reader.cpp test going timeout of 120s.

Logs:

27/27 reader              TIMEOUT        120.06s   killed by signal 15 SIGTERM
>>> ZIM_TEST_DATA_DIR=/builddir/libzim-7.2.0/build/test/data MALLOC_PERTURB_=48 /builddir/libzim-7.2.0/build/test/reader
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
Running main() from ../googletest/src/gtest_main.cc
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from FileReader
[ RUN      ] FileReader.shouldJustWork
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Summary of Failures:

27/27 reader              TIMEOUT        120.06s   killed by signal 15 SIGTERM


Ok:                 26  
Expected Fail:      0   
Fail:               0   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            1   

Full log written to /builddir/libzim-7.2.0/build/meson-logs/testlog.txt
FAILED: meson-test 
/usr/bin/meson test --no-rebuild --print-errorlogs
ninja: build stopped: subcommand failed.
=> ERROR: libzim-7.2.0_1: do_check: '${make_cmd} -C ${meson_builddir} ${makejobs} ${make_check_args} ${make_check_target}' exited with 1
=> ERROR:   in do_check() at common/build-style/meson.sh:141
@kelson42
Copy link
Contributor

Have you tried SKIP_BIG_MEMORY_TEST=1 meson test

@kelson42 kelson42 added this to the 7.3.0 milestone Feb 15, 2022
@kelson42 kelson42 self-assigned this Feb 15, 2022
@Animeshz
Copy link
Author

@kelson42 still seems to stuck there

[animesh@/home/animesh/Projects/void-packages/masterdir build]$ SKIP_BIG_MEMORY_TEST=1 meson test
ninja: Entering directory `/builddir/libzim-7.2.0/build'
ninja: no work to do.
 1/27 lrucache                    OK               0.02s
 2/27 dirent                      OK               0.02s
 3/27 header                      OK               0.01s
 4/27 template                    OK               0.01s
 5/27 iterator                    OK               0.02s
 6/27 dirent_lookup               OK               0.01s
 7/27 istreamreader               OK               0.01s
 8/27 find                        OK               0.16s
 9/27 rawstreamreader             OK               0.01s
10/27 bufferstreamer              OK               0.01s
11/27 parseLongPath               OK               0.01s
12/27 random                      OK               0.04s
13/27 tooltesting                 OK               0.01s
14/27 creator                     OK               0.29s
15/27 tinyString                  OK               0.01s
16/27 cluster                     OK               0.53s
17/27 indexing_criteria           OK               0.66s
18/27 decoderstreamreader         OK               0.90s
19/27 defaultIndexdata            OK               0.03s
20/27 uuid                        OK               1.01s
21/27 search                      OK               0.93s
22/27 suggestion_iterator         OK               1.25s
23/27 search_iterator             OK               0.72s
24/27 archive                     OK               2.20s
25/27 suggestion                  OK               2.18s
26/27 compression                 OK               3.76s
27/27 reader                      TIMEOUT        120.01s   killed by signal 15 SIGTERM
>>> ZIM_TEST_DATA_DIR=/builddir/libzim-7.2.0/build/test/data MALLOC_PERTURB_=247 /builddir/libzim-7.2.0/build/test/reader



Ok:                 26  
Expected Fail:      0   
Fail:               0   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            1   

Full log written to /builddir/libzim-7.2.0/build/meson-logs/testlog.txt

@kelson42
Copy link
Contributor

@mgautierfr @veloman-yunkan Any idea?

@veloman-yunkan
Copy link
Collaborator

@Animeshz

Please run

gdb -ex run -args /builddir/libzim-7.2.0/build/test/reader

wait for a few seconds, hit CTRL-C, enter the bt command in the gdb prompt and paste its output here.

@Animeshz
Copy link
Author

@veloman-yunkan

[animesh@/home/animesh/Projects/void-packages/masterdir /]$ gdb -ex run -args /builddir/libzim-7.2.0/build/test/reader
GNU gdb (GDB) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /builddir/libzim-7.2.0/build/test/reader...
Starting program: /builddir/libzim-7.2.0/build/test/reader 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
Running main() from ../googletest/src/gtest_main.cc
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from FileReader
[ RUN      ] FileReader.shouldJustWork
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b5b366 in __libc_pread64 (fd=3, buf=0x7fffffffe62f, count=1, offset=26) at ../sysdeps/unix/sysv/linux/pread64.c:25
25	../sysdeps/unix/sysv/linux/pread64.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7b5b366 in __libc_pread64 (fd=3, buf=0x7fffffffe62f, count=1, offset=26) at ../sysdeps/unix/sysv/linux/pread64.c:25
#1  0x00007ffff7f7726c in zim::unix::FD::readAt(char*, zim::zsize_t, zim::offset_t) const () from /builddir/libzim-7.2.0/build/test/../src/libzim.so.7
#2  0x00007ffff7f62150 in zim::FileReader::read(zim::offset_t) const () from /builddir/libzim-7.2.0/build/test/../src/libzim.so.7
#3  0x000055555555c717 in (anonymous namespace)::FileReader_shouldJustWork_Test::TestBody() ()
#4  0x00007ffff7f07d97 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () from /usr/lib64/libgtest.so.1.11.0
#5  0x00007ffff7efc8ee in testing::Test::Run() () from /usr/lib64/libgtest.so.1.11.0
#6  0x00007ffff7efca65 in testing::TestInfo::Run() () from /usr/lib64/libgtest.so.1.11.0
#7  0x00007ffff7efcfe9 in testing::TestSuite::Run() () from /usr/lib64/libgtest.so.1.11.0
#8  0x00007ffff7efd71a in testing::internal::UnitTestImpl::RunAllTests() () from /usr/lib64/libgtest.so.1.11.0
#9  0x00007ffff7f08307 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () from /usr/lib64/libgtest.so.1.11.0
#10 0x00007ffff7efcb28 in testing::UnitTest::Run() () from /usr/lib64/libgtest.so.1.11.0
#11 0x00007ffff7f220e0 in main () from /usr/lib64/libgtest_main.so.1.11.0
#12 0x00007ffff7a95e0a in __libc_start_main (main=0x7ffff7f220a0 <main>, argc=1, argv=0x7fffffffeca8, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fffffffec98) at ../csu/libc-start.c:314
#13 0x00005555555595aa in _start () at ../sysdeps/x86_64/start.S:120
(gdb) 

@veloman-yunkan
Copy link
Collaborator

The test at

ASSERT_THROW(reader->read(offset_t(26)), std::runtime_error);
works in a debug build because the expected exception is generated by an assertion at
ASSERT(offset.v, <, _size.v);

In a release build, the assertion is disabled and the execution proceeds past that point leading to an infinite loop. @mgautierfr as the author of that test should decide how this should be fixed.

@kelson42 kelson42 added the bug label Feb 15, 2022
@kelson42 kelson42 assigned mgautierfr and unassigned kelson42 Feb 16, 2022
@kelson42 kelson42 removed the question label Feb 16, 2022
@veloman-yunkan
Copy link
Collaborator

@kelson42 @mgautierfr I think we should have a release build configuration (for at least one platform) in our CI.

@kelson42
Copy link
Contributor

@veloman-yunkan Supportive of this idea.

@kelson42
Copy link
Contributor

kelson42 commented Jul 3, 2022

@mgautierfr Your feedback about a solution approach is expected here.

@mgautierfr
Copy link
Collaborator

Sorry, I've totally missed this issue.

In a release build, the assertion is disabled and the execution proceeds past that point leading to an infinite loop

I'm surprised about that. We build libzim in release build configuration in kiwix-build each time we do a release. If building in release mode would be enough to see the bug, we should have seen it since a long time.
=> After verification, assert are not removed by default in release mode. We have to pass a explicit b_ndebug to true or if-release to meson to remove assertion.

@mgautierfr as the author of that test should decide how this should be fixed.

From a global perspective, ASSERTs are the last way to check something goes wrong. It should not be considered as valid way to check is ok. (And it is coherent with the fact that we should remove them in release mode)
ASSERTs here are mainly used to check that we are not reading data out of bound. It can be because of two things:

  • A bug in our code, and then ASSERT, review and testing should catch it
  • A wrong value in the zim file. We must check the value when we read it and throw a ZimFormatError if something is wrong. If we don't, it should be considered as a bug (and ASSERT help us to catch it)

In this context, ASSERT in low level should not be considered as a normative behavior and we should not test them. Reading out of range (at this level in libzim) can be considered as undefined behavior (as std::vector [] operator does)

We have several things to do :

  • Remove test on the reader checking for the ASSERT
  • Review all ASSERT in our code to check if they are real valid ASSERT or if they must be replace by a check&throw

@kelson42
Copy link
Contributor

kelson42 commented Aug 11, 2022

@mgautierfr @veloman-yunkan What is the status/next step on this? FYI this is currently the only one bug known in the libzim!

@mgautierfr
Copy link
Collaborator

I have an old branch for a started work on this. I've just created a WIP PR #723

@kelson42 kelson42 linked a pull request Aug 12, 2022 that will close this issue
@kelson42 kelson42 modified the milestones: 8.2.0, 8.3.0 Apr 6, 2023
@kelson42 kelson42 modified the milestones: 9.0.0, 9.1.0 Sep 26, 2023
@kelson42 kelson42 modified the milestones: 9.1.0, 9.2.0 Dec 3, 2023
@kelson42 kelson42 modified the milestones: 9.2.0, 9.3.0 Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants