Skip to content

test: on asan flaky box/on_shutdown.test.lua test #199

@avtikhon

Description

@avtikhon

Tarantool version:
Tarantool 2.6.0-48-gbfeb61b33
Target: Linux-x86_64-RelWithDebInfo
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /usr/bin/clang-8 /usr/bin/clang++-8
C_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -fsanitize=address -fsanitize-blacklist=/tarantool/asan/asan.supp -std=c11 -Wall
-Wextra -Wno-strict-aliasing -fsanitize=alignment,bool,bounds,builtin,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,return,shift,unreachable,vla-bound -fno-sanitize-recover
=alignment,bool,bounds,builtin,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,return,shift,unreachable,vla-bound -Wno-char-subscripts -Wno-gnu-alignof-expression -Werror
CXX_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -fsanitize=address -fsanitize-blacklist=/tarantool/asan/asan.supp -std=c++11 -W
all -Wextra -Wno-strict-aliasing -fsanitize=alignment,bool,bounds,builtin,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,return,shift,unreachable,vla-bound -fno-sanitize-rec
over=alignment,bool,bounds,builtin,enum,float-cast-overflow,float-divide-by-zero,function,integer-divide-by-zero,return,shift,unreachable,vla-bound -Wno-char-subscripts -Wno-invalid-offsetof -Wno-gnu-alig
nof-expression -Werror

OS version:
Debian 10 (Buster)

Bug description:

020-08-26 09:04:06.707 [42629] main/103/proxy C> Tarantool 2.6.0-48-gbfeb61b33
2020-08-26 09:04:06.707 [42629] main/103/proxy C> log level 5
2020-08-26 09:04:06.707 [42629] main/103/proxy I> mapping 117440512 bytes for memtx tuple arena...
2020-08-26 09:04:06.707 [42629] main/103/proxy I> mapping 134217728 bytes for vinyl tuple arena...
2020-08-26 09:04:06.712 [42629] main/103/proxy I> instance uuid 67365709-814e-48d7-a5a0-7f0aa3853f17
2020-08-26 09:04:06.712 [42629] main/103/proxy I> instance vclock {}
2020-08-26 09:04:06.712 [42629] iproto/101/main I> binary: bound to unix/:(socket)
2020-08-26 09:04:06.712 [42629] main/103/proxy I> recovery start
2020-08-26 09:04:06.712 [42629] main/103/proxy I> recovering from `/tnt/test/var/001_box/proxy/00000000000000000000.snap'
2020-08-26 09:04:06.716 [42629] main/103/proxy I> cluster uuid 1b9f9aae-f1b5-4818-b9aa-3ecab7414bce
2020-08-26 09:04:06.729 [42629] main/103/proxy I> assigned id 1 to replica 67365709-814e-48d7-a5a0-7f0aa3853f17
2020-08-26 09:04:06.730 [42629] main/103/proxy I> recover from `/tnt/test/var/001_box/proxy/00000000000000000000.xlog'
2020-08-26 09:04:06.730 [42629] main/103/proxy I> done `/tnt/test/var/001_box/proxy/00000000000000000000.xlog'
2020-08-26 09:04:06.730 [42629] main/103/proxy I> ready to accept requests
2020-08-26 09:04:06.730 [42629] main/103/proxy C> leaving orphan mode
2020-08-26 09:04:06.730 [42629] main/103/proxy I> set 'log_level' configuration option to 5
2020-08-26 09:04:06.730 [42629] main/106/checkpoint_daemon I> scheduled next checkpoint for Wed Aug 26 10:48:34 2020
2020-08-26 09:04:06.730 [42629] main/103/proxy I> set 'memtx_memory' configuration option to 107374182
2020-08-26 09:04:06.730 [42629] main/103/proxy I> set 'listen' configuration option to "\/tnt\/test\/var\/001_box\/proxy.socket-iproto"
2020-08-26 09:04:06.730 [42629] main/103/proxy I> set 'log_format' configuration option to "plain"
2020-08-26 09:04:06.731 [42629] main/117/console/unix/:/tnt/test/var/001_box/proxy.socket-admin I> started
2020-08-26 09:04:06.731 [42629] main C> entering the event loop
2020-08-26 09:04:06.750 [42629] main/102/on_shutdown [string "_ = box.ctl.on_shutdown(function() log.warn("..."]:1 W> on_shutdown 5
Starting instance proxy...
Run console at unix/:/tnt/test/var/001_box/proxy.control
Start failed: builtin/box/console.lua:865: failed to create server unix/:/tnt/test/var/001_box/proxy.control: Address already in use

It happened on ASAN build, because server stop routine

test-run/lib/preprocessor.py:TestState.server_stop() -> test-run/lib/tarantool_server.py:TarantoolServer.stop()

needs some delay to free the proxy.control socket created by

test-run/lib/preprocessor.py:TestState.server_start() -> tarantoolctl:process_local()

To fix the issue added fiber.sleep() to give the needed delay.

Steps to reproduce:
Start the docker container:

docker run --network=host -v $PWD:/source -w /source -ti registry.gitlab.com/tarantool/tarantool/testing/debian-buster:latest

Inside the docker container run:

rm -f unit/guard.skipcond
mkdir /tnt ; cd /tnt/ ; rm -rf /tnt/*

CC=clang-8 CXX=clang++-8 cmake /tarantool \
        -DCMAKE_BUILD_TYPE=RelWithDebInfo \
        -DENABLE_WERROR=ON \
        -DENABLE_ASAN=ON \
        -DENABLE_UB_SANITIZER=ON && make -j

cd test
while ASAN=ON LSAN_OPTIONS=suppressions=/tarantool/asan/lsan.supp ASAN_OPTIONS=heap_profile=0:unmap_shadow_on_exit=1:detect_invalid_pointer_pairs=1:symbolize=1:detect_leaks=1:dump_instruction_bytes=1:print_suppressions=0 ./test-run.py --builddir /tnt --vardir /tnt/test/var box/on_shutdown.test.lua ; do date ; done

Optional (but very desirable):

  • coredump
  • backtracetest: fix flaky box/on_shutdown.test.lua on asan
  • netstat

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions