Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: box-tap/cfg test fails on SuSE 15.x images #4594

Closed
avtikhon opened this issue Oct 29, 2019 · 1 comment
Closed

test: box-tap/cfg test fails on SuSE 15.x images #4594

avtikhon opened this issue Oct 29, 2019 · 1 comment
Assignees
Labels
bug Something isn't working qa Issues related to tests or testing subsystem

Comments

@avtikhon
Copy link
Contributor

avtikhon commented Oct 29, 2019

Tarantool version:
master

OS version:
SuSE 15.x

Bug description:
Issue:

[001] box-tap/cfg.test.lua                                            
[001] not ok 91 - bootstrap from non-empty vinyl_dir #  
[001] Expected: 256
[001] Got:      0
[001] Traceback:
[001] [main] at </source/test/box-tap/cfg.test.lua:0>
[001] 
[001] Rejected result file: box-tap/cfg.reject
[001] [ fail ]

Steps to reproduce:
Use any of the ways below:

  1. Run commands:
cat >gh-4594.lua <<EOF
box.cfg{vinyl_dir = '/tmp/gh-4594'}
s = box.schema.space.create('test', {engine = 'vinyl'})
s:create_index('pk')
os.exit(0)
EOF

rm -rf gh-4594/ /tmp/gh-4594 ; mkdir gh-4594 && mkdir /tmp/gh-4594 && \
( cd gh-4594 && tarantool ../gh-4594.lua && rm -rf * && tarantool ../gh-4594.lua ; echo $? ) 

  1. Create box-tap/cfg_gh-4594.test.lua:
#!/usr/bin/env tarantool
  
local tap = require('tap')
local test = tap.test('cfg')
local fio = require('fio')
test:plan(1)

local tarantool_bin = arg[-1]
local PANIC = 256
function run_script(code)
    local dir = fio.tempdir()
    local script_path = fio.pathjoin(dir, 'script.lua')
    local script = fio.open(script_path, {'O_CREAT', 'O_WRONLY', 'O_APPEND'},
        tonumber('0777', 8))
    script:write(code)
    script:write("\nos.exit(0)")
    script:close()
    local cmd = [[/bin/sh -c 'cd "%s" && "%s" ./script.lua 2> /dev/null']]
    local res = os.execute(string.format(cmd, dir, tarantool_bin))
    fio.rmtree(dir)
    return res
end

--
-- gh-2872 bootstrap is aborted if vinyl_dir contains vylog files
-- left from previous runs
--
vinyl_dir = fio.tempdir()

run_script(string.format([[
box.cfg{vinyl_dir = '%s'}
s = box.schema.space.create('test', {engine = 'vinyl'})
s:create_index('pk')
os.exit(0)
]], vinyl_dir))

code = string.format([[
box.cfg{vinyl_dir = '%s'}
os.exit(0)
]], vinyl_dir)

test:is(run_script(code), PANIC, "bootstrap from non-empty vinyl_dir")
fio.rmtree(vinyl_dir)

test:check()
os.exit(0)

Run:
./test-run.py box-tap/cfg_gh-4594.test.lua

Optional (but very desirable):

  • coredump
  • backtrace
  • netstat
@avtikhon avtikhon self-assigned this Oct 29, 2019
@avtikhon avtikhon added bug Something isn't working qa Issues related to tests or testing subsystem labels Oct 29, 2019
avtikhon added a commit that referenced this issue Oct 29, 2019
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Closes #4562
@kyukhin kyukhin added this to the 2.2.2 milestone Nov 8, 2019
@kyukhin kyukhin modified the milestones: 2.2.2, 2.2.3 Dec 30, 2019
@avtikhon
Copy link
Contributor Author

avtikhon commented Apr 10, 2020

Waiting sub-tasks at:
packpack/packpack#113
#4562

@kyukhin kyukhin modified the milestones: 2.2.3, 2.3.3 Apr 20, 2020
avtikhon added a commit that referenced this issue Apr 23, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Closes #4562
@kyukhin kyukhin removed this from the 2.3.3 milestone Apr 23, 2020
avtikhon added a commit that referenced this issue Apr 28, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Closes #4562
avtikhon added a commit that referenced this issue Apr 28, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Closes #4562

t
avtikhon added a commit that referenced this issue Apr 30, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.
avtikhon added a commit that referenced this issue Apr 30, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Close #4562
avtikhon added a commit that referenced this issue Jun 24, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Close #4562
avtikhon added a commit that referenced this issue Jun 24, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jun 24, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jun 24, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jun 24, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jun 26, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jul 1, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jul 1, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jul 4, 2020
Added makefile to build the SuSE based packages.
Implemented build with testing for opensuse images:
SuSE 15.0, 15.1, 15.2

Tarantool spec file changed with:
Added checks of %{sle_version} according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Running tests by make test, because opensuse uses out-of-source
build by default.
Set complete testing except replication suite.
Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs with testings.

Close #4562
avtikhon added a commit that referenced this issue Jul 6, 2020
Implemented SuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that opensuse adding linker flag like '--no-undefined' which
produces the fails on building test modules at app-tap/app/box-tap
suites and tests in luajit repository. Decided to block this flag.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Close #4562
avtikhon added a commit that referenced this issue Jul 6, 2020
Implemented openSuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that openSuSE adding linker flag '--no-undefined' which produces
the fails on building tests. Decided to block this flag due to dynamic
libraries will be loaded from tarantool executable and will use symbols
from it. So it is completely okay to have unresolved symbols at build
time.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594: test not ready to be used, because
of problems around forking processes from Tarantool.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Close #4562

t
avtikhon added a commit that referenced this issue Jul 6, 2020
Implemented openSuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that openSuSE adding linker flag '--no-undefined' which produces
the fails on building tests. Decided to block this flag due to dynamic
libraries will be loaded from tarantool executable and will use symbols
from it. So it is completely okay to have unresolved symbols at build
time.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594: test not ready to be used, because
of problems around forking processes from Tarantool.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Close #4562
avtikhon added a commit that referenced this issue Jul 6, 2020
Implemented openSuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that openSuSE adding linker flag '--no-undefined' which produces
the fails on building tests. Decided to block this flag due to dynamic
libraries will be loaded from tarantool executable and will use symbols
from it. So it is completely okay to have unresolved symbols at build
time.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594: test not ready to be used, because
of problems around forking processes from Tarantool.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Close #4562
avtikhon added a commit that referenced this issue Jul 7, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
avtikhon added a commit that referenced this issue Jul 7, 2020
Implemented openSuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that openSuSE adding linker flag '--no-undefined' which produces
the fails on building tests. Decided to block this flag due to dynamic
libraries will be loaded from tarantool executable and will use symbols
from it. So it is completely okay to have unresolved symbols at build
time.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594: test not ready to be used, because
of problems around forking processes from Tarantool.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Closes #4562
avtikhon added a commit that referenced this issue Jul 7, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
avtikhon added a commit that referenced this issue Jul 7, 2020
Implemented openSuSE packages build with testing for images:
opensuse-leap:15.[0-2]

Added %{sle_version} checks in Tarantool spec file according to
https://en.opensuse.org/openSUSE:Packaging_for_Leap#RPM_Distro_Version_Macros

Found that openSuSE adding linker flag '--no-undefined' which produces
the fails on building tests. Decided to block this flag due to dynamic
libraries will be loaded from tarantool executable and will use symbols
from it. So it is completely okay to have unresolved symbols at build
time.

Test box-tap/cfg fails and temporary blocked according to the
issue created for it #4594: test not ready to be used, because
of problems around forking processes from Tarantool.

Added rpm/prebuild-opensuse-leap.sh script to fix the issue with
broken repositories in zypper cache. It refreshes zypper cache.

Added opensuse-leap of 15.1 and 15.2 versions to Gitlab-CI packages
building/deploing jobs.

Closes #4562
avtikhon added a commit that referenced this issue Jul 7, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
Needed for #4562
@avtikhon avtikhon added this to ON REVIEW in Quality Assurance Jul 7, 2020
avtikhon added a commit that referenced this issue Jul 8, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
Needed for #4562
avtikhon added a commit that referenced this issue Jul 10, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
Needed for #4562
avtikhon added a commit that referenced this issue Jul 14, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
Needed for #4562
avtikhon added a commit that referenced this issue Aug 14, 2020
Found that test fails on its subtest checking:
  "gh-2872 bootstrap is aborted if vinyl_dir
   contains vylog files left from previous runs"

Debugging src/box/xlog.c found that all checks
were correct, but at function:
  src/box/vy_log.c:vy_log_bootstrap()
the check on of the errno on ENOENT blocked the
negative return from function:
  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before
the xdir_scan() call. To fix the issue the errno
must be clean before the call to xdir_scan, because
we are interesting in it only from this function.
The same issue fixed at function:
  src/box/vy_log.c:vy_log_begin_recovery()

Closes #4594
Needed for #4562
avtikhon added a commit that referenced this issue Aug 14, 2020
During implementation of openSUSE got failed box-tap/cfg.test.lua test.
Found that when memtx_dir wasn't existed, while vinyl_dir existed and
errno was set to ENOENT, box configuration succeeded, but it shouldn't.
Reason of this wrong behaviour was that not all of the failure paths in
xdir_scan() were set errno, but the caller assumed it.

Usual C convention is to report success or failure using a return
value and set errno at any error. So a caller usually just checks a
return value and if it means a failure (usually -1), it checks errno
to determine an exact reason.

Usual convention in tarantool is a bit different: we use a special
diagnostics area to report a reason of a failure.

Not all failure paths of xdir_scan() sets errno (including our
'invalid instance UUID' case), so we cannot be sure that errno is
not remains unchanged after a failure of the function.

However the solution with checking errno against ENOENT (No such file
or directory) is not good. For example:

- What if xdir_scan() would be changed in future and, say, some call
  will rewrite errno after the opendir() call?
- What if some other call inside xdir_scan() will set ENOENT: say,
  open() in xdir_open_cursor() due to some race?

We lean on implementation details of the callee, not its contract. This
way is too fragile and it should either check whether the directory
exists before xdir_scan() call or pass a flag to xdir_scan() whether
the directory should exist. Decided to use second variant - it does not
lead to code duplication.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 14, 2020
avtikhon added a commit that referenced this issue Aug 14, 2020
During implementation of openSUSE got failed box-tap/cfg.test.lua test.
Found that when memtx_dir wasn't existed, while vinyl_dir existed and
errno was set to ENOENT, box configuration succeeded, but it shouldn't.
Reason of this wrong behaviour was that not all of the failure paths in
xdir_scan() were set errno, but the caller assumed it.

Usual C convention is to report success or failure using a return
value and set errno at any error. So a caller usually just checks a
return value and if it means a failure (usually -1), it checks errno
to determine an exact reason.

Usual convention in tarantool is a bit different: we use a special
diagnostics area to report a reason of a failure.

Not all failure paths of xdir_scan() sets errno (including our
'invalid instance UUID' case), so we cannot be sure that errno is
not remains unchanged after a failure of the function.

However the solution with checking errno against ENOENT (No such file
or directory) is not good. For example:

- What if xdir_scan() would be changed in future and, say, some call
  will rewrite errno after the opendir() call?
- What if some other call inside xdir_scan() will set ENOENT: say,
  open() in xdir_open_cursor() due to some race?

We lean on implementation details of the callee, not its contract. This
way is too fragile and it should either check whether the directory
exists before xdir_scan() call or pass a flag to xdir_scan() whether
the directory should exist. Decided to use second variant - it does not
lead to code duplication.

Added subtest to box-tap/cfg.test.lua test file, to check the currently
fixed issue.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 17, 2020
During implementation of openSUSE got failed box-tap/cfg.test.lua test.
Found that when memtx_dir didn't exist and vinyl_dir existed and also
errno was set to ENOENT, box configuration succeeded, but it shouldn't.
Reason of this wrong behaviour was that not all of the failure paths in
xdir_scan() set errno, but the caller assumed it.

Debugging src/box/xlog.c found that all checks were correct, but at:

  src/box/vy_log.c:vy_log_bootstrap()
  src/box/vy_log.c:vy_log_begin_recovery()

the checks on of the errno on ENOENT blocked the negative return from:

  src/box/xlog.c:xdir_scan()

Found that errno was already set to ENOENT before the xdir_scan() call.
To fix the issue the errno could be clean before the call to xdir_scan,
because we are interesting in it only from xdir_scan function.

After discussions found that there were alternative better solution to
fix it. The fix with resetting errno was not good because xdir_scan()
was not system call in real and some internal routines could set it
to ENOENT itself, so it couldn't be controled from outside of function.

To be sure in behaviour of the changing errno decided to pass a flag to
xdir_scan() if the directory should exist.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 19, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behaviour was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
whould be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself than the check was wrong. The previous logic of the
check was to catch the error ENOENT from inside the xdir_scan()
function to handle the situation when vinyl_dir was not exist. In this
way errno should be reseted before xdir_scan() call to give the ability
for xdir_scan() to use return value without errno set and correctly
handle errno from inside the xdir_scan().

After discussions found that there was alternative better solution to
fix it. To be sure in behaviour of the changing errno decided to use
variant with the flag in xdir_scan() if the directory should exist.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 19, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behaviour was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
whould be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself than the check was wrong. The previous logic of the
check was to catch the error ENOENT from inside the xdir_scan()
function to handle the situation when vinyl_dir was not exist. In this
way errno should be reseted before xdir_scan() call to give the ability
for xdir_scan() to use return value without errno set and correctly
handle errno from inside the xdir_scan().

After discussions found that there was alternative better solution to
fix it. As mentioned above xdir_scan() function is not system call and
can be changed inside it in any possible way. So check outside of this
function on errno could be broken, because of the xdir_scan() changes.
To avoid of it we must avoid of errno checks outside of the function.
Better solution was to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 19, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself than the check was wrong. The previous logic of the
check was to catch the error ENOENT from inside the xdir_scan()
function to handle the situation when vinyl_dir was not exist. In this
way errno should be reset before xdir_scan() call to give the ability
for xdir_scan() to use return value without errno set and correctly
handle errno from inside the xdir_scan().

After discussions found that there was alternative better solution to
fix it. As mentioned above xdir_scan() function is not system call and
can be changed inside it in any possible way. So check outside of this
function on errno could be broken, because of the xdir_scan() changes.
To avoid of it we must avoid of errno checks outside of the function.
Better solution was to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
@avtikhon avtikhon moved this from ON REVIEW to LGTM in Quality Assurance Aug 19, 2020
avtikhon added a commit that referenced this issue Aug 20, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself than the check was wrong. The previous logic of the
check was to catch the error ENOENT which set in the xdir_scan()
function to handle the situation when vinyl_dir was not exist. In this
way errno should be reset before xdir_scan() call to give the ability
for xdir_scan() to use return value without errno set and correctly
handle errno which could be set in the xdir_scan().

After discussions found that there was alternative better solution to
fix it. As mentioned above xdir_scan() function is not system call and
can be changed inside it in any possible way. So check outside of this
function on errno could be broken, because of the xdir_scan() changes.
To avoid of it we must avoid of errno checks outside of the function.
Better solution was to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 20, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself than the check was wrong.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it errno had to be reset before xdir_scan() call,
because errno could be passed from any other function before call to
xdir_scan() and success system calls in xdir_scan() hadn't clean it up.

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid of it we must avoid of errno checks outside of the function.
Better solution was to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 20, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid of it we must avoid of errno checks outside of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 27, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid that we must not check errno after call of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon added a commit that referenced this issue Aug 28, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid that we must not check errno after call of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
(cherry picked from commit 77367c0)
@avtikhon avtikhon moved this from ON REVIEW WITH 1 LGTM to 2 LGTMs in Quality Assurance Aug 28, 2020
kyukhin pushed a commit that referenced this issue Aug 31, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid that we must not check errno after call of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
(cherry picked from commit 43c776a)
kyukhin pushed a commit that referenced this issue Aug 31, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid that we must not check errno after call of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
(cherry picked from commit 43c776a)
kyukhin pushed a commit that referenced this issue Aug 31, 2020
During implementation of openSUSE build with testing got failed test
box-tap/cfg.test.lua. Found that when memtx_dir didn't exist and
vinyl_dir existed and also errno was set to ENOENT, box configuration
succeeded, but it shouldn't. Reason of this wrong behavior was that
not all of the failure paths in xdir_scan() set errno, but the caller
assumed it.

Debugging the issue found that after xdir_scan() there was incorrect
check for errno when it returned negative values. xdir_scan() is not
system call and negative return value from it doesn't mean that errno
would be set too. Found that in situations when errno was left from
previous commands before xdir_scan() and xdir_scan() returned negative
value by itself it produced the wrong check.

The previous failed logic of the check was to catch the error ENOENT
which set in the xdir_scan() function to handle the situation when
vinyl_dir was not exist. It failed, because checking ENOENT outside
the xdir_scan() function, we had to be sure that ENOENT had come from
xdir_scan() function call indeed and not from any other functions
before. To be sure in it possible fix could be reset errno before
xdir_scan() call, because errno could be passed from any other function
before call to xdir_scan().

As mentioned above xdir_scan() function is not system call and can be
changed in any possible way and it can return any result value without
need to setup errno. So check outside of this function on errno could
be broken.

To avoid that we must not check errno after call of the function.
Better solution is to use the flag in xdir_scan(), to check if the
directory should exist. So errno check was removed and instead of it
the check for vinyl_dir existence using flag added.

Closes #4594
Needed for #4562

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
(cherry picked from commit 43c776a)
@avtikhon avtikhon moved this from 2 LGTMs ready to push to DONE in Quality Assurance Aug 31, 2020
@avtikhon avtikhon removed this from DONE in Quality Assurance Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working qa Issues related to tests or testing subsystem
Projects
None yet
Development

No branches or pull requests

2 participants