Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

travis-ci: box/on_shutdown.test.lua test flaky fails under highload #4134

Closed
avtikhon opened this issue Apr 10, 2019 · 3 comments
Closed

travis-ci: box/on_shutdown.test.lua test flaky fails under highload #4134

avtikhon opened this issue Apr 10, 2019 · 3 comments
Assignees
Labels
flaky test qa Issues related to tests or testing subsystem
Milestone

Comments

@avtikhon
Copy link
Contributor

avtikhon commented Apr 10, 2019

Tarantool version:
master

OS version image:
packpack/packpack:debian-stretch

Bug description:
The following failed even after the grep_log was changed to wait_log with 30 secs timeout

[079] --- box/misc.result Wed Apr 10 15:13:37 2019
[079] +++ box/misc.reject Wed Apr 10 18:12:13 2019
[079] @@ -1406,7 +1406,6 @@
[079] ...
[079] test_run:wait_log('test', 'on_shutdown 5', nil, 30, {noreset=true})
[079] ---
[079] -- on_shutdown 5
[079] ...
[079] -- make sure we exited because of os.exit(), not a signal.
[079] test_run:wait_log('test', 'signal', nil, 10, {noreset=true})
[079]

Steps to reproduce:
./test-run.py -j 50 box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua box/misc.test.lua

Optional (but very desirable):

  • coredump
  • backtrace
  • netstat
sergepetrenko added a commit that referenced this issue Apr 11, 2019
Rewrite the test so that it doesn't depend on timeouts, which can be
exceeded under high load, and simplify it a little.

Closes #4134
avtikhon pushed a commit that referenced this issue Apr 11, 2019
Rewrite the test so that it doesn't depend on timeouts, which can be
exceeded under high load, and simplify it a little.

Closes #4134
sergepetrenko added a commit that referenced this issue Apr 11, 2019
Rewrite the test so that it doesn't depend on timeouts, which can be
exceeded under high load, too much and simplify it a little.

Also extract the whole part regarding on_shutdown triggers into a
separate test, so that its sporadic failures would be easier to
investigate.

Part of #4134
sergepetrenko added a commit that referenced this issue Apr 12, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
sergepetrenko added a commit that referenced this issue Apr 12, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
locker pushed a commit that referenced this issue Apr 12, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
locker pushed a commit that referenced this issue Apr 12, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
locker pushed a commit that referenced this issue Apr 12, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134

(cherry picked from commit 70ea999)
locker pushed a commit that referenced this issue Apr 12, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134

(cherry picked from commit c026052)
avtikhon pushed a commit that referenced this issue May 13, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 13, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 13, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 13, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
This part of the test is flaky when tests are run in parallel, besides,
it is quite big on its own, so extract it into a separate file to add
more flexibility in running tests and to make finding problems easier.

Part of #4134
avtikhon pushed a commit that referenced this issue May 15, 2019
The test is flaky under high load (e.g. when is run in parallel with a
lot of workers). Make it less dependent on arbitrary timeouts to improve
stability.

Part of #4134
@kyukhin kyukhin added the qa Issues related to tests or testing subsystem label Jun 7, 2019
@kyukhin kyukhin added this to the 2.1.3 milestone Jun 7, 2019
@sergepetrenko
Copy link
Collaborator

Doesn't reproduce on my side anymore. (Mac OS X 10.14.5)

./test-run.py -j128 box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown. box/on_shutdown.
...
Statistics:

  • pass: 128

@sergepetrenko sergepetrenko added the needs feedback Something is unclear with the issue label Jun 17, 2019
@avtikhon
Copy link
Contributor Author

@Totktonada
Copy link
Member

Note: The test case is now extracted into box/on_shutdown.test.lua.

Totktonada pushed a commit that referenced this issue Jun 21, 2019
The test is flaky and often fails in parallel testing. We want to enable
parallel testing within scope of #4156 (enabling GitLab CI).

It should be enabled back in the scope of #4134.

Needed for #4156.
avtikhon added a commit that referenced this issue Jun 25, 2019
The test is flaky and often fails in parallel testing. We want to enable
parallel testing within scope of #4156 (enabling GitLab CI).

It should be enabled back in the scope of #4134.

Needed for #4156.
@Totktonada Totktonada changed the title travis-ci: box/misc test flaky fails under highload travis-ci: box/on_shutdown.test.lua test flaky fails under highload Jun 25, 2019
Totktonada pushed a commit that referenced this issue Jun 25, 2019
The test is flaky and often fails in parallel testing. We want to enable
parallel testing within scope of #4156 (enabling GitLab CI).

It should be enabled back in the scope of #4134.

Needed for #4156.
sergepetrenko added a commit that referenced this issue Jun 26, 2019
Replace prints that indicate on_shutdown trigger execution with
log.warn, which is more reliable. This eliminates occasional test
failures. Also instead of waiting for the server to start and executing
grep_log, wait for the desired log entries to appear with wait_log.

Closes #4134
avtikhon added a commit that referenced this issue Jul 1, 2019
The test is flaky and often fails in parallel testing. We want to enable
parallel testing within scope of #4156 (enabling GitLab CI).

It should be enabled back in the scope of #4134.

Needed for #4156.
@sergepetrenko sergepetrenko added ready for review and removed needs feedback Something is unclear with the issue labels Jul 3, 2019
avtikhon added a commit that referenced this issue Jul 3, 2019
The test is flaky and often fails in parallel testing. We want to enable
parallel testing within scope of #4156 (enabling GitLab CI).

It should be enabled back in the scope of #4134.

Needed for #4156.
Totktonada pushed a commit that referenced this issue Jul 4, 2019
Replace prints that indicate on_shutdown trigger execution with
log.warn, which is more reliable. This eliminates occasional test
failures. Also instead of waiting for the server to start and executing
grep_log, wait for the desired log entries to appear with wait_log.

Closes #4134

(cherry picked from commit 5046069)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky test qa Issues related to tests or testing subsystem
Projects
None yet
Development

No branches or pull requests

4 participants