Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-http2-session-timeout #20628

Closed
Trott opened this issue May 9, 2018 · 21 comments
Closed

Investigate flaky test-http2-session-timeout #20628

Trott opened this issue May 9, 2018 · 21 comments
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. http2 Issues or PRs related to the http2 subsystem.

Comments

@Trott
Copy link
Member

Trott commented May 9, 2018

  • Version: v11.0.0-pre (master)
  • Platform: ubuntu1604_sharedlibs_openssl110_x64
  • Subsystem: http2

https://ci.nodejs.org/job/node-test-commit-linux-containered/4283/nodes=ubuntu1604_sharedlibs_openssl110_x64/console

03:10:22 not ok 2130 sequential/test-http2-session-timeout
03:10:22   ---
03:10:22   duration_ms: 1.252
03:10:22   severity: fail
03:10:22   exitcode: 1
03:10:22   stack: |-
03:10:22     (node:6609) ExperimentalWarning: The http2 module is an experimental API.
03:10:22     assert.js:77
03:10:22       throw new AssertionError(obj);
03:10:22       ^
03:10:22     
03:10:22     AssertionError [ERR_ASSERTION]: function should not have been called at /home/iojs/build/workspace/node-test-commit-linux-containered/nodes/ubuntu1604_sharedlibs_openssl110_x64/test/sequential/test-http2-session-timeout.js:11
03:10:22         at Http2Server.mustNotCall (/home/iojs/build/workspace/node-test-commit-linux-containered/nodes/ubuntu1604_sharedlibs_openssl110_x64/test/common/index.js:540:12)
03:10:22         at Http2Server.emit (events.js:182:13)
03:10:22         at ServerHttp2Session.sessionOnTimeout (internal/http2/core.js:2526:15)
03:10:22         at Object.onceWrapper (events.js:273:13)
03:10:22         at ServerHttp2Session.emit (events.js:182:13)
03:10:22         at ServerHttp2Session._onTimeout (internal/http2/core.js:1256:10)
03:10:22         at ontimeout (timers.js:428:11)
03:10:22         at tryOnTimeout (timers.js:291:5)
03:10:22         at listOnTimeout (timers.js:252:5)
03:10:22         at Timer.processTimers (timers.js:212:10)
03:10:22   ...

@nodejs/http2

@Trott Trott added flaky-test Issues and PRs related to the tests with unstable failures on the CI. http2 Issues or PRs related to the http2 subsystem. labels May 9, 2018
@apapirovski
Copy link
Member

This tests a naturally flaky behaviour... I'm not sure what we can do other than keep increasing the timeouts which I'm not a big fan of. As far as I know this is the first failure in a long time. Maybe just ignore unless it happens again sometime soon?

@Trott
Copy link
Member Author

Trott commented May 13, 2018

I'm not sure what we can do other than keep increasing the timeouts which I'm not a big fan of. As far as I know this is the first failure in a long time. Maybe just ignore unless it happens again sometime soon?

@apapirovski If possible, I would like to take this opportunity to at least improve it a little bit. PTAL at #20692.

Trott added a commit to Trott/io.js that referenced this issue May 14, 2018
Check actual expired time rather than relying on a number of calls to
setTimeout() in test-http2-session-timeout more robust.

Fixes: nodejs#20628
MylesBorins pushed a commit that referenced this issue May 22, 2018
Check actual expired time rather than relying on a number of calls to
setTimeout() in test-http2-session-timeout more robust.

PR-URL: #20692
Fixes: #20628
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
@Trott
Copy link
Member Author

Trott commented Jul 29, 2018

Failed again today.

https://ci.nodejs.org/job/node-test-commit-linux-containered/5860/nodes=ubuntu1604_sharedlibs_shared_x64/console

00:14:31 not ok 2231 sequential/test-http2-session-timeout
00:14:31   ---
00:14:31   duration_ms: 1.277
00:14:31   severity: fail
00:14:31   exitcode: 1
00:14:31   stack: |-
00:14:31     (node:27510) ExperimentalWarning: The http2 module is an experimental API.
00:14:31     assert.js:84
00:14:31       throw new AssertionError(obj);
00:14:31       ^
00:14:31     
00:14:31     AssertionError [ERR_ASSERTION]: function should not have been called at /home/iojs/build/workspace/node-test-commit-linux-containered/nodes/ubuntu1604_sharedlibs_shared_x64/test/sequential/test-http2-session-timeout.js:10
00:14:31         at Http2Server.mustNotCall (/home/iojs/build/workspace/node-test-commit-linux-containered/nodes/ubuntu1604_sharedlibs_shared_x64/test/common/index.js:533:12)
00:14:31         at Http2Server.emit (events.js:182:13)
00:14:31         at ServerHttp2Session.sessionOnTimeout (internal/http2/core.js:2547:15)
00:14:31         at Object.onceWrapper (events.js:273:13)
00:14:31         at ServerHttp2Session.emit (events.js:182:13)
00:14:31         at ServerHttp2Session._onTimeout (internal/http2/core.js:1272:10)
00:14:31         at ontimeout (timers.js:454:11)
00:14:31         at tryOnTimeout (timers.js:326:5)
00:14:31         at listOnTimeout (timers.js:300:5)
00:14:31         at processTimers (timers.js:257:5)
00:14:31   ...

Wondering if reliability can be improved by checking number of ticks rather than a duration. This is something @apapirovski seems to suggest above and something @addaleax did for a different flaky test recently. I'll try to make the change and hopefully get some stress test results to see if we can reproduce the unreliability here and make it reliable.

@Trott Trott reopened this Jul 29, 2018
@Trott
Copy link
Member Author

Trott commented Jul 29, 2018

Stress test to see if we can measure the flakiness on current master: https://ci.nodejs.org/job/node-stress-single-test/1972/nodes=ubuntu1604_sharedlibs_debug_x64/

@Trott
Copy link
Member Author

Trott commented Jul 30, 2018

Wondering if the issue is setTimeout(makeReq, callTimeout). Increasing callTimeout increases flakiness for more locally, so perhaps changing that to setImmediate() will make things a fair bit more reliable?

Trott added a commit to Trott/io.js that referenced this issue Jul 30, 2018
Use `setImmediate()` instead of `setTimeout()` to improve robustness of
test-http2-session-timeout.

Fixes: nodejs#20628
@Trott Trott closed this as completed in d2ffcac Aug 1, 2018
targos pushed a commit that referenced this issue Aug 2, 2018
Use `setImmediate()` instead of `setTimeout()` to improve robustness of
test-http2-session-timeout.

Fixes: #20628

PR-URL: #22026
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Oct 17, 2018
Check actual expired time rather than relying on a number of calls to
setTimeout() in test-http2-session-timeout more robust.

Backport-PR-URL: #22850
PR-URL: #20692
Fixes: #20628
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this issue Oct 17, 2018
Use `setImmediate()` instead of `setTimeout()` to improve robustness of
test-http2-session-timeout.

Fixes: #20628

Backport-PR-URL: #22850
PR-URL: #22026
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Jon Moss <me@jonathanmoss.me>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
jasnell pushed a commit that referenced this issue Oct 17, 2018
PR-URL: #23639
Refs: #20628
Reviewed-By: Rich Trott <rtrott@gmail.com>
@Trott
Copy link
Member Author

Trott commented Oct 22, 2018

Showing up on Windows now looking different (timing out after 322 requests rather than afetr 1.)

Host is test-azure_msft-win10-x64-4.

https://ci.nodejs.org/job/node-test-binary-windows/20919/COMPILED_BY=vs2017,RUNNER=win10,RUN_SUBSET=0/console

06:40:05 not ok 590 sequential/test-http2-session-timeout
06:40:05   ---
06:40:05   duration_ms: 0.694
06:40:05   severity: fail
06:40:05   exitcode: 1
06:40:05   stack: |-
06:40:05     assert.js:128
06:40:05       throw err;
06:40:05       ^
06:40:05     
06:40:05     AssertionError [ERR_ASSERTION]: Timeout after 322 request(s)
06:40:05         at Http2Server.mustNotCall (c:\workspace\node-test-binary-windows\test\sequential\test-http2-session-timeout.js:13:10)
06:40:05         at Http2Server.emit (events.js:182:13)
06:40:05         at ServerHttp2Session.sessionOnTimeout (internal/http2/core.js:2633:15)
06:40:05         at Object.onceWrapper (events.js:273:13)
06:40:05         at ServerHttp2Session.emit (events.js:182:13)
06:40:05         at ServerHttp2Session._onTimeout (internal/http2/core.js:1306:10)
06:40:05         at listOnTimeout (timers.js:324:15)
06:40:05         at processTimers (timers.js:268:5)
06:40:05   ...

MylesBorins pushed a commit that referenced this issue Oct 30, 2018
PR-URL: #23639
Refs: #20628
Reviewed-By: Rich Trott <rtrott@gmail.com>
@Trott
Copy link
Member Author

Trott commented Nov 8, 2018

https://ci.nodejs.org/job/node-test-binary-arm/3941/RUN_SUBSET=2,label=pi2-docker/console

16:10:08 not ok 328 sequential/test-http2-session-timeout # TODO : Fix flaky test
16:10:08   ---
16:10:08   duration_ms: 2.247
16:10:08   severity: flaky
16:10:08   exitcode: 1
16:10:08   stack: |-
16:10:08     assert.js:128
16:10:08       throw err;
16:10:08       ^
16:10:08     
16:10:08     AssertionError [ERR_ASSERTION]: Timeout after 2 request(s)
16:10:08         at Http2Server.mustNotCall (/home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-http2-session-timeout.js:13:10)
16:10:08         at Http2Server.emit (events.js:182:13)
16:10:08         at ServerHttp2Session.sessionOnTimeout (internal/http2/core.js:2644:15)
16:10:08         at Object.onceWrapper (events.js:273:13)
16:10:08         at ServerHttp2Session.emit (events.js:182:13)
16:10:08         at ServerHttp2Session._onTimeout (internal/http2/core.js:1310:10)
16:10:08         at listOnTimeout (timers.js:324:15)
16:10:08         at processTimers (timers.js:268:5)
16:10:08   ...

rvagg pushed a commit that referenced this issue Nov 28, 2018
PR-URL: #23639
Refs: #20628
Reviewed-By: Rich Trott <rtrott@gmail.com>
MylesBorins pushed a commit that referenced this issue Nov 29, 2018
PR-URL: #23639
Refs: #20628
Reviewed-By: Rich Trott <rtrott@gmail.com>
@Trott
Copy link
Member Author

Trott commented Dec 6, 2018

https://ci.nodejs.org/job/node-test-binary-arm/4703/RUN_SUBSET=4,label=pi1-docker/console

test-requireio_bengl-debian9-armv6l_pi1p-1

01:14:04 not ok 328 sequential/test-http2-session-timeout # TODO : Fix flaky test
01:14:04   ---
01:14:04   duration_ms: 6.20
01:14:04   severity: flaky
01:14:04   exitcode: 1
01:14:04   stack: |-
01:14:04     assert.js:128
01:14:04       throw err;
01:14:04       ^
01:14:04     
01:14:05     AssertionError [ERR_ASSERTION]: Timeout after 1 request(s)
01:14:05         at Http2Server.mustNotCall (/home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-http2-session-timeout.js:13:10)
01:14:05         at Http2Server.emit (events.js:189:13)
01:14:05         at ServerHttp2Session.sessionOnTimeout (internal/http2/core.js:2622:15)
01:14:05         at Object.onceWrapper (events.js:277:13)
01:14:05         at ServerHttp2Session.emit (events.js:189:13)
01:14:05         at ServerHttp2Session._onTimeout (internal/http2/core.js:1307:10)
01:14:05         at listOnTimeout (timers.js:324:15)
01:14:05         at processTimers (timers.js:268:5)
01:14:05   ...

@Trott
Copy link
Member Author

Trott commented Dec 6, 2018

I'm hopeful that a robust solution might be to determine the right timeout empirically at runtime. Working it out right now....

@Trott
Copy link
Member Author

Trott commented Dec 6, 2018

Proposed fix: #24877

Trott added a commit to Trott/io.js that referenced this issue Dec 6, 2018
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: nodejs#20628
Trott added a commit to Trott/io.js that referenced this issue Dec 8, 2018
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: nodejs#24877
Fixes: nodejs#20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
@Trott Trott closed this as completed in 3fe00ef Dec 8, 2018
Flakees in CI automation moved this from Known flakes to Closed Dec 8, 2018
BethGriggs pushed a commit that referenced this issue Dec 17, 2018
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: #20628

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Dec 17, 2018
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
refack pushed a commit to refack/node that referenced this issue Jan 14, 2019
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: nodejs#20628

PR-URL: nodejs#24877
Fixes: nodejs#20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
refack pushed a commit to refack/node that referenced this issue Jan 14, 2019
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: nodejs#24877
Fixes: nodejs#20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Feb 12, 2019
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: #20628

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Feb 12, 2019
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Feb 20, 2019
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: #20628

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
BethGriggs pushed a commit that referenced this issue Feb 20, 2019
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
rvagg pushed a commit that referenced this issue Feb 28, 2019
Instead of using magic values for the server timeout in
test-http2-session-timeout, measure the duration of the first request
(which should be longer than subsequent requests) and use that value.

Fixes: #20628

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
rvagg pushed a commit that referenced this issue Feb 28, 2019
A fix to test-http2-session-timeout makes it sufficiently robust that it
can be moved to the parallel directory.

PR-URL: #24877
Fixes: #20628
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. http2 Issues or PRs related to the http2 subsystem.
Projects
Flakees in CI
  
Closed
3 participants