New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: flaky box/net.box_wait_connected_gh-3856 test on FreeBSD #5083
Labels
Comments
avtikhon
added a commit
that referenced
this issue
Jun 16, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() To avoid of such issue and check indeed that "wait_connected = false" is ignored the test should wait when connection state became 'initial' and only after that the test can be checked. Closes #5083
avtikhon
added a commit
that referenced
this issue
Jun 16, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() The test uses external Google DNS IP, check information on it: https://developers.google.com/speed/public-dns/docs/using This issue appears because the link is external and connection may fail from time to time. In this case the test should wait till connection state became 'initial' and only after that the test can continue. Closes #5083
avtikhon
pushed a commit
that referenced
this issue
Jun 20, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() The reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Reviewed-by: Alexander V. Tikhonov <avtikhon@tarantool.org>
avtikhon
added a commit
that referenced
this issue
Jun 22, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
avtikhon
added a commit
that referenced
this issue
Jun 22, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org> Co-authored-by: Vladislav Shpilevoy<v.shpilevoy@tarantool.org>
avtikhon
added a commit
that referenced
this issue
Jun 23, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org> Co-authored-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
kyukhin
pushed a commit
that referenced
this issue
Jun 26, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org> Co-authored-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> (cherry picked from commit d51be6f)
kyukhin
pushed a commit
that referenced
this issue
Jun 26, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org> Co-authored-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> (cherry picked from commit d51be6f)
kyukhin
pushed a commit
that referenced
this issue
Jun 26, 2020
Found issue running test on FreeBSD VBox host: [011] --- box/net.box_wait_connected_gh-3856.result Mon Jun 15 09:39:49 2020 [011] +++ box/net.box_wait_connected_gh-3856.reject Fri May 8 08:23:30 2020 [011] @@ -12,7 +12,8 @@ [011] - opts: [011] wait_connected: false [011] host: 8.8.8.8 [011] - state: initial [011] + state: error [011] + error: Invalid argument [011] port: '123456' [011] ... [011] c:close() A. Turenko made deep investigation and found that the reason of the fail was that getaddrinfo() returned EIA_SERVICE for an incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc. Checked with his local script './getaddrinfo': (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456 ---- family: AF_INET socktype: SOCK_STREAM protocol: IPPROTO_TCP host: 8.8.8.8 serv: 57920 (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456 getaddrinfo: Service was not recognized for socket type So obvious fix is to change 123456 to something less or equal to 65535. Say, 1234. The test depended on an order in which fibers were scheduled (net_box.connect() creates a separate fiber for connecting in background using fiber.create(), which yields). Unlikely our fiber were not get execution time during the connection attempt, so it was more like a formal thing. But we can decrease probability of this situation even more if we'll grab all connection fields just when net_box.connect() returns, not after yield in console (which is due to waiting a next command from test-run). Closes #5083 Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org> Co-authored-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> (cherry picked from commit d51be6f)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Tarantool version:
Tarantool 2.5.0-142-ged935572b
Target: FreeBSD-amd64-RelWithDebInfo
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=OFF
Compiler: /usr/bin/cc /usr/bin/c++
C_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-common -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-gnu-alignof-expression -Werror
CXX_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-common -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Werror
OS version:
FreeBSD 12
Bug description:
https://gitlab.com/tarantool/tarantool/-/jobs/596437958
https://gitlab.com/tarantool/tarantool/-/jobs/596349477
https://gitlab.com/tarantool/tarantool/-/jobs/594179381
Steps to reproduce:
Optional (but very desirable):
The text was updated successfully, but these errors were encountered: