Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AppVeyor CI msys2/cygwin debug jobs permafail for test2600 after #11690 #11767

Closed
vszakats opened this issue Aug 30, 2023 · 5 comments
Closed
Labels
CI Continuous Integration name lookup DNS and related tech regression

Comments

@vszakats
Copy link
Member

vszakats commented Aug 30, 2023

I did this

Pushed a PR. These AppVeyor CI jobs always fail for test2600:

  • autotools, msys2, Debug, no Proxy, no SSL
  • autotools, msys2, Debug, no SSL
  • autotools, cygwin, Debug, no SSL

All 64-bit builds.
The MSYS2 Release build is always OK however.
Release build has AsynchDNS enabled, while Debug ones don't.

Happening since:
https://ci.appveyor.com/project/curlorg/curl/builds/47840460

Original report: #11690 (comment)
Ref: 8627416 #11690

I expected the following

AppVeyor CI jobs green. Or, green after a rebuild in case of flaky results.

curl/libcurl version

master

operating system

Windows MSYS2/Cygwin

@vszakats vszakats added the CI Continuous Integration label Aug 30, 2023
@vszakats vszakats changed the title AppVeyor CI msys2/cygwin debug jobs permafail for test2600 after 862741637 AppVeyor CI msys2/cygwin debug jobs permafail for test2600 after 862741637bd31f7c621a988e47c5452ebbb58cb4 Aug 30, 2023
@vszakats vszakats changed the title AppVeyor CI msys2/cygwin debug jobs permafail for test2600 after 862741637bd31f7c621a988e47c5452ebbb58cb4 AppVeyor CI msys2/cygwin debug jobs permafail for test2600 after #11690 Aug 30, 2023
@jay
Copy link
Member

jay commented Aug 30, 2023

I've looked at a few and afaics it's always the happy eyeballs subtests that fail.

 5: test case took 403ms
 unit2600.c:309 test failed: '5: expected ipv6 to start delayed after 150ms, instead first attempt made after 201ms'

...

 6: test case took 403ms
 unit2600.c:309 test failed: '6: expected ipv4 to start delayed after 150ms, instead first attempt made after 201ms'

The error message is misleading because USE_ALARM_TIMEOUT is defined in this case (which may be the issue?) and that means the happy eyeballs timeout is set to 1000ms (instead of he_timeout which is 150ms). So it's failing because it expects a delay of 1s and it's not getting it. As you noted in your original report these fails are all with synchronous dns on Windows.

curl/tests/unit/unit2600.c

Lines 329 to 337 in 226d042

#ifdef USE_ALARM_TIMEOUT
/* we only 1sec timer resolution here */
curl_easy_setopt(easy, CURLOPT_HAPPY_EYEBALLS_TIMEOUT_MS,
(tc->he_timeout_ms > tc->connect_timeout_ms)?
3000L : 1000L);
#else
curl_easy_setopt(easy, CURLOPT_HAPPY_EYEBALLS_TIMEOUT_MS,
(long)tc->he_timeout_ms);
#endif

curl/tests/unit/unit2600.c

Lines 300 to 310 in 226d042

#ifndef USE_ALARM_TIMEOUT
if(stats2->first_created < tc->he_timeout_ms) {
#else
if(stats2->first_created < 1000) {
#endif
curl_msprintf(msg, "%d: expected ip%s to start delayed after %dms, "
"instead first attempt made after %dms",
tc->id, stats2->family, (int)tc->he_timeout_ms,
(int)stats2->first_created);
fail(msg);
}

/cc @icing

@vszakats vszakats added name lookup DNS and related tech regression labels Aug 31, 2023
@vszakats
Copy link
Member Author

vszakats commented Sep 5, 2023

Closing due to no interest.

@vszakats vszakats closed this as completed Sep 5, 2023
@dfandrich
Copy link
Contributor

There's definitely interest in dealing with failing tests. The problem is lack of developers with access to Windows. We can't have tests permanently failing.

@dfandrich
Copy link
Contributor

It turns out there is no CI build that 1) enables debug mode, 2) disables threaded resolver, 3) runs in non-event-based mode, 4) runs in non-torture mode, and 5) runs on a non-Windows platform. If there were, we'd be seeing this test fail there as well, as it's easily reproducible in that configuration locally. This basically confirms the two previous commenters' assumptions that it isn't actually Windows related but rather synchronous resolver related.

@dfandrich
Copy link
Contributor

I tried disabling the special USE_ALARM_TIMEOUT support and sent it to CI and 2600 is passing everywhere so far. Those three Appveyor jobs have been permafailing on 2600 since #11690 (commit 8627416), and, ironically, it looks to me like the whole reason for special-casing USE_ALARM_TIMEOUT was removed in that same commit. That commit increased the minimum CURLOPT_CONNECTTIMEOUT_MS used in the test well beyond the 1 second threshold that the alarm timeouts can handle. Since that's the only thing that is handled differently in USE_ALARM_TIMEOUT mode, it's no longer required. Even Solaris 11 works when removing that special case.

I'm going to clean it up into a PR and see how that fares.

dfandrich added a commit that referenced this issue Sep 14, 2023
This was originally added to handle platforms that supported only 1
second granularity in connect timeouts, but after some recent changes
the test currently permafails on several Windows platforms.

The need for this special-case was removed in commit 8627416, which
increased the connect timeout in all cases to well above 1 second.

Fixes #11767
Closes #11849
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
This was originally added to handle platforms that supported only 1
second granularity in connect timeouts, but after some recent changes
the test currently permafails on several Windows platforms.

The need for this special-case was removed in commit 8627416, which
increased the connect timeout in all cases to well above 1 second.

Fixes curl#11767
Closes curl#11849
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration name lookup DNS and related tech regression
Development

Successfully merging a pull request may close this issue.

3 participants